- moving the interrupt handler callback traversal into a separate
function.
- using evt_iterate_bits() to scan through the pending bitfield
- removing cross-cpu pending actions - events recieved on the wrong
vcpu are re-routed via hypervisor_send_event().
- simplifying nested while() loops by encapsulating them in
equivalent functions.
Many thanks for multiple reviews by bouyer@ and jym@
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.
No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().
Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().
Make cpu_rootconf(9) describe the calling order.
avoids exposing the MD phys_to_machine/machine_to_phys tables directly.
Added:
- xpmap_ptom handles PFN (pseudo physical) to MFN (machine frame number)
translations, and is under control of the domain.
- xpmap_mtop is its counterpart (MFN to PFN), and is under control of
hypervisor.
xpmap_ptom_map() map a pseudo-phys address to a machine address
xpmap_ptom_unmap() unmap a pseudo-phys address (invalidation)
xpmap_ptom_isvalid() check for pseudo-phys address validity
The parameters are physical/machine addresses, like bus_dma/bus_space(9).
As x86 MFNs are tracked by u_long (Xen's choice) while machine addresses
can be 64 bits entities (PAE), use ptoa() to avoid truncation when bit
shifting by PAGE_SHIFT.
I kept the same namespace (xpmap_) to avoid code churn.
[1] http://mail-index.netbsd.org/port-xen/2009/05/09/msg004951.html
XXX will document ptoa/atop/trunc_page separately.
http://mail-index.netbsd.org/port-xen/2012/06/25/msg007431.html
The xen_p2m API comes next.
ok bouyer@.
Tested on i386 PAE and amd64 (Xen 3.3 on private test bed, and
Xen 3.4 for Amazon EC2).
FWIW, Amazon always reported:
hypervisor0 at mainbus0: Xen version 3.4.3-kaos_t1micro
multiple times for Europe and US West-1, so I guess they are now at
3.4 (32 and 64 bits).
save/restore.
For an unknown reason (to me) Xen refuses to update VM translations
when the entry is pointing back to itself (which is precisely
what our recursive VM model does). So enable the functions that take
care of this, which will avoid all sort of memory corruption upon restore
leading domU to trample upon itself.
Save/restore works again for amd64. The occasional domU frontend corruption is
still present, but is harmless to dom0. Now we have a working shell and
ddb inside domU, that helps debugging a tiny bit.
XXX pull-up to -6.
- cpu_load_pmap: use atomic kcpuset(9) operations; fixes rare crashes.
- Add kcpuset_copybits(9) and replace xen_kcpuset2bits(). Avoids incorrect
ncpu problem in early boot. Also, micro-optimises xen_mcast_invlpg() and
xen_mcast_tlbflush() routines.
Tested by chs@.
kcpuset(9) and thus replace hardcoded CPU bitmasks. This removes the
limitation of maximum CPUs.
- Support up to 256 CPUs on amd64 architecture by default.
Bug fixes, improvements, completion of Xen part and testing on 64-core
AMD Opteron(tm) Processor 6282 SE (also, as Xen HVM domU with 128 CPUs)
by Manuel Bouyer.
calls after pmap_{,un}map_recursive_entries() so that pmap's handlers
handle the flush themselves.
Now pmap_{,un}map_recursive_entries() do what their names imply, nothing more.
Fix pmap_xen_suspend()'s comment: APDPs are now gone.
pmap's handlers are called deep during kernel save/restore. We already
are at IPL_VM + kpreemption disabled. No need to wrap the xpq_flush_queue()
with splvm/splx.
(this include the physical->machine table).
(vaddr_t)(KERNBASE + NKL2_KIMG_ENTRIES * NBPD_L2) is after text+data+bss but,
on a domU with lots of RAM (more than 4GB) (so large
xpmap_phys_to_machine_mapping table) this can point to some of Xen's data
setup at bootstrap (either the xpmap_phys_to_machine_mapping table,
some page shared with the hypervisor, or our kernel page table). Using it for
early_zerop will cause of these pages to be unmapped after bootstrap.
This will cause a kernel page fault for the domU, either immediatly or
eventually much later, depending on where early_zerop points to.
To fix this, account for early_zerop when building the bootstrap pages,
and its VA from here.
May fix PR port-xen/38699
problem reported here but is part of the solution):
xen_kpm_sync() is not working as expected,
leading to races between CPUs.
1 the check (xpq_cpu != &x86_curcpu) is always false because we
have different x86_curcpu symbols with different addresses in the kernel.
Fortunably, all addresses dissaemble to the same code.
Because of this we always use the code intended for bootstrap, which doesn't
use cross-calls or lock.
2 once 1 above is fixed, xen_kpm_sync() will use xcalls to sync other CPUs,
which cause it to sleep and pmap.c doesn't like that. It triggers this
KASSERT() in pmap_unmap_ptes():
KASSERT(pmap->pm_ncsw == curlwp->l_ncsw);
3 pmap->pm_cpus is not safe for the purpose of xen_kpm_sync(), which
needs to know on which CPU a pmap is loaded *now*:
pmap->pm_cpus is cleared before cpu_load_pmap() is called to switch
to a new pmap, leaving a window where a pmap is still in a CPU's
ci_kpm_pdir but not in pm_cpus. As a virtual CPU may be preempted
by the hypervisor at any time, it can be large enough to let another
CPU free the PTP and reuse it as a normal page.
To fix 2), avoid cross-calls and IPIs completely, and instead
use a mutex to update all CPU's ci_kpm_pdir from the local CPU.
It's safe because we just need to update the table page, a tlbflush IPI will
happen later. As a side effect, we don't need a different code for bootstrap,
fixing 1). The mutex added to struct cpu needs a small headers reorganisation.
to fix 3), introduce a pm_xen_ptp_cpus which is updated from
cpu_pmap_load(), whith the ci_kpm_mtx mutex held. Checking it with
ci_kpm_mtx held will avoid overwriting the wrong pmap's ci_kpm_pdir.
While there I removed the unused pmap_is_active() function;
and added some more details to DIAGNOSTIC panics.
point in enabling them.
Avoids:
- a warning logged by hypervisor when a domain attempts to modify the PAT
MSR.
- an error during domain resuming, where a PAT flag has been set on a page
while the hypervisor does not allow it.
ok releng@.
Update ci->ci_kpm_pdir from user pmap, not global pmap_kernel() entry which may get clobbered by other CPUs.
XXX: Look into why we use pmap_kernel() userspace entries at all.