to restore the spl (shouldn't happen, but ...), we could end up with a
higther pending ipl set and never cleared because iplbit already
handled this level. while (iplmask != 0) {} would then never be true,
and we'd end up triggering KASSERT(iplbit != 0).
Now print the faultly handler and reset the IPL, and start the whole
pending IPL handling again if needed.
affinity (cpu_lock protects these operations now).
- Disallow setting of state of CPU to to offline, if there are bound LWPs,
which have no CPU to migrate.
- Disallow setting of affinity for the LWP(s), if all CPUs in the dynamic
CPU-set are offline.
- sched_setaffinity: fix invalid check of kcpuset_isset().
- Rename cpu_setonline() to cpu_setstate().
Should fix PR/39349.
- add BEST_SUSPENDED macro, the xennet equivalent of BLKIF_STATE_SUSPENDED
for xbd (indicate that backend is in suspended state)
- use GNTST_okay to check the grant table setup status returned by hypercall,
rather than 0
No functional changes.
pseudo-physical and machine frame numbers.
- add HYPERVISOR_crash() for i386 and amd64. Intended to be used by a domain
to notify Xen that it crashed on purpose, and request a dump (if applicable).
No functional changes intended.
Reviewed by Christoph (cegger@).
- change unbind_[pv]irq_from_evtch() so that they now return the event
channel the [PV]IRQ was bound to. It reflects the opposite behaviour of the
bind_[pv]irq_to_evtch() functions.
- remove xenbus_suspend() and xenbus_resume() prototypes, as they are not
used anywhere else, and will conflict with the xenbus pmf(9) handlers.
- make start_info aligned on a page boundary, as Xen expects it to be so.
- mask event channel during xbd detach before removing its handler (can
avoid spurious events).
- add the "protocol" entry in xenstore during xbd initialization. Normally
created during domU's boot by xentools, it is under domU's responsibility
in all other cases (save/restore, hot plugging, etc.).
- modifications to xs_init(), so that it can properly return an error.
Reviewed by Christoph (cegger@).
- fix and add comments
- make some panic/error messages more relevant
- remove last '\n' in DPRINTK() macros, not required as it is already part of format string.
No functional changes.
but use the map's protection bits, as the hypervisor may refuse read/write
mappings for some entries. Now suspend/resume of domUs should work
from a NetBSD dom0, provided that the domU's kernel supports it.
From Jean-Yves Migeon.
> xm list
Domain-Unnamed 1 467 1 ---s-- 46.0
The root cause is a discrepancy in the error *value* codes:
BSD uses the AT&T Unix Version 6 error codes, while Xen uses
Unix System V error codes (or actually what Linux/i386 has taken over from it).
After shutting down (or rebooting) a domU, the guest container gets destroyed.
This implies freeing resources used by the guest (RAM, internal management structures, etc.).
The destroy process is an asynchronous process in order to not block the Dom0 (and other DomUs).
The destroy process works this way:
The XEN_DOMCTL_destroydomain is invoked from the xentools (python, libxc code).
XEN_DOMCTL_destroydomain hypercall calls domain_kill().
domain_kill() calls domain_relinquish_resources().
domain_relinquish_resources() calls relinquish_memory().
relinquish_memory() calls hypercall_preempt_check().
hypercall_preempt_check() makes all this asynchronous.
It fails, if there's an other hypercall pending.
In that case relinquish_memory() returns EAGAIN, which means, just retry to continue the destroy process.
EAGAIN is passed through the return path back into the python code
(= userspace). The python code checks for EAGAIN and *should*
retry, but it didn't.
In Unix System V / Linux, EAGAIN has the error code value 11.
In BSD, EAGAIN has the error code value 35 and EDEADLK has the error code value 11.
This means, Xen returning EAGAIN means for the python code EDEADLK.
This lead to the confusing error message 'domain destroy failed due to Resource Deadlock avoided'.
We finally convert the error code from the Xen hypercall to BSD before passing it upstream.
Since we have to treat 0x0 as a valid page, this got broken in rev. 1.26.
Introduce INVALID_PAGE as magic value and restore the check.
This unbreaks IOCTL_PRIVCMD_MMAPBATCH while allowing to launch HVM guests.
hvm: Avoid need for ugly setcpucontext() in HVM domain builder by
pre-setting the vcpu0 to runnable inside Xen, and have the builder
insert a JMP instruction to reach the hvmloader entry point from
address 0x0.
So we have to treat guest physical address 0x0 like every one
or we end in a page fault loop when launching a HVM guest, otherwise.
XXX Keep this for Xen2 as this change hasn't been tested there.
for ioapics and faked up for others. Add it to "struct ioapic_softc"
for now, until device/softc get split.
This required all typecasts between "struct pic" and "struct ioapic_softc"
to be replaced, I hope I got them all.
functionally tested on i386, compile-tested on xen, untested on amd64
callbacks are always called in decreasing IPL order. Although it's not
strictly a bug, it makes the code easier to read, and avoids processing
the whole IPL range several times.
Select the Xen idle routine for Xen, mwait if supported by the CPU and
it is not AMD and halt otherwise. As reported by Christoph Egger,
AMD Barcelona keeps the CPU in C0 state with MWAIT, contrary to HLT,
which uses C1 and therefore much less power.
Important note: This does not break backward-compatibility. It is still possible to run on Xen 3.1.4.
Hint for developers: Use the xen_version hypercall to determine at runtime if a new hypercall will work. Also check the hypercall return code.
Tested by me and bouyer. OK bouyer.
in cpu_attach(), and doing it here will overwrite the cpu_number of the
physical CPU with the one from the virtual CPU (which is always 0).
XXX is ioapic_bsp_id read somewhere ?
here will reenable interrupts as a side effect, and we don't want it here.
It could cause some event handlers to run twice (which should be harmless),
and trap() to be called on the wrong LWP in doreti_checkast (which can
probably cause some damage).
- Add a lot of missing selinit() and seldestroy() calls.
- Merge selwakeup() and selnotify() calls into a single selnotify().
- Add an additional 'events' argument to selnotify() call. It will
indicate which event (POLL_IN, POLL_OUT, etc) happen. If unknown,
zero may be used.
Note: please pass appropriate value of 'events' where possible.
Proposed on: <tech-kern>
them in the mi "files" file, and remove include statements from md files.
These shouldn't pull in additional kernel code when not in use, so it
shouldn't do any harm except a risk of namespace collisions which
should be easy to fix.
on amd64). Make sure to use the right type to store and manipulate them.
This fixes amd64, where basically any event channel > 31 was not working
(and you get there after starting/stopping a domU a few times). Things
would occasionally unwedge though the spllower() callbacks.
Register a ipl callback for IPL_HIGH.
if the current ipl level is too high, just record the event in a bitmap,
and record IPL_HIGH as pending. The callback will process the pending events.
of xentools3.
XXX ignore those in Makefile.ioctl-c, they don't compile properly outside
of the Xen context and the ioctls from xenio.h conflicts with
soundcard.h
some hypercalls results are returned though the error path (depending on
sign, it's an error code or a result), and some results are interpreted in
a special way by the NetBSD kernel (e.g. -1).
Add a 'retval' member to the privcmd_hypercall_t argument, which holds
the hypercall result if it completed without error. The error code
is returned via the usual error path.
Handle the old IOCTL_PRIVCMD_HYPERCALL under COMPAT_40.
While there, make the double-inclusion protection #define match the
convention in xenio3.h and xenio.h.
- All three functions are included in the kernel by default.
They call a backend function cpu_in_cksum after possibly
computing the checksum of the pseudo header.
- cpu_in_cksum is the core to implement the one-complement sum.
The default implementation is moderate fast on most platforms
and provides a 32bit accumulator with 16bit addends for L32 platforms
and a 64bit accumulator with 32bit addends for L64 platforms.
It handles edge cases like very large mbuf chains (could happen with
native IPv6 in the future) and provides a good base for new native
implementations.
- Modify i386 and amd64 assembly to use the new interface.
This disables the MD implementations on !x86 until the conversion is
done. For Alpha, the portable version is faster.
(domU only). PAE support is enabled by 'options PAE', see the new XEN3PAE_DOMU
and INSTALL_XEN3PAE_DOMU kernel config files.
See the comments in arch/i386/include/{pte.h,pmap.h} to see how it works.
In short, we still handle it as a 2-level MMU, with the second level page
directory being 4 pages in size. pmap switching is done by switching the
L2 pages in the L3 entries, instead of loading %cr3. This is almost required
by Xen, which handle the last L2 page (the one mapping 0xc0000000 - 0xffffffff)
in a very special way. But this approach should also work for native PAE
support if ever supported (in fact, the pmap should almost suport native
PAE, what's missing is bootstrap code in locore.S).
pmap_bootstrap()/init386() wants to map a few additionnal things after
first_avail that we didn't account for, before pmap_growkernel() is
used/functionnal, and if the loaded kernel is close to the end of
the last L2 slot we loose. Should fix port-xen/37761 by YAMAMOTO Takashi.
Fix a XENPRINTF() so that low debug builds again.
branch is still active and will see i386PAE support developement).
Sumary of changes:
- switch xeni386 to the x86/x86/pmap.c, and the xen/x86/x86_xpmap.c
pmap bootstrap.
- merge back most of xen/i386/ to i386/i386
- change the build to reduce diffs between i386 and amd64 in file locations
- remove include files that were identical to the i386/amd64 counterparts,
the build will find them via the xen-ma/machine link.
- make tss per-cpu. this considerably speeds up context switch for,
at least, pentium4, where ltr instruction seems very slow.
i386, xen:
- kill cpu_maxproc.
kvm86:
- adapt to per-cpu tss.
- cleanup and simplify.
- move kvm86_mp_lock to more meaningful place.
- disable preemption during a call.
- introduce a function to reserve an idt entry and use it instead of
manipulating idt_allocmap directly.
- rename idt to xen_idt for amd64 xen. add missing #ifdef XEN.
ACPI wakeup code and teach it how to start the APs again. As a side
effect the CPU_START interface allows choosing between different
bootstrap codes more easily now.
- Reduce available SPL levels for hardware devices to none, vm, sched, high.
- Acquire kernel_lock only for interrupts at IPL_VM.
- Implement threaded soft interrupts.
the process page table, properly uvm_unmap1() the VA range and enter a new
uvm_map backed by an object. There is a know race between uvm_unmap1() and
uvm_map(), where another thread could get our VA range; in practice
I think none of the softwares using this interface are multithreaded (or at
last they are single-threaded when using it).