TPR change. This is supported on all Intel CPUs, and not-too-old AMD CPUs.
The reason for wanting this option is that certain OSes (like Win10 64bit)
manage interrupt priority in hardware via CR8 directly, and for these OSes,
the emulator may want to sync its internal TPR state on each change.
Add two new fields in cap.arch, to report the conf capabilities. Report TPR
only on Intel for now, not AMD, because I don't have a recent AMD CPU on
which to test.
- Sync the naming with reality.
- Replace "relevant" by "desired" and "virtualizer" by "emulator", closer
to what I meant.
- Add a "VCPU Configuration" section.
- Add a "Machine Ownership" section.
For some reason I had initially concluded that it wasn't doable; verily it
is, so let's do it.
The reserved 'flags' argument of nvmm_gpa_map() becomes 'prot' and takes
mmap-like protection codes.
the exit structure provided by the kernel. This saves an MMU translation,
and sometimes complex address computation (eg SIB).
Drop the GVA field, it is not useful to virtualizers.
Kernel driver:
* Don't take an extra (unneeded) reference to the UAO.
* Provide npc for HLT. I'm not really happy with it right now, will
likely be revisited.
* Add the INT_SHADOW, INT_WINDOW_EXIT and NMI_WINDOW_EXIT states. Provide
them in the exitstate too.
* Don't take the TPR into account when processing INTs. The virtualizer
can do that itself (Qemu already does).
* Provide a hypervisor signature in CPUID, and hide SVM.
* Ignore certain MSRs. One special case is MSR_NB_CFG in which we set
NB_CFG_INITAPICCPUIDLO. Allow reads of MSR_TSC.
* If the LWP has pending signals or softints, leave, rather than waiting
for a rescheduling to happen later. This reduces interrupt processing
time in the guest (Qemu sends a signal to the thread, and now we leave
right away). This could be improved even more by sending an actual IPI
to the CPU, but I'll see later.
Libnvmm:
* Fix the MMU translation of large pages, we need to add the lower bits
too.
* Change the IO and Mem structures to take a pointer rather than a
static array. This provides more flexibility.
* Batch together the str+rep IO transactions. We do one big memory
read/write, and then send the IO commands to the hypervisor all at
once. This considerably increases performance.
* Decode MOVZX.
With these changes in place, Qemu+NVMM works. I can install NetBSD 8.0
in a VM with multiple VCPUs, connect to the network, etc.
* Change the Assist API. Rather than passing callbacks in each call, the
callbacks are now registered beforehand. Then change the I/O Assist to
fetch MMIO data via the Mem callback. This allows a guest to perform an
I/O string operation on a memory that is itself an MMIO.
* Introduce two new functions internal to libnvmm, read_guest_memory and
write_guest_memory. They can handle mapped memory, MMIO memory and
cross-page transactions.
* Allow nvmm_gva_to_gpa and nvmm_gpa_to_hva to take non-page-aligned
addresses. This simplifies a lot of things.
* Support the MOVS instruction, and add a test for it. This instruction
is special, in that it takes two implicit memory operands. In
particular, it means that the two buffers can both be in MMIO memory,
and we handle this case.
* Fix gross copy-pasto in nvmm_hva_unmap. Also fix a few things here and
there.
Until now, the "owner" of the memory was the guest, and by calling
nvmm_gpa_map(), the virtualizer was creating a view towards the guest
memory.
Qemu expects the contrary: it wants the owner to be the virtualizer, and
nvmm_gpa_map should just create a view from the guest towards the
virtualizer's address space. Under this scheme, it is legal to have two
GPAs that point to the same HVA.
Introduce nvmm_hva_map() and nvmm_hva_unmap(), that map/unamp the HVA into
a dedicated UOBJ. Change nvmm_gpa_map() and nvmm_gpa_unmap() to just
perform an enter into the desired UOBJ.
With this change in place, all the mapping-related problems in Qemu+NVMM
are fixed.
and smallkern, there is little interest installing them by default,
rather they can be downloaded from www. It's better this way.
While here add NVMM(4) in "SEE ALSO".
software to effortlessly create and manage virtual machines via NVMM.
It is mostly complete, only nvmm_assist_mem needs to be filled -- I have
a draft for that, but it needs some more care. This Mem Assist should
not be needed when emulating a system in x2apic mode, so theoretically
the current form of libnvmm is sufficient to emulate a whole class of
systems.
Generally speaking, there are so many modes in x86 that it is difficult
to handle each corner case without introducing a ton of checks that just
slow down the common-case execution. Currently we check a limited number
of things; we may add more checks in the future if they turn out to be
needed, but that's rather low priority.
Libnvmm is compiled and installed only on amd64. A man page (reviewed by
wiz@) is provided.