Pass -DACPI_MISALIGNMENT_NOT_SUPPORTED under kUBSan enabled. This option
is dedicated for alignment sensitive CPUs in acpica. It was originally
designed for Itanium CPUs, but nowadays it's wanted for aarch64 as well.
Define it in acpica code under kUBSan in order to pacify Undefined Behavior
reports on all ports (in particular x86). The number of reports is now
halved with this patch applied. The remaining alignment alarms in acpica
will be addressed in future.
Patch contributed by <Akul Pillai>
VMs on Intel CPUs. Overall this implementation is fast and reliable, I am
able to run NetBSD VMs with many VCPUs on a quad-core Intel i5.
NVMM-Intel applies several optimizations already present in NVMM-AMD, and
has a code structure similar to it. No change was needed in the NVMM MI
frontend, or in libnvmm.
Some differences exist against AMD:
- On Intel the ASID space is big, so we don't fall back to a shared ASID
when there are more VCPUs executing than available ASIDs in the host,
contrary to AMD. There are enough ASIDs for the maximum number of VCPUs
supported by NVMM.
- On Intel there are two TLBs we need to take care of, one for the host
(EPT) and one for the guest (VPID). Changes in EPT paging flush the
host TLB, changes to the guest mode flush the guest TLB.
- On Intel there is no easy way to set/fetch the VTPR, so we intercept
reads/writes to CR8 and maintain a software TPR, that we give to the
virtualizer as if it was the effective TPR in the guest.
- On Intel, because of SVS, the host CR4 and LSTAR are not static, so
we're forced to save them on each VMENTRY.
- There is extra Intel weirdness we need to take care of, for example the
reserved bits in CR0 and CR4 when accesses trap.
While this implementation is functional and can already run many OSes, we
likely have a problem on 32bit-PAE guests, because they require special
care on Intel CPUs, and currently we don't handle that correctly; such
guests may misbehave for now (without altering the host stability). I
expect to fix that soon.
the three event types available on AMD, but Intel has seven of them, all
with weird and twisted meanings, and they require extra parameters.
Software interrupts should not be used anyway.
The idea is that under NVMM, we don't want to implement the hypervisor page
tables manually in NVMM directly, because we want pageable guests; that is,
we want to allow UVM to unmap guest pages when the host comes under
pressure.
Contrary to AMD-SVM, Intel-VMX uses a different set of PTE bits from
native, and this has three important consequences:
- We can't use the native PTE bits, so each time we want to modify the
page tables, we need to know whether we're dealing with a native pmap
or an EPT pmap. This is accomplished with callbacks, that handle
everything PTE-related.
- There is no recursive slot possible, so we can't use pmap_map_ptes().
Rather, we walk down the EPT trees via the direct map, and that's
actually a lot simpler (and probably faster too...).
- The kernel is never mapped in an EPT pmap. An EPT pmap cannot be loaded
on the host. This has two sub-consequences: at creation time we must
zero out all of the top-level PTEs, and at destruction time we force
the page out of the pool cache and into the pool, to ensure that a next
allocation will invoke pmap_pdp_ctor() to create a native pmap and not
recycle some stale EPT entries.
To create an EPT pmap, the caller must invoke pmap_ept_transform() on a
newly-allocated native pmap. And that's about it, from then on the EPT
callbacks will be invoked, and the pmap can be destroyed via the usual
pmap_destroy(). The TLB shootdown callback is not initialized however,
it is the responsibility of the hypervisor (NVMM) to set it.
There are some twisted cases that we need to handle. For example if
pmap_is_referenced() is called on a physical page that is entered both by
a native pmap and by an EPT pmap, we take the Accessed bits from the
two pmaps using different PTE sets in each case, and combine them into a
generic PP_ATTRS_U flag (that does not depend on the pmap type).
Given that the EPT layout is a 4-Level tree with the same address space as
native x86_64, we allow ourselves to use a few native macros in EPT, such
as pmap_pa2pte(), rather than re-defining them with "ept" in the name.
Even though this EPT code is rather complex, it is not too intrusive: just
a few callbacks in a few pmap functions, predicted-false to give priority
to native. So this comes with no messy #ifdef or performance cost.
run without a XEN domain loader having previously overwritten the
hypercall page with its hypercall trampoline machine code, we still
get to detect its presence by calling the xen_version hypercall stub.
We use this hack to detect the presence or absence of the hypervisor,
without relying on the MSR support on HVM domains.
This works as an added sanity check that the hypercall page
registration has indeed succeeded in HVM mode.
29 November 2018: Wouter
- Tag for 4.1.26rc1.
27 November 2018: Wouter
- Fix parsezone failure in 4194 fix.
26 November 2018: Wouter
- Fix to not set GLOB_NOSORT so the nsd.conf include: files are
sorted and in a predictable order.
- Added nsd-control changezone. nsd-control changezone name pattern
allows the change of a zone pattern option without downtime for
the zone, in one operation.
- Fix#3433: document that reconfig does not change per-zone stats.
20 November 2018: Wouter
- Fix#4205: enable-recvmmsg in mixed IPv4/IPv6 environment fails.
This sets the msg_hdr.msg_namelen correctly after receipt.
19 November 2018: Wouter
- Support SO_REUSEPORT_LB in FreeBSD 12 with the reuseport: yes
option in nsd.conf.
- Fix#4202: nsd-control delzone incorrect exit code on error.
- Tab style fix to use tab for 8 spaces, from Xiaobo Liu.
25 October 2018: Wouter
- Adjust dnstap socket path for chroot.
22 October 2018: Wouter
- Fix#4194: Zone file parser derailed by non-FQDN names in RHS of
DNSSEC RRs.
- Fix some more, neater code and checks for domain length limit.
- check that the dnstap socket file can be opened and exists, print
error if not.
4 October 2018: Wouter
- dnstap work, the dnstap.proto is a copy of the file from Unbound,
also dnstap.m4 configure include file.
- dnstap collector: free eventbase and memclean nicer.
- dnstap collector: send data and read it in collector.
- dnstap/dnstap.c and .h from Unbound's contribution from
Farsight Security, added to then adapt it for dnstap logging in NSD.
- dnstap.c with auth query and auth response, and called from
the collector.
- dnstap work, config nsd.conf parse.
- dnstap example config.
25 September 2018: Wouter
- NSD 4.1.25 released, trunk has 4.1.26 in development.
18 September 2018: Wouter
- tag for NSD 4.1.25rc1.
17 September 2018: Wouter
- Fix#4156: Fix systemd service manager state change notification
14 September 2018: Wouter
- Remove unused if clause during server service startup.
13 September 2018: Wouter
- Fix typo in clang analysis test.
- Annotate exit functions with noreturn.
- nsd-control prints neater errors for file failures.
12 September 2018: Wouter
- clang analysis test.
11 September 2018: Wouter
- Fix to combine the same error function into one, from Xiaobo Liu.
- Fix initialisation in remote.c.
- please clang analyzer and fix parse of IPSECKEY with bad gateway.
- Fix unit test code for clang analyzer.
- Fix nsd-checkconf fail on bad zone name.
10 September 2018: Wouter
- Fix coding style in nsd.c
7 September 2018: Wouter
- append_trailing_slash has one implementation and is not repeated
differently.
4 September 2018: Wouter
- Fix codingstyle in nsd-checkconf.c in patch from Sharp Liu.
15 August 2018: Wouter
- Fix use_systemd typo/leftover in remote.c.
VMENTRY, so clear it ourselves, to avoid uselessly flushing the guest
TLB. While here also fix the processing of EFER-induced flushes, they
shouldn't be delayed.
final file marks by opening and immediately closing the device
in O_WRONLY mode. That code has not been working since around 1998.
It can now be enabled with options ST_SUNCOMPAT.
Fix inconsistent/incomplete file mark handling to conform again
to mtio(4) at close(2) time. This was necessary as the PREVENT/ALLOW
bracket was reduced from a whole mount session to cover only the
open(2)/close(2) time on ~2002-03-22. The rationale was to allow
robots and humans to change the media during a mount session.
Unfortunately this lead to file marks being written to potentially other
media at the beginning on drives that used the two file marks as EOM
pattern. In order for that to happen the media had to be removed after
data and at most one file mark had been written before removal.
The mount error message has been clarified and a warning about
potential data/file mark lossage on UNIT ATTENTION
during an active mount session with unfinished file marks has been
added.
While there, fix, but disable the commented SUN compatibility to write
final file marks by opening and immediately closing the device
in O_WRONLY mode. That code has not been working since around 1998.
It can now be enabled with options ST_SUNCOMPAT.
Additionally debug output coverage has been extended.
- nvpair_create_stringv: free the temporary string; this fix affects
nvlist_add_stringf() and nvlist_add_stringv().
- nvpair_remove_nvlist_array (NV_TYPE_NVLIST_ARRAY case): free the chain
of nvpairs (as resetting it prevents nvlist_destroy() from freeing it).
Note: freeing the chain in nvlist_destroy() is not sufficient, because
it would still leak through nvlist_take_nvlist_array(). This affects
all nvlist_*_nvlist_array() users.
Found by clang/gcc ASAN. These fixes have been contributed to the
upstream (FreeBSD) repository.
on sparc and sparc64, don't remove .eh_frame section. it leads
to failure as something is referenced, and objcopy ends up
emitting a broken binary that can't be run -- it attempts to
load at va=0, beyond having missing referenced data.
also, on sparc64 also don't remove .note.netbsd.mcmodel.
the former should be revised when we can avoid it.
subsystem into a separate namespace where it can co-exist with the
native equivalent in PVHVM mode.
On PV, we alias and export the native symbols - this means that
although the namespace is different, the semantics must be identical.
Eg: xen_intr_establish_xname() vs. intr_establish_xname().
The specific functions we need in PVHVM are:
- spllower, xen_spllower (for native as well as XEN event spl
despatch/defer)
- xen_disable_intr()/xen_enable_intr() ,
x86_disable_intr()/x86_enable_intr()
- xen_read_psl()/xen_write_psl(),
x86_read_psl()/x86_write_psl()
- intr_establish() et. al, xen_intr_establish() et. al.
This gives us the ability to manage Paravirtualised drivers such as
xbd(4) as well as fully emulated ones such as wd(4)., for eg