argument. Use this and replace the inline assembly (mul + div using the
64bit intermediate result) with normal 32bit multiplication and
division. The compiler can turn the division into a multiplication and
shift, making it even cheaper then the original assembly. For extreme
long delays, just use 64bit arithmetic.
- reduce differences between amd64 and i386. notably, share pmap.c
between them. it makes several i386 pmap improvements available to
amd64, including tlb shootdown reduction and bug fixes from Stephan Uphoff.
- implement deferred pmap switching for amd64.
- remove LARGEPAGES option. always use large pages if available.
also, make it work on amd64.
This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.
TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.
NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
- Replace most inline assembly with proper functions. As a side effect
this reduces the size of amd64 GENERIC by about 120kB, and i386 by a
smaller amount. Nearly all of the inlines did something slow, or something
that does not need to be fast.
- Make curcpu() and curlwp functions proper, unless __GNUC__ && _KERNEL.
In that case make them inlines. Makes curlwp LKM and preemption safe.
- Make bus_space and bus_dma more LKM friendly.
- Share a few more files between the ports.
- Other minor changes.
This removes a hard limit that would prevent a guest from running with more
than 7 virtual network interface.
If running on a hypervisor that doesn't support GNTTABOP_query_size, fall back
to a 4-pages grant table, so this change is backward-compatible.
interrupt stuff. This is mostly comprised of changes to the pmap modules to
work on multiprocessor systems without kernel_lock, and changes to speed up
tlb shootdowns.
from doc/BRANCHES:
idle lwp, and some changes depending on it.
1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.
Seperate "cpubus" and "ioapicbus" -- while they share a common "address
space" (the apic id), the kernel doesn't use this fact. There are different
data passed to cpus and apics, which caused some ugly polymorphism. This
also saves the special "submatch" functions needed to distingush cpus
and ioapics for autoconf. (And it makes that "apid" locators wired
in the kernel configuration are honored now; this allows one to dumb down
an mp box to singleprocessor by userconfig.)
Print "apid" locators in the buses "print" function "as everyone does",
so the per-port cpu drivers don't need to do it.
Being here, constify "struct cpu_functions" and g/c the unused MP_PICMODE
flag.
- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.
so that our kernels works with newer xen-3 hypervisors; and correct the value
of VIRT_BASE for dom0.
Now that we can embed the values of KERNBASE and KERNTEXTOFF in the binary
for Xen, make the domU memory layout the same as dom0 for Xen3 (making
it the other way round doens't work; probably because of alignement
constraints in the hypervisor). The old domU layout is used if options
XEN_COMPAT_030001 is present in the kernel config file. Enable this the
domU kernel config files for now, in case someone wants to run a NetBSD
domU on an older Xen3 installation.
CPUs are now configured on mainbus only in dom0, and only to know about
their APIC id. virtual CPUs are attached to hypervisor as:
vcpu* at hypervisor?
and this is what's used as curcpu(). The kernel config files needs to be
updated for this, see XEN3_DOM0 or XEN3_DOMU for examples.
XEN3_DOM0 now has acpi, MPBIOS and ioapic by default.
Note that a Xen dom0 kernel doens't have access to the lapic.
Old hypercall call method still still available with
options XEN_NO_HYPERCALLPAGE
but this is disabled by default (xen-3.0.2-2 supports hypercall call page
just fine).
While there add a VIRT_BASE= string in __xen_guest section; from
Bastian Blank on port-xen@.
- Attempt to gracefully recover from a failed decrease_reservation or
increase_reservation, by avoiding physical memory loss.
- always store a machine address in ds_addr; this avoids some mistakes
where machine address would in some case be freed at physical address, or
mapped as physical address.
As we want some control on the name the backend driver will have we
can't use autoconf(9) here. Instead backend drivers registers to
xenbus, which will call a create callback when a new device is there.
Backend devices won't have a "struct device" in xenbus, use a void pointer
instead.
The rest of suspend/resume isn't there yet, but the parts that touch
regular usage have been tested. Discussed on port-xen on 2006-04-25;
approved by bouyer@.
- keep a linked list of xenbus_device in the xenbus_infrastructure, and
keep a pointer to struct device for each xenbus_device
- xenbus_probe_device_type(): check that the device is not already attached
- when we get a frontend_changed callback, call xenbus_probe_device_type()
- When a device changes to state XenbusStateClosed, config_detach() it
and free the structures.
While there, move xbusd_path[] to the end of struct xenbus_device, and
allocate only the space needed to store the path. Garbage-collect
struct xenbus_driver, it's not needed.
which fills in multicall arguments for __HYPERVISOR_update_va_mapping
and __HYPERVISOR_update_va_mapping_otherdomain dealing with
differences between i386 and amd64.
- kernel (both dom0 and domU) boot, console is functionnal and it can starts
software from a ramdisk
- there is no driver front-end expect console for domU yet.
- dom0 can probe devices and ex(4) work when Xen3 is booted without acpi
and apic support. But the on-board IDE doens't get interrupts.
The PCI code still needs work (it's hardcoded to mode 1). Some of this
code should be shared with ../x86
The physical insterrupt code needs to get MPBIOS and ACPI support, and
do interrupt routing to properly interract with Xen.
To enable Xen-3.0 support, add
options XEN3
to your kernel config file (this will disable Xen2 support)
Changes affecting Xen-2.0 support (no functionnal changes intended):
- get more constants from genassym for assembly code
- remove some unneeded registers move from start()
- map the shared info page from start(), and remove the pte = 0xffffffff hack
- vector.S: in hypervisor_callback() make sure %esi points to
HYPERVISOR_shared_info before accessing the info page. Remplace some
hand-written assembly with the equivalent macro defined in frameasm.h
- more debug code, dissabled by default.
while here added my copyright on some files I worked on in 2005.
- move copyin and friends from locore.S to their own file, copy.S.
share it between i386 and xen.
- defparam KERNBASE and kill KERNBASE_LOCORE hack.
- add more symbols to assym.h and use it where appropriate.
send queue. This give upper layer an opportunity to queue up all available
packets before starting to process them. This reduce the number of interrupt
generated on the backend, and the time spent doing hypercalls in a significant
way.
instead to always trying PG_RW and falling back to PG_RO if this fails.
Use uvm_map_checkprot() in IOCTL_PRIVCMD_MMAP and IOCTL_PRIVCMD_MMAP_BATCH
to compute the appropriate vm_prot_t for pmap_remap_pages().
Thanks to Jed Davis for pointing out uvm_map_checkprot().
- Define _BUS_AVAIL_END to 0xffffffff, as we don't have an easy way to
find the upper bound for our machine address space (and this can change
when we swap pages with the hypervisor).
- implement _xen_bus_dmamem_alloc_range(), which will request a contigous
set of pages to the hypervisor if the pages returned by uvm_pglistalloc()
don't fit the constraints.
We can't deal with the low/high constraints yet, because Xen doesn't offer a
way to get pages in a specific ranges of addresses.
Based on patches from Dave Thompson (in private mail), with heavy hacking
by me.
the generic pci_enumerate_bus() in Xen because the hypervisor may
hide the function 0, but give access to other functions of the same device
and pci_enumerate_bus() will stop if function 0 isn't available.
#define clockframe somethingelse
to:
struct clockframe {
struct somethingelse cf_se;
};
and change access macros accordingly.
That means that, at least for that very issue, things will not go
ka-boomy if you don't have the actual definition of struct clockframe
before including systm.h.
disks to other domains) from Jed Davis, <jld@panix.com>:
* Issue multiple requests when necessary rather than
assuming that arbitrary requests can be mapped into single
contiguous virtual address ranges.
* Don't assume that all data for a request is consecutive
in memory. With some client OSes, it's not.
The above two changes fix data corruption issues with Linux
clients with certain filesystem block sizes.
* Gracefully handle memory or pool allocation failures after
beginning to handle a request from the ring.
* Merge contiguous requests to avoid the "64K turns into 44K + 20K
and doubles the transactions per second at the disk" problem
caused by the 11-page limit caused by the structure of Xen
ring entries. This causes a very slight performance decrease
for sequential 64K I/O if the disk is not already saturated with
requests (about 1%) but halves the transactions per second we
hit the disk with -- or better. It even compensates for bizarre
Linux behaviour like breaking long requests up into 5.5K pieces.
* Probably some stuff I forgot to mention.
Disk throughput (though not latency) is now much, much closer to the
"raw hardware" case than it was before.