(domU only). PAE support is enabled by 'options PAE', see the new XEN3PAE_DOMU
and INSTALL_XEN3PAE_DOMU kernel config files.
See the comments in arch/i386/include/{pte.h,pmap.h} to see how it works.
In short, we still handle it as a 2-level MMU, with the second level page
directory being 4 pages in size. pmap switching is done by switching the
L2 pages in the L3 entries, instead of loading %cr3. This is almost required
by Xen, which handle the last L2 page (the one mapping 0xc0000000 - 0xffffffff)
in a very special way. But this approach should also work for native PAE
support if ever supported (in fact, the pmap should almost suport native
PAE, what's missing is bootstrap code in locore.S).
- use a hash rather than SPLAY trees.
SPLAY tree is a wrong algorithm to use here.
will be revisited if it slows down anything other than
micro-benchmarks.
- optimize the single mapping case (it's a common case) by
embedding an entry into mdpage.
- don't keep a pmap pointer as it can be obtained from ptp.
(discussed on port-i386 some years ago.)
ideally, a single paddr_t should be enough to describe a pte.
but it needs some more thoughts as it can increase computational
costs.
- pmap_enter: simplify and fix races with pmap_sync_pv.
- don't bother to lock pm_obj[i] where i > 0, unless DIAGNOSTIC.
- kill mp_link to save space.
- add many KASSERTs.
branch is still active and will see i386PAE support developement).
Sumary of changes:
- switch xeni386 to the x86/x86/pmap.c, and the xen/x86/x86_xpmap.c
pmap bootstrap.
- merge back most of xen/i386/ to i386/i386
- change the build to reduce diffs between i386 and amd64 in file locations
- remove include files that were identical to the i386/amd64 counterparts,
the build will find them via the xen-ma/machine link.
- reduce differences between amd64 and i386. notably, share pmap.c
between them. it makes several i386 pmap improvements available to
amd64, including tlb shootdown reduction and bug fixes from Stephan Uphoff.
- implement deferred pmap switching for amd64.
- remove LARGEPAGES option. always use large pages if available.
also, make it work on amd64.
This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.
TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.
NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
interrupt stuff. This is mostly comprised of changes to the pmap modules to
work on multiprocessor systems without kernel_lock, and changes to speed up
tlb shootdowns.
static limits to meet modern bloat requirements.
VM_PHYSSEG_MAX needs it to run on Intel's D946GZIS motherboard, as reported
by rix on #NetBSD-code on freenode. This has a consequence on the initial
number of possible extent allocations for iomem_ex, so increase that value
too.
While there, clarify the action to be taken when VM_PHYSSEG_MAX is maxed
out.
Do that on both amd64 and i386 because the causes, the effects and the code
are mostly the same.
start up without unlimit.
- Bump max data size from 2G to 3G. The actual space we are allowed to allocate
is somewhere between 2G and 3G, so trying to allocate above that will fail.
- Bump max stack size from 32M to 64M.
Approved by fvdl
and make the stack and heap non-executable by default. the changes
fall into two basic catagories:
- pmap and trap-handler changes. these are all MD:
= alpha: we already track per-page execute permission with the (software)
PG_EXEC bit, so just have the trap handler pay attention to it.
= i386: use a new GDT segment for %cs for processes that have no
executable mappings above a certain threshold (currently the
bottom of the stack). track per-page execute permission with
the last unused PTE bit.
= powerpc/ibm4xx: just use the hardware exec bit.
= powerpc/oea: we already track per-page exec bits, but the hardware only
implements non-exec mappings at the segment level. so track the
number of executable mappings in each segment and turn on the no-exec
segment bit iff the count is 0. adjust the trap handler to deal.
= sparc (sun4m): fix our use of the hardware protection bits.
fix the trap handler to recognize text faults.
= sparc64: split the existing unified TSB into data and instruction TSBs,
and only load TTEs into the appropriate TSB(s) for the permissions.
fix the trap handler to check for execute permission.
= not yet implemented: amd64, hppa, sh5
- changes in all the emulations that put a signal trampoline on the stack.
instead, we now put the trampoline into a uvm_aobj and map that into
the process separately.
originally from openbsd, adapted for netbsd by me.
topdown option) so that including it directly before including
uvm/uvm_param.h (or uvm/uvm_extern.h which includes uvm/uvm_param.h)
and attempting to use topdown won't result in a compiler error.
Problem noted in private email.
means that the dynamic linker gets mapped in at the top of available
user virtual memory (typically just below the stack), shared libraries
get mapped downwards from that point, and calls to mmap() that don't
specify a preferred address will get mapped in below those.
This means that the heap and the mmap()ed allocations will grow
towards each other, allowing one or the other to grow larger than
before. Previously, the heap was limited to MAXDSIZ by the placement
of the dynamic linker (and the process's rlimits) and the space
available to mmap was hobbled by this reservation.
This is currently only enabled via an *option* for the i386 platform
(though other platforms are expected to follow). Add "options
USE_TOPDOWN_VM" to your kernel config file, rerun config, and rebuild
your kernel to take advantage of this.
Note that the pmap_prefer() interface has not yet been modified to
play nicely with this, so those platforms require a bit more work
(most notably the sparc) before they can use this new memory
arrangement.
This change also introduces a VM_DEFAULT_ADDRESS() macro that picks
the appropriate default address based on the size of the allocation or
the size of the process's text segment accordingly. Several drivers
and the SYSV SHM address assignment were changed to use this instead
of each one picking their own "default".
to the user and adjust some comments (which were not accurate anyway
since NOREDZONE)
binary compatibility note: changing VM_MAXUSER_ADDRESS might influence
some sanity check in kvm_proc, where arguments on the stack are dealt
with, but it was a variable anyway and noone did care...
amount of kvm used for buffers was set at 70%, some 188M. Then
the total amount of kvm became 1G, and the amount for buffers
thus became some 716M. This is really too much, and some
device drivers want to map quite a bit of kvm these days.
So, cap it at 384M, which gives each buffer a little over 8k (the
default FFS blocksize) physical in an 1G physram configuration.
each vm_page structure. Add a VM_MDPAGE_INIT() macro to init this
data when pages are initialized by UVM. These macros are mandatory,
but ports may #define them to nothing if they are not needed/used.
This deprecates struct pmap_physseg. As a transitional measure,
allow a port to #define PMAP_PHYSSEG so that it can continue to
use it until its pmap is converted to use VM_MDPAGE_MEMBERS.
Use all this stuff to eliminate a lot of extra work in the Alpha
pmap module (it's smaller and faster now). Changes to other pmap
modules will follow.
of virtual address space, leaving userland with 3G, and update comments
to match the new reality.
We knew we were going to have to bite this bullet eventually, and there
are a couple of outstanding PRs related to this issue (9389 and 9313).
Complete solution to those PRs is going to involve some sort of run-time
decision on how large kmem_map should be, as well as changing some data
structure allocation strategies in UVM. However, this change will at
least allow the PR submitter to simply throw resources at the problem.
allocation strategy no longer works at all. Move pmap.new.* to pmap.*.
To read the revision history of PMAP_NEW up until this merge, use cvs
rlog of the old pmap.new.* files.
written by chuck cranor. thanks to mycroft for helping me find the
one little line of code i accidentally deleted while merging it.
this is not enabled by default. `options MACHINE_NEW_NONCONTIG'
will use this code. eventually, this should go into <machine/vmparam.h>
insteaed of MACHINE_NONCONTIG.