existing CPU power management implementations could peacefully coexist with
the acpicpu(4) driver. The following options can not be used with acpicpu(4):
ENHANCED_SPEEDSTEP, INTEL_ONDEMAND_CLOCKMOD, POWERNOW_K7, and POWERNOW_K8.
the acpicpu(4) driver should attach even if the existing frequency management
code fails to attach, mainly because ACPI is the only proper way to deal
with EST on new Intel system.
Use a more drastic hack to deal with this: when acpicpu(4) attachs, it tears
down any existing sysctl(8) controls and installs identical ones in place.
Upon detachment, the initialization function of the existing EST is called.
This patch is inspired by work previously done by Jeremy Morse, ported by me
to -current, merged with the work previously done for port-xen, together with
additionals fixes and improvements.
PAE option is disabled by default in GENERIC (but will be enabled in ALL in
the next few days).
In quick, PAE switches the CPU to a mode where physical addresses become
36 bits (64 GiB). Virtual address space remains at 32 bits (4 GiB). To cope
with the increased size of the physical address, they are manipulated as
64 bits variables by kernel and MMU.
When supported by the CPU, it also allows the use of the NX/XD bit that
provides no-execution right enforcement on a per physical page basis.
Notes:
- reworked locore.S
- introduce cpu_load_pmap(), used to switch pmap for the curcpu. Due to the
different handling of pmap mappings with PAE vs !PAE, Xen vs native, details
are hidden within this function. This helps calling it from assembly,
as some features, like BIOS calls, switch to pmap_kernel before mapping
trampoline code in low memory.
- some changes in bioscall and kvm86_call, to reflect the above.
- the L3 is "pinned" per-CPU, and is only manipulated by a
reduced set of functions within pmap. To track the L3, I added two
elements to struct cpu_info, namely ci_l3_pdirpa (PA of the L3), and
ci_l3_pdir (the L3 VA). Rest of the code considers that it runs "just
like" a normal i386, except that the L2 is 4 pages long (PTP_LEVELS is
still 2).
- similar to the ci_pae_l3_pdir{,pa} variables, amd64's xen_current_user_pgd
becomes an element of cpu_info (slowly paving the way for MP world).
- bootinfo_source struct declaration is modified, to cope with paddr_t size
change with PAE (it is not correct to assume that bs_addr is a paddr_t when
compiled with PAE - it should remain 32 bits). bs_addrs is now a
void * array (in bootloader's code under i386/stand/, the bs_addrs
is a physaddr_t, which is an unsigned long).
- fixes in multiboot code (same reason as bootinfo): paddr_t size
change. I used Elf32_* types, use RELOC() where necessary, and move the
memcpy() functions out of the if/else if (I do not expect sym and str tables
to overlap with ELF).
- 64 bits atomic functions for pmap
- all pmap_pdirpa access are now done through the pmap_pdirpa macro. It
hides the L3/L2 stuff from PAE, as well as the pm_pdirpa change in
struct pmap (it now becomes a PDP_SIZE array, with or without PAE).
- manipulation of recursive mappings ( PDIR_SLOT_{,A}PTEs ) is done via
loops on PDP_SIZE.
See also http://mail-index.netbsd.org/port-i386/2010/07/17/msg002062.html
No objection raised on port-i386@ and port-xen@R for about a week.
XXX kvm(3) will be fixed in another patch to properly handle both PAE and !PAE
kernel dumps (VA => PA macros are slightly different, and need proper 64 bits
PA support in kvm_i386).
XXX Mixing PAE and !PAE modules may lead to unwanted/unexpected results. This
cannot be solved easily, and needs lots of thinking before being declared
safe (paddr_t/bus_addr_t size handling, PD/PT macros abstractions).
Add MI flags PMAP_WRITE_COMBINE, PMAP_WRITE_BACK, PMAP_NOCACHE_OVR.
Update pmap(9) manpage.
hppa: Remove MD PMAP_NOCACHE flag as it exists as MI flag
mips: Rename MD PMAP_NOCACHE to PGC_NOCACHE.
x86: Implement new MI flags using Page-Attribute Tables.
x86: Implement BUS_SPACE_MAP_PREFETCHABLE.
Patch presented on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2010/06/30/msg008458.html
No comments on this last version.
we're ELF now, and there are many missing checks against OBJECT_FMT.
if we ever consider switching, the we can figure out what new ones
we need but for now it's just clutter.
this doesn't remove any of the support for exec_aout or any actually
required-for-boot a.out support, only the ability to build a netbsd
release in a.out format. ie, most of this code has been dead for
over a decade.
i've tested builds on vax, amd64, i386, mac68k, macppc, sparc, atari,
amiga, shark, cats, dreamcast, landisk, mmeye and x68k. this covers
the 5 MACHINE_ARCH's affected, and all the other arch code touched.
it also includes some actual run-time testing of sparc, i386 and
shark, and i performed binary comparison upon amiga and x68k as well.
some minor details relevant:
- move shlib.[ch] from ld.aout_so into ldconfig proper, and cut them
down to only the parts ldconfig needs
- remove various unused source files
- switch amiga bootblocks to using elf2bb.h instead of aout2bb.h
kernels, and use them in the bus_space(9) implementation instead of ugly
Xen #ifdef-age. In a non-Xen kernel, the _ma() functions either call or
alias the equivalent _pa() functions.
Reviewed on port-xen@netbsd.org and port-i386@netbsd.org. Passes
rmind@'s and bouyer@'s inspection. Tested on i386 and on Xen DOMU /
DOM0.
bus_space_tag. For now, bus_space_tag's only member is
bst_type, the type of space, which is either X86_BUS_SPACE_IO
or X86_BUS_SPACE_MEM. In the future, new bus_space_tag members
will refer to override-functions installed by a new function,
bus_space_tag_create(9).
Add pointers to constant struct bus_space_tag, x86_bus_space_io and
x86_bus_space_mem. Use them to replace most uses of X86_BUS_SPACE_IO
and X86_BUS_SPACE_MEM.
Add an x86-specific bus_space_is_equal(9) implementation that compares
the two tags' bst_type.
per-page execution right was disabled (therefore leading to the inability
of the kernel to detect fraudulent use of memory mappings marked as not
being executable).
- replace cpu_feature and ci_feature_flags variables by cpu_feature and
ci_feat_val arrays. This makes it cleaner and brings kernel code closer
to the design of cpuctl(8). A warning will be raised for each CPU that
does not expose the same features as the Boot Processor (BP).
- the blacklist of CPU features is now a macro defined in the
specialreg.h header, instead of hardcoding it inside MD initialization
code; fix comments.
- replace checks against CPUID_TSC with the cpu_hascounter() function.
- clean up the code in init_x86_64(), as cpu_feature variables are set
inside cpu_probe().
- use cpu_init_msrs() for i386. It will be eventually used later for NX
feature under i386 PAE kernels.
- remove code that checks for CPUID_NOX in amd64 mptramp.S, this is already
performed by cpu_hatch() through cpu_init_msrs().
- remove cpu_signature and feature_flags members from struct mpbios_proc
(they were never used).
This patch was tested with i386 MONOLITHIC, XEN3PAE_DOM0 and XEN3_DOM0 under
a native i386 host, and amd64 GENERIC, XEN3_DOM0 via QEMU virtual machines.
XXX Should kernel rev be bumped?
XXX A similar patch should be pulled-up for NetBSD-5, hopefully tomorrow.
it's perfectly valid for bus_dmamap_create() to do so (a contigous
transfers will then split in multiple segment).
Fix _xen_bus_dmamem_alloc_range() and _bus_dmamem_alloc_range() to
allow a boundary limit smaller than size:
- compute appropriate boundary for uvm_pglistalloc(), wich doesn't
accept boundary < size
- also take care of boundary when deciding to start a new segment.
While there, remove useless boundary argument to _xen_alloc_contig().
Fix the boundary-related issue of PR port-amd64/42980
to the tag that pc inherits its behavior from. Add code to deal with
pc.pc_super.
Pull identical declarations out of xen/include/pci_machdep.h and
x86/include/pci_machdep.h into x86/include/pci_machdep_common.h.
by XENMEM_decrease_reservation, it is checked by the hypervisor. In certain
circumstances (stack leak), the field could have an improper value, leading
to a fail of the hypercall.
Set it to 0 ("no addressing restriction") to avoid that.
Patch tested by Sam Fourman and haad@.
This should fix the rare "failed allocating DMA memory" encountered
under NetBSD dom0. Will ask for a pull-up.
- NBPD_* macros are set to the types that better match their architecture
(UL for i386 and amd64, ULL for i386 PAE) - will revisit when paddr_t is
set to 64 bits for i386 non-PAE.
- type fixes in printf/printk messages (Use PRIxPADDR when printing paddr_t
values, instead of %lx - paddr_t/pd_entry_t being 64 bits with PAE)
- remove casts that are no more needed now that Xen2 support has been dropped
Some fixes are from jmorse@ patches for PAE.
Compile + tested for i386 GENERIC and XEN3 kernels. Only compile tested for
amd64.
Reviewed by bouyer@.
See also http://mail-index.netbsd.org/tech-kern/2010/02/22/msg007373.html
and non-const types, and the kernel uses both const and non-const
PMF qualifiers and device suspensors, so change the pmf_qual_t and
device_suspensor_t typedefs from "pointers to const" to non-pointer,
non-const types.
pinning/unpinning, set_ldt, invlpg) operations cannot be queued in the
xpq_queue[] any more, as they use their own specific hypercall, mmuext_op().
Their associated xpq_queue_*() functions already call xpq_flush_queue()
before issuing the mmuext_op() hypercall, which makes these xpq_flush_queue()
calls not necessary.
Rapidly discussed with bouyer@ in private mail. XEN3_DOM0/XEN3PAE_DOM0 tested
through a build.sh release, amd64 was only compile tested. No regression
expected.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.
Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.
Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.
read/write/accept, then the expectation is that the blocked thread will
exit and the close complete.
Since only one fd is affected, but many fd can refer to the same file,
the close code can only request the fs code unblock with ERESTART.
Fixed for pipes and sockets, ERESTART will only be generated after such
a close - so there should be no change for other programs.
Also rename fo_abort() to fo_restart() (this used to be fo_drain()).
Fixes PR/26567
as xkbd backend. This problem was reported by Hugo Silva on port-xen.
Now we call DIOCGWEDGEINFO for all partitions, when it is not implemented
we use DIOCGPART to get information about volume size.
Fix oked by jym@.