across the "coma bug" workaround to avoid clearing the MAPEN bit if it
was originally set by firmware. This appears necessary for proper
functioning of SMM on Geode CPUs, and thus for proper emulation (ugh)
of access to certain PCI configuration registers or VGA register
spaces. With this change, VGA and soundblaster emulation work on Geode
NX1 systems.
This was also the underlying problem that led to the introduction of
the PCI_QUIRK_SKIP_FUNCn quirks in pci_quirks.c, which are no longer
necessary (and counterproductive if you want to use VGA or a
Geode-specific audio driver). See the thread "pci probe" on port-i386
in August 2003 (The Soekris 4801, apparantly the most popular
Geode-based NetBSD box, has neither VGA nor audio, which may explain
why this wasn't noticed at the time).
which is very handy on a laptop to control EST through another program that
you don't necessarily want to run as root (in my case, gkrellm).
The option's name is EST_FREQ_USERWRITE, and is disabled by default.
have the DNA trap handler point to npxdna_empty() by default. This way, if
there are no npx devices found and MATH_EMULATE is not configured, we go back
to the old behavior of issuing a SIGKILL and printing:
pid XXX killed due to lack of floating point
rather than panicking.
to all GENERIC-like kernel config files where SYSV* options were already
present (commented out if the SYSV* options are commented out).
Fix lib/25897 and lib/25898.
off, or priority inversion can occur, which can lead to IPI deadlocks.
Leaves interrupts off for a bit longer, sadly, but with no noticeable
effects on the systems I tested on.
From YAMAMOTO Takashi.
This allows boot1() to change the sector number (of the boot partition)
that bootxx.S passes through to boot2().
This means that boot2() will find the correct partition when boot1()
reads /boot from the 'a' partition instead of the mbr boot partition.
This all happens when you update a system that used a small 'wd0h' partition
to boot a raid1 set to the new bootcode. Deleting /boot from the 'wd0h'
partition will make the new bootcode find /boot and the root filesystem
inside the raid set.
registers are registers that overlap with others on many controllers, but
which may actually be distinct on some controllers. Right now, the two
shadows are:
- wd_status (usually overlaps wd_command)
- wd_features (usually overlaps wd_error)
Add a new helper function, wdc_init_shadow_regs(), used to initialize
the shadow register handles on controllers where they do actually overlap.
Partially from Jordan Rhody @ Wasabi Systems, Inc.
width implementation was a rather poor choice. Per discussion with
Charles Hannum.
Note: While this is technically an ABI change I believe it is a
change that we can afford at this time (and to be pulled up to
2.0). The types are not widely used yet, and a survey of pkgsrc
has not shown uses that would be adversely affected by it.
Kanaoka. I've been sitting on this code for 3 years, and have not done
anything better with it. It is ugly, it needs to be handled better, but
it is better to have it #ifdef'ed out rather than nothing.
Michael Eriksson posted to port-i386 on 20031102, with various
modifications by me to work in the new sysctl(9) framework.
The code is enabled with 'options ENHANCED_SPEEDSTEP', and if
the CPU supports EST the following sysctl(8) nodes appear
(with the values that a Dell Inspiron 8600 + WUXGA with a
1.4GHz Pentium M CPU supports):
machdep.est.cpu_brand = Intel(R) Pentium(R) M processor 1400MHz
machdep.est.frequency.target = 1400
machdep.est.frequency.current = 1400
machdep.est.frequency.available = 1400 1200 1000 800 600
If EST support isn't available, the "machdep.est" sysctl sub-MIB
is not created.
Once we have a more general "CPU frequency" control API we can
migrate this code to using that.
Thanks to Michael Erikkson for providing this code!
written by Michael Eriksson and posted to port-i386 on 20031102.
(This is the driver "as is" - I'll commit the code to integrate it
into -current separately)
Having the table in the 'standard' mbr allows fdisk to write in bootsel
menu items and only ask about updating the mbr code before exit.
Sysinst validates that the mbr code contains the bootselect table for
all the mbr code variants it reads - because it might want to write the table
and doesn't really want to make the validation dependant on what it is
going to do later.
Fixes install/25235, but sysinst needs some changes (like reporting the
failure to write the mbr) before the pr itself is closed.
this fixes PR#25014. i386 GENERIC can re-enable PERFCTRS by default now
(it was disabled with x86 SMP support was commited to the trunk.)
XXX: should add P4 support
XXX: should add MP support
otherwise an interrupt vector using a task gate (ie. ddbipi) messes it up.
- defer LDTR loading as well as cr3.
- tweak comments to make three copies of switching code more synchronized.
causing Mobile Pentium 4 to be shown as a Mobile Celeron.
- fix intel_family6_name() for brand=0xB && signature >= 0xF13
- fix a potential out-of-bounds array reference
(ICH2 and later), which fixes PR/23700.
The changes are from Hiroyuki Bessho and Masanori Kanaoka in PR/23700
with a little modification of interrupt router lookup from mine.
leave 4 bytes for the Windows NT Drive Serial Number (DSN) at 440-443
(as mbr_sector.mbr_dsn).
Ensure that all the MBR & PBR code reserves space for mbr_sector.mbr_dsn.
Leave the bootsel magic number at 444-445 as mbr_sector.mbr_bootsel_magic
(instead of mbr_sector.mbr_bootsel.mbrbs_magic), but use 0xb5e1 (MBR_BS_MAGIC)
instead of 0xaa55 (MBR_MAGIC) to indicate that this change has occurred.
Rework MBR_BS_NEWMBR to mean "mbr_bootsel has moved to 400".
Modify fdisk(8) to automatically relocate the mbr_bootsel from 404 to 400
if mbr_bootsel_magic is the old value (0xaa55), and unset MBR_BS_NEWMBR
to flag that new mbr_bootsel code must be used if updating the MBR.
These changes fixes a problem where Windows 2000 or Windows XP would corrupt
the last 3 bytes + NUL of MBR partition 3's bootsel name if the bootsel name
was 5 characters long, replacing bytes 6-9 with the DSN.
Also, by explicitly reserving the space for the DSN we prevent problems in the
future if non bootsel MBR or PBR code had other information at bytes 440-443.
- move per VP data into struct sadata_vp referenced from l->l_savp
* VP id
* lock on VP data
* LWP on VP
* recently blocked LWP on VP
* queue of LWPs woken which ran on this VP before sleep
* faultaddr
* LWP cache for upcalls
* upcall queue
- add current concurrency and requested concurrency variables
- make process exit run LWP on all VPs
- make signal delivery consider all VPs
- make timer events consider all VPs
- add sa_newsavp to allocate new sadata_vp structure
- add sa_increaseconcurrency to prepare new VP
- make sys_sa_setconcurrency request new VP or wakeup idle VP
- make sa_yield lower current concurrency
- set sa_cpu = VP id in upcalls
- maintain cached LWPs per VP
drivers that attach to it. This allows for other host interface chips
that use the same keyboards and mice, such as the ones in the ARM
IOMD20, ARM7500, and SA-1111. The PC-compatible driver is still
called pckbc(4), and the new abstraction layer is "pckbport", so the
child devices have moved from sys/dev/pckbc to sys/dev/pckbport, which
also contains some code shared between all host controllers. To avoid
incompatibility, pckbdreg.h is still installed in
/usr/include/dev/pckbc.
In theory, this shouldn't cause any behavioural changes in the drivers
concerned. Thy just use rather more function pointers than before. Tested
on i386 and (with a new host driver) acorn32. Compiled on several other
affected architectures.
- clear PSL_NT. it can be set by userland because setting it
isn't a privileged operation.
(cf. DSA-336-1, CVE-2002-0429)
- set PSL_I. otherwise, if SIGSEGV is ignored, we'll
end up to infinite loop, generating the same traps, with
interrupts disabled.
to only call pckbc_machdep_cnattach() if this is present. This allows
pckbc_machdep_cnattach() to be omitted entirely on most ports, where it only
returns ENXIO anyway.
The devices with this attribute at the moment are pc(4) on i386 and bebox, and
pckbc on sparc, where pckbc_machdep_cnattach() mysteriously returns 0 rather
than ENXIO.
* lpt device is defined in MI place (dev/ppbus/files.ppbus), dev/ic/lpt.c
is included there too; dev/ic/lpt.c is not included if ppbus is
configured or if there is alternative platform lpt (like for pc532)
* g/c MD lpt definitions and custom puc/upc attachments,
glue moved to conf/files and dev/pci/files.pci respectively; remove
device lpt definition from dev/isa/files.isa
* add ppbus parport attribute, atppc device attachments, adjust plip and lpt
glue
is empty besides calling switch_exit(). So, rename switch_exit() to
cpu_exit() and modify the routine to call lwp_exit2() direct.
This saves couple cycles on the exit path.
process context ('reaper').
From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit
uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.
MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.
g/c now unneeded routines and variables, including the reaper kernel thread
- wdc_xfer to ata_xfer
- channel_queue to ata_queue
and move them to <dev/ata/atavar.h> so they can be used by non-wdc ATA
controllers. Clean up the member names of these structures while at it.
clients, and a pseudo-device for userspace access.
The attribute is named `opencrypto'. The pseudo-device is renamed to
"crypto", which has a dependency on "opencrypto". The sys/conf/majors
entry and pseudo-device attach entrypoint are updated to match the
new pseudo-device name.
Fast IPsec (sys/netipsec/files.ipsec) now lists a dependency on the
"opencrypto" attribute. Drivers for crypto accelerators (ubsec,
hifn775x) also pull in opencrypto, as providers of opencrypto transforms.
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.
This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.
On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.
- nothing needs to be done if ci_want_resched is already set.
- if the cpu isn't running any lwp, send a no-op ipi to it
so that it can resume immediately from halting in idle loop
without having to wait until the next clock tick.
some advices from Stephan Uphoff.
- pmap_enter: zap PTE and read attributes atomically to
eliminate a race window which could cause lost of attributes.
- reduce number of TLB shootdown by using some assumptions
about PTE handling.
for more details, see "SMP improvements for pmap" thread on port-i386@
around May 2003.
address currently in effect does not always work: There might be more
instances of the code segment selector in other threads, on other CPUs
and in *jmp_bufs.
So always check whether the CS needs updating, if it is not already
set to the "BIG" value.
This code needs more cleanup, this is considered a stopgap fix only.
Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al. Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.
Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded. Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.
All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.
PS - I'm sorry, but there's a distinct lack of documentation at the
moment. I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
bswapl, and bf_cbc.S uses it. Unfortunately, this means that GENERIC
will no longer use the asm code -- though it will still use the asm
for the basic Blowfish transform. This won't slow down the KAME IPsec
(since it rolls its own CBC) but may slow down fast-ipsec in kernels
that have I386_CPU defined.
(it only allowed to boot an nfs /netbsd automatically)
To make it work for people who can't tell the DHCP server to pass
the right kernel file to pxeboot, without losing flexibility for
people who can, do the following:
Use the filename given by the DHCP server if it contains a ":". A ":"
was already used to seperate filesystem and filename, so we don't
lose anything. Otoh, a path to pxeboot usually doesn't contain a ":",
so it should still work if we got the old pxeboot filename again.
install media and the kernels (and sysinst) will still run on a 16MB system.
(They haven't run on an 8MB system for a while - might affect 12MB though.)
The additional space in the root filesystem lets sysinst core dump properly!
make absolutely high the top 16bits of returned values are zero.
Ralf's list says that some BIOS need %eax = 0x0000e820 in getmementry.
Add a few comments.
Might fix problems with memory size detection on some systems.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)
it can now accomodate multiple _CIDs
sizeof(ACPI_DEVICE_INFO) should not be used
* make ad_devinfo member in acpi_devnode a pointer
* implement acpi_match_hid() to simplify matching devices;
_CIDs are also taken into account now as well as _HID
containing signal posting, kernel-exit handling and sa_upcall processing.
XXX the pc532, sparc, sparc64 and vax ports should have their
XXX userret() code rearranged to use this.
uvm_swapout_threads will swapout LWPs which are running on another CPU:
- uvm_swapout_threads considers LWPs running on another CPU for swapout
if their l_swtime is high
- uvm_swapout_threads considers LWPs on the runqueue for swapout if their
l_swtime is high but these LWPs might be running by the time uvm_swapout
is called
symptoms of failure: panic in setrunqueue
fixes PR kern/23095
FAT16 (11+51) except when booting from FAT{12,16,32}, which needs FAT32 (11+79).
We still reserve the BPB for non-bootxx_msdos PBR bootblocks because
they may be installed as a floppy boot record (and those need a BPB).
Remove some redundant wording in an error messsage, saving 6 bytes.
* _UC_MACHINE_PC() - access the program counter
* _UC_MACHINE_INTRV() - access the integer return value register
* _UC_MACHINE_SET_PC() - set the program counter (this requires
special handling on some platforms).
i386-MD installboot.
They haven't been enabled for a while, keeping them here is just
confusing, and they're still going to be in the CVS repo attic...
<sys/bootblock.h>:
* Added definitions for the Master Boot Record (MBR) used by
a variety of systems (primarily i386), including the format
of the BIOS Parameter Block (BPB).
This information was cribbed from a variety of sources
including <sys/disklabel_mbr.h> which this is a superset of.
As part of this, some data structure elements and #defines
were renamed to be more "namespace friendly" and consistent
with other bootblocks and MBR documentation.
Update all uses of the old names to the new names.
<sys/disklabel_mbr.h>:
* Deprecated in favor of <sys/bootblock.h> (the latter is more
"host tool" friendly).
amd64 & i386:
* Renamed /usr/mdec/bootxx_dosfs to /usr/mdec/bootxx_msdos, to
be consistent with the naming convention of the msdosfs tools.
* Removed /usr/mdec/bootxx_ufs, as it's equivalent to bootxx_ffsv1
and it's confusing to have two functionally equivalent bootblocks,
especially given that "ufs" has multiple meanings (it could be
a synonym for "ffs", or the group of ffs/lfs/ext2fs file systems).
* Rework pbr.S (the first sector of bootxx_*):
+ Ensure that BPB (bytes 11..89) and the partition table
(bytes 446..509) do not contain code.
+ Add support for booting from FAT partitions if BOOT_FROM_FAT
is defined. (Only set for bootxx_msdos).
+ Remove "dummy" partition 3; if people want to installboot(8)
these to the start of the disk they can use fdisk(8) to
create a real MBR partition table...
+ Compile with TERSE_ERROR so it fits because of the above.
Whilst this is less user friendly, I feel it's important
to have a valid partition table and BPB in the MBR/PBR.
* Renamed /usr/mdec/biosboot to /usr/mdec/boot, to be consistent
with other platforms.
* Enable SUPPORT_DOSFS in /usr/mdec/boot (stage2), so that
we can boot off FAT partitions.
* Crank version of /usr/mdec/boot to 3.1, and fix some of the other
entries in the version file.
installboot(8) (i386):
* Read the existing MBR of the filesystem and retain the BIOS
Parameter Block (BPB) in bytes 11..89 and the MBR partition
table in bytes 446..509. (Previously installboot(8) would
trash those two sections of the MBR.)
mbrlabel(8):
* Use sys/lib/libkern/xlat_mbr_fstype.c instead of homegrown code
to map the MBR partition type to the NetBSD disklabel type.
Test built "make release" for i386, and new bootblocks verified to work
(even off FAT!).
Right now the only flag is used to indicate if a ksiginfo_t is a
result of a trap. Add a predicate macro to test for this flag.
* Add initialization macros for ksiginfo_t's.
* Add accssor macro for ksi_trap. Expands to 0 if the ksiginfo_t was
not the result of a trap. This matches the sigcontext trapcode semantics.
* In kpsendsig(), use KSI_TRAP_P() to select the lwp that gets the signal.
Inspired by Matthias Drochner's fix to kpsendsig(), but correctly handles
the case of non-trap-generated signals that have a > 0 si_code.
This patch fixes a signal delivery problem with threaded programs noted by
Matthias Drochner on tech-kern.
As discussed on tech-kern. Reviewed and OK's by Christos.
which is automatically included during kernel config, and add comments
to individual machine-dependant majors.* files to assign new MI majors
in MI file.
Range 0-191 is reserved for machine-specific assignments, range
192+ are MI assignments.
Follows recent discussion on tech-kern@
most polling.
2) Clean up some goofiness in pciide -- get rid of the whole "candisable" path
(it's gratuitous) and simplify the code by calling pciide_map_compat_intr(),
*_set_modes() and wdc_print_modes() from central locations.
3) Add a register writability and register ghost test to eliminate phantom
drives more quickly.