- Always look for labels in the first two sectors
- Update all existing labels, but only write new labels to netbsd mbr
partitions, and the first/second sector of disks that don't have an mbr.
Moving disks between systems still sucks bigtime!
- raw patition is either c or d, if d then c is reserved
- max partitions might be 8 or 16, nothing in the label gives the maximum
- endianness
the time pool_get() calls pool_catchup(), pp has been free'd but it is still
in the "entered" state. The chain pool_catchup() -> pool_allocator_alloc()
-> pool_reclaim() on pp fails because pp is still in the "entered" state.
Call pr_leave() before calling calling pool_catchup() to avoid this.
Thanks for the excellent analysis!
Be more flexible in what we accept as a valid LINTSTUB directive.
Don't abort on first error.
Separate LINTSTUB comments look ugly if the function/variable already
has a descriptive comment. People don't like to write ugly code.
Now one can write:
/*
* LINTSTUB: Func: type function(args)
* Some descriptive comment about the function.
*/
buffer when the passed nested buffer has no B_ERROR flag set but not all
was transfered for the nested iobuf extent.
Discussed on tech-kern and ok'd by Takashi
otherwise generate an UVM trap or will access random memory. This is due to
the dereference of vp->v_specmountpoint that is really
vp->v_specinfo->si_mountpoint. The field v_specinfo is multiplexed with
other structs in the vun union in struct vnode like struct socket.
The patch adds a sanity check for accessing the specinfo fields by only
allowing VBLK nodes to be passed. In theory also VCHR could be valid since
its also a special node though mounting is only done on VBLK so be strict.
Ok'd by yamt.
- for structure fields that are conditionally present,
make those fields always present.
- for functions which are conditionally inline, make them never inline.
- remove some other functions which are conditionally defined but
don't actually do anything anymore.
- make a lock-debugging function conditional on only LOCKDEBUG.
as discussed on tech-kern some time back.
for each fork-wait cycles.
- updatepri: factor out the code to decay estcpu so that it can be used
by scheduler_wait_hook.
- scheduler_fork_hook: record how much estcpu is inherited from
the parent process.
- scheduler_wait_hook: don't add back inherited estcpu to the parent.
otherwise, once the corresponding bit in the inode bitmap is cleared,
an unrelated inode with the same inode number can be allocated and
ufs_ihashget() picks a stale in-core vnode for it.
PR/32301 by Matthias Scheler.
- pool_allocator_alloc: drain ourselves as well,
so that pool_cache on us is drained as well.
- pool_cache_put_paddr: destruct objects if underlying pool is starved.
- pool_get: on kva starvation, wake up once a second and try again.
Fixes:
PR/32287: Processes hang in "mclpl"
PR/32330: shark kernel hangs under memory load.
- don't compare a scaled value with a unscaled value.
- actually, 7 times the loadfactor is necessary to decay p_estcpu enough,
even before the recent p_estcpu changes.
after the recent p_estcpu change, 8 times loadavg decay is needed.
- fix a comment to match with the recent reality.
It breaks the code that generates a default label (with partition 'a'
covering the entire volume - which is what everything expects) for disks that
don't have a NetBSD label nor an MBR, but do have a single filesytem covering
the entire volume.
return NULL even though no disklabel was found making callers assume that a
valid disklabel WAS found but instead were presented by the dummy disklabel
that is created.
If the rval is SCAN_CONTINUE it now returns a standard error that no
disklabel was found instead of the NULL.
shortcut to the process of the passed lwp paniced the kernel since lwp
could/can be passwd as NULL in VOP_WRITE().
This was happening when ktracing to NFS. The function ktrwrite() set the
uio_lwp to NULL and then calls VOP_WRITE() with this argument. nfs_write()
then accessed lwp *l->l_proc wich paniced.
Thanks to David Laight for his help on tracking it down.
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.
Thanks to dyoung@, scw@, and perry@ for help testing.
2005-08-30 15:27 avatar
Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.
Submitted by: sam
X-MFC-With: other ic_curchan changes
2005-08-13 18:50 sam
revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.
Noticed by: Michal Mertl
2005-08-13 18:31 sam
Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.
Submitted by: Michal Mertl (original version)
MFC after: 2 weeks
2005-08-10 18:42 sam
Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.
Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks
2005-08-10 17:22 sam
Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine
Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks
2005-08-09 11:19 rwatson
Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.
Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.
Reviewed by: pjd, bz
MFC after: 7 days
2005-08-08 19:46 sam
Split crypto tx+rx key indices and add a key index -> node mapping table:
Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)
Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)
Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api
These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.
Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks
2005-08-08 06:49 sam
use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking
MFC after: 1 week
2005-08-08 04:30 sam
Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard
Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week
2005-08-06 05:57 sam
fix debug msg typo
MFC after: 3 days
2005-08-06 05:56 sam
Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.
MFC after: 5 days
2005-07-31 07:12 sam
close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations
MFC after: 5 days
2005-07-27 05:41 sam
when bridging internally bypass the bss node as traffic to it
must follow the normal input path
Submitted by: Michal Mertl
MFC after: 5 days
2005-07-27 03:53 sam
bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure
Reviewed by: avatar, David Young
MFC after: 5 days
2005-07-23 01:16 sam
the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table
2005-07-23 00:25 sam
o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)
MFC after: 3 days
2005-07-22 22:11 sam
split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning
MFC after: 3 days
2005-07-22 21:48 sam
split 802.11 frame xmit setup code into ieee80211_send_setup
MFC after: 3 days
2005-07-22 18:57 sam
simplify ic_newassoc callback
MFC after: 3 days
2005-07-22 18:54 sam
simplify ieee80211_ibss_merge api
MFC after: 3 days
2005-07-22 18:50 sam
add stats we know we'll need soon and some spare fields for future expansion
MFC after: 3 days
2005-07-22 18:45 sam
simplify tim callback api
MFC after: 3 days
2005-07-22 18:42 sam
don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped
Obtained from: Atheros
MFC after: 3 days
2005-07-22 18:36 sam
simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's
MFC after: 3 days
2005-07-22 18:31 sam
simplifiy ieee80211_send_nulldata api
MFC after: 3 days
2005-07-22 18:29 sam
simplify rate set api's by removing ic parameter (implicit in node reference)
MFC after: 3 days
2005-07-22 18:21 sam
reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion
Obtained from: Atheros
MFC after: 3 days
2005-07-22 18:16 sam
missed one in last commit; add device name to discard msgs
2005-07-22 18:13 sam
include device name in discard msgs
2005-07-22 18:12 sam
add diag msgs for frames discarded because the direction field is wrong
2005-07-22 18:08 sam
split data frame delivery out to a new function ieee80211_deliver_data
2005-07-22 18:00 sam
o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD
MFC after: 3 days
2005-07-22 17:55 sam
o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h
2005-07-22 17:50 sam
diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1
2005-07-22 17:37 sam
add flags missed in last merge
2005-07-22 17:36 sam
Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros
MFC after: 3 days
2005-07-22 06:17 sam
send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong
2005-07-22 06:15 sam
remove excess whitespace
2005-07-22 05:55 sam
use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's
MFC after: 3 days
2005-07-11 04:06 sam
Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.
Approved by: re (scottl)
2005-07-11 04:00 sam
nuke assert that duplicates real check
Reviewed by: avatar
Approved by: re (scottl)
in the case where we're sending SIGKILL but all LWPs are not signalable.
some LWP will wake up soon enough to process the signal, and there may
not be any LWPs in the cache to wake up anyway. fixes PR 28886 and PR 26771.
also, add a missing "break" pointed out by yamt.
by making p_estcpu fixpt_t. PR/31542.
1. schedcpu() decreases p_estcpu of all processes
every seconds, by at least 1 regardless of load average.
2. schedclock() increases p_estcpu of curproc by 1,
at about 16 hz.
in the consequence, if a system has >16 processes
with runnable lwps, their p_estcpu are not likely increased.
by making p_estcpu fixpt_t, we can decay it more slowly
when loadavg is high. (ie. solve #1.)
i left kinfo_proc2::p_estcpu (ie. ps -O cpu) scaled because i have
no idea about its absolute value's usage other than debugging,
for which raw values are more valuable.
in some drivers including wd and scsi.
- physio: if a caller provided a buf, stick to use it
because some drivers use it as an identifier.
- sprinkle simple_locks.
- scsistrategy: rather than issueing an async request and
waiting for its completion, simply issue a sync request.
the way to wait for the completion had an assumption that
B_CALL is never used. it isn't the case after the recent
physio() changes.
pointed/analyzed/tested by Martin Husemann.
we can't do an SA context switch after all, we need to clear the sau from
the LWP's arg. sa_switch() frees the sau in this case, but if we don't
reset the LWP's state and the process exits, then the exiting LWP will
try to free the sau again.
also, change the sadebug printf stuff to use printf_nolog(), since
otherwise we deadlock because we're already holding sched_lock and
the normal printf() will try to wakeup the log reader.
code.
- To achieve COMPAT_NETBSD32 compatibility, introduce a parameter to
kevent1 that points to functions that do the actual copyin/copyout
operations. This is similar to what was done in FreeBSD by Paul Saab.
- Add the COMPAT_NETBSD32 definitions and hooks.
discussed in the PR.
- introduce sys/timevar.h to hold kernel-specific stuff relevant to
sys/time.h. Ideally, timevar.h would contain all (or almost) of the
#ifdef _KERNEL part of time.h, but that's a pretty big and tedious
change to make. For now, it will contain only the prototypes I
introduced when working on COMPAT_NETBSD32.
- split copyinout_t into copyin_t and copyout_t, it makes prototypes more
explicit about the meaning of a given argument. Suggested by yamt@.
- move copyinout_t definition in sys/time.h to systm.h as copyin_t and
copyout_t
- make everything uses the new types and include the proper headers at
the proper places.
the original code since if fullgroups was empty and partgroups wasn't, we
would not clean up partgroups (pointed out by yamt). Well, this one has
different semantics from the original, they are the correct ones I think..
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)
COMPAT_16 and earlier that results in a current shared linker running at
address 0 (and thus allows NULL pointer derefs to work).
As noted by Matthias Drochner, this "fix" just checks the first psection
and not the first loadable psection. This isn't a problem with the
binutils up to now, but might be in the future.
This closes a hole pointed out by Thor Lancelot Simon on tech-kern ~3
years ago.
The problem was with running binaries from remote storage, where our
kernel (and Veriexec) has no control over any changes to files.
An attacker could, after the fingerprint has been verified and
program loaded to memory, inject malicious code into the backing
store on the remote storage, followed by a forced flush, causing
a page-in of the malicious data from backing store, bypassing
integrity checks.
Initial implementation by Brett Lymn.
since both pool_get() and pool_put() can call wakeup().
instead, allocate the struct sadata_upcall before taking
sched_lock in mi_switch() and free it after releasing sched_lock.
clean up some modularity warts by adding a callback to
struct sadata_upcall for freeing sa_arg.
split the single list of pool cache groups into three lists:
completely full, partially full, and completely empty.
use LIST instead of TAILQ where appropriate.
- Remove all NFS related stuff from file system specific code.
- Drop the vfs_checkexp hook and generalize it in the new nfs_check_export
function, thus removing redundancy from all file systems.
- Move all NFS export-related stuff from kern/vfs_subr.c to the new
file sys/nfs/nfs_export.c. The former was becoming large and its code
is always compiled, regardless of the build options. Using the latter,
the code is only compiled in when NFSSERVER is enabled. While doing this,
also make some functions in nfs_subs.c conditional to NFSSERVER.
- Add a new command in nfssvc(2), called NFSSVC_SETEXPORTSLIST, that takes a
path and a set of export entries. At the moment it can only clear the
exports list or append entries, one by one, but it is done in a way that
allows setting the whole set of entries atomically in the future (see the
comment in mountd_set_exports_list or in doc/TODO).
- Change mountd(8) to use the nfssvc(2) system call instead of mount(2) so
that it becomes file system agnostic. In fact, all this whole thing was
done to remove a 'XXX' block from this utility!
- Change the mount*, newfs and fsck* userland utilities to not deal with NFS
exports initialization; done internally by the kernel when initializing
the NFS support for each file system.
- Implement an interface for VFS (called VFS hooks) so that several kernel
subsystems can run arbitrary code upon receipt of specific VFS events.
At the moment, this only provides support for unmount and is used to
destroy NFS exports lists from the file systems being unmounted, though it
has room for extension.
Thanks go to yamt@, chs@, thorpej@, wrstuden@ and others for their comments
and advice in the development of this patch.
filedescriptors passed in this message - the counterpart in
unp_externalize does this as well.
Note that CMSG_SPACE(0) does not make sense, since it does not invoke
the alignment magic - so use CMSG_SPACE(sizeof(int)) and adjust the
calculated total later.
This fixes the postfix conection cache for 64bit platforms. Previously
the number of passed filed descriptors (nfds) would have been
calculeted too high, causing the fdrelease() of uninitialized junk.
by changing the symlink one to set vap's vatype to VLNK. All the other three
already set vatype to the correct type. Note that, however, in the mkdir
case (and now symlink too) this is not strictly necessary.
used in ioctl routines to do the right thing when the FKIOCTL flag is
passed to the IOCTL routine indicating its a in-kernel VOP_IOCTL call and
indirect addresses provided in the arguments are to be seen as kernel
adresses rather than userland adresses.
A simple substitution and prepending of the `flags' passed on to the ioctl
handler is enough to DTRT.
we can implement an universal submatch() function covering all
the standard cases:
if (<configured> != <wildcard> && <configured> != <real>)
then fail
else
ask device match function
explicitely by a plain integer array
the length in now known to all relevant parties, so this avoids
duplication of information, and we can allocate that thing in
drivers without hacks