NetBSD

Author	SHA1	Message	Date
elad	36ec4b320c	When reporting open files using sysctl, don't use 'filehead' to fetch files, as we don't have a process context to authorize on. Instead, traverse the file descriptor table of each process -- as we already do in one case. Introduce a "marker" we can use to mark files we've seen in an iteration, as the same file can be referenced more than once. Hopefully this availability of filtering by process also makes life easier for those who are interested in implementing process "containers" etc.	2009-12-24 19:01:12 +00:00
mbalmer	1ce3f76abb	Fix typo, no code change.	2009-12-23 09:23:53 +00:00
pooka	3142d3ac31	Define namei flag INRENAME and set it if a lookup operation is part of rename. This helps with building better asserts for rename in the DELETE lookup ... the RENAME lookup is quite obviously a part of rename.	2009-12-23 01:09:24 +00:00
elad	4f2529fdb9	Including sysctl.h once is enough.	2009-12-23 00:21:38 +00:00
dsl	668acfeeca	Use sizeof correct type, not pointer to wrong type. Fixes PR/42498. This has been wrong since the initial import!	2009-12-22 20:50:46 +00:00
rmind	4fff15550a	Add comment about locking.	2009-12-20 23:00:59 +00:00
mrg	9a7ae38999	remove dated and wrong comments about curlwp being NULL. _kernel_{,un}lock() always assume it is valid now.	2009-12-20 20:42:23 +00:00
pooka	f015d3c5a1	Add a pointing to an explanation of why we have #ifdef pmax stuff in here.	2009-12-20 19:06:44 +00:00
dsl	2a54322c7b	If a multithreaded app closes an fd while another thread is blocked in read/write/accept, then the expectation is that the blocked thread will exit and the close complete. Since only one fd is affected, but many fd can refer to the same file, the close code can only request the fs code unblock with ERESTART. Fixed for pipes and sockets, ERESTART will only be generated after such a close - so there should be no change for other programs. Also rename fo_abort() to fo_restart() (this used to be fo_drain()). Fixes PR/26567	2009-12-20 09:36:05 +00:00
rmind	3c74cdf150	signal(9) code: add some comments, improve/fix wrong ones. While here, kill trailing whitespaces, wrap long lines, etc. No functional changes intended.	2009-12-20 04:49:09 +00:00
martin	cecef5e6d5	Use the kernel space version of the vfs name, not the original userspace pointer. Avoids crashes on archs with completely separate userspace VA.	2009-12-19 20:28:27 +00:00
rmind	ebd0ab14ab	sigtimedwait: fix a memory leak (which happens since newlock2 times). Allocate ksiginfo on stack since it is safe and sigget() assumes that it is not allocated from pool (pending signals via sigput()/sigget() "mill" should be dynamically allocated, however). Might be useful to revisit later. Likely the cause of PR/40750 and indirect cause of PR/39283.	2009-12-19 18:25:54 +00:00
rmind	1069745866	Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.	2009-12-17 01:25:10 +00:00
dsl	bc86c9b425	Don't ERESTART write() calls for now. I suspect some programs don't allow for the partial transfer.	2009-12-15 18:35:18 +00:00
dyoung	62f43df82a	Per rmind@'s suggestion, avoid an acquire/release-mutex dance by collecting garbage in two phases: in the first stage, with alldevs_mtx held, gather all of the objects to be freed onto a list. Drop alldevs_mtx, and in the second stage, free all the collected objects. Also per rmind@'s suggestion, remove KASSERT(!mutex_owned(&alldevs_mtx)) throughout, it is not useful. Find a free unit number and allocate it for a new device_t atomically. Before, two threads would sometimes find the same free unit number and race to allocate it. The loser panicked. Now there is no race. In support of the changes above, extract some new subroutines that are private to this module: config_unit_nextfree(), config_unit_alloc(), config_devfree(), config_dump_garbage(). Delete all of the #ifdef __BROKEN_CONFIG_UNIT_USAGE code. Only the sun3 port still depends on __BROKEN_CONFIG_UNIT_USAGE, it's not hard for the port to do without, and port-sun3@ had fair warning that it was going away (>1 week, or a few years' warning, depending how far back you look!).	2009-12-15 03:02:24 +00:00
matt	15aa4c53c9	Regen (new makesyscalls.sh)	2009-12-14 00:53:32 +00:00
matt	e110dba586	Merge from matt-nb5-mips64	2009-12-14 00:47:10 +00:00
dsl	723a159171	Another, better, fix for PR/26567. Only sleep once within each pipe_read/pipe_write call. If there is no data/space available after we wakeup return ERESTART so then the 'fd' number is validated again. A simple broadcast of the cvs is then enough to evict the correct threads when close() is called from an active thread.	2009-12-13 20:02:23 +00:00
dsl	e19cad8fcc	Revert most of the previous change. Only one fd needs clobbering, not all fds that reference the pipe. This may be what ad@ realised when he tried to add the same code to sockets. Unfixes part of PR/26567.	2009-12-13 18:27:02 +00:00
matt	dfa7467a6e	Pullup from matt-nb5-mips64. For each syscall, add a flag for the return value or an argument indicating that it is a 64-bit argument. Also include the number of 64-bit arguments. In theory this could get most of the code in compat/netbsd32/netbsd32_netbsd.c but not at the moment due to multiply defined structures.	2009-12-13 04:47:45 +00:00
dsl	c7517e0921	Add support for unblocking read/write when close called. Fixes PR/26567 for pipes. (NB ad backed out the fix for sockets)	2009-12-12 21:28:04 +00:00
dsl	9987412565	Fix comment for arg types of sys_profil().	2009-12-12 17:48:54 +00:00
dsl	ef379fcb95	Bounding the 'nfds' arg to poll() at the current process limit for actual open files is rather gross - the poll map isn't required to be dense. Instead limit to a much larger value (1000 + dt_nfiles) so that user programs cannot allocate indefinite sized blocks of kvm. If the limit is exceeded, then return EINVAL instead of silently truncating the list. (The silent truncation in select isn't quite as bad - although even there any high bits that are set ought to generate an EBADF response.) Move the code that converts ERESTART and EWOULDBLOCK into common code. Effectively fixes PR/17507 since the new limit is unlikely to be detected.	2009-12-12 17:47:05 +00:00
dsl	17a42f25f1	Report L_INMEM in the lwp info as well.	2009-12-12 17:29:34 +00:00
dsl	f537a9ce5f	Always set L_INMEM to maintain binary compatibility.	2009-12-12 17:03:19 +00:00
tsutsui	428585a7d8	Remove `volatile' qualifier from argument types of struct timeval passed to todr_gettime(9) and todr_settime(9). We no longer have an ancient and volatile struct timeval `time' global since we have switched to MI timercounter(9) on all port. XXX1: some of these RTC drivers still assume 32bit time_t XXX2: some of these should be rewritten to use todr_[gs]ettime_ymdhms() XXX3: todr(9) man page doesn't mention todr_[gs]ettime_ymdhms()	2009-12-12 15:10:34 +00:00
tsutsui	a49264523b	Use bool where appropriate.	2009-12-12 11:35:16 +00:00
tsutsui	efd28fda6a	Don't use int to get delta of time_t values.	2009-12-12 11:28:40 +00:00
dsl	eff3e2124a	Avoid leaking a mutex_obj when pipe_create() fails for the read pipe. Remove the unused argument from pipeclose().	2009-12-10 20:55:17 +00:00
matt	6a9e4e8eeb	Change u_long to vaddr_t/vsize_t in exec code where appropriate (mostly involves setregs and vmcmds). Should result in no code differences.	2009-12-10 14:13:48 +00:00
drochner	a1a04dd1be	If a struct sigevent with SIGEV_SIGNAL is passed to timer_create(2), check the signal number to be in the allowed range. An invalid signal number could crash the kernel by overflowing the sigset_t array. More checks would be good, and SIGEV_THREAD shouldn't be dropped silently, but this fixes at least the local DOS vulnerability.	2009-12-10 12:39:12 +00:00
drochner	fe1db36da9	fix some security critical bugs: -an invalid signal number passed to mq_notify(2) could crash the kernel on delivery -- add a boundary check -mq_receive(2) from an empty queue crashed the kernel by NULL dereference in timeout calculation -- handle the NULL case -likewise for mq_send(2) to a full queue -a user could set mq_maxmsg (the maximal number of messages in a queue) to a huge value on mq_open(O_CREAT) and later use up all kernel memory by mq_send(2) -- add a sysctl'able limit which defaults to 16mq_def_maxmsg (mq_notify(2) should get some more checks, and SIGEV_ values other than SIGEV_SIGNAL should be handled somehow, but this doesn't look security critical)	2009-12-10 12:22:48 +00:00
dsl	7a42c833db	Rename fo_drain() to fo_abort(), 'drain' is used to mean 'wait for output do drain' in many places, whereas fo_drain() was called in order to force blocking read()/write() etc calls to return to userspace so that a close() call from a different thread can complete. In the sockets code comment out the broken code in the inner function, it was being called from compat code.	2009-12-09 21:32:58 +00:00
dsl	43bac9730d	Correct comment, pipelock() no longer releases the mutex.	2009-12-06 20:26:55 +00:00
pooka	d2445bdd09	tsleep() on lbolt is now illegal. Convert cv_wakeup(&lbolt) to cv_broadcast(&lbolt) and get rid of the prior.	2009-12-05 22:38:19 +00:00
pooka	faa8e1b3e3	Convert tsleep(&lbolt) to kpause(). Make ltsleep/mtsleep on lbolt illegal. I examined all places where lbolt is referenced to make sure there were pointer aliases of it passed to tsleep, but put a KASSERT in m/ltsleep() just to be sure.	2009-12-05 22:34:43 +00:00
pooka	debaf78619	explicitly initialize static boolean	2009-11-30 15:37:56 +00:00
pooka	051b421f3f	Create CTL_HW before creating nodes on top of it (sysctl constructors run in "random" order).	2009-11-30 11:28:35 +00:00
pooka	0fb0ab1101	Fix kernel build on platforms which define __BROKEN_CONFIG_UNIT_USAGE and therefore don't take config_alldevs_lock() in config_devalloc().	2009-11-29 15:17:30 +00:00
dsl	454df0687b	When truncating a request in bounds_check_with_mediasize() multiply by the provided sector size instead of 512. Fixes last bit of PR/31565	2009-11-28 22:38:07 +00:00
bouyer	8c392da154	Previous did cause a deadlock with layered FS: the vrele thread can sleep on the vnode lock, while vget is sleeping on the VI_INACTNOW flag (or the vget caller is looping on vget returning failure because of the VI_INACTNOW flag). With layered FSes, the upper and lower vnodes share the same lock, so the vget() caller above can be already holding the vnode lock. Fix by dropping VI_INACTNOW before sleeping on the vnode lock in vrelel(), and check the ref count again once we have the lock. If the vnode has more than one reference, donc VOP_INACTIVE it. Fix PR kern/42318 and PR kern/42377 patch tested by Hisashi T Fujinaka, Joachim König, Stephen Borrill and Matthias Scheler.	2009-11-28 10:10:17 +00:00
pooka	bbc50ef41d	Due to the schizophrenic nature of kobj (mem + vfs source), split the module in twain to subj_kobj.c (master + mem) and subr_kobj_vfs.c (vfs).	2009-11-27 17:54:11 +00:00
pooka	8102fe7341	Move rootfs-related init from init_main() to vfs_mountroot(). Reduces code re-written in rump.	2009-11-27 16:43:51 +00:00
pooka	8257134a74	Make this work on some m68k ports which like putting the disklabel in the third sector (or have copypasted disklabel.h from a port which likes doing that ;).	2009-11-27 13:29:33 +00:00
tsutsui	c48b085654	u_short -> uint16_t, some KNF.	2009-11-27 11:23:50 +00:00
pooka	1798957738	Add DV_VIRTUAL for non-backed virtual devices and allow to mount root from a DV_VIRTUAL device.	2009-11-26 20:52:19 +00:00
pooka	baffc0cbae	typo in comment (it actually breaks the script totally. i wish more typos in comments were as effective)	2009-11-26 17:23:48 +00:00
pooka	91ac00ac3a	pipe +RUMP	2009-11-26 17:20:20 +00:00
pooka	67ff6315cd	Add rump support for the special handling required by pipe(2).	2009-11-26 17:19:54 +00:00
pooka	a91020162b	Instead of a single register_t as the retval of rump syscalls, use an array of two. No functional change ... yet.	2009-11-26 16:34:24 +00:00
pooka	024c040316	modctl +RUMP	2009-11-26 09:00:45 +00:00
matt	11af2f9cfa	Kill proc0paddr. Use lwp0.l_addr instead.	2009-11-26 00:19:11 +00:00
pooka	64ab232858	make WAPBL_DEBUG_PRINT compile	2009-11-25 14:43:31 +00:00
pooka	5fc3d70195	Remove highly questionable assert which demans that the kernel symbol table is in memory at a lower address than the string table.	2009-11-25 13:16:55 +00:00
rmind	606b1d9782	Add assert that ce->ce_func is not NULL.	2009-11-24 20:11:50 +00:00
dyoung	c8fed843e1	Address some of the concerns that SPLDEBUG is not machine-independent, Part 1 of N: There is not an MI ordering of interrupt priority levels, so use == IPL_HIGH and != IPL_HIGH instead of >= IPL_HIGH and < IPL_HIGH. Ignore 'cold' and always use curcpu(), since cpu_info_primary is MD. Other changes: There is no need to create symbols named _spldebug_* and strong aliases to them. Just use symbols spldebug_*, instead. Use a temporary variable instead of repeat cpu_index(9) calls. KASSERT() that cpu_index(9) is < MAXCPUS.	2009-11-24 17:28:32 +00:00
pooka	09dbb89b44	If cpu_disklabel includes struct dkbad, define __HAVE_DISKLABEL_DKBAD. This allows use of subr_disk_mbr on all archs. Default to it for the rump disk component. No functional change for regular kernels. (The other option would've been to include dkbad in disklabels everywhere, but arguably this approach has less possible side-effects, especially given that wedges and related magic will take over the world any second now).	2009-11-23 13:40:08 +00:00
mbalmer	0ae57f90dd	more s/the the/the/	2009-11-22 19:09:15 +00:00
enami	07ab814664	Fix indentation, wrap long line and remove unused variable.	2009-11-19 03:01:05 +00:00
enami	9f91c09ebc	Add missing vfs_unbusy() call in error path of sysctl_kern_vnode(). This allows us to reboot machine successfully even if pstat -v fails once.	2009-11-19 02:59:33 +00:00
pooka	a8ed404de6	* make it possible to include kern_module in a kernel without vfs support, i.e. move vfs functionality to a separate module (kern_module_vfs.c) * make module proplist size an MI constant (now 8k) instead of PAGE_SIZE * change some error values to something else than the karmic EINVAL	2009-11-18 17:40:45 +00:00
yamt	d8b340409c	turnstile_block: reduce code duplication.	2009-11-18 12:26:22 +00:00
yamt	e8ed984955	turnstile_block: turn a comment into KASSERTs.	2009-11-18 12:25:15 +00:00
bouyer	e3c6fd050a	Fix getcleanvnode() in previous: in the if (vp->v_usecount != 0) case we didn't bump the refcount, so don't decrease it through vrelel(). call mutex_exit() on v_interlock directly instead.	2009-11-17 22:20:14 +00:00
pooka	1d8a950195	Add a comment saying "name" to pool_init() is never freed (fixing requires touching pool implementation). No biggie, though, since the pools themselves are never freed.	2009-11-17 14:38:31 +00:00
elad	903af42390	Include miscfs/specfs/specdev.h for spec_init().	2009-11-15 02:37:13 +00:00
rmind	16347a5be7	kpsignal2: do not make the signal pending twice when tracing the process, also update a comment and add an assert. Fixes PR/42309 by Nicolas Joly.	2009-11-14 19:06:54 +00:00
elad	1570e68c40	- Move kauth_init() a little bit higher. - Add spec_init() to authorize special device actions (and passthru too for the time being). Move policy out of secmodel_suser.	2009-11-14 18:36:56 +00:00
dsl	e6a11930a4	Christos was worried about clrbits() being called with a length of zero. This can't happen, but rework so it doesn't matter. Remove 'optimisation' for length 1, that doesn't happen often enough.	2009-11-14 13:18:41 +00:00
dsl	f3583ee6ce	Fix clrbits() so that it doesn't mask no bits out of the byte after the range (when the last bit to be cleared is the msb of a byte). Fixes PR/42312 in a slightly better way than proposed.	2009-11-13 19:15:24 +00:00
dsl	be258d919e	Change args to clrbits() to be unsigned for efficiency.	2009-11-13 19:00:15 +00:00
dyoung	3ea78c91dc	Use TAILQ_FOREACH() instead of open-coding it. I applied this patch with Coccinelle's semantic patch tool, spatch(1). I installed Coccinelle from pkgsrc: devel/coccinelle/. I wrote tailq.spatch and kdefs.h (see below) and ran this command, spatch -debug -macro_file_builtins ./kdefs.h -outplace \ -sp_file sys/kern/tailq.spatch sys/kern/subr_autoconf.c which wrote the transformed source file to /tmp/subr_autoconf.c. Then I used indent(1) to fix the indentation. :::::::::::::::::::: ::: tailq.spatch ::: :::::::::::::::::::: @@ identifier I, N; expression H; statement S; iterator name TAILQ_FOREACH; @@ - for (I = TAILQ_FIRST(H); I != NULL; I = TAILQ_NEXT(I, N)) S + TAILQ_FOREACH(I, H, N) S ::::::::::::::: ::: kdefs.h ::: ::::::::::::::: #define MAXUSERS 64 #define _KERNEL #define _KERNEL_OPT #define i386 /* * Tail queue definitions. / #define _TAILQ_HEAD(name, type, qual) \ struct name { \ qual type tqh_first; /* first element / \ qual type qual tqh_last; / addr of last next element / \ } #define TAILQ_HEAD(name, type) _TAILQ_HEAD(name, struct type,) #define TAILQ_HEAD_INITIALIZER(head) \ { NULL, &(head).tqh_first } #define _TAILQ_ENTRY(type, qual) \ struct { \ qual type tqe_next; /* next element / \ qual type qual tqe_prev; / address of previous next element */\ } #define TAILQ_ENTRY(type) _TAILQ_ENTRY(struct type,) #define PMF_FN_PROTO1 pmf_qual_t #define PMF_FN_ARGS1 pmf_qual_t qual #define PMF_FN_CALL1 qual #define PMF_FN_PROTO , pmf_qual_t #define PMF_FN_ARGS , pmf_qual_t qual #define PMF_FN_CALL , qual #define __KERNEL_RCSID(a, b)	2009-11-12 23:16:28 +00:00
dyoung	972989f5e3	Move a device-deactivation pattern that is replicated throughout the system into config_deactivate(dev): deactivate dev and all of its descendants. Block all interrupts while calling each device's activation hook, ca_activate. Now it is possible to simplify or to delete several device-activation hooks throughout the system. Do not deactivate a driver while detaching it! If the driver was already deactivated (because of accidental/emergency removal), let the driver cope with the knowledge that DVF_ACTIVE has been cleared. Otherwise, let the driver access the underlying hardware (so that it can flush caches, restore original register settings, et cetera) until it exits its device-detachment hook. Let multiple readers and writers simultaneously access the system's device_t list, alldevs, from either interrupt or thread context: postpone changing alldevs linkages and freeing autoconf device structures until a garbage-collection phase that runs after all readers & writers have left the list. Give device iterators (deviter(9)) a consistent view of alldevs no matter whether device_t's are added and deleted during iteration: keep a global alldevs generation number. When an iterator enters alldevs, record the current generation number in the iterator and increase the global number. When a device_t is created, label it with the current global generation number. When a device_t is deleted, add a second label, the current global generation number. During iteration, compare a device_t's added- and deleted-generation with the iterator's generation and skip a device_t that was deleted before the iterator entered the list or added after the iterator entered the list. The alldevs generation number is never 0. The garbage collector reaps device_t's whose delete-generation number is non-zero. Make alldevs private to sys/kern/subr_autoconf.c. Use deviter(9) to access it.	2009-11-12 19:10:30 +00:00
rmind	ad4f42d499	workqueue_finiqueue: remove unused variable.	2009-11-11 14:54:40 +00:00
rmind	1283950019	- selcommon/pollcommon: drop redundant l argument. - Use cached curlwp->l_fd, instead of p->p_fd. - Inline selscan/pollscan.	2009-11-11 09:48:50 +00:00
rmind	e6f025f1da	Add a small comment on buffer cache locking, fix mark letter b_objlock.	2009-11-11 09:15:42 +00:00
rmind	484f70316c	G/C unused breada() and bdirty().	2009-11-11 07:22:33 +00:00
cegger	9480c51b04	Add a flags argument to pmap_kenter_pa(9). Patch showed on tech-kern@ http://mail-index.netbsd.org/tech-kern/2009/11/04/msg006434.html No objections.	2009-11-07 07:27:40 +00:00
pooka	1dac1a8cbc	g/c M_SOFTINTR	2009-11-06 13:32:41 +00:00
dyoung	fbe2bb0ace	Use deviter(9) instead of accessing alldevs directly.	2009-11-05 18:07:19 +00:00
pooka	11b02a2b55	Excommunicate comment not abiding to the 80col dogma. (well, turns out it was no longer valid either)	2009-11-05 16:15:51 +00:00
pooka	35a75982e4	expose module_{lookup,enqueue}()	2009-11-05 14:09:14 +00:00
bouyer	6b8161200e	getcleanvnode(): don't vclean() the vnode if it has gained another reference while we were getting the v_interlock. vget(): attempt prevent it from returning a clean vnode: if the vnode is being inactivated (by vrelel()), wait for vrelel() to complete (or return EBUSY if we can't wait), and return ENOENT if the vnode has been vclean'ed by vrelel() Fix kern/41147 in a better way, hopefully fix other related race conditions.	2009-11-05 08:18:02 +00:00
rmind	4c1098f541	do_sys_wait(): fix previous by checking for ru != NULL. Noticed by Onno van der Linden. Also, remove redundant arguments (seems that was_zombie was not used since rev 1.177 ?).	2009-11-04 21:23:02 +00:00
pooka	fcc20a4ba1	Split uiomove() and high-level copy routines out of the crowded kern_subr and into their own cozy home in subr_copy.	2009-11-04 16:54:00 +00:00
pooka	ab72032a6c	nuke unused local variable	2009-11-04 15:35:09 +00:00
pooka	83685e650c	Heave-ho mutex/rwlock object routines into separate modules -- they don't have anything to do with the lock internals.	2009-11-04 13:29:45 +00:00
dyoung	e48f8429d1	Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available on i386 and Xen kernels, today. 'options SPLDEBUG' adds instrumentation to spllower() and splraise() as well as routines to start/stop debugging and to record IPL transitions: spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().	2009-11-03 05:23:27 +00:00
dyoung	648f423c6f	Make lockdebug_lock_print(NULL, ...) dump all locks. Now, in ddb, 'show lock 0x0' dumps all of the locks. XXX I still need to fix 'show all lock'.	2009-11-03 00:29:11 +00:00
rmind	b9a294cf04	- Move inittimeleft() and gettimeleft() to subr_time.c, where they belong. - Move abstimeout2timo() there too and export. Use it in lwp_park().	2009-11-01 21:46:09 +00:00
rmind	1ceff942e5	Move common logic in selcommon() and pollcommon() into sel_do_scan(). Avoids code duplication. XXX: pollsock() should be converted too, except it's a bit ugly.	2009-11-01 21:14:21 +00:00
rmind	1ff7612225	do_sys_wait: clear rusage, instead of returning garbage. Patch from dholland@ via PR/40717, with minor change by me.	2009-11-01 21:05:30 +00:00
rmind	5ccbe1e208	orphanpg: remove no longer user variable.	2009-11-01 20:59:24 +00:00
njoly	b83467c466	Make flock(2) more robust to invalid operation, such as (LOCK_EX\|LOCK_SH).	2009-10-28 18:24:44 +00:00
rmind	e4be2748a3	- Amend fd_hold() to take an argument and add assert (reflects two cases, fork1() and the rest, e.g. kthread_create(), when creating from lwp0). - lwp_create(): do not touch filedesc internals, use fd_hold().	2009-10-27 02:58:28 +00:00
rmind	0ca6708c13	- Use pool(9) for pmf_event_workitem_t, instead of pool_cache(9). Still, meta-data of this pool takes more space than the actual data.. - Reduce lowat/hiwat to 1..8, since intensity is very low. - Remove unused pew_next_free from pmf_event_workitem_t.	2009-10-27 02:55:07 +00:00
rmind	c32b625d4c	Update comment about proc0_init().	2009-10-26 19:03:17 +00:00
rmind	554a0142dc	Initialise struct emul members by name (it is readable now and one can search them in the tree).	2009-10-25 01:14:03 +00:00
rmind	33963b1448	Avoid #ifndef __NO_CPU_LWP_FREE, only ia64 is missing cpu_lwp_free routines and it can/should provide stubs.	2009-10-22 22:28:57 +00:00
rmind	30d0b02e57	Make lwp_park_sobj and lwp_park_tab static. Wrap long lines while here.	2009-10-22 13:12:47 +00:00
rmind	40cf6f3659	Remove uarea swap-out functionality: - Addresses the issue described in PR/38828. - Some simplification in threading and sleepq subsystems. - Eliminates pmap_collect() and, as a side note, allows pmap optimisations. - Eliminates XS_CTL_DATA_ONSTACK in scsipi code. - Avoids few scans on LWP list and thus potentially long holds of proc_lock. - Cuts ~1.5k lines of code. Reduces amd64 kernel size by ~4k. - Removes __SWAP_BROKEN cases. Tested on x86, mips, acorn32 (thanks <mpumford>) and partly tested on acorn26 (thanks to <bjh21>). Discussed on <tech-kern>, reviewed by <ad>.	2009-10-21 21:11:57 +00:00
jym	de3d6f78cf	Fix a bug where on MP systems, pool_cache_invalidate(9) could be called early during boot, just after CPUs are attached but before they are marked as running. This will result in a list of CPUs without the SPCF_RUNNING flag set, and will trigger the 'KASSERT(xc_tailp < xc_headp)' in xc_lowpri() as no cross call is issued. Bug reported and patch tested by tron@. See also http://mail-index.netbsd.org/tech-kern/2009/10/19/msg006293.html	2009-10-20 17:24:22 +00:00
snj	07ce40632e	Follow upstream's lead and remove third and fourth clauses (except on from usr.sbin/mopd/common/pf.c, where only the ad clause is removed, because it has a shared UCB copyright) on Mats O Jansson's files. thorpej OK'd usr.sbin/rpc.yppasswdd/yppasswdd_mkpw.c, where he shares copyright.	2009-10-20 00:51:13 +00:00
snj	4968c04d96	Move Eduardo Horvath's license to 2 clause. OK eeh@.	2009-10-19 18:12:37 +00:00
jnemeth	30d0592bd3	allow passing a NULL proplib dictionary to modctl(MODCTL_LOAD, ...)	2009-10-16 00:27:07 +00:00
thorpej	1f59a448f4	- pool_cache_invalidate(): broadcast a cross-call to drain the per-CPU caches before draining the global cache. - pool_cache_invalidate_local(): remove.	2009-10-15 20:50:12 +00:00
pooka	624234c0c5	Generate scheduling points around rump vnode operations.	2009-10-15 00:29:40 +00:00
dsl	931ac5949a	Error out of ptcread() if the uio length supplied is zero before the code has a chance to panic in ureadc().	2009-10-14 19:25:39 +00:00
pooka	ddc943db02	regen: fix rump varargs syscalls prototypes	2009-10-13 21:57:52 +00:00
pooka	0d8bdf6131	For varargs syscalls, create rump prototypes which match the regular system call counterparts, e.g.: open(const char , int, mode_t) -> open(const char , int, ...)	2009-10-13 21:54:29 +00:00
yamt	e894729250	sys___aio_suspend50, sys_lio_listio: - fix the buffer sizes. - use kmem_alloc instead of kmem_zalloc for buffers which we will overwrite soon.	2009-10-12 23:43:13 +00:00
yamt	29e552b036	wrap long lines. no functional changes.	2009-10-12 23:38:08 +00:00
yamt	b8562be527	make aio_worker static.	2009-10-12 23:36:56 +00:00
yamt	5873138145	constify	2009-10-12 23:36:02 +00:00
yamt	28bf72b353	fix KMEM_SIZE vs KMEM_GUARD	2009-10-12 23:35:09 +00:00
yamt	de25ce6a4c	remove no longer necessary include of drvctl.h	2009-10-12 23:33:02 +00:00
yamt	199e4526f3	aio_suspend1: fix a double free bug.	2009-10-12 23:31:59 +00:00
dsl	65dd100015	Check for zero length read here - and return zero. Most times we've come through spec_read() which has already done the test, but not always (eg pty with ptsfs mounted). Without this there is a simple local-user panic in ureadc(). Noted Matthew Mondor on tech-kern.	2009-10-11 17:20:48 +00:00
dsl	270307174b	Fix locking when collecting pt_read and pt_ucntl.	2009-10-11 08:08:32 +00:00
jym	31629a1342	Add pool_cache_invalidate_local() to the pool_cache(9) API, to permit per-CPU objects invalidation when cached in the pool cache. See http://mail-index.netbsd.org/tech-kern/2009/10/05/msg006206.html . Reviewed by bouyer@. Thanks!	2009-10-08 21:54:45 +00:00
elad	2cb56be586	Add a (weak aliased) machdep_init() as a place to do machdep initialization that can't happen as early as the other init functions as called from cpu_startup() -- for example, register kauth(9) listeners. Put unprivileged policy in the x86 code; used by i386, amd64, and xen.	2009-10-06 21:07:05 +00:00
elad	756638cf95	Factor out a block of code that appears in three places (Veriexec, keylock, and securelevel) so that others can use it as well.	2009-10-06 04:28:10 +00:00
rmind	c9a5a18df3	mq_timedsend/mq_timedreceive: timeout value is absolute, not relative. While here, drop unecessary (since fdesc API changes) lwp_t arguments. Bug reported by Stathis Kamperis, thanks!	2009-10-05 23:49:46 +00:00
rmind	5503429772	shmexit: simplify a lot by avoiding unnecessary memory allocations, since it is a last reference, just re-lock and check mapping list again. Often there wont be re-locks at all, moreover, shm_lock is not contended at all.	2009-10-05 23:47:04 +00:00
rmind	c3a98b4c87	semu_alloc: simplify a little.	2009-10-05 23:46:02 +00:00
rmind	ac8f63538a	Convert cpu_number(), which can be sparse, to cpu_index(), which is MI.	2009-10-05 23:39:27 +00:00
elad	4c9fcb77c3	- Add usermount_common_policy() that implements some common (everything but access control) user mounting policies: enforced MNT_NOSUID and MNT_NODEV, no MNT_EXPORT, MNT_EXEC propagation. This can be useful for secmodels that are interested in simply adding finer grained user mount support. - Add a mount subsystem listener for KAUTH_REQ_SYSTEM_MOUNT_GET.	2009-10-05 04:20:13 +00:00
elad	fa69dc186a	Install floppies (haha) don't get built with ktrace/ptrace, so they don't include kern/sys_process.c. Move proc_uidmatch() to kern/kern_proc.c which always gets built instead. Pointed out by Kurt Schreiner on current-users@: http://mail-index.netbsd.org/current-users/2009/10/03/msg010745.html	2009-10-04 03:15:08 +00:00
elad	b2f3768346	- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it really belongs (suggested by rmind@), - Rename sched_init() to synch_init(), and introduce a new sched_init() in sys_sched.c where we (a) initialize the sysctl node (no more link-set) and (b) listen on the process scope with sched_listener. Reviewed by and okay rmind@.	2009-10-03 22:32:56 +00:00
elad	458410e7b5	Oops, forgot to make sched_listener static. Pointed out by rmind@, thansk!	2009-10-03 21:21:56 +00:00
elad	54d08ac134	Update a comment. No functional change.	2009-10-03 21:03:55 +00:00
elad	a39251ecc2	Introduce time_wraps() to check if setting the time will wrap it (or close to it). Useful for secmodels. Replace open-coded form with it in secmodel code (securelevel, keylock). Note: I need to find a way to make secmodel_keylock.c ~<100 lines.	2009-10-03 20:48:42 +00:00
elad	7f720ad562	KAUTH_GENERIC_CANSEE -> KAUTH_REQ_NETWORK_SOCKET_CANSEE. Not quite the same semantics but it's okay. Once our sockets have credentials (and they will) it's all the same.	2009-10-03 20:24:39 +00:00
elad	5b3a96a24d	Move KAUTH_NETWORK_BIND::KAUTH_REQ_NETWORK_BIND_PORT policy back to the subsystem (or close to it). Note: Revisit KAUTH_REQ_NETWORK_BIND_PRIVPORT.	2009-10-03 03:59:39 +00:00
elad	82ce55ed44	Move policies for KAUTH_PROCESS_{CANSEE,CORENAME,STOPFLAG,FORK} back to the subsystem. Note: Consider killing the signal listener and sticking KAUTH_PROCESS_SIGNAL here as well.	2009-10-03 03:38:31 +00:00
elad	111de3833c	Finish moving socket policy to the subsystem.	2009-10-03 01:41:39 +00:00
elad	452ced03bd	Move sched policy back to the subsystem.	2009-10-03 01:30:25 +00:00
elad	212f5fa214	Move kevent policy back to the subsystem.	2009-10-03 00:14:07 +00:00
elad	abc7a4290b	Put module loading policy back in the subsystem. Revisit: consider moving kauth_init() above module_init() in main().	2009-10-03 00:06:37 +00:00
elad	1f98cab201	Put the tty opening policy back in the subsystem. Remove include we don't need from the secmodel code.	2009-10-02 23:58:53 +00:00
elad	510083464f	Move some of the socket policy back to the subsystem. Remove include we don't need in the secmodel code.	2009-10-02 23:50:16 +00:00
elad	8751f894d8	Put signal delivery policy back in the subsystem.	2009-10-02 23:24:15 +00:00
elad	09f3ac9e2f	Stick nice policy in its own subsystem and call the listener "resource" rather than "rlimit"...	2009-10-02 22:46:18 +00:00
elad	bcc5014bd0	Move rlimit policy back to the subsystem. For this we needed proc_uidmatch() exposed, which makes a lot of sense, so put it back in sys_process.c for use in other places as well.	2009-10-02 22:38:45 +00:00
elad	2ae3a70827	Move ptrace's security policy back to the subsystem itself. Add a ptrace_init() so we have a place to register the listener; called next to ktrinit().	2009-10-02 22:18:56 +00:00
elad	40cc528a28	Move psets security policy back to the subsystem and keep suser logic only in the suser secmodel code.	2009-10-02 21:56:28 +00:00
elad	932cd15f91	Move ktrace's subsystem security policy to the subsystem itself, and keep just the suser-related logic in the suser secmodel.	2009-10-02 21:47:35 +00:00
elad	53ca19a3b3	First part of secmodel cleanup and other misc. changes: - Separate the suser part of the bsd44 secmodel into its own secmodel and directory, pending even more cleanups. For revision history purposes, the original location of the files was src/sys/secmodel/bsd44/secmodel_bsd44_suser.c src/sys/secmodel/bsd44/suser.h - Add a man-page for secmodel_suser(9) and update the one for secmodel_bsd44(9). - Add a "secmodel" module class and use it. Userland program and documentation updated. - Manage secmodel count (nsecmodels) through the module framework. This eliminates the need for secmodel_{,de}register() calls in secmodel code. - Prepare for secmodel modularization by adding relevant module bits. The secmodels don't allow auto unload. The bsd44 secmodel depends on the suser and securelevel secmodels. The overlay secmodel depends on the bsd44 secmodel. As the module class is only cosmetic, and to prevent ambiguity, the bsd44 and overlay secmodels are prefixed with "secmodel_". - Adapt the overlay secmodel to recent changes (mainly vnode scope). - Stop using link-sets for the sysctl node(s) creation. - Keep sysctl variables under nodes of their relevant secmodels. In other words, don't create duplicates for the suser/securelevel secmodels under the bsd44 secmodel, as the latter is merely used for "grouping". - For the suser and securelevel secmodels, "advertise presence" in relevant sysctl nodes (sysctl.security.models.{suser,securelevel}). - Get rid of the LKM preprocessor stuff. - As secmodels are now modules, there's no need for an explicit call to secmodel_start(); it's handled by the module framework. That said, the module framework was adjusted to properly load secmodels early during system startup. - Adapt rump to changes: Instead of using empty stubs for securelevel, simply use the suser secmodel. Also replace secmodel_start() with a call to secmodel_suser_start(). - 5.99.20. Testing was done on i386 ("release" build). Spearated module_init() changes were tested on sparc and sparc64 as well by martin@ (thanks!). Mailing list reference: http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html	2009-10-02 18:50:12 +00:00
pooka	68f37adaa6	Give humanize_number & format_bytes their own spots in the sun and move from kern_subr to subr_humanize.	2009-10-02 15:48:41 +00:00
pooka	bea18fb702	Add dealloccnt to list of things to be considered in the stetson-harrison decision making algorithm for flushing a wapbl transation.	2009-10-01 12:28:34 +00:00
pooka	5b19885537	Turn a KASSERT into a panic. I don't want us to be randomly overwriting memory on non-DIAGNOSTIC kernels if resource estimation fails.	2009-10-01 07:42:45 +00:00
dyoung	e533051d0f	#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL definition, drvctl_init() is not called, the drvctl_eventq is not initialized, and the kernel will panic in devmon_insert() when a device is detached. Thanks to Jared McNeill for pointing out the panic.	2009-09-29 22:40:15 +00:00
pooka	8de13bd4c6	regen: remove VNODE_LOCKDEBUG	2009-09-29 11:54:52 +00:00
pooka	ab3237b942	Add a switch on whether to create VNODE_LOCKDEBUG checks or not. Since VNODE_LOCKDEBUG has never been generally useful, default to off. However, the checks can still be generated by flipping the switch for the isolated cases where this form of dynamic analysis is useful and the person using it knows what she is doing.	2009-09-29 11:51:02 +00:00
dholland	8d36057243	Move a big wodge of symlink-following code from nfsd to inside lookup_for_nfsd(). This code is, or at least should be, the same as the regular symlink-following code plus an extra flag nfsd needs. The two lots of code can/will be merged in the future.	2009-09-27 17:23:53 +00:00
dholland	fb458255a3	Rename lookup() to lookup_for_nfsd(), to make it clear just whose private backdoor entry point this is. Also, clone the lookup_for_nfsd() entry point as lookup_for_nfsd_index(), for use by a different call site in nfsd that does different unclean things with nameidata.	2009-09-27 17:19:07 +00:00
dyoung	7e8a3f8dc1	Replace 'struct device *' with 'device_t', throughout. No functional change intended.	2009-09-25 19:21:09 +00:00
yamt	d571330722	cwdinit: whitespace fix. no functional changes.	2009-09-24 06:14:22 +00:00
pooka	9b040bc3a9	Split config_init() into config_init() and config_init_mi() to help platforms which want to call config_init() very early in the boot.	2009-09-21 12:14:46 +00:00
jmcneill	ae17b8bef2	If vfs_mountroot fails, print a list of supported file systems. If no file systems are supported by the kernel, print a big fat warning instead.	2009-09-19 16:20:41 +00:00
pooka	26e4989d18	Provide unwind log for bufq sysctls, since (theoretically) bufq might not be initialized during kernel bootstrap and therefore "permanent" nodes can be created only with an unwind log.	2009-09-17 09:54:27 +00:00
pooka	8a9910b608	Can't use CTLFLAG_PERMANENT here without providing a rollback log, since accept filters aren't (necessarily) added during kernel boot phase. pointed out & tested by Geoff Wing	2009-09-17 08:09:49 +00:00
dyoung	8497597988	Nothing calls config_activate(9) any longer, so delete it.	2009-09-16 22:45:23 +00:00
dyoung	36fffd8d02	In pmf(9), improve the implementation of device self-suspension and make suspension by self, by drvctl(8), and by ACPI system sleep play nice together. Start solidifying some temporary API changes. 1. Extract a new header file, <sys/device_if.h>, from <sys/device.h> and #include it from <sys/pmf.h> instead of <sys/device.h> to break the circular dependency between <sys/device.h> and <sys/pmf.h>. 2. Introduce pmf_qual_t, an aggregate of qualifications on a PMF suspend/resume call. Start to replace instances of PMF_FN_PROTO, PMF_FN_ARGS, et cetera, with a pmf_qual_t. 3. Introduce the notion of a "suspensor," an entity that holds a device in suspension. More than one suspensor may hold a device at once. A device stays suspended as long as at least one suspensor holds it. A device resumes when the last suspensor releases it. Currently, the kernel defines three suspensors, 3a the system-suspensor: for system suspension, initiated by 'sysctl -w machdep.sleep_state=3', by lid closure, by power-button press, et cetera, 3b the drvctl-suspensor: for device suspension by /dev/drvctl ioctl, e.g., drvctl -S sip0. 3c the system self-suspensor: for device drivers that suspend themselves and their children. Several drivers for network interfaces put the network device to sleep while it is not administratively up, that is, after the kernel calls if_stop(, 1). The self-suspensor should not be used directly. See the description of suspensor delegates, below. A suspensor can have one or more "delegates". A suspensor can release devices that its delegates hold suspended. Right now, only the system self-suspensor has delegates. For each device that a self-suspending driver attaches, it creates the device's self-suspensor, a delegate of the system self-suspensor. Suspensors stop a system-wide suspend/resume cycle from waking devices that the operator put to sleep with drvctl before the cycle. They also help self-suspension to work more simply, safely, and in accord with expectations. 4. Add the notion of device activation level, devact_level_t, and a routine for checking the current activation level, device_activation(). Current activation levels are DEVACT_LEVEL_BUS, DEVACT_LEVEL_DRIVER, and DEVACT_LEVEL_CLASS, which respectively indicate that the device's bus is active, that the bus and device are active, and that the bus, device, and the functions of the device's class (network, audio) are active. Suspend/resume calls can be qualified with a devact_level_t. The power-management framework treats a devact_level_t that qualifies a device suspension as the device's current activation level; it only runs hooks to reduce the activation level from the presumed current level to the fully suspended state. The framework treats a devact_level_t qualifying device resumption as the target activation level; it only runs hooks to raise the activation level to the target. 5. Use pmf_qual_t, devact_level_t, and self-suspensors in several drivers. 6. Temporarily add an unused power-management workqueue that I will remove or replace, soon.	2009-09-16 16:34:49 +00:00
pooka	11281f01a0	Replace a large number of link set based sysctl node creations with calls from subsystem constructors. Benefits both future kernel modules and rump. no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL	2009-09-16 15:23:04 +00:00
pooka	41c00db98c	Chop init_sysctl into base nodes (init_sysctl_base.c) and the kitchen sink (init_sysctl.c). Further surgery may be needed down the line.	2009-09-16 15:03:56 +00:00
pooka	fbd53556dc	Wipe out the last vestiges of POOL_INIT with one swift stroke. In most cases, use a proper constructor. For proplib, give a local equivalent of POOL_INIT for the kernel object implementation. This way the code structure can be preserved, and a local link set is not hazardous anyway (unless proplib is split to several modules, but that'll be the day). tested by booting a kernel in qemu and compile-testing i386/ALL	2009-09-13 18:45:10 +00:00
bouyer	b21564d63d	PR kern/41923: assertion "cur != owner" failed In the for(;;) loop of turnstile_block(), the lock owner can change while cur's lock is released (cur's lock is also the tschain_t's mutex). Remove the KASSERT about owner being invariant and try to deal with the fact that the owner can change instead. http://mail-index.netbsd.org/tech-kern/2009/08/24/msg005957.html and followups.	2009-09-13 14:38:20 +00:00
dyoung	c5d5f7697a	Make ifconfig(8) set and display preference numbers for IPv6 addresses. Make the kernel support SIOC[SG]IFADDRPREF for IPv6 interface addresses. In in6ifa_ifpforlinklocal(), consult preference numbers before making an otherwise arbitrary choice of in6_ifaddr. Otherwise, preference numbers are not consulted by the kernel, but that will be rather easy for somebody with a little bit of free time to fix. Please note that setting the preference number for a link-local IPv6 address does not work right, yet, but that ought to be fixed soon. In support of the changes above, 1 Add a method to struct domain for "externalizing" a sockaddr, and provide an implementation for IPv6. Expect more work in this area: it may be more proper to say that the IPv6 implementation "internalizes" a sockaddr. Add sockaddr_externalize(). 2 Add a subroutine, sofamily(), that returns a struct socket's address family or AF_UNSPEC. 3 Make a lot of IPv4-specific code generic, and move it from sys/netinet/ to sys/net/ for re-use by IPv6 parts of the kernel and ifconfig(8).	2009-09-11 22:06:29 +00:00
apb	7ab65de0a9	Expose the kernel's boothowto(9) variable through the sysctl kern.boothowto variable. Part of the /etc/rc silent changes requested in PR 41946 and proposed in tech-userlevel.	2009-09-11 18:14:58 +00:00
dyoung	3d4351e682	Delete whitespace at ends of lines.	2009-09-08 18:01:34 +00:00
pooka	f926eb58c3	Remove autoconf dependency on vfs and dk: opendisk() -> kern/subr_disk_open.c config_handle_wedges -> dev/dkwedge/dk.c	2009-09-06 16:18:55 +00:00
pooka	5e46a7c29a	Move configure() and configure2() from subr_autoconf.c to init_main.c, since they are only peripherially related to the autoconf subsystem and more related to boot initialization. Also, apply _KERNEL_OPT to autoconf where necessary.	2009-09-03 15:20:08 +00:00
jmcneill	56614eff97	In bdev_strategy, return ENXIO instead of panicing if the block device has disappeared. ok pooka@	2009-09-03 11:42:21 +00:00
elad	a162140107	Implement the vnode scope and adapt tmpfs to use it. Mailing list reference: http://mail-index.netbsd.org/tech-kern/2009/07/04/msg005404.html	2009-09-03 04:45:27 +00:00
tls	fd671f648a	Add a direction argument to socket upcalls, so they can tell why they've been called when, for example, they're waiting for space to write. From Ritesh Agrawal at Coyote Point.	2009-09-02 14:56:57 +00:00
pooka	5523d7f5c9	Initialize devsw (lock) early so that subsystems may play with it.	2009-09-02 08:07:05 +00:00
rmind	e24f6c0896	Turn off pipe's direct I/O again, it corrupts the data (although build and various activity survived while testing this). Corruptions also happen on sparc64 where emap is not in effect, therefore bugs are in direct I/O code.	2009-08-31 20:48:14 +00:00
rmind	3a8481feb4	Make pool_head static.	2009-08-29 00:09:02 +00:00
rmind	924c9047ea	- Re-enable direct I/O with emap for pipe. - While not used, #ifdef KVA allocation in emap (so it wont burn the space).	2009-08-29 00:06:43 +00:00
bouyer	389f5178ad	In uipc_usrreq(PRU_ACCEPT), grab the unp_streamlock before unp_setpeerlocks(). This fixes a race where, for a short period of time, so->so_lock and so2->so_lock are not sync. This makes solocked2() and solocked() unreliable and cause DIAGNOSTIC kernel panics. This also fixes a possible panic in unp_setaddr() which expects the socket locked. Should fix kern/38968, fix proposed in http://mail-index.netbsd.org/tech-kern/2009/08/17/msg005863.html	2009-08-26 22:34:47 +00:00
dyoung	2d89489416	In sysctl_create(), the first character of sysctl_name is sysctl_name[0], so write that instead of sysctl_name[sz] (where sz just happened to be set to 0 in the previous line). Also in sysctl_create(), give the length of the sysctl_name its own variable, nsz, and reserve sz for expressing the size of the node's value. No functional change intended.	2009-08-24 20:53:00 +00:00
manu	dd47ec7336	Back out previous change: do not skip the test on rootspec, but make it a simple attempt instead of an authoritative answer. The failure of the rootspec test could me machine-dependant. Thanks to martin@ for pointing that out.	2009-08-23 12:10:50 +00:00
dyoung	210a227e29	In sysctl_realloc(), don't make 'i' act as both an child-array iterator and the length of the old child array, but introduce a new variable, 'olen', for the latter purpose. In sysctl_alloc(), name a constant. Introduce sysctl_log_print(), a handy debug routine. No functional changes intended.	2009-08-21 22:51:00 +00:00
dyoung	5a3627a2a6	Make sure that a sysctlnode's child nodes, even nodes that are not yet in service, have a correct pointer to their parent, sysctl_parent. This fixes a bug where sysctl_teardown(9) could not clean up a network interface's sysctl(9) trees when I detached it, because the wrong log had been recorded.	2009-08-21 22:43:32 +00:00
manu	61a1c8cdd1	When netbooting, rootspec is now "md0a", and it has no chance to match an interface name, so do not give it a try.	2009-08-21 09:20:47 +00:00
yamt	f97310f398	whitespace fixes. no functional changes.	2009-08-18 02:43:49 +00:00
christos	a9d1bfd0c5	provide compatibility for the older variant of kern.consdev, which used a 32 bit dev_t. Reported by mrg.	2009-08-16 20:28:19 +00:00
yamt	273f17a18a	kauth_cred_free: add an assertion.	2009-08-16 11:01:12 +00:00
yamt	77d977dcbc	assertion	2009-08-16 11:00:20 +00:00
yamt	d59302b0e4	struct lwp -> lwp_t for consistency	2009-08-16 10:59:25 +00:00
haad	5f6671a94a	Allow undescribed, direct ioctls as used by Unix. This capability was removed in BSD, presumably because nothing used it any more. Third party system software written for Unix (like ZFS) requires this to work without significant modifications. Ok supremeleader@	2009-08-13 08:57:43 +00:00
haad	5200b9b492	Add enum uio_seg argument to do_sys_mknod and do_sys_mkdir so these functions can be called from kernel, too. Change needed for zfs device node creation, until we have propoer devfs. Oked by ad@.	2009-08-09 22:49:00 +00:00
dholland	f821ac304a	Begin splitting lookup() into more tractable pieces too.	2009-08-09 07:27:54 +00:00
dholland	40c09fbf2c	Begin splitting up namei into smaller pieces.	2009-08-09 03:28:35 +00:00
dsl	3d8c11d579	ktrace the arguments to script interpreters that come from the script. Fixes PR/33021	2009-08-06 21:33:54 +00:00
dsl	8b926bc93c	Fix ktrace of data from iovec based system calls. Fixes PR/41819	2009-08-05 19:53:42 +00:00
dsl	8129ef72eb	lockf() passes its arguments through to fcntl() but is supposed to support -ve lengths (lock area before current offset). Nothing in libc or the kernel allowed for this, so some random part of the file would get locked (no idea which bits). Although this could probably be fixed in libc, the stubs for posix file locks for emulations could easily get into the kernel with -ve lengths. So fixing in the kernel avoids those problems. This also fixes PR/41620 (attempting to lock negative offsets) - which is what I was looking into!	2009-08-05 19:39:50 +00:00
bad	0152c542e8	Add a note to change_root() that the callers need to authorize the operation. As requested by elad@.	2009-08-02 20:44:55 +00:00
christos	f1cd8c73cb	Don't return EWOULDBLOCK on an O_NONBLOCK tty file descriptor that has vmin > 0 and vtime > 0. It should be allowed to go to sleep for the sleep interval indicated in vtime. Reported by der Mouse a long while ago, and this is what other unixes do.	2009-08-01 23:07:05 +00:00
bad	02bcf17298	As discussed on tech-kern: Factor out common code of chroot-like syscalls into change_root() and export that function for use in other parts of the kernel. Rename change_dir() to chdir_lookup() as the latter describes better what the function does. While there, move the namei_data initialisation into chdir_lookup(), too. And export chdir_lookup().	2009-08-01 21:17:11 +00:00
mbalmer	9d8b69b23a	Do not attach gpiosim(4) at root, but make it a pseudo device. With help from Matthias Drochner, thanks!	2009-07-27 17:40:57 +00:00
mbalmer	953ebaaf3d	Allow gpiosim(4) to attach if configured in the kernel configuration.	2009-07-25 16:23:39 +00:00
christos	47736ab62e	check return code from soreserve() (Sean Boudreau)	2009-07-24 01:09:49 +00:00
pooka	39de73aae0	+fhopen, +fhstatvfs1 RUMP	2009-07-21 23:59:00 +00:00
yamt	0436400c70	set LP_RUNNING when starting lwp0 and idle lwps. add assertions.	2009-07-19 10:11:55 +00:00
rmind	db98cd9499	Regen.	2009-07-19 02:54:21 +00:00
rmind	7512d1e720	Make POSIX message queues a kernel module.	2009-07-19 02:50:44 +00:00
rmind	b95f99b9f9	Fix previous, so that it actually works, correctly.	2009-07-19 02:26:49 +00:00
ad	5c5bb856e1	Don't send the quiet banner to the log, since the usual noise gets dumped there anyway.	2009-07-17 23:31:51 +00:00
dyoung	b734bafe0e	Fix spelling: situatations -> situations.	2009-07-17 22:17:37 +00:00
dyoung	b43b2d186c	A definition in aic79xxvar.h somehow shadows pci_attach_args (ctags bug?), so leave it out of the tags computation for now.	2009-07-16 23:53:10 +00:00
rmind	569aa0de8b	Revert previous: disable direct I/O on pipe, it cought a problem with emap.	2009-07-15 21:09:41 +00:00
apb	dfcfba79d8	Convert free text inside #ifdef to a proper comment. Inspired by PR 41255 from Kurt Lidl.	2009-07-14 20:59:00 +00:00
tsutsui	46133c54ef	Add a workaround for some traditional ports (amiga and atari): - Defer callout_setfunc() call after config_init() call in configure(). Fixes silent hang before consinit() at least on atari. These traditional ports use config(9) structures and autoconf(9) functions to detect console devices, and config_init() is called at very early stage at boot where mutex(9) is not ready. Actually config_init() has been split out from configure() for these ports: http://cvsweb.NetBSD.org/bsdweb.cgi/src/sys/kern/subr_autoconf.c#rev1.74 while x68k has been fixed properly: http://mail-index.NetBSD.org/source-changes/2009/01/17/msg215673.html See also: http://mail-index.NetBSD.org/port-x68k/2008/12/31/msg000006.html http://mail-index.NetBSD.org/port-atari/2009/07/03/msg000419.html	2009-07-14 13:24:00 +00:00
rmind	f80b636295	Re-enable direct I/O for pipe: - Larger writes (2 or more pages) will use emap. - Might help to catch rare hang (some very old bug).	2009-07-13 02:49:08 +00:00
rmind	7e069f82fb	- Make insertion to message queue O(1) by using bitmap and array. However, mq_prio_max is dynamic, and sorted list is used for custom setup, when user manually sets higher priority range. - Cache mq->mq_attrib in some places. Change msg_ptr type to uint8_t. - Update copyright, misc.	2009-07-13 02:37:12 +00:00
rmind	b83b94a98e	mq_send/mq_receive: while permission may allow that, return EBADF if sending to read-only queue, or receiving from write-only queue. From Stathis Kamperis, thanks!	2009-07-13 00:41:08 +00:00
dyoung	2261ca8c07	In lwp_create(), take a reference to l2's filedesc_t instead of taking a reference to curlwp's by calling fd_hold(). If lwp_create() is called from fork1(), then l2 != curlwp, but l2's and not curlwp's filedesc_t whose reference we should take. This change stops the problem I describe in <http://mail-index.netbsd.org/tech-kern/2009/07/09/msg005422.html>, where /dev/rsd0a is never properly closed after fsck / runs on it. This change seems to quiet my USB backup drive, sd0 at scsibus0 at umass0, which had stopped spinning down when it was not in use: The unit probably stayed open after mount(8) tried (and failed: errant fstab entry) to mount it. I am confident that this change is an improvement, but I doubt that it is the last word on the matter. I hate to get under the filedesc_t abstraction by fiddling with fd_refcnt, and there may be something I have missed, so somebody with greater understanding of the file descriptors code should have a look.	2009-07-10 23:07:54 +00:00
dyoung	bfd7452af9	pmf_event_inject(9) may be called from interrupt context, so we must not allocate a pmf_event_workitem_t using kmem_alloc(9). Use pool_cache(9), instead, because it is safe in interrupt context. Thanks, rmind@, for catching the problem and suggesting the solution.	2009-07-08 18:53:36 +00:00
joerg	73df1b22f7	Remove unused include.	2009-07-06 12:37:17 +00:00
elad	518bb3e503	Message queues also use genfs_can_access() to control access. Since the latter might lose its KAUTH_GENERIC_ISSUSER check soon, add an internal function, mqueue_access(), and call genfs_can_access() from it instead so we don't pollute the main code path once we need to add a special kauth(9) check for message queues. No functional change, error codes preserved. Related mailing list thread: http://mail-index.netbsd.org/tech-kern/2009/06/21/msg005311.html	2009-07-03 21:32:09 +00:00
pooka	1a0b832e88	expose mkdir to in-kernel consumers	2009-07-02 12:53:47 +00:00
martin	53822d1e78	Update fd_freefile when kqueue descriptors are not copied from parent to child. From Wolfgang Solfrank in PR kern/41651. Approved by Andrew Doran.	2009-06-30 20:32:49 +00:00
yamt	6d375e715d	update a comment	2009-06-29 23:39:00 +00:00
dyoung	c53d86bdf2	Fix a typo in last (coda/ exclusion).	2009-06-29 18:03:37 +00:00
dholland	effcf1af5c	Convert 67 namei call sites to use namei_simple, in these functions: check_console, veriexecclose, veriexec_delete, veriexec_file_add, emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib, compat_20_sys_statfs, compat_20_netbsd32_statfs, ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs, ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib, osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs, ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4), adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount, ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount, ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags, sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown, sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs, sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl, sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file, sys_extattr_get_link, sys_extattr_delete_file, sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link, sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr, sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr All have been scrutinized (several times, in fact) and compile-tested, but not all have been explicitly tested in action. XXX: While I haven't (intentionally) changed the use or nonuse of XXX: TRYEMULROOT in any of these places, I'm not convinced all the XXX: uses are correct; an audit might be desirable.	2009-06-29 05:08:15 +00:00
dholland	acfecf55d7	Add namei_simple_kernel and namei_simple_user. These provide the common case functionality of namei in a simple package with only a couple flags. A substantial majority of the namei call sites in the kernel can use this interface; this will isolate those areas from the changes arising as the internals of namei are fumigated.	2009-06-29 05:00:14 +00:00
rmind	fe55ad324c	panic: use MI cpu_index(), instead of cpu_number(), which could be sparse.	2009-06-28 15:30:30 +00:00
rmind	5c68e5d0ee	Ephemeral mapping (emap) implementation. Concept is based on the idea that activity of other threads will perform the TLB flush for the processes using emap as a side effect. To track that, global and per-CPU generation numbers are used. This idea was suggested by Andrew Doran; various improvements to it by me. Notes: - For now, zero-copy on pipe is not yet enabled. - TCP socket code would likely need more work. - Additional UVM loaning improvements are needed. Proposed on <tech-kern>, silence there. Quickly reviewed by <ad>.	2009-06-28 15:18:50 +00:00
rmind	7b7c187a92	Amend previous.	2009-06-28 14:34:48 +00:00
rmind	39b52425ff	- Convert some #ifdefs to KASSERT()s. - KNF, style, no parameters in function declarations. - No functional changes.	2009-06-28 14:22:11 +00:00
yamt	85542b11cd	wrap a long line.	2009-06-28 11:42:07 +00:00
ad	5b4feac126	idle_loop: explicitly go to spl0() to sidestep potential MD bugs.	2009-06-28 09:25:05 +00:00
dyoung	b4f24be356	sys/coda/ rudely re-#defines some kernel constants and such, so leave it out of the tags for now.	2009-06-26 22:59:25 +00:00
dyoung	9d9978e5a5	Switch to kmem(9). (void )pew is one way to get a struct work , but let's write&pew->pew_work, instead. It is more defensive and persuasive. Make miscellaneous changes in support of tearing down arbitrary stacks of filesystems and devices during shutdown: 1 Move struct shutdown_state, shutdown_first(), and shutdown_next(), from kern_pmf.c to subr_autoconf.c. Rename detach_all() to config_detach_all(), and move it from kern_pmf.c to subr_autoconf.c. Export all of those routines. 2 In pmf_system_shutdown(), do not suspend user process scheduling, and do not detach all devices: I am going to do that in cpu_reboot(), instead. (Soon I will do it in an MI cpu_reboot() routine.) Do still call PMF shutdown hooks. 3 In config_detach(), add a DIAGNOSTIC assertion: if we're exiting config_detach() at the bottom, alldevs_nwrite had better not be 0, because config_detach() is a writer of the device list. 4 In deviter_release(), check to see if we're iterating the device list for reading, first, and if so, decrease the number of readers. Used to be that if we happened to be reading during shutdown, we ran the shutdown branch. Thus the number of writers reached 0, the number of readers remained > 0, and no writer could iterate again. Under certain circumstances that would cause a hang during shutdown.	2009-06-26 19:30:45 +00:00
dyoung	57a3ffeae7	Cosmetic: remove #if 1 / #endif.	2009-06-26 18:58:14 +00:00
dyoung	0b429bf76a	Keep a generation number, mountgen, that increases every time a filesystem is mounted. Synchronize access to the number with a mutex. When a struct mount, mp, is allocated, assign the current generation number to mp->mnt_gen. Introduce vfs_unmount_forceone() that forcefully unmounts the most recently mounted filesystem. Refactor: extract vfs_shutdown1() from vfs_shutdown(). Extract vfs_sync_all() from vfs_shutdown1(). Print more progress indications while we're unmounting all of the filesystems during shutdown. We increase the reference count on mp before calling dounmount(mp), but we do not decrease it if dounmount(mp) fails, and neither does dounmount(mp). So decrease the reference count if dounmount(mp) fails. Change the loop terminating condition in vfs_unmountall1() to (mp != (void *)&mountlist) from !CIRCLEQ_EMPTY(&mountlist), because we may not ever empty the list, especially if we're not forcing the filesystems to unmount.	2009-06-26 18:53:07 +00:00
christos	2ee7096547	magic symlink cleanup: - use size_t for len - don't call strlen multiple times in macro - add gid - off by one in bounds calculation	2009-06-26 15:49:03 +00:00
elad	55f182207a	Wow... too much Python. Fix DIAGNOSTIC build breakage: print -> printf. Pointed out by Kurt Schreiner on current-users@: http://mail-index.netbsd.org/current-users/2009/06/23/msg009815.html	2009-06-23 23:04:11 +00:00
elad	870920260d	Move the implementation of vaccess() to genfs_can_access(), in line with the other routines of the same spirit. Adjust file-system code to use it. Keep vaccess() for KPI compatibility and to keep element of least surprise. A "diagnostic" message warning that vaccess() is deprecated will be printed when it's used (obviously, only in DIAGNOSTIC kernels). No objections on tech-kern@: http://mail-index.netbsd.org/tech-kern/2009/06/21/msg005310.html	2009-06-23 19:36:38 +00:00
cegger	4765113ada	Return type of cpu_number(9) is cpuid_t which is effectively unsigned long. So cast return type to unsigned long. Fixes build for alpha GENERIC kernel.	2009-06-20 11:10:40 +00:00
mrg	8520c31093	when printing a ddb stack trace when entering ddb, include the cpu number	2009-06-18 06:26:58 +00:00
dyoung	61fa5bb9be	Make kobj_stat() return ENOSYS instead of panicking ("not modular") on non-MODULAR kernels. Make a few kobj_stat() callers check for a non-zero return code and deal gracefully.	2009-06-17 21:04:25 +00:00
kardel	a888100516	Make PPS work with fast time counters (> 2GHz) by making the pps count time stamp and the update time stamp u_int64. The time delta between two PPS events can now be correctly calculated avoiding any unaccounted for wraps with 32-bit counters.	2009-06-14 13:16:32 +00:00
plunky	6e74f4625b	Writes on the controlling tty were not being awoken from blocks, use the correct condvar to make this happen. this fixes PR/41566	2009-06-12 09:26:50 +00:00
yamt	1a7984dbf3	do_posix_fadvise: - deactivate pages on POSIX_FADV_DONTNEED. - more sanity checks. fix a panic in genfs_getpages introduced by the previous (rev.1.15).	2009-06-10 23:48:10 +00:00
yamt	724fd50176	don't make F_GETLK or the common case of F_UNLCK fail for per-user limit.	2009-06-10 22:34:35 +00:00
yamt	5216f042b0	lf_split: cv_destroy a condvar before clobbering it.	2009-06-10 22:23:15 +00:00
yamt	1763b7795c	do_posix_fadvise: on POSIX_FADV_WILLNEED, start prefeching of object's pages.	2009-06-10 01:56:34 +00:00

... 3 4 5 6 7 ...

7430 Commits