NetBSD

Author	SHA1	Message	Date
maya	f6be953d31	use a bound string copy	2017-01-15 01:47:24 +00:00
maya	8341f84221	use a bound string copy	2017-01-15 01:28:14 +00:00
kamil	c52f1ed048	Fix generation of PTRACE_LWP_EXIT event Set p_lwp_exited instead of p_lwp_created for PTRACE_LWP_EXIT. This made the lwp_exit1 ATF test passing. Sponsored by <The NetBSD Foundation>	2017-01-14 19:32:10 +00:00
kamil	6413a1acf0	Introduce PTRACE_LWP_{CREATE,EXIT} in ptrace(2) and TRAP_LWP in siginfo(5) Add interface in ptrace(2) to track thread (LWP) events: - birth, - termination. The purpose of this thread is to keep track of the current thread state in a tracee and apply e.g. per-thread designed hardware assisted watchpoints. This interface reuses the EVENT_MASK and PROCESS_STATE interface, and shares it with PTRACE_FORK, PTRACE_VFORK and PTRACE_VFORK_DONE. Change the following structure: typedef struct ptrace_state { int pe_report_event; pid_t pe_other_pid; } ptrace_state_t; to typedef struct ptrace_state { int pe_report_event; union { pid_t _pe_other_pid; lwpid_t _pe_lwp; } _option; } ptrace_state_t; #define pe_other_pid _option._pe_other_pid #define pe_lwp _option._pe_lwp This keeps size of ptrace_state_t unchanged as both pid_t and lwpid_t are defined as int32_t-like integer. This change does not break existing prebuilt software and has minimal effect on necessity for source-code changes. In summary, this change should be binary compatible and shouldn't break build of existing software. Introduce new siginfo(5) type for LWP events under the SIGTRAP signal: TRAP_LWP. This change will help debuggers to distinguish exact source of SIGTRAP. Add two basic t_ptrace_wait* tests: lwp_create1: Verify that 1 LWP creation is intercepted by ptrace(2) with EVENT_MASK set to PTRACE_LWP_CREATE lwp_exit1: Verify that 1 LWP creation is intercepted by ptrace(2) with EVENT_MASK set to PTRACE_LWP_EXIT All tests are passing. Surfing the previous kernel ABI bump to 7.99.59 for PTRACE_VFORK{,_DONE}. Sponsored by <The NetBSD Foundation>	2017-01-14 06:36:52 +00:00
kamil	0e96af0f53	Add support for PTRACE_VFORK_DONE and stub for PTRACE_VFORK in ptrace(2) PTRACE_VFORK is supposed to be used to track vfork(2)-like events, when parent gives birth to new process child and stops till it exits or calls exec(). Currently PTRACE_VFORK is a stub. PTRACE_VFORK_DONE is notification to notify a debugger that a parent has resumed after vfork(2)-like action. PTRACE_VFORK_DONE throws SIGTRAP with TRAP_CHLD. Sponsored by <The NetBSD Foundation>	2017-01-13 23:00:35 +00:00
hannken	cfa69dcf1b	Add file-local iterator variant vfs_vnode_iterator_next1() that waits for vnodes to become reclaimed and use it from vflush().	2017-01-13 10:10:32 +00:00
christos	d8dfcd6c2a	regen	2017-01-13 06:18:31 +00:00
christos	a1a8fc3617	const police!	2017-01-13 06:11:27 +00:00
hannken	0365dd0e1a	Adapt to the recent vnode changes.	2017-01-11 14:52:02 +00:00
joerg	6ff696c6b4	Add ddb command to find a vnode by the address of its lock. This makes it much easier to convert lockstat traces into understandable data.	2017-01-11 12:17:34 +00:00
hannken	e2f2c94b67	Move vnode member v_lock as vi_lock to vnode_impl.h.	2017-01-11 09:08:58 +00:00
hannken	dcc198a3f8	Move vnode member v_mntvnodes as vi_mntvnodes to vnode_impl.h. Add an ugly hack so pstat.c may still traverse the list.	2017-01-11 09:07:57 +00:00
hannken	6e1af6b1d7	Move vnode members v_synclist_slot and v_synclist as vi_synclist_slot and vi_synclist to vnode_impl.h.	2017-01-11 09:06:57 +00:00
hannken	2b4a4af133	Move vnode members v_dnclist and v_nclist as vi_dnclist and vi_nclist to vnode_impl.h.	2017-01-11 09:04:37 +00:00
pgoyette	4869ce0a43	Use membar_{producer,consumer}() to ensure proper access to the "ready" flag.	2017-01-10 22:08:14 +00:00
pgoyette	5a30768de5	Rework the sysctl initialization to avoid creating new nodes from within the helper function. This should avoid the "locking against myself" error reported earlier.	2017-01-10 00:50:57 +00:00
kamil	687ff8a6ad	Introduce new si_code for SIGTRAP: TRAP_CHLD - process child trap The SIGTRAP signal is thrown from the kernel if EVENT_MASK (ptrace_event) enables PTRACE_FORK. This new si_code helps debuggers to distinguish the exact source of signal delivered for a debugger. Another purpose of TRAP_CHLD is to retain the same behavior inside the NetBSD kernel for process child traps and have an interface to monitor it. Retrieving exact event and extended properties of process child trap is available with PT_GET_PROCESS_STATE. There is no behavior change for existing software. This si_code value is NetBSD extension. Sponsored by <The NetBSD Foundation>	2017-01-10 00:48:37 +00:00
christos	69f0023338	If we had an error, don't do the debug checks because they will most certainly fail and we'll panic.	2017-01-09 14:25:52 +00:00
kamil	e6f79d077f	Cleanup dead code after revert of racy vfork(2) commit This removes dead code introduced with the following commit: date: 2012-07-27 22:52:49 +0200; author: christos; state: Exp; lines: +8 -2; revert racy vfork() parent-blocking-before-child-execs-or-exits code. ok rmind	2017-01-09 00:31:30 +00:00
christos	f896811791	fix build without ddb.	2017-01-08 19:49:25 +00:00
kamil	e4281b2073	Introduce new ptrace(2) interface: PT_SET_SIGINFO and PT_GET_SIGINFO This interface is designed to read signal information emited to tracee and fake this signal with new value. This functionality is required to distinguish types of events that occured in the tracee and intercepted by a debugger. These accessors introduce a new structure type ptrace_siginfo: /* * Signal Information structure / typedef struct ptrace_siginfo { siginfo_t psi_siginfo; / signal information structure / lwpid_t psi_lwpid; / destination LWP of the signal * value 0 means the whole process * (route signal to all LWPs) */ } ptrace_siginfo_t; Include <sys/siginfo.h> in <sys/ptrace.h> in order to not break existing software due to unknown symbol siginfo_t. This interface has been proposed to the tech-kern@ mailing list. Sponsored by <The NetBSD Foundation>	2017-01-06 22:53:17 +00:00
kamil	239e90be56	Introduce new SIGTRAP code: TRAP_EXEC On exec() events under a debugger generate the SIGTRAP signal with TRAP_EXEC property. This allows tracer to distinguish exec() events easily. Sponsored by <The NetBSD Foundation>	2017-01-06 22:42:58 +00:00
pgoyette	a62101788a	Use the new magic BINTIME_SCALE_* macros instead of magic numbers. No functional change.	2017-01-05 23:29:14 +00:00
hannken	592be9ae45	Name all "vnode_impl_t" variables "vip". No functional change.	2017-01-05 10:05:11 +00:00
pgoyette	c9b6361b98	By popular demand, update kernhist to use bintime(9) as the basis for its timestamps. As this changes storage structures for data passed between kernel and userland, welcome to 7.99.55! XXX Output routines still use microsecond resolution when printf()ing. XXX Possible future feature would be addition of option to use XXX getbintime(9) for less time-critical histories.	2017-01-05 03:40:33 +00:00
pgoyette	c42fba4183	Actually initialize the sysctl stuff for kernhist! Missed this file in earlier commits.	2017-01-05 03:22:20 +00:00
hannken	78a3dd75dc	Expand struct vcache to individual variables (vcache.* -> vcache_*). No functional change.	2017-01-04 17:13:50 +00:00
pgoyette	5a0b3ff699	Rearrange the sysctl export structure for better alignment.	2017-01-04 01:05:58 +00:00
hannken	8b7bed0d14	Now that v_usecount tracks valid references add some "v_usecount == 1" assertions.	2017-01-02 10:36:58 +00:00
hannken	e0f81f2c02	Change vcache_*vget() to increment v_usecount on success only. Increment v_holdcnt to prevent the vnode from disappearing while vcache_vget() waits for a stable state. Now v_usecount tracks the number of successfull references.	2017-01-02 10:35:00 +00:00
hannken	998709c439	Rename vget() to vcache_vget() and vcache_tryvget() respectively and move the definitions to sys/vnode_impl.h. No functional change intended. Welcome to 7.99.54	2017-01-02 10:33:28 +00:00
pgoyette	c2efd8c96e	Provide a sysctl method of exporting the kernel history data. XXX vmstat will be update soon to use the sysctl rather than grovelling XXX through kvm.	2017-01-01 23:58:47 +00:00
pgoyette	c129bbe940	Remove some extraneous whitespace	2016-12-28 06:25:40 +00:00
hannken	3b04d6a086	It is wrong to block the vnode during vcache_rekey. The vnode may be looked up using the old key until vcache_rekey_exit changes the key to the new one. Add an assertion that the temporary key is different from the current one.	2016-12-27 11:59:36 +00:00
maya	441aa9cf25	Revert previous commit (to r1.117) Superfluous warnings in simple userland programs is not a valid reason to break a security model.	2016-12-27 09:34:44 +00:00
pgoyette	ee1d5b993e	Decouple BIOHIST from other users of KERNHIST.	2016-12-27 04:12:34 +00:00
pgoyette	d05a55c879	#include giohist.h from proper location	2016-12-26 23:49:53 +00:00
pgoyette	6a7e4606d5	Fix locking so we don't release the lock between the time we check the tailq (for being non-empty) and the time we remove an entry.	2016-12-26 23:15:15 +00:00
pgoyette	7f0851cee1	Add a BIOHIST option. As mentioned on tech-kern.	2016-12-26 23:12:33 +00:00
mlelstv	46f58a90c6	When balancing threads over multiple CPUs, use fixpoint arithmetic for averages. Otherwise the decisions can be heavily biased by rounding errors. Add sysctl kern.sched_average_weight to change the weight of historical data, the default is 50%.	2016-12-22 14:11:58 +00:00
hannken	0d2ece78cb	Restructure vdrain_vrele(). While it is not possible for another thread to lock this vnodes v_interlock -> vdrain_lock another vnode sharing the v_interlock may lock this order. While here, restore fstrans_start_nowait arg to FSTRANS_LAZY. Fixes a deadlock seen recently on some pbulk environments.	2016-12-20 10:02:21 +00:00
cherry	28fcb4a4b5	panic() must be able to take varargs - in userspace testing too.	2016-12-19 13:02:14 +00:00
dholland	b79a953f51	typo in comment	2016-12-18 05:43:20 +00:00
riastradh	51beee07d0	Fix return value of nommap.	2016-12-16 23:35:04 +00:00
kamil	241cf91ddc	Add support for hardware assisted watchpoints/breakpoints API in ptrace(2) Add new ptrace(2) calls: - PT_COUNT_WATCHPOINTS - count the number of available hardware watchpoints - PT_READ_WATCHPOINT - read struct ptrace_watchpoint from the kernel state - PT_WRITE_WATCHPOINT - write new struct ptrace_watchpoint state, this includes enabling and disabling watchpoints The ptrace_watchpoint structure contains MI and MD parts: typedef struct ptrace_watchpoint { int pw_index; /* HW Watchpoint ID (count from 0) / lwpid_t pw_lwpid; / LWP described / struct mdpw pw_md; / MD fields / } ptrace_watchpoint_t; For example amd64 defines MD as follows: struct mdpw { void md_address; int md_condition; int md_length; }; These calls are protected with the __HAVE_PTRACE_WATCHPOINTS guard. Tested on amd64, initial support added for i386 and XEN. Sponsored by <The NetBSD Foundation>	2016-12-15 12:04:17 +00:00
hannken	4349535165	Change the freelists to lrulists, all vnodes are always on one of the lists. Speeds up namei on cached vnodes by ~3 percent. Merge "vrele_thread" into "vdrain_thread" so we have one thread working on the lrulists. Adapt vfs_drainvnodes() to always wait for a complete cycle of vdrain_thread().	2016-12-14 15:49:35 +00:00
hannken	70ec436e39	Move vnode members "v_freelisthd" and "v_freelist" from "struct vnode" to "struct vnode_impl" and rename to "vi_lrulisthd" and "vi_lrulist". No functional change intended. Welcome to 7.99.48	2016-12-14 15:48:54 +00:00
hannken	13fa9cae25	Remove the "target" argment from vfs_drainvnodes() as it is always equal to "desiredvnodes" and move its definition from sys/vnode.h to sys/vnode_impl.h. Extend vfs_drainvnodes() to also wait for deferred vrele to flush and replace the call to vrele_flush() with a call to vfs_drainvnodes().	2016-12-14 15:46:57 +00:00
nat	f1631e52a4	Add functions to access device flags. This restores simultaneous audio open/close. OK hannken@ christos@	2016-12-09 19:13:47 +00:00
roy	89a9eb7b34	When loading a kernel, test if it's already loaded before authorizing. This allows us to return EEXIST instead of EPERM for higher secure levels. My use case was to stop npfctl complaining that it could not load bpfjit on ERLITE when it was compiled into the kernel. It then went on to complain that NPF performance would be de-graded, but this is clearly not the case.	2016-12-09 13:06:41 +00:00
christos	cdadd9e0af	void duplicate definition on statically linking libc+ssp and rumpkern+ssp.	2016-12-06 02:55:42 +00:00
christos	cf786e11e4	set the signal flag when the signal was sent to every lwp, not to just an individual one.	2016-12-05 22:07:16 +00:00
christos	1d6d63b6d6	PR/51685: Kamil Rytarowski: Fill sigcontext info in kpsignal2 so that the debugger/core-dump signal info gets filled in in all code paths (including the lwp_kill one).	2016-12-04 16:40:43 +00:00
christos	840d624913	Add missing ktrkuser	2016-12-03 22:28:16 +00:00
hannken	f3e32599e8	- Change vcache_reclaim() to always call VOP_INACTIVE() before VOP_RECLAIM(). When called from vrecycle() or vgone() there is a window where the refcount is greater than zero and another thread could get and release a reference that would miss VOP_INACTIVE() as the refcount doesn't drop to zero. Adjust test fs/puffs/t_basic: test VOP_INACTIVE count being greater zero. - Make vrecycle() more robust by checking v_usecount first and preventing further references across vn_lock(). Fixes a deadlock where one thread starts unmount, second thread locks a directory and allocates a vnode and first thread tries to vrecycle() the directory. First thread holds vfs_busy and wants vnode, second thread holds vnode and wants vfs_busy. - With these fixes in place change cleanvnode() to use vget()/vrecycle() to reclaim the vnode.	2016-12-01 14:49:03 +00:00
ozaki-r	6f15561386	Fix a race condition of low priority xcall xc_lowpri and xc_thread are racy and xc_wait may return during/before executing all xcall callbacks, resulting in a kernel panic at worst. xc_lowpri serializes multiple jobs by a mutex and a cv. If all xcall callbacks are done, xc_wait returns and also xc_lowpri accepts a next job. The problem is that a counter that counts the number of finished xcall callbacks is incremented before actually executing a xcall callback (see xc_tailp++ in xc_thread). So xc_lowpri accepts a next job before all xcall callbacks complete and a next job begins to run its xcall callbacks. Even worse the counter is global and shared between jobs, so if a xcall callback of the next job completes, the shared counter is incremented, which confuses wc_wait of the previous job as all xcall callbacks of the previous job are done and wc_wait of the previous job returns during/before executing its xcall callbacks. How to fix: there are actually two counters that count the number of finished xcall callbacks for low priority xcall for historical reasons (I guess): xc_tailp and xc_low_pri.xc_donep. xc_low_pri.xc_donep is incremented correctly while xc_tailp is incremented wrongly, i.e., before executing a xcall callback. We can fix the issue by dropping xc_tailp and using only xc_low_pri.xc_donep. PR kern/51632	2016-11-21 00:54:21 +00:00
christos	ecb08d7cca	Add FALLTHROUGH commit	2016-11-19 19:06:12 +00:00
pgoyette	f48fa2dcc1	By popular request, don't bother initializing a static pointer to NULL.	2016-11-18 02:37:33 +00:00
pgoyette	fdd49fc76c	Use compile-time initialization for the list head, and make sure that the sysctllog is also initialized before being used.	2016-11-17 08:06:49 +00:00
pgoyette	a1889144f5	Initialize the bufq code right before we're ready to load the strategy modules.	2016-11-16 12:31:33 +00:00
pgoyette	219154eeef	Define a new module class for the bufq_strategy modules. These need to be loaded and intialized before autoconfigure runs, since some devices (like disks and floppy drives) want to call bufq_alloc().	2016-11-16 10:42:14 +00:00
pgoyette	556c690963	Modularize the various bufq strategies	2016-11-16 00:46:46 +00:00
kre	75973081c3	Return the "true" parent's pid as the parent pid (ppid) via the various sysctl/procfs interfaces that allow it to be interrogated. (This is rather than the temporary parent's pid when a process is being traced and has been reparented.) XXX The ppid in elf32 core files has not been similarly adjusted, XXX Should it be ?	2016-11-14 08:55:51 +00:00
christos	931a19e8b1	Make p_ppid contain the original parent's pid even for traced processes. Only change it when we are being permanently reparented to init. Since p_ppid is only used as a cached value to retrieve the parent's process id from userland, this change makes it correct at all times. Idea from kre@ Revert specialized logic from getpid/getppid now that it is not needed.	2016-11-13 15:25:01 +00:00
christos	f19994519e	back to using SIGSTOP..	2016-11-12 20:03:17 +00:00
christos	cf7cb04d80	PR/51624: Return the original parent for a traced process.	2016-11-12 19:42:47 +00:00
christos	711ad24258	kern/51621: When attaching to a child send it a SIGTRAP not a SIGSTOP like Linux and FreeBSD do.	2016-11-11 17:10:04 +00:00
njoly	a9422942bd	Adjust clock_nanosleep(2) to not copyout remaining time struct if TIMER_ABSTIME flag is set. Ok Christos.	2016-11-11 15:29:36 +00:00
jdolecek	86e8a3aae2	during truncate with wapbl, register deallocation for upper indirect block before recursing into lower blocks, to make sure that it will be removed after all its referenced blocks are removed fixes 'ffs_blkfree_common: freeing free block' panic triggered by ufs_truncate_retry() when just the upper indirect block registration failed, code tried to free the lower blocks again after wapbl flush problem found by hannken@, thank you	2016-11-10 20:56:32 +00:00
christos	b2924f399d	GC WOPTSCHECKED, define macros for the select opts and all the valid opts. The linux compat flags are not part of X/Open.	2016-11-10 17:07:14 +00:00
ozaki-r	8db944330d	Add a new sanity check to psref It checks if a target being acquired is already acquired with the same psref. It is usable but not lightweight, so enabled only if DEBUG.	2016-11-09 09:00:46 +00:00
kre	b6732360dd	PR kern/51600 ; PR standards/51606 Revert 1.264 - that was intended to fix 51600, but didn't, it just hid the problem, and caused 51606. This fixes 51606. Handle waiting on a process that has been detatched from its parent because of being ptrace'd by some other process. This fixes 51600. ("handle" here means that the wait() hangs, or with WNOHANG, returns 0, we cannot actually wait on a process that is not currently an attached child.) Note: the detatched process waiting is not yet perfect (it fails to take account of options like WALLSIG and WALTSIG) - suport for those (that is, ignoring a detatched child that one of those options will later cause to be ignored when the process is re-attached.) For now, for ither than when waiting for a specific process ID, when a process does a wait() sys call (any of them), has no applicable children attached that can be returned, and has at least one detatched child, then we do a linear search of all processes to look for a suitable detatched child. This is likely to be slow - but very rare. Eventually it might be better to keep a list of detatched children per process.	2016-11-09 00:30:17 +00:00
christos	678541356f	Return 0 if WNOHANG and no kids.	2016-11-05 02:59:22 +00:00
christos	9b5ab01589	deduplicate the complex lock reparent dance.	2016-11-04 18:14:04 +00:00
christos	e8fde31e58	Cleanup old parent from zombies too. Fixes repeatable panic when we try to signal the already freed zombie parent after the child exits.	2016-11-04 18:12:06 +00:00
kamil	f26cf4cb48	Prefer modern simple past tense and past participle of catch The "catched" form is obsolete and nonstandard, prefer "caught".	2016-11-03 22:08:30 +00:00
christos	7bfe2974a7	Fix wrong WIFCONTINUED() status.	2016-11-03 20:58:25 +00:00
hannken	30572e03fd	Add a function to print the fields of a vnode including its implementation and use it from vprint() and vfs_vnode_print(). Move vstate_name() to vfs_subr.c.	2016-11-03 11:04:21 +00:00
hannken	175d720a94	Split sys/vnode.h into sys/vnode.h and sys/vnode_impl.h - Move _VFS_VNODE_PRIVATE protected operations into vnode_impl.h. - Move struct vnode_impl definition and operations into vnode_impl.h. - Include vnode_impl.h where we include vnode.h with _VFS_VNODE_PRIVATE defined. - Get rid of _VFS_VNODE_PRIVATE.	2016-11-03 11:03:31 +00:00
hannken	4f55676a14	Prepare the split of sys/vnode.h into sys/vnode.h and sys/vnode_impl.h - Rename struct vcache_node to vnode_impl, start its fields with vi_. - Rename enum vcache_state to vnode_state, start its elements with VS_. - Rename macros VN_TO_VP and VP_TO_VN to VIMPL_TO_VNODE and VNODE_TO_VIMPL. - Add typedef struct vnode_impl vnode_impl_t.	2016-11-03 11:02:09 +00:00
pgoyette	18cd37a864	Remove ptrace_do{,fp}regs - they are a duplicate of process_* routines which are still in sys_ptrace_common.c.	2016-11-03 03:57:05 +00:00
pgoyette	032607b8f0	Regenerate files for modularization of ptrace(2)	2016-11-02 00:14:11 +00:00
pgoyette	a60b99094c	* Split sys/kern/sys_process.c into three parts: 1 - ptrace(2) syscall for native emulation 2 - common ptrace(2) syscall code (shared with compat_netbsd32) 3 - support routines that are shared with PROCFS and/or KTRACE * Add module glue for #1 and #2. Both modules will be built-in to the kernel if "options PTRACE" is included in the config file (this is the default, defined in sys/conf/std). * Mark the ptrace(2) syscall as modular in syscalls.master (generated files will be committed shortly). * Conditionalize all remaining portions of PTRACE code on a new kernel option PTRACE_HOOKS. XXX Instead of PROCFS depending on 'options PTRACE', we should probably just add a procfs attribute to the sys/kern/sys_process.c file's entry in files.kern, and add PROCFS to the "#if defineds" for process_domem(). It's really confusing to have two different ways of requiring this file.	2016-11-02 00:11:59 +00:00
maxv	e18421c86e	The mbuf is freed by the protocol even on error, so always NULL the pointer instead of double-freeing it. Indirectly pointed out by Mootja.	2016-10-31 15:27:24 +00:00
maxv	a8d918182b	Memory leak, found by Mootja. By the way, we probably shouldn't be returning -1 here.	2016-10-31 15:08:45 +00:00
maxv	bee122aa97	Memory leak, found by Mootja. It is easily triggerable from userland.	2016-10-31 15:05:05 +00:00
christos	6f53bbe9e7	Fix arg64 computation for compat_netbsd32	2016-10-28 23:44:32 +00:00
jdolecek	b695bc874e	reorganize ffs_truncate()/ffs_indirtrunc() to be able to partially succeed; change wapbl_register_deallocation() to return EAGAIN rather than panic when code hits the limit callers changed to either loop calling ffs_truncate() using new utility ufs_truncate_retry() if their semantics requires it, or just ignore the failure; remove ufs_wapbl_truncate() this fixes possible user-triggerable panic during truncate, and resolves WAPBL performance issue with truncates of large files PR kern/47146 and kern/49175	2016-10-28 20:38:12 +00:00
jdolecek	71a8e131fb	fixup comment	2016-10-28 20:17:27 +00:00
ozaki-r	8941dc1184	Fix an assertion in _psref_held The assertion, psref->psref_lwp == curlwp, is valid only if the target is held by the caller. Reviewed by riastradh@	2016-10-28 07:27:52 +00:00
skrll	f2ef31cb48	PR kern/51514: ptrace(2) fails for 32-bit process on 64-bit kernel Updated from the original patch in the PR by me.	2016-10-19 09:44:00 +00:00
skrll	a857ba2662	KNF	2016-10-15 09:09:55 +00:00
skrll	855e4d5be4	Trailing whitespace	2016-10-14 08:38:31 +00:00
skrll	07111ed295	KNF	2016-10-14 08:37:05 +00:00
uwe	c9ab2a37ec	Revert to revision 1.249 to undo changes from PR 49636. Marking up some zeroes with a type suffix, while not marking others in the very same function does nothing but places cognitive burden on the reader. Spelling "clear bits" as "&~" is actually not uncommon (and some say is more readable).	2016-10-13 19:10:23 +00:00
dholland	d81762cbc9	foo & ~bar, not foo &~ bar. From Henning Petersen in PR 49636.	2016-10-10 01:22:51 +00:00
dholland	a6c9b0f9c4	PR 49636 Henning Petersen: use "0L" to return 0 from a function returning long, and test its returned value against "0L" instead of "0". This is not especially necessary, but it's also harmless.	2016-10-10 01:22:08 +00:00
christos	192a00203a	Hide MFREE now that it is not being used anymore and provide some debugging for the location of the last free for debugging kernels.	2016-10-04 14:13:21 +00:00
christos	da90486716	more MFREE -> m_free	2016-10-02 19:26:46 +00:00
jdolecek	9e58801f20	drop wl_mtx mutex during call to pool_get() with PR_WAITOK pointed out by riastradh	2016-10-02 16:52:27 +00:00
jdolecek	407be399a4	fix off-by-one in wapbl_write_revocations() - when exiting the write loop, wd gets set to next unwritten record, not last written one as code assumed; 'lost head!' KASSERT is not triggered any more	2016-10-02 16:44:02 +00:00
jdolecek	c69152bd80	wapbl_write_revocations(): fix use-after-free when writing more then one block worth of revocations, introduced in previous commit; discovered by Brad Harder on current-users	2016-10-02 14:38:46 +00:00
jdolecek	16c7d9d735	allocate wapbl dealloc registration structures via pool, so that there is more flexibility with limit handling	2016-10-01 13:15:45 +00:00
christos	c0e5049c21	Require exact credential match; this way even if we su to the original user that created the session, we won't match his credentials.	2016-10-01 04:42:54 +00:00
christos	4b39133eee	Weaken the test a bit to still allow non-root to use TIOCSTI; we need to have the same creds as the session leader process for the tty session.	2016-10-01 03:46:00 +00:00
christos	f08a5ec0bf	Only allow root to use TIOCSTI. Don't eat the kauth error number. It is unexpected for an unprivileged process to gain privs by typing to root's tty: $ cat installer #!/bin/sh whoami /usr/sbin/sti /dev/tty whoami\\n $ su unprivileged -c ./installer unprivileged $ whoami root	2016-09-29 21:46:32 +00:00
christos	e771ba939e	Introduce and use PROC_PTRSZ() to handle differing pointer size 64->32 emulation.	2016-09-29 20:40:53 +00:00
christos	d18e278dd0	Allow sparc kernels to build with SSP by using a constant PAGE_SIZE...	2016-09-29 18:47:35 +00:00
skrll	cf96d30a9f	Trailing whitespace	2016-09-23 14:16:32 +00:00
skrll	7b000a7783	Add netbsd32_clock_getcpuclockid2 and netbsd32_wait6 functions	2016-09-23 14:09:39 +00:00
jdolecek	e3cebdd8d5	misplaced comment	2016-09-22 16:22:29 +00:00
jdolecek	d6c67f4b63	store the number of block records per block into wl as wl_brperjblock, so that it's visible it's same value everywhere; no functional change	2016-09-22 16:20:56 +00:00
maxv	2e04133cf9	This is just a temporary stack that holds fake arguments, and that gets remapped as RW in sys_execve. Still, in this small window, it does not need to be executable.	2016-09-17 12:09:22 +00:00
maxv	654592fc2b	Use VM_MAXUSER_ADDRESS for proc0, not VM_MAX_ADDRESS. It normally does not change anything, since kernel processes use the shared kernel map instead of the one they are given here. For consistency though, it is better to make sure UVM will not be tempted to access machine-dependent reserved areas (e.g., the PTE space on x86).	2016-09-17 12:00:34 +00:00
christos	f9ab0c061b	move aslr stuff to the aslr section	2016-09-17 02:29:11 +00:00
pgoyette	06402e0a42	Move kern_ctf.c into the dtrace_fbt module (the only place it is used) rather than including in kernels with KDTRACE_HOOKS defined. Update the dtrace_fbt module to depend on the zlib module. Bump kernel version to avoid module mismatch. Welcome to 7.99.38 !	2016-09-16 03:10:45 +00:00
christos	cbcfdd13ce	oops removed too much	2016-09-15 18:40:34 +00:00
christos	406ea0ab88	Add debugging.	2016-09-15 17:45:44 +00:00
christos	4fddba2c93	m68k binaries load @ pagesize. unbreak.	2016-09-15 17:44:16 +00:00
martin	abb6b48937	Allow emulations to override the creation of ktrace records for posting signals. In compat_netbsd32 use this to write the 32bit version of the records, so a 32bit userland kdump is happy.	2016-09-13 07:39:45 +00:00
martin	1766e4eee1	Make the ktrace record written by do_sys_sendmsg/do_sys_recvmsg overridable by the caller. Use this in compat_netbsd32 to log the 32bit version, so the 32bit userland kdump is happy.	2016-09-13 07:01:07 +00:00
dholland	273d65f9c5	Build fix for when COREDUMP is turned off, from Ray Phillips in PR 51460.	2016-09-05 17:42:57 +00:00
christos	262a6229a0	don't forget to destroy a cv	2016-09-05 14:13:50 +00:00
christos	1fcaa19698	vsize_t is not always u_long :-)	2016-09-03 12:20:58 +00:00
hannken	b3aa7e069f	siggetinfo: use TAILQ_FOREACH_SAFE as the element gets removed from the list.	2016-08-21 15:24:17 +00:00
hannken	7139aab724	Remove now obsolete operation vcache_remove(). Welcome to 7.99.36	2016-08-20 12:37:06 +00:00
hannken	2ec6f651c5	Change vcache_reclaim() to remove vnode from vnode cache once the vnode was reclaimed from the file system.	2016-08-20 12:33:57 +00:00
hannken	113946c517	Rename vclean() to vcache_reclaim(). No functional change.	2016-08-20 12:31:37 +00:00
christos	7ebc13ffe3	tidy up messages and indentation	2016-08-13 12:05:49 +00:00
maxv	e727235220	The way the kernel tries to prevent a userland process from allocating page zero is hugely flawed. It is easy to demonstrate that one can trick UVM into chosing a NULL hint after the user_va0_disable check from uvm_map. Such a bypass allows kernel NULL pointer dereferences to be exploitable on architectures with a shared userland<->kernel VA, like amd64. Fix this by increasing the limit of the vm space made available for userland processes. This way, UVM will never chose a NULL hint, since it would be outside of the vm space. The user_va0_disable sysctl still controls this feature.	2016-08-06 15:13:13 +00:00
christos	c10c4abe0f	Realtime signal support from GSoC 2016, Charles Cui.	2016-08-04 06:43:43 +00:00
christos	7549563373	Print the parent module that asked for the builtin to be loaded and failed. XXX: if a driver is built-in why can't it ask for a filesystem module to be loaded?	2016-08-04 06:13:15 +00:00
martin	90b40fe3e2	kobj_machdep() needs a chance to moify the loaded code, so move the code to protect it read-only a bit later.	2016-08-02 12:23:08 +00:00
maxv	607912eebd	Don't fail if a module does not have a data or rodata section. Small modules don't have data.	2016-08-01 15:41:05 +00:00
dholland	585fe4a842	typo in comment	2016-07-31 20:34:04 +00:00
dholland	28ccf570bf	In bwrite, add assertion that vp != NULL. (vp is the vnode from the buffer being written.) There's some logic here that carefully checks for vp being null, and other logic that will crash if it is. It appears that it's all needless paranoia. See tech-kern for more info. Unless someone sees the assertion go off (in which case a lot more investigation is needed) I or someone will clean out the logic at some future point. Spotted by coypu.	2016-07-31 04:05:32 +00:00
christos	b265873d52	Fix reversed test.	2016-07-30 15:38:17 +00:00
skrll	ac3daeaa4c	Bump size of scratchstr - some KASSERTMGS exceed 256 characters	2016-07-27 09:57:26 +00:00
maxv	ece8cd54ab	Split the data+bss+rodata segment in two data+bss and rodata segments. The latter is made read-only.	2016-07-20 13:36:19 +00:00
maxv	d2c6c6c84f	Change the protection of the kernel modules segments once we are done relocating them. The text is allocated as RWX, and then mprotected to RW. There is a bug that prevents us from doing RW->RX on amd64 and perhaps sparc64. On x86, the pmap waits for the page to fault before granting it the X permission. But in the trap handler, such a page is considered as belonging to kernel_map, while it actually belongs to module_map. The kernel then finds out the page is not present in kernel_map, and panics. In all cases, module_map is non pageable, so even if the trap were handled properly, it still wouldn't work. Therefore, there is a small window in which the segment is RWX. But that's fine enough, for now.	2016-07-20 13:11:58 +00:00
msaitoh	207265e875	Print number of attach error regardless of AB_QUIET and AB_SILENT.	2016-07-19 07:44:03 +00:00
pgoyette	8665318c03	Also, don't hard-code the function name in the message; use __func__	2016-07-15 01:17:47 +00:00
pgoyette	abd2da1923	As suggested by christos@, use KASSERTMSG()	2016-07-15 01:13:10 +00:00
pgoyette	097a241ddc	Remove a call to panic() which duplicates the subsequent KASSERT()! XXX Since everything has (or should have) been switched to dev_t, we XXX could probably remove the check for XXX XXX ca->ca_devsize >= sizeof(struct device) XXX XXX But someone ought to check on that first! Reviewed by riastradh@	2016-07-14 21:57:06 +00:00
christos	20a2c0a7f7	make sure we cleanup properly when fd is too big.	2016-07-14 18:16:51 +00:00
christos	1c128d4498	From tedu at openbsd: kevent validates that ident is a valid fd by getting the file. one sad quirk: uint64 to int32 truncation can lead to false positives, and then later in the array sizing code, very big mallocs panic the kernel. add a check that the ident isn't larger than INT_MAX in the fd case. reported by Tim Newsham	2016-07-14 06:22:17 +00:00
njoly	84b8b47bee	In dosetrlimit() round stack hard limit just like soft one. Avoid cases where hard limit becomes smaller than soft limit.	2016-07-13 09:52:00 +00:00
msaitoh	6399f1a6ef	KNF. No functional change.	2016-07-11 07:42:13 +00:00
maxv	6c1bb9a544	When loading a module from VFS and from the bootloader, the kernel packs up the module segments into one big RWX chunk. Split this chunk into two different text and data+bss+rodata chunks. The latter is made non- executable. This also provides some kind of ASLR, since the chunks are not necessarily contiguous.	2016-07-09 07:25:00 +00:00
maxv	e169fdcc18	Force the kernel to dynamically reallocate the preloaded modules.	2016-07-08 08:55:48 +00:00
msaitoh	8bc54e5be6	KNF. Remove extra spaces. No functional change.	2016-07-07 06:55:38 +00:00
ozaki-r	058d974b09	Add HASH_PSLIST (pslist(9)) type for hashinit()	2016-07-06 05:20:48 +00:00
pgoyette	9969e67634	Don't declare module_verbose_on or module_autoload_on static. It is useful for these variables to be global, so they can be modified by ddb(4) (entered via "boot -d") early in startup.	2016-07-04 23:55:54 +00:00
maxv	99d1152db6	Make the execution flow canonical instead of jumping back and forth, and complete the userland check.	2016-07-04 07:56:07 +00:00
knakahara	850de3d9a9	revert kern_softint.c:r1.42 (which was incorrect fix) gif(4) has violated softint(9) contract. That is fixed by previous 2 commits. see: https://mail-index.netbsd.org/tech-kern/2016/01/12/msg019993.html	2016-07-04 04:20:14 +00:00
christos	65120c51fa	regen	2016-07-03 14:26:47 +00:00
christos	7cf7644fc7	GSoC 2016 Charles Cui: Implement thread priority protection based on work by Andy Doran. Also document the get/set pshared thread calls as not implemented, and add a skeleton implementation that is disabled. XXX: document _sched_protect(2).	2016-07-03 14:24:58 +00:00
maxv	5852a7fce9	Ensure the restartable atomic sequence is in userland, for real.	2016-07-01 12:49:22 +00:00
christos	4cfa4299d0	PR/51277: Fix compat32 coredumping that broke with the aux vector note addition.	2016-06-27 01:46:04 +00:00
pgoyette	1cd7e75a4d	Simplfy insertion of newly-activated modules into the list. There's no good reason to treat modules without dependencies differently from those which do require other modules.	2016-06-24 23:04:09 +00:00
skrll	70699ce203	Fix UVMHIST builds for kernels that don't include usb	2016-06-23 07:32:12 +00:00
pgoyette	b491c2af6f	When importing modules from the boot loader we should check for duplicate module names both in the built-in list and in the list of previously "pushed" modules. While here, delay allocating the new 'struct module' until we've passed the duplicate-name checks.	2016-06-23 04:41:03 +00:00
skrll	0895dad130	KNF . Sort includes	2016-06-22 07:44:02 +00:00
christos	f4c1c0d146	put back commented out name resolution code that was gc'ed after previous refactoring.	2016-06-20 19:14:35 +00:00
knakahara	69c0ff04b9	apply if_start_lock() to L2 callers which call ifp->if_start() of device derivers	2016-06-20 08:30:58 +00:00
bouyer	01a30830e3	Add a new config_detach() flag, DETACH_POWEROFF, which is set when detaching devices at shutdown time with RB_POWERDOWN. When detaching wd(4), put the drive in standby before detach for DETACH_POWEROFF. Fix PR kern/51252	2016-06-19 09:35:06 +00:00
pgoyette	1e522d74f9	Check for duplicate module names before loading modules that were "pushed" by the boot loader. The boot loader pushes the module name for the root file system (unless the root file system is ffs) even if the file system module is built into the kernel. When this happens, we get a lot of "redefined symbol" error messages. This fix does not alter the behavior of pushing the file system name. It simply avoids the redefined symbol errors by detecting that the module is already built-in to the kernel and not trying to load another copy. While here, differentiate the error message text between "failed to load" and "failed to fetch_info" conditions. Addresses PR kern/50357	2016-06-16 23:09:44 +00:00
ozaki-r	e1135cd9b9	Use curlwp_bind and curlwp_bindx instead of open-coding LP_BOUND	2016-06-16 02:38:40 +00:00
christos	ea2913a0a2	GSoC 2016: Charles Cui: Add timer related macros _POSIX_CPUTIME _POSIX_THREAD_CPUTIME _POSIX_DELAYTIMER_MAX	2016-06-10 23:29:20 +00:00
christos	0196f35dd1	GSoC 2016: Charles Cui: add SEM_NSEMS_MAX	2016-06-10 23:24:33 +00:00
ozaki-r	fe6d427551	Avoid storing a pointer of an interface in a mbuf Having a pointer of an interface in a mbuf isn't safe if we remove big kernel locks; an interface object (ifnet) can be destroyed anytime in any packet processing and accessing such object via a pointer is racy. Instead we have to get an object from the interface collection (ifindex2ifnet) via an interface index (if_index) that is stored to a mbuf instead of an pointer. The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9) for sleep-able critical sections and m_{get,put}_rcvif that use pserialize(9) for other critical sections. The change also adds another API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition moratorium, i.e., it is intended to be used for places where are not planned to be MP-ified soon. The change adds some overhead due to psref to performance sensitive paths, however the overhead is not serious, 2% down at worst. Proposed on tech-kern and tech-net.	2016-06-10 13:31:43 +00:00
ozaki-r	d938d837b3	Introduce m_set_rcvif and m_reset_rcvif The API is used to set (or reset) a received interface of a mbuf. They are counterpart of m_get_rcvif, which will come in another commit, hide internal of rcvif operation, and reduce the diff of the upcoming change. No functional change.	2016-06-10 13:27:10 +00:00
christos	b035b9b913	fix variable name	2016-06-09 00:17:45 +00:00
christos	3aa7fc217c	ignore EACCES	2016-06-08 23:55:24 +00:00
palle	3958153370	Added missing "it" to comment in start_init()	2016-06-04 21:10:56 +00:00
pgoyette	7bdbb58b22	Add a new kern.messages sysctl to allow kernel message verbosity to be altered after boot. Fixes PR kern/46539 using patch submitted by Nat Sloss.	2016-05-31 05:44:19 +00:00
pgoyette	b847d6b87c	Compare names of duplicate symbols properly, so we correctly return an error status. Fixes PR kern/45125 with patch supplied by Akinobu Mita	2016-05-31 03:57:04 +00:00
martin	5fc637a54d	David Binderman in PR kern/51189: simplify loop conditions	2016-05-30 11:24:40 +00:00
christos	ff63d49891	fix compilation without PAX_MPROTECT	2016-05-27 16:35:16 +00:00
hannken	40d12c0185	Use vnode state to replace VI_MARKER, VI_CHANGING, VI_XLOCK and VI_CLEAN. Presented on tech-kern@	2016-05-26 11:09:55 +00:00
hannken	1e17b1e3c2	Add vnode state and supporting operations and diagnostics. Presented on tech-kern@	2016-05-26 11:08:44 +00:00
hannken	c9685569a3	Merge the vnode and its corresponding vcache_node into one vcache_node structure. Print the vcache_node part in vprint() and vfs_vnode_print(). Presented on tech-kern@	2016-05-26 11:07:33 +00:00
wiz	692b4b1e95	Consistent indent.	2016-05-25 20:49:00 +00:00
christos	5763e378f2	Give 0,1,2 for security.pax.mprotect.ptrace and make it default to 1 as documented in sysctl(7): 0 - ptrace does not affect mprotect 1 - (default) mprotect is disabled for processes that start executing from the debugger (being traced) 2 - mprotect restrictions are relaxed for traced processes	2016-05-25 20:07:54 +00:00
christos	19ea743456	Introduce security.pax.mprotect.ptrace sysctl which can be used to bypass mprotect settings so that debuggers can write to the text segment of traced processes so that they can insert breakpoints. Turned off by default. Ok: chuq (for now)	2016-05-25 17:43:58 +00:00
christos	cd1c56e89e	randomize the location of the rtld.	2016-05-25 17:25:32 +00:00
martin	f3944df18c	Effectively disable aslr for non-topdown-VA binaries (unless they are compat32, which we deal with properly). It would be possible to get those working too, but it is not worth the code complexity. This makes binaries compiled with -mcmodel=medlow (and ancient binaries) work again on sparc64, smoothing the upgrade path. ok: christos	2016-05-24 17:30:01 +00:00
christos	9d95ecedc7	Add a note for the auxv array so we can find our load location from a core file of a PIE binary.	2016-05-24 00:49:55 +00:00
tls	1331d5da97	Fix a longstanding problem with accept filters noticed by Timo Buhrmester: sockets sitting in the accept filter can consume the entire listen queue, such that the application is never able to handle any connections. Handle this by simply passing through the oldest queued cxn when the queue is full. This is fair because the longer a cxn lingers in the queue (stays connected but does not meet the requirements of the filter for passage) the more likely it is to be passed through, at which point the application can dispose of it. Works because none of our accept filters actually allocate private state per-cxn. If they did, we'd have to fix the API bug that there is presently no way to tell an accf to finish/deallocate for a single cxn (accf_destroy kills off the entire filter instance for a given listen socket).	2016-05-23 13:54:34 +00:00
christos	b039ee7763	reduce #ifdef mess caused by PaX	2016-05-22 14:26:09 +00:00
christos	2b0df44082	Account for the VA hole differently (simpler)	2016-05-22 01:09:09 +00:00
riastradh	b93e5db80e	Use rnd_getmore as intended. No more essay needed here. Workaround for buffering got pushed into rnd_getmore, closer to the actual cause of the problem.	2016-05-21 15:33:40 +00:00
riastradh	77ebf39786	Ask on-demand entropy sources to produce enough data to fill buffer. Remainder of fix for PR kern/51135: if there is an entropy source that can produce arbitrarily much data, as in rump, then nothing should ever block indefinitely waiting for data.	2016-05-21 15:27:15 +00:00
christos	142afa09a8	fix for ILP32.	2016-05-19 21:39:15 +00:00
riastradh	950b6c0b3d	Replace deprecated disabled code by comment describing what it intends to do, and why it won't work yet From coypu.	2016-05-19 18:32:29 +00:00
hannken	a68d62d64c	Keep the old vcache node on rekey. Change its key and remove the new vcache node now used as placeholder only.	2016-05-19 14:50:18 +00:00
hannken	ed5aa2cef9	Change "ISSET(vp->v_iflag, VI_XLOCK)" to "vdead_check(vp, VDEAD_NOWAIT)".	2016-05-19 14:48:28 +00:00
hannken	4222e592dd	Add VFS_VNODE_PRIVATE protected operations vnalloc_marker() to create, vnfree_marker() to destroy and vnis_marker() to test for marker vnodes. Make operations vnalloc() and vnfree() local to vfs_vnode.c.	2016-05-19 14:47:33 +00:00
christos	f2f81db6f6	Hook to clamp the random value for mmap for machies that don't have enough VA bits.	2016-05-17 00:38:50 +00:00
christos	2a096139aa	only print debugging info if we are actually going to change the permission.	2016-05-14 17:04:09 +00:00

... 2 3 4 5 6 ...

9751 Commits