NetBSD

Author	SHA1	Message	Date
christos	56ae922037	Since pr_lock is now used to wait for two things now (PR_GROWING and PR_WANTED) we need to loop for the condition we wanted.	2017-11-09 15:40:23 +00:00
christos	f890274f96	add a "booted_method" string to aid in debugging double boot matches.	2017-11-09 01:02:55 +00:00
christos	0891190b55	hack around namei problem.	2017-11-07 20:58:23 +00:00
christos	0011aa658c	Store full executable path in p->p_path as discussed in tech-kern. This means that the full executable path is always available. - exec_elf.c: use p->path to set AT_SUN_EXECNAME, and since this is always set, do so unconditionally. - kern_exec.c: simplify pathexec, use kmem_strfree where appropriate and set p->p_path - kern_exit.c: free p->p_path - kern_fork.c: set p->p_path for the child. - kern_proc.c: use p->p_path to return the executable pathname; the NULL check for p->p_path, should be a KASSERT? - exec.h: gc ep_path, it is not used anymore - param.h: bump version, 'struct proc' size change TODO: 1. reference count the path string, to save copy at fork and free just before exec? 2. canonicalize the pathname by changing namei() to LOCKPARENT vnode and then using getcwd() on the parent directory?	2017-11-07 19:44:04 +00:00
christos	3afe107bee	Add two utility functions to help use kmem with strings: kmem_strdupsize, kmem_strfree.	2017-11-07 18:35:57 +00:00
christos	cd1c6201df	We computed the length of the string already, so use it...	2017-11-07 15:57:38 +00:00
riastradh	d6585e3401	Assert that pool_get failure happens only with PR_NOWAIT. This would have caught the mistake I made last week leading to null pointer dereferences all over the place, a mistake which I evidently poorly scheduled alongside maxv's change to the panic message on x86 for null pointer dereferences.	2017-11-06 18:41:22 +00:00
mlelstv	cc92bcd96f	pool_grow can now fail even when sleeping is ok. Catch this case in pool_get and retry.	2017-11-05 07:49:45 +00:00
christos	ad97afb146	use Elf_Sym ** instead of casting.	2017-11-04 22:17:55 +00:00
martin	640f0abac6	Make kobj_sym_lookup's result type an Elf_Addr. Fixes the arm builds.	2017-11-04 12:14:41 +00:00
pgoyette	64ae8753d8	Remove the ABI version-and-length check that was recently introduced; sysctl(9) ABIs should be stable across versions. XXX Pull-up to -8	2017-11-03 22:45:14 +00:00
maxv	4e8a8f71db	Handle absolute relocations coming from the kernel: preserve SHN_ABS in the kernel and module symbols, and when relocating a symbol that has SHN_ABS, take its value as-is and don't return an error if it equals zero. Sent on tech-kern@.	2017-11-03 09:59:07 +00:00
riastradh	50a782dc6e	C99ify initialization of dummy_timecounter.	2017-11-02 15:28:23 +00:00
martin	a6bab1a764	Allow architectures to define a macro PROC_MACHINE_ARCH(P) and PROC_MACHINE_ARCH32(P) to override the value for sysctl hw.machine_arch (native and netbsd32 commpat resp.). Use these for arm and mips instead of the (not working, noisy, in case of arm) sysctl override and #ifdef __mips__ in architecture neutral code.	2017-10-31 12:37:23 +00:00
riastradh	aca2a29cb6	Allow only one pending call to a pool's backing allocator at a time. Candidate fix for problems with hanging after kva fragmentation related to PR kern/45718. Proposed on tech-kern: https://mail-index.NetBSD.org/tech-kern/2017/10/23/msg022472.html Tested by bouyer@ on i386. This makes one small change to the semantics of pool_prime and pool_setlowat: they may fail with EWOULDBLOCK instead of ENOMEM, if there is a pending call to the backing allocator in another thread but we are not actually out of memory. That is unlikely because nearly always these are used during initialization, when the pool is not in use. XXX pullup-8 XXX pullup-7 XXX pullup-6 (requires tweaking the patch) XXX pullup-5...	2017-10-28 17:06:43 +00:00
pgoyette	cb32a134a5	Update the kernhist(9) kernel history code to address issues identified in PR kern/52639, as well as some general cleaning-up... (As proposed on tech-kern@ with additional changes and enhancements.) Details of changes: * All history arguments are now stored as uintmax_t values[1], both in the kernel and in the structures used for exporting the history data to userland via sysctl(9). This avoids problems on some architectures where passing a 64-bit (or larger) value to printf(3) can cause it to process the value as multiple arguments. (This can be particularly problematic when printf()'s format string is not a literal, since in that case the compiler cannot know how large each argument should be.) * Update the data structures used for exporting kernel history data to include a version number as well as the length of history arguments. * All [2] existing users of kernhist(9) have had their format strings updated. Each format specifier now includes an explicit length modifier 'j' to refer to numeric values of the size of uintmax_t. * All [2] existing users of kernhist(9) have had their format strings updated to replace uses of "%p" with "%#jx", and the pointer arguments are now cast to (uintptr_t) before being subsequently cast to (uintmax_t). This is needed to avoid compiler warnings about casting "pointer to integer of a different size." * All [2] existing users of kernhist(9) have had instances of "%s" or "%c" format strings replaced with numeric formats; several instances of mis-match between format string and argument list have been fixed. * vmstat(1) has been modified to handle the new size of arguments in the history data as exported by sysctl(9). * vmstat(1) now provides a warning message if the history requested with the -u option does not exist (previously, this condition was silently ignored, with only a single blank line being printed). * vmstat(1) now checks the version and argument length included in the data exported via sysctl(9) and exits if they do not match the values with which vmstat was built. * The kernhist(9) man-page has been updated to note the additional requirements imposed on the format strings, along with several other minor changes and enhancements. [1] It would have been possible to use an explicit length (for example, uint64_t) for the history arguments. But that would require another "rototill" of all the users in the future when we add support for an architecture that supports a larger size. Also, the printf(3) format specifiers for explicitly-sized values, such as "%"PRIu64, are much more verbose (and less aesthetically appealing, IMHO) than simply using "%ju". [2] I've tried very hard to find "all [the] existing users of kernhist(9)" but it is possible that I've missed some of them. I would be glad to update any stragglers that anyone identifies.	2017-10-28 00:37:11 +00:00
joerg	e64612f440	Revert printf return value change.	2017-10-27 12:25:14 +00:00
utkarsh009	f11595bab5	[syzkaller] Cast all the printf's to (void *) > as a result of new printf(9) declaration.	2017-10-27 09:59:16 +00:00
maya	18b796d442	Use C99 initializer for filterops Mostly done with spatch with touchups for indentation @@ expression a; identifier b,c,d; identifier p; @@ const struct filterops p = - { a, b, c, d + { + .f_isfd = a, + .f_attach = b, + .f_detach = c, + .f_event = d, };	2017-10-25 08:12:37 +00:00
riastradh	2a7a645aaa	Document lock order and locking rules.	2017-10-25 06:02:40 +00:00
jdolecek	d3e642e387	remove counter for 'journal I/O bufs biowait' - it's (total - async), so superfluous; adjust the description of the the other counters a bit to make them more clear	2017-10-23 19:03:40 +00:00
riastradh	f7b8b20d17	Initialize the in/out parameter vmin. vmin is only an optional hint since we're not passing UVM_FLAG_FIXED, but that doesn't mean we should use uninitialized stack garbage as the hint. Noted by chs@.	2017-10-20 19:06:46 +00:00
riastradh	4691bf4bd7	Carve out KVA for execargs on boot from an exec_map like we used to. Candidate fix for PR kern/45718: `processes sometimes get stuck and spin in vm_map', a problem that has been plaguing all our 32-bit ports for years. Since we currently use large (256k) buffers for execargs, and since nobody has stepped up to tackle breaking them into bite-sized (or at least page-sized) chunks, after KVA gets sufficiently fragmented we can't allocate new execargs buffers from kernel_map. Until 2008, we always carved out KVA for execargs on boot with a uvm submap exec_map of kernel_map. Then ad@ found that the uvm_km_free call, to discard them when done, cost about 100us, which a pool avoided: https://mail-index.NetBSD.org/tech-kern/2008/06/25/msg001854.html https://mail-index.NetBSD.org/tech-kern/2008/06/26/msg001859.html ad@ _simultaneously_ introduced a pool _and_ eliminated the reserved KVA in the exec_map submap. This change preserves the pool, but restores exec_map (with less code, by putting it in MI code instead of copying it in every MD initialization routine). Patch proposed on tech-kern: https://mail-index.NetBSD.org/tech-kern/2017/10/19/msg022461.html Patch tested by bouyer@: https://mail-index.NetBSD.org/tech-kern/2017/10/20/msg022465.html I previously discussed the issue on tech-kern before I knew of the history around exec_map: https://mail-index.NetBSD.org/tech-kern/2012/12/09/msg014695.html The candidate workaround I proposed of using pool_setlowat to force preallocation of KVA would also force preallocation of physical RAM, which is a waste not incurred by using exec_map, and which is part of why I never committed it. There may remain a general problem that if thread A calls pool_get and tries to service that request by a uvm_km_alloc call that hangs because KVA is scarce, and thread B does pool_put, the pool_put in thread B will not notify the pool_get in thread A that it doesn't need to wait for KVA, and so thread A may continue to hang in uvm_km_alloc. However, (a) That won't apply here, because there is exactly as much KVA available in exec_map as exec_pool will ever try to use. (b) It is possible that may not even matter in other cases as long as the page daemon eventually tries to shrink the pool, which will cause a uvm_km_free that can unhang the hung uvm_km_alloc. XXX pullup-8 XXX pullup-7 XXX pullup-6 XXX pullup-5, perhaps...	2017-10-20 14:48:43 +00:00
martin	f115d566a4	Make check_exec() errors print the name of the binary that fails to execute.	2017-10-20 12:11:34 +00:00
bouyer	d4ce271380	PR port-arm/52603: There is a race here, as seen on arm with FPU: LWP L is running but not on CPU, has its FPU state on CPU2 which has not been released yet, so fpexc still has VFP_FPEXC_EN set in the PCB copy. LWP L is scheduled on CPU1, CPU1 calls cpu_switchto() for L in mi_switch(). cpu_switchto() will set VFP_FPEXC_EN in the FPU's fpexc register per the PCB fpexc copy. Before CPU1 calls pcu_switchpoint() for L, CPU2 calls pcu_do_op(PCU_CMD_SAVE \| PCU_CMD_RELEASE) for L because it still holds its FPU state and wants to load another lwp. This cause VFP_FPEXC_EN to be cleared in the PCB copy, but not in CPU1's register. L's l_pcu_cpu is set to NULL. When CPU1 calls pcu_switchpoint() for L it see l_pcu_cpu is NULL, and doesn't call the release callback. Now CPU1 has its FPU enabled but with the wrong FPU state. Fix by releasing the PCU even if l_pcu_cpu is NULL.	2017-10-16 15:03:57 +00:00
christos	bb321f6151	Setting AT_BASE on static binaries breaks TLS because they assume that it is 0, will fix it differently.	2017-10-16 01:50:55 +00:00
christos	3df3b581f3	For static PIE set the interpreter address to be the entry offset so we don't lose it.	2017-10-08 15:00:40 +00:00
maxv	252ca9c54a	Remove compat_linux32 from the autoload list and add a enable/disable sysctl, like compat_linux.	2017-09-29 17:47:29 +00:00
maxv	aef145dda9	Remove compat_linux from the autoload list, and add a sysctl to enable or disable it - which defaults to disabled. The following command is now required to use linux binaries: sysctl -w emul.linux.enabled=1 After a discussion on tech-kern@. All the other ideas to reduce the attack surface have drawbacks, and this sysctl seems to be the best option.	2017-09-29 17:08:00 +00:00
joerg	0e5b5aa88a	Fix non-DIAGNOSTICS build by adjusting _vstate_assert here too.	2017-09-22 06:05:20 +00:00
joerg	5db0939512	Change the VSTATE_ASSERT_UNLOCKED code by pushing the potential lock handling into the backend and doing an optimistic (unlocked) check first. Always taking the vnode interlock makes this assertion otherwise very heavy for multi-processor machines. Ride the kernel version bump.	2017-09-21 18:19:44 +00:00
jakllsch	6b34528ad5	Initialize ex_lock and ex_cv only in the not-EX_EARLY case.	2017-09-18 13:22:56 +00:00
christos	e7f0067cbe	more const	2017-09-16 23:55:33 +00:00
christos	c483c7cba9	more debug info	2017-09-16 23:55:16 +00:00
christos	9d349e2adb	add missing const	2017-09-16 23:25:34 +00:00
sevan	684872c792	Remove support for VERIFIED_EXEC_FP_RMD160, VERIFIED_EXEC_FP_SHA1, and VERIFIED_EXEC_FP_MD5 options. These algorithms are either broken or on their way to being broken. Discussed on tech-security http://mail-index.netbsd.org/tech-security/2017/08/21/msg000936.html ok riastradh	2017-09-13 22:24:42 +00:00
joerg	69ab70f077	Fix a race between sysctl_unpcblist and closef.	2017-09-09 14:41:19 +00:00
pgoyette	1c284673f6	When adding a new veriexec_file_entry, if an entry already exists with all the same values (except for the filename) just ignore it. Otherwise report the duplicate-entry error. This allows the user to create a signature file with veriexegen(8) and not worry about duplicate entries (due to hard-linked files) which will otherwise cause /etc/rc.d/veriexec to report an error. Fixes PR kern/52512 XXX Pull-up for -8	2017-08-31 08:47:19 +00:00
pgoyette	c31e1d979d	Revert previous changes. They are wrong. The intended clean-up is already being handled by the call to veriexec_file_free() in the "out:" path.	2017-08-29 12:48:50 +00:00
pgoyette	f18bf91f4d	One more resource to release - the filename, if we kept it.	2017-08-29 10:23:12 +00:00
pgoyette	8bdb86c1df	Release any allocated resources if we take the error paths. As posted on tech-kern and discussed on IRC.	2017-08-29 10:19:54 +00:00
dholland	cec712d80b	If we go to allocate and find someone else has at the same time, don't trigger a refcount leak of the other guy's object. From mjg@freebsd. While here also remove a bogus use of lbolt on the same path.	2017-08-28 04:57:11 +00:00
kamil	a69b333e73	Remove the filesystem tracing feature This is a legacy interface from 4.4BSD, and it was introduced to overcome shortcomings of ptrace(2) at that time, which are no longer relevant (performance). Today /proc/#/ctl offers a narrow subset of ptrace(2) commands and is not applicable for modern applications use beyond simplistic tracing scenarios. This removal will simplify kernel internals. Users will still be able to use all the other /proc files. This change won't affect other procfs files neither Linux compat features within mount_procfs(8). /proc/#/ctl isn't available on Linux. Remove: - /proc/#/ctl from mount_procfs(8) - P_FSTRACE note from the documentation of ps(1) - /proc/#/ctl and filesystem tracing documentation from mount_procfs(8) - KAUTH_REQ_PROCESS_PROCFS_CTL documentation from kauth(9) - source code file miscfs/procfs/procfs_ctl.c - PFSctl and procfs_doctl() from sys/miscfs/procfs/procfs.h - KAUTH_REQ_PROCESS_PROCFS_CTL from sys/sys/kauth.h - PSL_FSTRACE (0x00010000) from sys/sys/proc.h - P_FSTRACE (0x00010000) from sys/sys/sysctl.h Reduce code complexity after removal of this functionality. Update TODO.ptrace accordingly: remove two entries about /proc tracing. Do not keep legacy notes as comments in the headers about removed PSL_FSTRACE / P_FSTRACE, as this interface had little number of users (close or equal to zero). Proposed on tech-kern@. All filesystem tracing utility users are encouraged to switch to ptrace(2). Sponsored by <The NetBSD Foundation>	2017-08-28 00:46:06 +00:00
kre	968e76ebe6	Build fix attempt ... changes affect !KERNEL (ie: userland, rump) version of this file only. Rather than adding meaningless {} around all uses of functions that are #defined to nothing for userland, #define the funcs to something that is functionally equivalent (but which appeases gcc). Also, define KASSERT() to nothing for userland, which avoids the need to add a #definee for mutex_owned which would otherwise be needed, and simmultaneoiusly stops gcc from complaining about a lack of a prototype.	2017-08-24 17:18:55 +00:00
skrll	78070145bc	Whitespace fix	2017-08-24 11:37:25 +00:00
jmcneill	7de85ed29e	Add EX_EARLY flag for extent_create, which skips locking. Required for using extent subsystem in early bootstrap code, before caches are enabled. From skrll@	2017-08-24 11:33:28 +00:00
hannken	28650af9eb	Change forced unmount to revert open device vnodes to anonymous devices.	2017-08-21 09:00:21 +00:00
hannken	7801661c06	No need to cache anonymous device vnodes, they will never be looked up. Set key to (dead_rootmount, 0, NULL) and add assertions.	2017-08-21 08:56:45 +00:00
maxv	c778810068	Remove compat_svr4, compat_svr4_32 and compat_ibcs2 from the list of autoloaded modules. These options are disabled everywhere (except ibcs2 on Vax, but Vax does not support kernel modules, so doesn't matter), therefore there is no issue in removing them from the list. Interested users will now have to do a 'modload' first, or uncomment the entries in GENERIC.	2017-08-08 16:57:32 +00:00
maxv	1d68b497f2	Remove compat_freebsd from the list of autoloaded modules. Interested users will now have to type 'modload' to use it, or uncomment the entry in GENERIC. I should have removed it when I disabled COMPAT_FREEBSD by default, sorry about that.	2017-08-08 08:12:14 +00:00

1 2 3 4 5 ...

9870 Commits