Commit Graph

9004 Commits

Author SHA1 Message Date
rtr
ff90c29d04 * sprinkle KASSERT(solocked(so)); in all pr_stat() functions.
* fix remaining inconsistent struct socket parameter names.
2014-07-07 17:13:56 +00:00
maxv
8d909506c6 Remove this (symtabindex == -1) check; it is already handled by (nsym != 1).
Put a KASSERT instead.
2014-07-06 15:35:32 +00:00
maxv
3021bdd8e1 Use a macro instead of always putting __func__ and __LINE__. 2014-07-06 15:22:31 +00:00
maxv
477e684b2e Check .evs_used==0 instead of .evs_cmds==NULL. evs_cmds would not be NULL if
another _makecmds() had allocated and deallocated VMCMDs (not the case
currently).
2014-07-06 07:41:41 +00:00
rtr
a60320ca07 * split PRU_SENSE functionality out of xxx_usrreq() switches and place into
separate xxx_stat(struct socket *, struct stat *) functions.
* replace calls using pr_generic with req == PRU_SENSE with pr_stat().

further change will follow that cleans up the pattern used to extract the
pcb and test for its presence.

reviewed by rmind
2014-07-06 03:33:33 +00:00
hannken
51464fb45f Add vcache operations to support key changes:
vcache_rekey_enter locks the old cache node and creates and locks the
  new cache node.  It is an error if the new cache node exists.

vcache_rekey_exit removes the old cache node and finalizes and
  unlocks the new cache node.

No objections on tech-kern@

Welcome to 6.99.46
2014-07-05 09:33:15 +00:00
maxv
57fd3cffd6 Change the pattern of KMEM_REDZONE so that the first byte is never '\0'.
From me and lars@.
2014-07-03 08:43:49 +00:00
maxv
11febc641e Fix the KMEM_POISON check: it should check the whole buffer, otherwise some
write-after-free's wouldn't be detected (those occurring in the 8 last bytes
of the allocated buffer).

Was here before my changes, spotted by lars@.
2014-07-02 15:00:28 +00:00
maxv
0d191e1f54 1) Define a malloc(9)-like kmem_header structure for KMEM_SIZE. It is in
fact more consistent, and more flexible (eg if we want to add new fields).
2) When I say "page" I actually mean "kmem page". It may not be clear, so
   replace it by "memory chunk" (suggested by lars@).
3) Minor changes for KMEM_REDZONE.
2014-07-01 12:08:33 +00:00
rtr
0dedd9772f fix parameter types in pr_ioctl, called xx_control() functions and remove
abuse of pointer to struct mbuf type.

param2 changed to u_long type and uses parameter name 'cmd' (ioctl command)
param3 changed to void * type and uses parameter name 'data'
param4 changed to struct ifnet * and uses parameter name 'ifp'
param5 has been removed (formerly struct lwp *) and uses of 'l' have been
       replaced with curlwp from curproc(9).

callers have had (now unnecessary) casts to struct mbuf * removed, called
code has had (now unnecessary) casts to u_long, void * and struct ifnet *
respectively removed.

reviewed by rmind@
2014-07-01 05:49:18 +00:00
maxv
374ecba24a This is weird; 'abort' already does all this, so simply use goto abort. 2014-06-30 17:51:31 +00:00
maxv
1d2bb5599c Reorder two variables and fix some comments. 2014-06-30 17:31:15 +00:00
maxv
8bd04c63bd If the interpreter is "", do not keep loading the script (which will later
fail), but return ENOEXEC directly.

ok christos@
2014-06-30 17:22:32 +00:00
dholland
01e782f371 Revert the following changes:
src/sys/sys/quotactl.h 1.37
   src/sys/compat/netbsd32/netbsd32.h 1.101
   src/sys/compat/netbsd32/netbsd32_netbsd.c 1.188, 1.189
   src/sys/kern/vfs_quotactl.c 1.39
   src/sys/kern/vfs_syscalls.c 1.483
   src/sys/ufs/lfs/ulfs_quota.c 1.11
   src/sys/ufs/ufs/ufs_quota.c 1.116
   src/lib/libquota/quota_kernel.c 1.5

and do them correctly.

If you're going to change the name of something, you need to change
the name of *all* the things with the same name, not just a handful,
and you should change it to something similar so it still matches the
rest of the system rather than just picking an arbitrarily different
name.

Hi, Joerg.

To wit, rename the quotactl "delete" operation to "del", because
"delete" is a reserved word in C++ and for some reason Joerg wants to
run internal interfaces used only by C code through his C++ compiler.
Do not rename it to "remove" instead, because this doesn't match
libquota or the rest of the usage throughout the system; and rename
all the related identifiers, not just the ones that blew the mind of
Joerg's C++ compiler.

Because this is not a user-facing API (the only userland consumer
sys/quotactl.h is libquota) it is sort of ok to make arbitrary
source-incompatible changes; however, by the same token it's completely
unnecessary. If it *were* a user-facing API that someone might have a
semi-rational reason to want to run a C++ compiler on, it would be
incorrect to change it at this point.
2014-06-28 22:27:50 +00:00
christos
9ab46539af Don't initialize the fh pointer to NULL when the allocation functions fail
and allow NULL in the free functions. It just leads to writing sloppy code
for no good reason.
2014-06-26 01:46:03 +00:00
christos
32d87f41a7 Provide a compatibility define for binaries generated before NetBSD 1.5.
These binaries contain multiple notes per section and their NetBSD version
value is 199905. This is enabled via COMPAT_OLDNOTE (default off).
2014-06-25 17:10:39 +00:00
maxv
5b91db99c9 1) Make clear that we want the space allocated for the KMEM_SIZE header to be
aligned, by using kmem_roundup_size(). There's no functional difference with
   the current MAX().

2) If there isn't enough space in the page padding for the red zone, allocate
   one more page, not just 2 bytes. We only poison 1 or 2 bytes in this page,
   depending on the space left in the previous page. That way 'allocsz' is
   properly aligned. Again, there's no functional difference since the shift
   already handles it correctly.
2014-06-25 16:35:12 +00:00
maxv
26b0b0e266 Rephrase some comments and remove whitespaces. No functional change. 2014-06-25 16:05:22 +00:00
maxv
eb92b9efbc Do not hardcode the value. Use KQ_NEVENTS. 2014-06-24 14:42:43 +00:00
maxv
b61ced9fc0 'miliseconds' -> 'milliseconds'. 2014-06-24 10:08:45 +00:00
maxv
fb1a1f54c3 KMEM_REDZONE+KMEM_POISON is supposed to detect buffer overflows. But it only
poisons memory after kmem_roundup_size(), which means that if an overflow
occurs in the page padding, it won't be detected.

Fix this by making KMEM_REDZONE independent from KMEM_POISON and making it
put a 2-byte pattern at the end of each requested buffer, and check it when
freeing memory to ensure the caller hasn't written outside the requested area.

Not enabled on DIAGNOSTIC for the moment.
2014-06-24 07:28:23 +00:00
maxv
5fa25b57b4 Use KASSERT() instead of #ifdef(DIAGNOSTIC). Clearer. 2014-06-23 18:06:32 +00:00
maxv
f7c9f4d7c3 Enable KMEM_SIZE on DIAGNOSTIC. It will catch memory corruption bugs due to a
different size given to kmem_alloc() and kmem_free(), with no performance
impact.
2014-06-23 17:43:42 +00:00
rtr
c5cb349386 where appropriate rename xxx_ioctl() struct mbuf * parameters from
`control' to `ifp' after split from xxx_usrreq().

sys_socket.c
    fix wrapping of arguments to be consistent with other function calls
    in the file after replacing pr_usrreq() call with pr_ioctl() which
    required one less argument.

link_proto.c
    fix indentation of parameters in link_ioctl() prototype to be
    consistent with the rest of the file.

discussed with rmind@
2014-06-23 17:18:45 +00:00
maxv
54e39d64d0 Fix a NULL pointer dereference after a loooong discussion with dholland@,
hannken@, blymn@ and martin@.

This bug would panic the system when veriexec is set to the VERIEXEC_LOCKDOWN
mode (only settable from root).
2014-06-22 18:32:27 +00:00
maxv
742b1eee79 Put the KMEM_GUARD code under #if defined(KMEM_GUARD). No functional change. 2014-06-22 17:36:42 +00:00
maxv
f1911357ef A KASSERT() is better. 2014-06-22 17:23:34 +00:00
rtr
d54d7ab24a * split PRU_CONTROL functionality out of xxx_userreq() switches and place
into separate xxx_ioctl() functions.
* place KASSERT(req != PRU_CONTROL) inside xxx_userreq() as it is now
  inappropriate for req = PRU_CONTROL in xxx_userreq().
* replace calls to pr_generic() with req = PRU_CONTROL with pr_ioctl().
* remove & fixup references to PRU_CONTROL xxx_userreq() function comments.
* fix various comments references for xxx_userreq() that mentioned
  PRU_CONTROL as xxx_userreq() no longer handles the request.

a further change will follow to fix parameter and naming inconsistencies
retained from original code.

Reviewed by rmind@
2014-06-22 08:10:18 +00:00
joerg
f3c5663df0 Require the actual namecache_look around cache_lookup_entry.
Add one last case of missing stat locking.
2014-06-16 12:28:10 +00:00
joerg
7913639b38 Make the stat mutex a leaf. XXX Use atomic counters. 2014-06-14 16:12:34 +00:00
njoly
6b27a22c99 Follow OpenGroup online documents for truncate[1] and ftruncate[2].
Fail with EINVAL for length argument negative values.

[1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/truncate.html
[2] http://pubs.opengroup.org/onlinepubs/9699919799/functions/ftruncate.html
2014-06-14 11:37:35 +00:00
joerg
d0f3f6896c Add kern.pool for memory pool stats. 2014-06-13 19:09:07 +00:00
joerg
e764e6c8e4 Rename old/new to match syscalls.master. 2014-06-12 22:10:04 +00:00
joerg
16691bd24e Regenerate 2014-06-12 21:42:26 +00:00
joerg
ec22bfc3c9 Avoid using C++ keywords as argument name. 2014-06-12 21:41:33 +00:00
joerg
ef53e37514 Don't t use a C++ keyword as field name. 2014-06-12 21:39:45 +00:00
rmind
190a99cba8 Add PCQ_MAXLEN constant. 2014-06-09 12:44:06 +00:00
christos
e703946d2e Handle race where the server closed the socket between us 'connecting' and
sending data.
2014-06-08 02:52:50 +00:00
joerg
94adc2671a Provide sysctl for namecache statistics. 2014-06-03 21:16:15 +00:00
joerg
0ae69ecb1a Don't play loop games, just enumerate the 10 fields explicitly. 2014-06-03 19:42:17 +00:00
joerg
11581dcbbb Introduce two helper functions to centralise the namecache statistics
in vfs_cache.c. Use consistent locking around the per-cpu data.
2014-06-03 19:30:29 +00:00
hannken
fb88097aee vfs_vnode_iterator_next(): if a vnode is reclaiming (VI_XLOCK) skip
the filter.  Vget() will wait until the vnode disappeared.  No more
"dangling vnode" panics on unmount.
2014-05-30 08:46:00 +00:00
rmind
83eb28cbe4 hashinit: replace loop with a formula. 2014-05-29 21:15:55 +00:00
njoly
2e3c0c8e82 In shmrealloc(), add missing condvar initialisations for segments
copied from previous location.
2014-05-27 21:00:46 +00:00
msaitoh
404b63ff2f Move forward read pointer to the next line in the buffer
to prevent corrupting the most old line.
2014-05-27 05:14:02 +00:00
pooka
bc09db942d Call biodone() in the bdev_strategy() error via a pointer. Decouples
subr_devsw from VFS -- not that I/O buffers are _VFS_ entities -- and
eliminates the last weak alias from librump, which means things now
fully work on glibc (w/o LD_DYNAMIC_WEAK) and musl.

The whole code path is suspect anyway, since nothing prevents the device
from escaping after the lookup, suggesting that the whole error path
should be handled by the caller, but oh well.
2014-05-25 16:31:51 +00:00
rmind
0132815be0 softint: implement softint_schedule_cpu() to trigger software interrupts
on the remote CPUs and add SOFTINT_RCPU flag to indicate whether this is
going to be used; implemented using asynchronous IPIs.
2014-05-25 15:42:01 +00:00
rmind
3da69dd68c MI IPI interface:
- Implement support for the asynchronous IPI calls.
- Rework synchronous IPI code to reuse the asynchronous mechanism.
- Add ipi(9) manual page; needs wizd(8).

Note: MD code can now provide a low level primitive for the ipi(9) and
reuse this interface instead of open-coding.  Portmasters are encouraged
to convert.  Ride 6.99.43!
2014-05-25 15:34:19 +00:00
rmind
77f33c2aef pcu: replace xcall(9) used for messaging with ipi(9). This provides
a better performance of the PCU (e.g. FPU) state synchronisation.
2014-05-25 14:53:55 +00:00
christos
02cb0c6eaf Introduce a selector function to the vfs vnode iterator so that we don't
need to vget() vnodes that we are not interested at, and optimize locking
a bit. Iterator changes reviewed by Hannken (thanks), the rest of the bugs
are mine.
2014-05-24 16:34:03 +00:00
dholland
39b82eecb9 Use accessor functions for the tty's table of control characters.
(at least from outside the core tty sources)

Move some xon/xoff code from net/ppp_tty.c to kern/tty.c.
2014-05-22 16:31:19 +00:00
dholland
44a93ea590 Define TTY_ALLOW_PRIVATE in tty.c, tty_pty.c, and tty_conf.c.
These modules are the core of the tty code that in the long term needs
access to struct tty. (It may be that in the future this can be cut
back to just tty.c; we'll see. For now I'll settle for keeping drivers
out of struct tty.)
2014-05-22 16:28:06 +00:00
rmind
eb664c40a7 Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.
2014-05-19 23:33:19 +00:00
rmind
8011b285c0 Implement MI IPI interface with cross-call support. 2014-05-19 22:47:53 +00:00
rmind
a6e0a15f58 Constify kcpuset_countset() and cpu_index() parameters. 2014-05-19 20:39:23 +00:00
rmind
4ae03c1815 - Split off PRU_ATTACH and PRU_DETACH logic into separate functions.
- Replace malloc with kmem and eliminate M_PCB while here.
- Sprinkle more asserts.
2014-05-19 02:51:24 +00:00
justin
c922b676b6 Fix prototype of last arg of rump_sys_mknod to dev_t not uint32_t
Discussed with pooka@
See also https://github.com/rumpkernel/buildrump.sh/issues/53
2014-05-18 21:25:44 +00:00
rmind
39bd8dee77 Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw.  Welcome to 6.99.62!
2014-05-18 14:46:15 +00:00
rmind
c3f8d27787 sonewconn: insert the socket into the queue *after* the protocol attach.
This potentially avoids unnecessary race conditions when handling partial
connections.
2014-05-17 23:55:24 +00:00
rmind
26e5a75962 - fsocreate: set SS_NBIO before the file descriptor is affixed as there is
a theoretical race condition (hard to trigger, though); remove the LWP
  parameter and clean up the code a little.
- Sprinkle few comments.
- Remove M_SOOPTS while here.
2014-05-17 23:27:59 +00:00
rmind
250d3c701d - sonewconn: improve the initialisation order and add some asserts.
- Add various comments describing primitive routines operating on sockets,
  clarify connection life-cycle and improve the description of socket queues.
- Sprinkle more asserts.
2014-05-17 22:52:36 +00:00
rmind
e16e8aee89 makesocket: set SS_NBIO slightly earlier. 2014-05-17 21:48:48 +00:00
rmind
3e8fbba831 Remove trailing whitespaces, wrap long lines, minor KNF; no functional changes. 2014-05-17 21:45:02 +00:00
martin
804dc5f91b Get rid of all sysc_init_field uses - initialize fields directly in C99
notation.
2014-05-16 12:22:32 +00:00
rmind
d67ab12c1d pcu(9):
- Remove PCU_KERNEL (hi matt!) and significantly simplify the code.
  This experimental feature was tried on ARM did not meet the expectations.
  It may be revived one day, but it should be done in a much simpler way.
- Add a message structure for xcall function, pass the LWP ower and thus
  optimise a race condition: if LWP is discarding its state on a remote CPU,
  but another LWP already did it - do not cause an unecessary re-faulting.
- Reduce the variety of flags for PCU operations (only PCU_VALID and
  PCU_REENABLE are used now), pass them only to the pcu_state_load().
- Rename pcu_used_p() to pcu_valid_p(); hopefully it is less confusing.
- pcu_save_all_on_cpu: SPL ought to be used here.
- Update and improve the pcu(9) man page; it needs wizd(8) though.
2014-05-16 00:48:41 +00:00
christos
7360fa8391 be a bit more verbose about why we think a note is bad. 2014-05-15 19:37:22 +00:00
hannken
42c8d67c49 Add a global vnode cache:
- vcache_get() retrieves a referenced and initialised vnode / fs node pair.
- vcache_remove() removes a vnode / fs node pair from the cache.

On cache miss vcache_get() calls new vfs operation vfs_loadvnode() to
initialise a vnode / fs node pair.  This call is guaranteed exclusive,
no other thread will try to load this vnode / fs node pair.

Convert ufs/ext2fs, ufs/ffs and ufs/mfs to use this interface.

Remove now unused ufs/ufs_ihash

Discussed on tech-kern.

Welcome to 6.99.41
2014-05-08 08:21:53 +00:00
christos
c4c94e150f Free pid for linux processes. Reported by Mark Davies, fix by dsl@
XXX: pullup 6
2014-05-05 15:45:32 +00:00
pooka
4d8864ed4f Eliminate weak symbols from rump kernel syscall handlers, part 5:
regen syscalls to eliminate weak aliases and link-time initialization
2014-04-27 15:11:22 +00:00
pooka
94ebf9ba52 Eliminate weak symbols from rump kernel syscall handlers, part 2:
Generate a file (rump.sysmap) which can be used to autogenerate the
syscall loaders.  The file contains syscall handler names and numbers.

Also store "libc" side syscall names in rump.sysmap to help with
the rumprun build process.
2014-04-27 14:50:23 +00:00
pooka
0b435d6ec6 Eliminate weak symbols from rump kernel syscall handlers, part 1:
Initialize all non-modular syscalls to enosys and expect them to be
filled at boottime.  Do not create the now-unnecessary weak aliases.

Modular syscalls work as before.
2014-04-27 14:29:53 +00:00
abs
6fe75f1616 Ensure pool_head is non static - for "vmstat -i" 2014-04-26 16:30:05 +00:00
pooka
0f54014c8e Decouple sockets linkage from interface code by making ifioctl() a pointer. 2014-04-26 11:16:22 +00:00
pooka
1814443234 It's been > 20years since rtioctl() did something. Let's just
remove that special way of returning EOPNOTSUPP.
2014-04-26 11:10:10 +00:00
riastradh
2191ea5a51 Correct type of i in execve_dovmcmds. Fixes DEBUG_EXEC build. 2014-04-25 18:04:45 +00:00
pooka
2324105436 Remove pollsock(). Since it took only a single socket, it was essentially
a complicated way to call soreceive() with a sb_timeo.  The only user
(netsmb) already did that anyway, so just had to delete the call to
pollsock().
2014-04-25 15:52:45 +00:00
pooka
e1b9adcc58 Make sleepq_wake() type void. The return value hasn't been used in
almost 6 years.  Even if it were, returning an arbitrary lwp is a bit
of a wonky interface and can really work only when expected == 1.
2014-04-24 12:04:28 +00:00
pooka
afbb108620 domains are attached by module(-like) constructors, so no need to
play link_set games with them.
2014-04-23 17:05:18 +00:00
maxv
2056c71da8 Fix a read-beyond-end string read.
coredump_buildname() copies 'pattern' into 'name', and handles special
characters such as "%n". "%n", if present, will be replaced by p->p_comm.

	error = coredump_buildname(p, name, pattern, MAXPATHLEN);

This function handles overflows, and returns an error when 'name' becomes
larger than MAXPATHLEN. However, when coredump() calls it, 'name' is used
before the error check, with:

	lastslash = strrchr(name, '/');

'name' is not guaranteed to be NUL-terminated, because of the *d = *s in
coredump_buildname(). This strrchr will read a string which is not NUL-
terminated (ie. until finding a '\0' in memory).

'pattern' can't be higher than MAXPATHLEN. A user can fill it in via a
PT_DUMPCORE ptrace call, given the input is not longer than MAXPATHLEN.
Since the 2-bytes-sized "%n"s will be replaced by p->p_comm (which is
user-settable, like a 10-bytes-sized "0123456789"), 'name' can become
longer than 'pattern' (and thus longer than MAXPATHLEN). Some 'a's at the
end of the buffer will make sure 'name' is not NUL-terminated.

    pattern: "%n%n%naaaaaaaaaaaaaaaaaaaaaaaaaaaa\0"
              | | | |||||||||||||||||||||||||||||
  ->   name: "012345678901234567890123456789aaaaa" [no \0]
              |         |         |         |||||MAXPATHLEN

Fix it by checking 'error' before calling strrchr.
2014-04-22 19:01:47 +00:00
maxv
6547a55a59 This thing is totally buggy: 'data_len' is modified by the fs, so calling
kmem_free with it while its value has changed since the kmem_alloc is far
from being a good idea.

If the kernel figures out that something mismatches, it will panic
(typically with kernfs).
2014-04-20 21:26:51 +00:00
uebayasi
82d46164cd execve_runproc: Isolate emul specific code into a function. 2014-04-20 00:20:01 +00:00
uebayasi
f207cc4217 copyinargs: Shorten a local var name. 2014-04-19 23:00:27 +00:00
uebayasi
ea85945d7a copyinargs: Plug theoretical memory leak when fakearg is too long.
Pointed out & reviewed by Maxime Villard.
2014-04-19 22:59:08 +00:00
maxv
dc8c3423b2 'error' is not set on failure. This is a true bug: everything is freed
and unlocked while zero is returned. Since there's no error, execve_runproc()
will get called and will try to use those freed things.

PS: This bug was here before uebayasi@'s changes
2014-04-18 11:44:31 +00:00
uebayasi
a969a4cf8a calcargs: Correct the size of "argc" in the stack size calculation.
(The old code has worked because it is compensated by wrong size calculation
of "auxinfo" (multiplied by sizeof(void *)).)
2014-04-18 06:59:32 +00:00
maxv
4a1b3781e1 Memory leak (only triggerable from root).
ok christos@
2014-04-18 05:22:13 +00:00
christos
fa910fdab6 CID/1203196: Don't confuse coverity with out of bounds access 2014-04-17 16:14:22 +00:00
maxv
cf89d4e5af Some fs's - like kernfs - set their vfs_min_mount_data to zero. Add a check
to prevent an (un)privileged user from requesting a zero-sized allocation
(and thus a panic).
2014-04-16 19:25:28 +00:00
maxv
23f76b6d00 An (un)privileged user can easily make the kernel dereference a NULL
pointer.

The kernel allows 'data' to be NULL; it's the fs's responsibility to
ensure that it isn't NULL (if the fs actually needs data).

ok christos@
2014-04-16 18:55:17 +00:00
uebayasi
c3b49b4f57 execve_runproc: Isolate vmcmd execution code into a function. 2014-04-16 02:22:38 +00:00
uebayasi
acaa1e700b execve_runproc: Isolate path / commandname (proc:p_comm) related code into a function. 2014-04-16 01:30:33 +00:00
uebayasi
532bc286ec execve_runproc: Isolate new stack arg filling code into a function. 2014-04-15 17:06:21 +00:00
uebayasi
e4f9e005a5 execve_runproc: Isolate ps_strings filling code into a function. 2014-04-15 16:44:57 +00:00
uebayasi
0244fbfc39 execve_runproc: Simplify &argc address calc. The set of (argc, argv, ...)
is located just "behind" the initial SP.  SHRINK, then ALLOC, and you get
&argc.
2014-04-15 16:13:04 +00:00
uebayasi
9605f3cc61 exec_loadvm: Isolate stack size calc logic into separate functions. 2014-04-15 15:50:16 +00:00
hannken
2f1e07219a Fix a deadlock where one thread exits, enters fstrans_lwp_dtor()
and wants fstrans_lock.  This thread holds the proc_lock.
Another thread holds fstrans_lock and runs pserialize_perform().
As the first thread holds the proc_lock, timeouts are blocked and
the second thread blocks forever in kpause().

Change fstrans_lwp_dtor() to invalidate, but not free its info
structs.  No need to take fstrans_lock.

Change fstrans_get_lwp_info() to reuse invalidated info before
trying to allocate a new one.
2014-04-15 09:50:45 +00:00
maxv
05b3bfa0ba There's no need for this NULL-check. 2014-04-15 06:14:55 +00:00
uebayasi
3d725db397 copyinargs: Redo previous; if given fakearg is longer than arg buf (which is
very unlikely to happen), there's no point to continue with truncated arg.
Just give up and return E2BIG.
2014-04-14 13:14:38 +00:00
uebayasi
dd3e806542 copyinargs: Replace a hand-written string copy loop with strlcpy(3). Carefully
reuse return value of strlcpy(3) to iterate.
2014-04-14 05:39:19 +00:00
uebayasi
4adfcd2c94 Revert braces. 2014-04-13 12:11:01 +00:00
uebayasi
eecddf1604 copyinargs: Refactor. Share code. 2014-04-13 09:19:42 +00:00
uebayasi
35b479ac55 execve_loadvm: Move long code block reading passed arguments() into a function.
This needs further clean up.  (See the XXX comment.)  No functional changes.
2014-04-13 06:03:49 +00:00
uebayasi
04729d8900 execve_runproc: Correct thinko in Rev. 1.386; the new SP always points to
after (higher adderss) argc/argv/env/aux/strings regardless of stack growing
direction .  Machines with grow-up stack will detect the top of
argc/argv/env/aux/strings by the address of *argv[] via ps_strings:ps_argvstr.

This means that old comments about RTLD_GAP are all obsolete.

With help from Nick Hudson.
2014-04-12 15:08:56 +00:00
uebayasi
c1047adce3 Don't #define DEBUG_EXEC. 2014-04-12 07:38:32 +00:00
uebayasi
d01b6ecafe execve_runproc: Refactor debug code. 2014-04-12 07:33:51 +00:00
uebayasi
93fb83ebaa execve_runproc: Move a long code block handling credential into a separate
function.  No functional changes.
2014-04-12 06:31:27 +00:00
uebayasi
763d7b32d6 execve_runproc: Unbreak __MACHINE_STACK_GROWS_UP machines. Clarify the stack
address allocation code.  Summarize an awful big comment about the _rtld()
"gap".

(The log message in Rev. 1.384 was wrong; the new stack address is passed
not via the 3rd register argument, but via the SP.  The 3rd is for ps_strings.)
2014-04-12 05:25:23 +00:00
uebayasi
7dd91721cc Reorder a new lines. Comments. 2014-04-11 18:02:33 +00:00
uebayasi
4282002059 execve_runproc: The stack address passed to the newly execve()'ed process,
via the 3rd register argument, always points to the stack base address (==
minsaddr (min stack address) + ssize (stack size)).  Clarify that.
2014-04-11 17:28:24 +00:00
uebayasi
8ab74c3b1b execve_runproc: Reorder a few local vars. Avoid reuse. No functional changes. 2014-04-11 17:06:02 +00:00
uebayasi
6770193e9c Clarify stack size calculation in copyargs(). Comments. 2014-04-11 11:49:38 +00:00
uebayasi
8f07d0cf93 Clean up assertions. 2014-04-11 11:32:14 +00:00
uebayasi
5dcee2c64e Protect not only proc::p_flag but also lwp::l_ctxlink and proc::p_acflag with
proc:p_lock.
2014-04-11 11:21:29 +00:00
uebayasi
5ddf7749cf Try to decrypt stack size calculation code in execve_loadvm().
No functional changes.  Two potential miscalculations remain.
2014-04-11 11:11:06 +00:00
uebayasi
11c21c773e Cache struct exec_package * for readability. No functional changes. 2014-04-11 02:27:20 +00:00
pooka
885b424da9 regen 2014-04-09 23:57:26 +00:00
pooka
9f45fed20c rump kernel wrappers for aio syscalls 2014-04-09 23:55:37 +00:00
pooka
0bb4e2ffe3 properly handle forward declarations for pointerpointer arguments 2014-04-09 23:50:45 +00:00
rjs
752c60b211 whitespace. 2014-04-07 17:02:15 +00:00
seanb
6bcc34c970 Fix a case where an erroneous EAGAIN was returned out of recvmmsg.
This occured when some, but not all of the mmsg array members
were filled with data from a non-blocking socket.
PR kern/48725
2014-04-07 15:35:23 +00:00
christos
6accf143de Kernel portion of the multiple ptyfs mount support. Protocol changed
between kernel and module, so bump. (Ilya Zykov)
2014-04-04 18:11:58 +00:00
maxv
18ff15fb2d Limit check for 'data_len'. Otherwise a (un)privileged user can easily
panic the system by passing a huge size.

ok christos@
2014-04-04 06:47:02 +00:00
para
608fba6393 make condition for ENOMEM consistent with allocation requirement 2014-04-02 18:09:10 +00:00
matt
790372329e If we are writing PN_XNUM or more phdrs, include one section header and
encode the real # of sections in its sh_info.
2014-04-02 17:19:49 +00:00
para
c28aad1c2f bt_refill is and must be called with VM_NOSLEEP set, assert this
fix error path if pool_get returns NULL
2014-04-02 16:14:50 +00:00
seanb
f9c6e7aeaa len argument to strlcpy() was incorrect when copying
out AF_LOCAL sockets in sysctl helper.  The entire
buffer wasn't available since sun_path member is not
at offset 0 in struct sockaddr_un.
2014-04-02 15:35:45 +00:00
maxv
687880ac6a Style 2014-03-29 09:31:11 +00:00
ozaki-r
6ac95d35b1 Fix unused variable 'mp' 2014-03-28 11:55:09 +00:00
christos
33baebc2e5 explain how a printf might happen (since it has bitten more than one person) 2014-03-27 21:09:33 +00:00
christos
a9253db65e From Ilya Zykov:
- ifdef out some code that is only used for NO_DEV_PTM
- pass the mountpoint instead of the ptm structure to the implementation
  dependent (ptyfs or bsdpty) functions.
- add a function to return the correct ptyfs mountpoint for the current lwp
2014-03-27 17:31:56 +00:00
christos
968c5f53d8 in the bsdpty allocvp flavor, call the bsdpty mkname directly, since it is
the only one possible to be valid (Ilya Zykov)
2014-03-26 21:29:54 +00:00
christos
e9ba8bc5a2 remove {v,}sprintf 2014-03-26 18:03:47 +00:00
macallan
00c16ffd7f snprintf -> vsnprintf in cpu_setmodel()
now this can actually work
hi christos
2014-03-25 12:50:53 +00:00
christos
2788907516 - create cpu_{g,s}etmodel() and hide cpu_model from direct access. 2014-03-24 20:07:40 +00:00
hannken
f3cf481632 - Make VI_XLOCK, VI_CLEAN and VI_LOCKSHARE private to kern/vfs_*.c.
- Make vwait() static.
- Add  vdead_check() to check a vnode for being or becoming dead.

Discussed on tech-kern.

Welcome to 6.99.38
2014-03-24 13:42:40 +00:00
christos
f363da3aa0 fix unused 2014-03-23 02:56:33 +00:00
maxv
2632b9d940 Fix a potential - but very unlikely - NULL pointer dereference.
(it does not introduce a new error code for open(), since
 pathbuf_copyin() is already there and can return ENOMEM)

Found by my code scanner.
2014-03-22 08:15:25 +00:00
maxv
d8a274dfb9 Small changes:
- rename elf_load_file() to elf_load_interp()
 - use the correct type for 'nused'
 - remove useless cases
 - reorder a kmem_alloc

ok christos@
2014-03-22 07:27:21 +00:00
mlelstv
43b8706dc0 Incorrect use of pointer arithmetic.
CID 1193195:  Extra sizeof expression
2014-03-20 06:48:22 +00:00
christos
52813a4e8e fix leak on error from pty_fill_ptmget (Ilya Zykov) 2014-03-19 18:11:17 +00:00
hannken
b349ee43ab Operations vmark(), vunmark() and vismarker() have been replaced by
vfs_vnode_iterator_*(), remove them.

Document vfs_vnode_iterator_*().

Make VI_MARKER private to vfs_vnode.c, vfs_mount.c and unfortunately
to ufs/lfs/lfs_segment.c.

Welcome to 6.99.37
2014-03-18 10:21:47 +00:00
hannken
618ee03549 Change sysctl_kern_vnode() to use vfs_vnode_iterator. 2014-03-17 09:28:37 +00:00
hannken
ed193ed61b Add fstrans_startnowait()/fstrans_done() to vrele_thread(). 2014-03-17 09:27:37 +00:00
maxv
7c09916210 Remove the 'prot' argument from elf_load_psection(). It is not used
outside, and can be declared locally. Clearer.

ok christos@
2014-03-16 07:57:25 +00:00
dholland
a68f9396b6 Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.
2014-03-16 05:20:22 +00:00
pooka
c9bffc6f73 regen: time/timer related syscalls for rump kernels 2014-03-14 00:56:37 +00:00
pooka
527bb3b75d Add rump kernel syscall wrapper flag for a bunch of time-related
syscalls (nanosleep, clock_gettime, etc.).  These are mostly intended
for situations where a rump kernel runs on an OS-less host.
2014-03-14 00:52:08 +00:00
pooka
1ac307e403 kill undesirable #ifndef _RUMPKERNEL 2014-03-11 20:32:05 +00:00
pooka
b05633df66 kill undesirable "#ifdef _RUMPKERNEL" 2014-03-11 20:26:08 +00:00
matt
bbe1552068 Tell where the corrruption was encountered in the panic message. 2014-03-07 16:36:32 +00:00
matt
dbd8c999e4 Remove spurious . 2014-03-07 01:55:01 +00:00
christos
54b7adb159 c99 initializers for struct execsw 2014-03-07 01:33:43 +00:00
matt
ab77483fb9 add ep_entryoffset to exec_package so one can calculate the relocabase
of an ET_DYN image.
2014-03-06 09:30:37 +00:00
hannken
72439b7dc8 Current support for iterating over mnt_vnodelist is rudimentary. Every
caller has to care about list and vnode mutexes, reference count being zero,
intermediate vnode states like VI_CLEAN, VI_XLOCK, VI_MARKER and so on.

Add an interface to iterate over a vnode list:

void vfs_vnode_iterator_init(struct mount *mp, struct vnode_iterator **marker)
void vfs_vnode_iterator_destroy(struct vnode_iterator *marker)
bool vfs_vnode_iterator_next(struct vnode_iterator *marker, struct vnode **vpp)

vfs_vnode_iterator_next() returns either "false / *vpp == NULL" when done
or "true / *vpp != NULL" to return the next referenced vnode from the list.

To make vrecycle() work in this environment change it to

bool vrecycle(struct vnode *vp)

where "vp" is a referenced vnode to be destroyed if this is the last reference.

Discussed on tech-kern.

Welcome to 6.99.34
2014-03-05 09:37:29 +00:00
dsl
4af555d7e1 When converting out of range 64bit sysctl values to 'int' (because of
an 'int' sized read) don't assume that sizeof (int) is 4.
2014-03-01 17:27:48 +00:00
riastradh
84bbdd5611 Kick on-demand entropy sources in rndsinks_distribute.
Partial workaround for indefinite hangs when entropy is scarce or
buffered up.  We need to do more to handle entropy that has been
buffered up -- see the comment for details -- but this will help for
now.

Problem noted by pooka.
2014-03-01 14:15:15 +00:00
skrll
dd7bb1e0a8 G/C sys/simplelock.h includes 2014-02-28 10:16:51 +00:00
dsl
7b1adb697e Allow CTLTYPE_INT and CTLTYPE_QUAD to be read and written as either 4 or 8
byte values regardless of the type.
64bit writes to 32bit variables must be valid (signed) values.
32bit reads of large values return -1.
Amongst other things this should fix libm's code that reads machdep.sse
  as a 32bit int, but I'd changed it to 64bit (to common up some code).
2014-02-27 22:50:52 +00:00
hannken
2b6ec89863 The current implementation of vn_lock() is racy. Modification of
the vnode operations vector for active vnodes is unsafe because it
is not known whether deadfs or the original file system will be
called.

- Pass down LK_RETRY to the lock operation (hint for deadfs only).

- Change deadfs lock operation to return ENOENT if LK_RETRY is unset.

- Change all other lock operations to check for dead vnode once
  the vnode is locked and unlock and return ENOENT in this case.

With these changes in place vnode lock operations will never succeed
after vclean() has marked the vnode as VI_XLOCK and before vclean()
has changed the operations vector.

Adresses PR kern/37706 (Forced unmount of file systems is unsafe)

Discussed on tech-kern.

Welcome to 6.99.33
2014-02-27 16:51:37 +00:00
hannken
d940ddcc62 Currently dead vnodes still reside on the vnodelist of the file system
they have been removed from.

Create a "dead mount" that takes dead vnodes until they get freed.

Discussed on tech-kern.
2014-02-27 13:00:06 +00:00
maxv
ff3f3d5c44 We have to ensure the string is NUL-terminated and of the expected
length to avoid copying uninitialized data.

ok christos@
2014-02-27 09:58:05 +00:00
riastradh
98ff99631b Fix bits/bytes mixup in rnd_getmore.
Remove some needless casts and fix format directives while here.

Bit/byte mixup noticed by pooka.
2014-02-25 23:15:43 +00:00
pooka
4f6fb3bf35 Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before
the sysctl link sets are processed, and remove redundancy.

Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate
lines of code.
2014-02-25 18:30:08 +00:00
justin
69dd91d347 Add kern.{ostype,osrelease,osrevision,version} kern.domainname,
kern.rawpartition sysctl support to rump kernel.
Moved the sysctl support that is shared between rump and normal
kernels to init_sysctl_base.c as rump cannot use init_sysctl.c
in order to avoid code duplication. Agreed with pooka@.
2014-02-25 01:02:42 +00:00
mlelstv
5d1221e5bf ttioctl always gets a valid lwp reference. Replace attempt to handle a NULL
reference in only one place with a regular assertion.
2014-02-23 07:54:43 +00:00
maxv
33cfa4fef0 Simplify error path.
ok christos@
2014-02-22 07:53:16 +00:00
maxv
0ff9025533 Revert rev1.38. The header already begins with EXEC_SCRIPT_MAGIC="#!".
So it can't be ELFMAG="\177ELF" at the same time.

ok christos@
2014-02-21 08:11:59 +00:00
maxv
c14dea48b0 Properly check the section size to avoid out-of-bound reads. The
computed size must be the exact same size that is indicated in
sh_size.

ok agc@ christos@
2014-02-21 07:47:02 +00:00
maxv
c22b5e2a12 We need VMCMDs for a binary and its interpreter, so make sure we have
at least one VMCMD. This also prevents the kernel from using an
uninitialized pointer as entry point for the execution.

From me and Christos

ok christos@
2014-02-19 15:23:20 +00:00
para
e3e2479f22 replace vmem(9) custom boundary tag allocation with a pool(9) 2014-02-17 20:40:06 +00:00
maxv
113995d235 Cosmetic; just replace whitespaces by tabs 2014-02-17 19:29:46 +00:00
maxv
03cdabd0dd Small cleanup:
- make elf_load_file() and elf_load_psection() static
 - make loops consistent
 - 'nload' is not used - see rev1.24
 - 'ap' is not used in elf_load_file()

ok agc@ christos@
2014-02-16 17:46:36 +00:00
njoly
9f120b8d09 Remove argument name from prototype. 2014-02-15 22:32:16 +00:00
christos
6f9879ba7d initialize offset to 0 (Maxime Villard) 2014-02-15 17:39:03 +00:00
maxv
c11747d060 Remove the last argument of elf_check_header(). It is easier - and faster - to
check the e_type field in the calling function. Other BSD's already do this.

ok christos@
2014-02-15 16:17:01 +00:00
christos
df9581b1ee explain why the innocent sigaction1 call now works. 2014-02-14 16:35:40 +00:00
christos
b9e9a610e4 Don't check trampolines for SIG_DFL or SIG_IGN since they are not used.
From gimpy.
2014-02-14 16:35:11 +00:00
maxv
1a33eb9d1c Fix memory leak.
ok christos@ agc@
2014-02-14 07:30:07 +00:00
martin
47869c118a Unlock correct mutex in an error path.
PR kern/48592 from Kengo NAKAHARA.
2014-02-12 20:20:15 +00:00
maxv
52673c8d59 Reorder code to avoid using an uninitialized variable: if
sysctl_copyin fails, 'tmp' is not initialized. This bug is
harmless since only the return value will be different;
it does not expose kernel memory unless diagnostic is enabled.

ok agc@ martin@
2014-02-09 14:51:13 +00:00
hannken
97834f7ba0 Change vnode operation lookup to return the resulting vnode *vpp unlocked.
Change cache_lookup() to return an unlocked vnode.

Discussed on tech-kern@

Welcome to 6.99.31
2014-02-07 15:29:20 +00:00
hannken
f106eaceb6 Change vnode operation lookup to return the resulting vnode *vpp unlocked.
Change cache_lookup() to return an unlocked vnode.

Discussed on tech-kern@
2014-02-07 15:26:42 +00:00
msaitoh
62342f9d4d s/mesage/message/ 2014-02-07 11:51:00 +00:00
riastradh
6c0ad81464 __HAVE_ATOMIC_AS_MEMBAR is spelled with two leading underscores.
This underscores the need to replace this error-prone cpp API by
unconditionally defined {pre,post}atomic_membar_*.

This change should only remove unnecessary membar_producers on x86.
2014-02-06 03:47:16 +00:00
martin
65095476e3 Cosmetics: return is an operator, not a function: remove (). 2014-02-02 14:50:46 +00:00
martin
2934fa70dc Limit the amount of kernel memory a posix_spawn syscall can use (for handling
the file action list) by limiting the maximum number of file actions to
twice the current file descriptor limit.
Fix a few bugs in the support functions and document the new limit.
From Maxime Villard.
2014-02-02 14:48:57 +00:00
dogcow
437b1ce30d Delete duplicate symbol definition introduced in 1.371. Now builds again. 2014-02-02 08:25:23 +00:00
manu
70aead41ff Add EMUL_NATIVEROOT so that native binaries can be told to search an
"emulation" directory before the real root. This makes easier to test
an amd64 kernel on the top of an i386 root filesystem prior a full
migration.
2014-02-02 04:28:42 +00:00
yamt
57688c9a9e tty_pty: add CTASSERTs to document assumptions 2014-01-29 02:38:48 +00:00
martin
30a98d4423 Mark a diagnostic only variable 2014-01-28 12:50:54 +00:00
christos
9477bafa18 kill the topdown flag only if we succeed. 2014-01-25 23:58:41 +00:00
christos
f5fe8e85e2 fix unused 2014-01-25 21:11:20 +00:00
christos
840bc63029 __USING_TOPDOWN_VM is no more, __USE_TOPDOWN_VM... 2014-01-25 19:44:11 +00:00
christos
cee146c035 Add compat_10, open NULL == open "." 2014-01-25 17:24:45 +00:00
christos
f4956d9c6a a.out binaries can't handle topdown. 2014-01-25 05:15:05 +00:00
christos
1525b564a7 expose do_open 2014-01-25 02:28:31 +00:00
skrll
c92b6b82d2 Pass PCU_LOADED to pcu_state_load in the "this CPU already has our PCU
state loaded" of pcu_load.

ok, gimpy@ and rmind@
2014-01-23 17:32:03 +00:00
hannken
04c776e5c8 Change vnode operations create, mknod, mkdir and symlink to return
the resulting vnode *vpp unlocked.

Discussed on tech-kern@

Welcome to 6.99.30
2014-01-23 10:13:55 +00:00
hannken
ac59f9acc5 Change vnode operations create, mknod, mkdir and symlink to return
the resulting vnode *vpp unlocked.

Discussed on tech-kern@
2014-01-23 10:11:55 +00:00
hannken
0fa0d339bd Change cache_prune() to test for end-of-list before testing for an
invalid entry.  Prevents a lifelock when the end-of-list marker
gets invalid while scanning the list and all entries are recent.
2014-01-20 07:47:22 +00:00