Commit Graph

8490 Commits

Author SHA1 Message Date
dholland b496a9dc26 Rename the new ni_startdir (the slot used to hold the starting point
for openat() and friends) to ni_atdir to avoid confusion with a
previously existing (and, alas, still documented) ni_startdir field
that meant something else entirely.
2012-11-05 19:06:26 +00:00
dholland 35ed690545 Excise struct componentname from the namecache.
This uglifies the interface, because several operations need to be
passed the namei flags and cache_lookup also needs for the time being
to be passed cnp->cn_nameiop. Nonetheless, it's a net benefit.

The glop should be able to go away eventually but requires structural
cleanup elsewhere first.

This change requires a kernel bump.
2012-11-05 17:27:37 +00:00
dholland 1617a81dd1 Disentangle the namecache from the internals of namei.
- Move the namecache's hash computation to inside the namecache code,
instead of being spread out all over the place. Remove cn_hash from
struct componentname and delete all uses of it.

 - It is no longer necessary (if it ever was) for cache_lookup and
cache_lookup_raw to clear MAKEENTRY from cnp->cn_flags for the cases
that cache_enter already checks for.

 - Rearrange the interface of cache_lookup (and cache_lookup_raw) to
make it somewhat simpler, to exclude certain nonexistent error
conditions, and (most importantly) to make it not require write access
to cnp->cn_flags.

This change requires a kernel bump.
2012-11-05 17:24:09 +00:00
njoly 1ebbf7b605 Move rusage computation to a new getrusage1() function. Adjust all
compat/emulations to make use of it.
2012-11-03 23:22:21 +00:00
matt 1f0bc47f9c Use kmem_intr_alloc/kmem_intr_free 2012-10-27 17:34:07 +00:00
chs cbab9cadce split device_t/softc for all remaining drivers.
replace "struct device *" with "device_t".
use device_xname(), device_unit(), etc.
2012-10-27 17:17:22 +00:00
tls 9d8dce6eca Fix hardware RNGs -- accept their entropy estimates *rather than* using
timestamps to estimate the entropy of their input.  I'd accidentally
made it so no entropy was ever counted from them at all.
2012-10-27 01:29:02 +00:00
apb b6366e94c8 Set tp->t_dev to the correct dev_t value in both ptmopen and ptsopen.
Depending on how the pty had been opened, t_dev could previously have
been set to NODEV.  This was probably harmless before, but it caused the
compatibility handler for the COMPAT_60_TIOCPTSNAME ioctl to fail for
ptys that were allocated by screen(1), but only if this was the first
time that the pty had ever been used.
2012-10-20 00:21:10 +00:00
apb f6297d7676 Add COMPAT_60 versions of the TIOCPTMGET and TIOCPTSNAME ioctls. 2012-10-19 16:55:22 +00:00
riastradh 8db30059ca No, we can't elide the fs-wide rename lock for same-directory rename.
rename("a/b", "a/c") and rename("a/c/x", "a/b/y") will deadlock.

Darn.
2012-10-19 02:07:22 +00:00
para fc7d559bb7 bring comment up to reality
kmem_map => kmem_arena
2012-10-18 19:33:38 +00:00
drochner 035939be53 put binary compatibility support for the old AMD-only CPU microcode
update API inside COMPAT_60
2012-10-17 20:19:55 +00:00
christos eac5a1b990 remove KERN_USRSTACK 2012-10-14 20:56:55 +00:00
dholland ebc30f9e8b Replace hack implementation of NDAT() for "nameiat" with a proper one.
(This change requires a kernel bump.)
2012-10-13 17:46:50 +00:00
christos c6f0835de6 add KERN_USRSTACK (this is not dynamically defined for FreeBSD compatibility) 2012-10-13 15:35:55 +00:00
rmind 2440dfcd19 Update comment on vnode life-cycle a little. 2012-10-12 21:10:55 +00:00
riastradh a807072402 Disentangle do_sys_rename.
Elide the fs-wide rename lock for single-directory renames.  This
required changing the order of lookups, so that we know what the
directories are before we lock the nodes.

Clean up error branches, explain why various nonsense happens and
what it does and doesn't do, and note some of what needs to change.
2012-10-12 02:37:20 +00:00
dholland 03f1fbd862 In layer_lookup(), clear *vpp before returning EROFS, as otherwise a
stale value can be returned and this causes a diagnostic panic in
namei.

In relookup(), clear *vpp before calling VOP_LOOKUP, as is done in
lookup_once(), as an additional precautionary measure.

(in theory both of these fixes are not required together)

Should fix PR 47040.
2012-10-10 06:55:25 +00:00
dholland 4fc6b20089 Add namei-level support for openat() and friends. The way you do it is
by calling NDAT(&nd, dirvp) after NDINIT().

Right now the implementation is vile and unspeakable to avoid changing
the kernel ABI; this way we can get openat() and friends into 6.1. I
will rectify the mess and bump the kernel once things are working.
2012-10-08 23:43:33 +00:00
dholland eee667d033 Tidy up namei internals to allow openat() and friends without getting
tangled in nfsd's special cases.
2012-10-08 23:41:39 +00:00
pooka e30ea15ccf put all kern socket sysctls in the same place 2012-10-08 19:20:45 +00:00
matt 320d4922ba If the workqueue is using a prio less than PRI_KERNEL, make sure KTHREAD_TS
is used when creating the kthread.
2012-10-07 22:16:21 +00:00
christos f2a172afbf Avoid crash dereferencing a NULL fp in fd_affix() in unp_externalize
caused by the sequence of passing two fd's with two sendmsg()'s,
then doing a read() and a recvmsg(). The read() calls dom_dispose()
which discards both messages in the mbuf, and sets the fp's in the
array to NULL. Linux dequeues only one message per read() so the
second recvmsg() gets the fd from the second message.  This fix
just avoids the NULL pointer de-reference, making the second
recvmsg() to fail. It is dubious to pass fd's with stream sockets
and expect mixing read() and recvmsg() to work. Plus processing
one control message per read() changes the current semantics and
should be examined before applied. In addition there is a race between
dom_externalize() and dom_dispose(): what happens in a multi-threaded
network stack when one thread disposes where the other externalizes
the same array?

NB: Pullup to 6.
2012-10-06 22:58:08 +00:00
mlelstv 582d3a41a2 Add sanity check to sysctl_kern_maxvnodes. 2012-10-03 07:22:59 +00:00
mlelstv aac856dfae No longer determine availability of ISO and UDF partitions, we default
to allow access to both. Only use a found ISO header to access the
correct session.
2012-10-03 07:05:51 +00:00
mlelstv 2897603d95 Don't call ureadc() with a spinlock held because ureadc() may fault when
writing to userspace.
2012-10-02 23:10:34 +00:00
christos 8e4cb02016 regen 2012-10-02 01:46:20 +00:00
christos 1ec743232e kernel portion of clock_nanosleep() 2012-10-02 01:44:27 +00:00
mlelstv cfb7ebba73 Provide consistent locking around getc() in ttread(). This is necessary
to prevent crashes in MPSAFE tty drivers like ucom.
2012-09-30 11:49:44 +00:00
rmind ea775f7598 exit_lwps, lwp_wait: fix a race condition by re-trying if p_lock was dropped
in a case of process exit.  Necessary to re-flag all LWPs for exit, as their
state might have changed or new LWPs spawned.

Should fix PR/46168 and PR/46402.
2012-09-27 20:43:15 +00:00
pooka c189ace80c alias rump_sysent to sysent, since the linux compat code wants to
access it (it calls ptrace, so 0 practical impact here, though).
2012-09-20 17:46:21 +00:00
rmind 4dc5d07777 Rename kcpuset_copybits() to kcpuset_export_u32() and thus be more specific
about the interface.
2012-09-16 22:09:33 +00:00
christos fae2443c7b PR/46973: Dr. Wolfgang Stukenbrock: kauth_authorize_action_internal() returns
non-macro value as it should do
2012-09-16 14:35:26 +00:00
joerg ed602fb487 Don't use const foo const as type, one const is enough. 2012-09-13 21:44:49 +00:00
msaitoh a3568bcd83 Fix a bug that kmem_alloc() is called from the interrupt context. 2012-09-08 02:58:13 +00:00
tls 68ad6e1d4b Fix kern/46911: note that we rekeyed the cprng so we don't keep doing so. 2012-09-07 02:42:13 +00:00
tls a003f4459f Don't wait until the pool *fills* to rekey anything that was keyed with
insufficient entropy at boot: key it as soon as it makes any request after
we hit the minimum entropy threshold.

This too should help avoid predictable output at boot time.
2012-09-05 18:57:33 +00:00
tls c39440e8e4 Try to help embedded systems a _little_ bit: stir in the system boot time
as early as possible.  On systems with no cycle counter (or very very
predictable cycle counts early in boot) at least this will cause some
difference across boots.
2012-09-05 18:06:52 +00:00
mlelstv a73007ba40 The field ci_curlwp is only defined for MULTIPROCESSOR kernels. 2012-09-02 16:00:00 +00:00
para 192ae8787d rework boundary-tag reserve calculation, make it more precise.
add comment about the rational behind the sizing of certain vars
used by allocation and bootstrap.
as requested by yamt@
2012-09-01 12:28:58 +00:00
matt a462d18984 Add a __HAVE_CPU_UAREA_IDLELWP hook so that the MD code can allocate
special UAREAs for idle lwp's.
2012-09-01 00:26:37 +00:00
matt 584846fa01 Add a kcpuset_t which just includes ourself.
Add a ci_cpuname for convenience
2012-09-01 00:24:43 +00:00
matt 99af3c507e Add a new more KASSERT/KASSERTMSG 2012-08-30 02:26:38 +00:00
matt 2411d236db A few more KASSERT/KASSERTMSG. 2012-08-30 02:26:02 +00:00
matt b0d1f89948 Change KASSERT to KASSERTMSG 2012-08-30 02:25:35 +00:00
matt a91e719fd0 KASSERT -> KASSERTMSG 2012-08-30 02:24:48 +00:00
matt dfb595098c Give config thread more descriptive names. 2012-08-30 02:24:20 +00:00
matt ed51f1779c Use __cacheline_aligned 2012-08-30 02:23:14 +00:00
dholland 523ec9d4d8 Add missing newline to printf (in the disabled code for $ORIGIN). 2012-08-29 18:56:39 +00:00
drochner 312c339026 Extend the CPU microcode update framework to support Intel x86 CPUs.
Contrary to the AMD implementation, it doesn't use xcalls to distribute
the update to all CPUs but relies on cpuctl(8) to bind itself to the
right CPU -- to keep it simple and avoid possible problems with
hyperthreading.
Also, it doesn't parse the vendor supplied file to pick the right
part for the present CPU model but relies on userland to prepare
files with specific filenames. I'll commit a pkg for this in a minute
(pkgsrc/sysutils/intel-microcode).
The ioctl interface changed; compatibility is provided (should be
limited to COMPAT_NETBSD6 as soon as this is available).
2012-08-29 17:13:21 +00:00
christos b490177227 proper locking for DEBUG 2012-08-28 15:52:19 +00:00
dholland 6da788cccc don't truncate size_t to int 2012-08-24 05:52:17 +00:00
rmind f5661bef75 kcpuset_copybits: fix potential endianness problem. Spotted by matt@. 2012-08-20 22:01:29 +00:00
christos 8547430828 PR/46811: Tetsua Isaki: Don't handle cpu limits when runtime is negative. 2012-08-18 08:54:06 +00:00
christos 0cd88accfc Better (not racy fix) from Paul Goyette. 2012-08-17 16:21:19 +00:00
christos 33b27c368d Use the queue of the tty not garbage from the stack (Paul Goyette) 2012-08-17 16:14:31 +00:00
christos eb2f61c4df PR/46780: Dennis Ferguson: Take the easy way out and return EBUSY when changing
the queue size if the output queue is not empty. Other solutions seemed too
complex/fragile.
2012-08-12 14:45:44 +00:00
jnemeth 942121f54c Add -A, -a, and -e options to modstat(8) along with kernel
changes required to support these options.  The -e option was
requested by martin@ in private chat in order to make writing tests
easier (i.e. don't bother testing MODULAR functionaility if it
doesn't exist).  While there, I added -A and -a since those were
quite similar.

     -A      Tells you whether or not modules can be autoloaded at the moment.
             This option does take into consideration the sysctl
             kern.module.autoload.

     -a      Tells you whether or not modules can be autoloaded at the moment.
             This option does not take into consideration the sysctl
             kern.module.autoload.

     -e      Tells you whether or not you may load a module at the moment.
2012-08-07 01:19:05 +00:00
riastradh bb195042a2 Use separate names for the multitudinous uses of `q' in exit1.
Now I can follow which process is which in this routine.

If I jiggle the whitespace so line numbers don't change, there is no
change in the output of `objdump -d kern_exit.o' for amd64.

ok abp
2012-08-05 14:53:25 +00:00
riastradh 0b891b69f1 Force sys_close not to restart by returning ERESTART.
Print a diagnostic message if we ever get ERESTART out of fd_close
and convert it to EINTR instead.

Even if fd_close fails, it has already closed the file descriptor, so
restarting the system call is a mistake, with dangerous consequences
for multithreaded programs.

Should probably turn the message into a kassert eventually, and maybe
add one deeper in fd_close in order to more easily debug it before
all the data structures are destroyed.
2012-08-05 04:26:10 +00:00
matt 3e95365cba back out elf note changes and use EF_ARM_ABIVERS to determine EABI usage. 2012-08-05 01:43:58 +00:00
christos d9ddb5220c - fix typo in comment
- Don't call abort1 with NULL ld, after panic(9).
2012-08-04 12:38:20 +00:00
matt 09c4ddf597 If any argument of a syscall is a pointer, set SYCALL_ARG_PTR as a flag. 2012-08-03 18:08:01 +00:00
pooka 04c76d7616 reregen 2012-08-03 12:42:10 +00:00
pooka 3c09f0a002 Forgot this one from previous commit. It too is needed for syscallargs.h
on rumpclient on !NetBSD.
2012-08-03 12:41:13 +00:00
pooka 0dbefb0fc2 regen 2012-08-03 11:32:55 +00:00
pooka fa3922be63 Make librumpclient compile and work on Linux. This is accomplished by:
1) avoid "NetBSD'isms" in the rumpclient sources
2) do not require the knowledge of unnecessary weird_t's in syscallargs.h
   for rumpclient
2012-08-03 11:31:33 +00:00
matt 2051fb7586 Add a elf note to describe the ARM ABI in use. If encounters on arm,
set EXEC_ARM_AAPCS bit in exec_package's ep_flags.
XXX kind of gross but it there's isn't a MD hook for notes so ...
2012-08-03 07:54:14 +00:00
njoly 05acbbfb30 Remove final ';' from CONDVAR_DECL macro. The caller already adds its
own.
2012-07-30 17:49:24 +00:00
christos 7828bed40d remove infinite loop on error, extra parens on return. 2012-07-30 10:45:03 +00:00
christos c10ba96d02 simplify unp_externalize(), some from gimpy, some from me. 2012-07-30 10:42:24 +00:00
mlelstv 8ce4433821 Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.
2012-07-29 18:05:39 +00:00
matt b1afbb311c Fix -fno-common found by building i386/conf/ALL 2012-07-28 00:43:22 +00:00
christos 72e4156b86 revert racy vfork() parent-blocking-before-child-execs-or-exits code.
ok rmind
2012-07-27 20:52:49 +00:00
matt f96ef7b3c5 Remove safepri and use IPL_SAFEPRI instead. This may be defined in a MD
header file (if not, a value of 0 is assmued).
2012-07-27 05:36:09 +00:00
rmind 6d7c79596a fork1: fix use-after-free problems. Addresses PR/46128 from Andrew Doran.
Note: PL_PPWAIT should be fully replaced and modificaiton of l_pflag by
other LWP is undesirable, but this is enough for netbsd-6.
2012-07-22 22:40:18 +00:00
rmind d65753d972 Move some the test for MAKEENTRY into the cache_enter(9). Make some
variables in vfs_cache.c static, __read_mostly, etc.

No objection on tech-kern@.
2012-07-22 00:53:18 +00:00
para 7e85a4e1d1 split allocation lookup table to decrease overall memory used
making allocator more flexible for allocations larger then 4kb
move the encoded "size" under DEBUG back to the begining of allocated chunk

no objections on tech-kern@
2012-07-21 11:45:04 +00:00
pooka 4a66f2d3c1 pretty pretty print 2012-07-20 18:19:09 +00:00
pooka dbb439cc26 unrevert part of 1.119 which should not have been reverted (sys/socket.h) 2012-07-20 18:17:26 +00:00
pooka 1aec8d1196 regen 2012-07-20 16:49:45 +00:00
pooka 0a5a22d26a revert 1.119. theoretically there should be no issue, and i couldn't
find one in practice either, except including rump_syscalls.h from
non-NetBSD now works.

ok christos
2012-07-20 16:44:33 +00:00
christos c296146af8 From Roger Pau Monne: kill(2) called for a zombie process should return 0,
according to:
    http://pubs.opengroup.org/onlinepubs/9699919799/functions/kill.html
2012-07-18 20:30:07 +00:00
njoly e6d2db5a59 Avoid kmem_alloc KASSERT for 0 byte allocation, when tracing processes
that use empty messages with sendmsg/recvmsg.
2012-07-17 14:22:42 +00:00
christos 1250ff8f6a revert previous; the problem was off by one in the bios device comparison
in x86_autoconf.c
2012-07-13 16:15:48 +00:00
chs 3493b990bb in soreceive(), handle uios larger than 31 bits.
fixes the remaining problem in PR 43240.
2012-07-09 04:35:13 +00:00
tsutsui 71a1f51386 unsigned -> unsigned int 2012-07-07 16:15:20 +00:00
tsutsui 50365d7376 Check if secsize and numsec returned from ioctl's are sane values
and add DIAGNOSTIC messages in getdisksize().

Discussed on source-changes-d@:
http://mail-index.NetBSD.org/source-changes-d/2012/07/02/msg004989.html
and patch is reviwed by christos@ and pgoyette@.
2012-07-07 16:10:23 +00:00
christos a105182ca5 Don't kill the iso partition at 'a' when we have a udf raw partition.
Makes cd0a mountable again. Should be pulled up to 6 (after people
verify that it works in the broken cases)
http://mail-index.netbsd.org/current-users/2012/06/14/msg020415.html
2012-07-03 13:03:47 +00:00
cheusov b6b59f4935 Add new action KAUTH_CRED_CHROOT for kauth(9)'s credential scope.
Reviewed and approved by elad@.
2012-06-27 12:28:28 +00:00
cheusov af4f78f198 KNF fix. space vs. tab 2012-06-27 10:06:55 +00:00
cheusov 06aa70f732 Fix a typo. s/seperate/separate/ 2012-06-27 10:02:02 +00:00
christos e77423410d regen 2012-06-22 18:27:25 +00:00
christos 7bee3146e4 Add {send,recv}mmsg from Linux 2012-06-22 18:26:35 +00:00
yamt f562b34504 comments and assertions.
no functional changes.
2012-06-15 13:51:40 +00:00
martin 18118f60c8 Do not try to find the wedge we booted from if opendisk(booted_device)
failed.
2012-06-14 20:18:16 +00:00
joerg 110cea35a1 Kill conditionals that are always true. Drop a dead assignment. 2012-06-13 23:00:05 +00:00
mlelstv 5741661f64 Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.
2012-06-10 17:05:18 +00:00
christos 847d309682 Remove debugging. 2012-06-09 02:55:32 +00:00
christos 0461089547 Add a new resource to limit the number of lwps per user, RLIMIT_NTHR. There
is a global sysctl kern.maxlwp to control this, which is by default 2048.
The first lwp of each process or kernel threads are not counted against the
limit. To show the current resource usage per user, I added a new sysctl
that dumps the uidinfo structure fields.
2012-06-09 02:31:14 +00:00
rmind e75fa0930a Few fixes for Xen:
- cpu_load_pmap: use atomic kcpuset(9) operations; fixes rare crashes.
- Add kcpuset_copybits(9) and replace xen_kcpuset2bits().  Avoids incorrect
  ncpu problem in early boot.  Also, micro-optimises xen_mcast_invlpg() and
  xen_mcast_tlbflush() routines.

Tested by chs@.
2012-06-06 22:22:41 +00:00
martin ba2b54cf0d Henning Petersen in PR kern/46552: include cosmetics 2012-06-06 11:20:21 +00:00
matt a9e4a2ff57 Make sure va_end is used even when errors are encountered. 2012-06-06 05:10:54 +00:00
jym 57d7988f76 Now that pool_cache_invalidate() is synchronous and can handle per-CPU
caches, merge together pool_drain_start() and pool_drain_end() into

bool pool_drain(struct pool **ppp);

"bool" value indicates whether reclaiming was fully done (true) or not (false)
"ppp" will contain a pointer to the pool that was drained (optional).

See http://mail-index.netbsd.org/tech-kern/2012/06/04/msg013287.html
2012-06-05 22:51:47 +00:00
jym ca40366292 As pool reclaiming is unlikely to happen at interrupt or softint
context, re-enable the portion of code that allows invalidation of CPU-bound
pool caches.

Two reasons:
- CPU cached objects being invalidated, the probability of fetching an
obsolete object from the pool_cache(9) is greatly reduced. This speeds up
pool_cache_get() quite a bit as it does not have to keep destroying objects
until it finds an updated one when an invalidation is in progress.

- for situations where we have to ensure that no obsolete object remains
after a state transition (canonical example: pmap mappings between Xen VM
restoration), invalidating all pool_cache(9) is the safest way to go.

As it uses xcall(9) to broadcast the execution of pool_cache_transfer(),
pool_cache_invalidate() cannot be called from interrupt or softint context
(scheduling a xcall(9) can put a LWP to sleep).

pool_cache_xcall() => pool_cache_transfer() to reflect its use.

Invalidation being a costly process (1000s objects may be destroyed),
all places where pool_cache_invalidate() may be called from
interrupt/softint context will now get caught by the proper KASSERT(), and
fixed. Ping me when you see one.

Tested under i386 and amd64 by running ATF suite within 64MiB HVM
domains (tried triggering pgdaemon a few times).

No objection on tech-kern@.

XXX a similar fix has to be pulled up to NetBSD-6, but with a more
conservative approach.

See http://mail-index.netbsd.org/tech-kern/2012/05/29/msg013245.html
2012-06-05 22:28:11 +00:00
rmind 533522e2c8 Add hash_list_size() and simplify slightly. 2012-06-05 20:51:36 +00:00
martin f8fdd418df Measure kinfo_proc2::p_vm_vsize in pages, as it was always documented.
This value seems to never have been used anywhere.
This makes it consistent with it's cousin p_vm_msize (which is in pages as
well and has several uses).
2012-06-05 08:23:05 +00:00
dsl 5d8067f580 Use separate temporaries for the 'int' percentage and the 'long'
water marks.
Previous paniced on sparc64 due to a misaligned copy.
2012-06-03 16:23:44 +00:00
dsl 0334c2fd52 Fix processing of vm.bufmem_lowater and vm.bufmem_hiwater on 64bit systems.
The values are 'u_long' so copying them into an 'int' temporary
(to avoid writing an out of range value into the actual variable)
doesn't work too well at all.
Shows up on amd64 now that the sysctl values are marked as 64bit.
sparc64 must have been badly broken for ages.
2012-06-03 11:37:44 +00:00
dsl e21a34c25e Add some pre-processor magic to verify that the type of the data item
passed to sysctl_createv() actually matches the declared type for
  the item itself.
In the places where the caller specifies a function and a structure
  address (typically the 'softc') an explicit (void *) cast is now needed.
Fixes bugs in sys/dev/acpi/asus_acpi.c sys/dev/bluetooth/bcsp.c
  sys/kern/vfs_bio.c sys/miscfs/syncfs/sync_subr.c and setting
  AcpiGbl_EnableAmlDebugObject.
(mostly passing the address of a uint64_t when typed as CTLTYPE_INT).
I've test built quite a few kernels, but there may be some unfixed MD
  fallout. Most likely passing &char[] to char *.
Also add CTLFLAG_UNSIGNED for unsiged decimals - not set yet.
2012-06-02 21:36:41 +00:00
christos 632a99a18c put back missing break; 2012-06-02 18:32:27 +00:00
christos cf50f3a20b the gnu tag name is valid for both type 1 (abi) and type 3 (build id) 2012-06-02 16:48:13 +00:00
martin 10212e900c Stopgap fix for PR kern/46463: disallow passing of kqueue descriptors
via SCM_RIGHT anxiliary socket messages.
2012-06-02 16:16:16 +00:00
martin 7849de7539 Remove an unused variable 2012-06-02 15:54:02 +00:00
christos 95a363c914 - Recognize the SuSE ABI note.
- Restructure the code to do the checking in the appropriate note type,
and harmonize all the checks to be positive.
- Print only the tag data being careful not to overrun the allocated buffer.
2012-05-22 02:40:05 +00:00
martin 6c3cc552c2 Calling _lwp_create() with a bogus ucontext could trigger a kernel
assertion failure (and thus a crash in DIAGNOSTIC kernels). Independently
discovered by YAMAMOTO Takashi and Joel Sing.

To avoid this, introduce a cpu_mcontext_validate() function and move all
sanity checks from cpu_setmcontext() there. Also untangle the netbsd32
compat mess slightly and add a cpu_mcontext32_validate() cousin there.

Add an exhaustive atf test case, based partly on code from Joel Sing.

Should finally fix the remaining open part of PR kern/43903.
2012-05-21 14:15:16 +00:00
tls a918f11452 Fix two problems that could cause /dev/random to not wake up readers when entropy became available. 2012-05-19 16:00:41 +00:00
martin 9d342c6506 Make sure we can deliver two file descriptors for pipe2() before we set
up anything special (like close on exec).
Fixes PR kern/46457.
2012-05-16 09:41:11 +00:00
chs 61d4721d48 remove a bogus optimization introduced in the previous change.
fixes hangs in the rump/rumpvfs/t_etfs test.
2012-05-12 18:42:08 +00:00
gson 425e23f1fe Move VFS_EXTATTRCTL to mount_domount(). This makes the
fs/puffs/t_fuzz:mountfuzz7, fs/puffs/t_fuzz:mountfuzz8,
and fs/zfs/t_zpool:create tests pass again.  Patch from
manu, discussed on tech-kern and committed at his request.
2012-05-08 08:44:49 +00:00
christos 5cebef3f28 regen 2012-05-05 19:49:13 +00:00
christos 8692f82623 use sy_call() so that l->l_sysent gets set, so that we can autoload modules
that define new syscalls properly.
2012-05-05 19:44:02 +00:00
christos fae991ec2f Add a new type of syscall "EXTERN" which is meant for modules that live
outside the tree (in pkgsrc). Use it to define afssys (210) which has
been reserved for years, and make it autoload the "openafs" module.
2012-05-05 19:37:37 +00:00
rmind 269014127a G/C POOL_DIAGNOSTIC option. No objection on tech-kern@. 2012-05-05 19:15:10 +00:00
rmind b10bf4690c Revert posix_spawn() clean up for now, there are some bugs. 2012-05-02 23:33:11 +00:00
rmind d9290bb010 do_open: move pathbuf destruction to the callers, thus simplify and fix a
memory leak on error path.
2012-05-02 20:48:29 +00:00
manu 7f8940a8ce Return ENODATA when no attribute is found, like Linux does. After
all we decided to adopt the Linux API, therefore there is rationale
to stick to it.

No standard tells us what to do, and our extended attribute API has not
been used in a release, therefore we do not break anything, and we get
more easily compatible with programs using the Linux extended attribute
API.

Note that FreeBSD and MacOS X return ENOATTR. FreeBSD has its own API
and MacOS X has a Linux-like API. How did the world get so complicated?
2012-05-01 07:48:25 +00:00
rmind f1d428af19 - Replace some malloc(9) uses with kmem(9).
- G/C M_IPMOPTS, M_IPMADDR and M_BWMETER.
2012-04-30 22:51:27 +00:00
rmind 0c217aec3a posix_spawn:
- Remove copy-pasting in error paths, use execve_free_{vmspace,data}().
- Move some code (both in the init and exit paths) out of the locks.
- Slightly simplify do_posix_spawn() callers.
- Add few asserts and comments.
2012-04-30 21:19:58 +00:00
manu 8658637414 Fix the extattr start fix. Looking up the filesystemroot vnode again
does not seems to be reliable. Instead save it before mount_domount()
sets it to NULL.
2012-04-30 10:05:12 +00:00
manu 74a73d8b5c Fix mount -o extattr : previous patch fixed a panic but caused operation
to happen on the mount point instead of the mounted filesystem.
2012-04-30 03:51:10 +00:00
chs 67b37d586b mark all wapbl I/O as BPRIO_TIMECRITICAL.
this is the second part of addressing PR 46325.
2012-04-29 22:55:11 +00:00
chs 8306a9eddf change vflushbuf() to take the full FSYNC_* flags.
translate FSYNC_LAZY into PGO_LAZY for VOP_PUTPAGES() so that
genfs_do_io() can set the appropriate io priority for the I/O.
this is the first part of addressing PR 46325.
2012-04-29 22:53:59 +00:00
dsl e05eb71de5 Remove everything to do with 'struct malloc_type' and the malloc link_set.
To make code in 'external' (etc) still compile, MALLOC_DECLARE() still
  has to generate something of type 'struct malloc_type *', with
  normal optimisation gcc generates a compile-time 0.
MALLOC_DEFINE() and friends have no effect.
Fix one or two places where the code would no longer compile.
2012-04-29 20:27:31 +00:00
dsl dbd0815551 Remove the unused 'struct malloc_type' args to kern_malloc/realloc/free
The M_xxx arg is left on the calls to malloc() and free(),
  maybe they could be converted to an enumeration and just saved in
  the malloc header (for deep diag use).
Remove the malloc_type from mbuf extension.
Fixes rump build as well.
Welcome to 6.99.6
2012-04-29 16:36:53 +00:00
rmind 4b760398c3 Remove MALLOC_DEBUG and MALLOCLOG, which is dead code after malloc(9) move
to kmem(9).  Note: kmem(9) has debugging facilities under DEBUG/DIAGNOSTIC.
However, expensive kmguard and debug_freecheck have to be enabled manually.
2012-04-28 23:03:39 +00:00
manu 57f4d08bde Do not use vp after mount_domount() call as it sets it to NULL on success.
This fixes a panic when starting extended attributes.
2012-04-28 17:30:19 +00:00
drochner d4145bf15d minor mostly cosmetical fixes: use designated type for device major
numbers, typo in comment, misuse of minor()
(the latter one is not cosmetical, but would only affect systems
with more than 256 disk wedges)
2012-04-27 18:15:55 +00:00
rmind 4b8ea8ed96 Improve the assert message. 2012-04-21 22:38:25 +00:00
rmind 0c79472223 - Convert x86 MD code, mainly pmap(9) e.g. TLB shootdown code, to use
kcpuset(9) and thus replace hardcoded CPU bitmasks.  This removes the
  limitation of maximum CPUs.

- Support up to 256 CPUs on amd64 architecture by default.

Bug fixes, improvements, completion of Xen part and testing on 64-core
AMD Opteron(tm) Processor 6282 SE (also, as Xen HVM domU with 128 CPUs)
by Manuel Bouyer.
2012-04-20 22:23:24 +00:00
yamt 673662ef94 comment 2012-04-18 13:42:11 +00:00
christos ec252a38db it is not an error if the kernel needs to clear the setuid/
setgid bit on write/chown/chgrp
2012-04-17 19:15:15 +00:00
tls 8e1a1c9f45 Address multiple problems with rnd(4)/cprng(9):
1) Add a per-cpu CPRNG to handle short reads from /dev/urandom so that
   programs like perl don't drain the entropy pool dry by repeatedly
   opening, reading 4 bytes, closing.

2) Really fix the locking around reseeds and destroys.

3) Fix the opportunistic-reseed strategy so it actually works, reseeding
   existing RNGs once each (as they are used, so idle RNGs don't get
   reseeded) until the pool is half empty or newly full again.
2012-04-17 02:50:38 +00:00
martin 1fb5ae697f We don't support KMEM_GUARD nor FREECHECK yet with rump, so disable them
in debug builds of the rump kernel.
2012-04-15 19:07:40 +00:00
martin 0ed1ffcc64 Fix leak in a posix_spawn error path, from Greg Oster. 2012-04-15 15:35:00 +00:00
yamt ea84519110 comment 2012-04-13 15:32:15 +00:00
yamt 6e7d55c554 - do_sched_getparam: release locks earlier.
- add comments
2012-04-13 15:27:13 +00:00
mrg 91a679a45c allow kmem_guard_depth to be set in the config file. 2012-04-13 06:27:02 +00:00
tls 65e7fe9a53 Fix LOCKDEBUG problems pointed out by drochner@
1) Lock ordering in cprng_strong_destroy had us take a spin mutex then
   an adaptive mutex.  Can't do that.  Reordering this requires changing
   cprng_strong_reseed to tryenter the cprng's own mutex and skip the
   reseed on failure, or we could deadlock.

2) Can't free memory with a valid mutex in it.
2012-04-10 15:12:40 +00:00
tls 2b09c6c851 Add a spin mutex to the rndsink structure; it is used to avoid lock
ordering and sleep-holding-locks problems when rekeying, and thus
to avoid a nasty race between cprng destruction and reseeding.
2012-04-10 14:02:27 +00:00
martin 4e00857f25 Fix asynchronous posix_spawn child exit status (and test for it). 2012-04-09 19:42:06 +00:00
martin 94b761b6aa Rework posix_spawn locking and memory management:
- always provide a vmspace for the new proc, initially borrowing from proc0
   (this part fixes PR 46286)
 - increase parallelism between parent and child if arguments allow this,
   avoiding a potential deadlock on exec_lock
 - add a new flag for userland to request old (lockstepped) behaviour for
   better error reporting
 - adapt test cases to the previous two and add a new variant to test the
   diagnostics flag
 - fix a few memory (and lock) leaks
 - provide netbsd32 compat
2012-04-08 11:27:44 +00:00
christos 23fc2b12e7 remove bogus check. 2012-04-07 05:38:49 +00:00
christos db09922d6f make this bitch less when we have wedges (EBUSY for the underlying disks) 2012-04-07 05:38:07 +00:00
hannken ab3e9955f7 Fix vn_lock() to return an invalid (dead, clean) vnode
only if the caller requested it by setting LK_RETRY.

Should fix PR #46221: Kernel panic in NFS server code
2012-04-05 07:26:36 +00:00
para a2b8e32362 don't overallocated once we leave the caches 2012-04-01 17:02:46 +00:00
dholland 98573cd6ad Misplaced parenthesis; fixes PR 44927 2012-03-22 17:46:07 +00:00
martin 834e2aaa79 Fix query of IMMEDIATE bool values (copy & pasto). 2012-03-21 14:51:36 +00:00
matt 50c1fb641e No need take the address of an array (&array) since an array is already a
pointer.
2012-03-19 06:04:19 +00:00
matt d35b5e4f3f Fix PR/49150.
Make listen(2) match the opengroup specification for what what errno to
return if the socket is connected when a listen(2) is attempted.
2012-03-16 06:47:37 +00:00
elad 0c9d8d15c9 Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

    http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
    http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
    http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.
2012-03-13 18:40:26 +00:00
elad f6ea09d026 Remove TNF license. 2012-03-13 18:36:49 +00:00
dholland 48a4d49b28 Repeated typo/varargs anachronism in comments. 2012-03-12 19:21:07 +00:00
christos ae15dfb84e PR/41673: Stathis Kamperis: tcsetpgrp returns EINVAL, but should return EPERM. 2012-03-12 18:27:08 +00:00
joerg 99c3eea80c P1003_1B_SEMAPHORE is no longer optional. 2012-03-10 21:51:48 +00:00
martin aec05724e5 Remove a KPREEMPT_ENABLE() in an error path I overlooked in the previous
change - pointed out by Manuel Bouyer.
While there, add a KASSERT() to make sure we have preemption enabled in
the success case.
2012-03-10 14:35:05 +00:00
martin f9619b6218 Make sure the child of a posix_spawn operation is not preempted during
the times when it does not have any vm_space.
Should fix PR kern/46153.
2012-03-10 08:46:45 +00:00
joerg f36ba6b5ad sem_open and friends should return EINVAL if the semaphore is not valid. 2012-03-09 21:03:46 +00:00
joerg 4acff4c01b Implement sem_timedwait. 2012-03-08 21:59:24 +00:00
joerg 3bd1fd2afe Add entry for _ksem_timedwait. 2012-03-08 21:55:45 +00:00
para abfefc8c5f make accounting for vm_inuse sane
while here don't statically allocated for more caches then required
2012-03-04 14:28:49 +00:00
matt 0fb436b313 If IPL_SAFEPRI is defined, use it to initialize safepri. 2012-03-03 00:22:24 +00:00
rmind 1f1468fdc5 - Add __cacheline_aligned for nprocs, make fork_tfmrate static.
- Fix indentation, remove whitespaces and redundant brackets.
2012-03-02 21:23:05 +00:00
rmind 251f7169ed {mutex,rw}_vector_enter: use macro versions to disable/enable preemption. 2012-02-25 22:32:44 +00:00
para 05f35f5342 change sched_upreempt_pri default to 0 as discussed on tech-kern@
should improve interactive performance on SMP machines
as user preemption happens immediately in x-cpu wakeup case now
2012-02-23 12:24:05 +00:00
martin 1e37a35ef0 Make time_second and time_uptime volatile, so the compiler knows they
may change during loops. Fixes the macppc build, which previously
died with:
src/sys/arch/macppc/dev/dbdma.c:62:6: error: assuming signed overflow does not occur when assuming that (X + c) < X is always false
2012-02-21 15:41:24 +00:00
christos 4ab990718f keep track of the original array length so we can pass it to kmem_free, from
enami
2012-02-21 04:13:22 +00:00
christos 222b58ad16 fix fae free'ing, from enami. 2012-02-21 03:44:54 +00:00
christos ca439b8516 Posix spawn fixes:
- split the file actions allocation and freeing into separate functions
- use pnbuf
- don't play games with pointers (partially freeing stuff etc), only check
  fa and sa and free as needed using the same code.
- use copyinstr properly
- KM_SLEEP allocation can't fail
- if path allocation failed midway, we would be possibily freeing
  userland strings.
- use sizeof(*var) instead sizeof(type)
2012-02-20 18:18:30 +00:00
martin 6bde504952 More posix_spawn fallout:
Fix a kmem_alloc() call with zero size (PR kern/46038), allow file actions
to be passed, even if empty.
Rearange p_reflock locking for the child, avoid a double free in an
error case, avoid a memory leak in another error case - all pointed out
by yamt.
During blocking operations early in the child borrow the kernel vmspace
(as suggested by yamt).
2012-02-20 12:19:55 +00:00
rmind 25c2f01c4f itimerfire: fix a regression, check if timer is already queued. 2012-02-20 01:12:42 +00:00
rmind ad12c77015 Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.
2012-02-19 21:05:51 +00:00
matt 931370c9c5 Regen. 2012-02-19 17:50:50 +00:00
matt a9cc3edeb9 Use void * instead of sa_upcall_t for sa_register (since sa_upcall_t is
going away).
2012-02-19 17:50:28 +00:00
matt f6e1352a6c Regen. 2012-02-19 17:23:04 +00:00
matt d1b4215da1 Make SA syscalls as COMPAT_60 2012-02-19 17:22:16 +00:00
rmind 510eed8079 Make SA calls obsolete (use stubs in kern case, as libc needs them for now). 2012-02-19 17:08:02 +00:00
matt e541fdb425 Add compat_60 2012-02-19 17:00:57 +00:00
mrg d9a1d7a11e add an XXX comment i meant to include with the original change. 2012-02-18 06:29:10 +00:00
yamt 31e27eaca4 BUFQ_PRIOCSCAN:
- to reduce cpu consumption for a long queue, maintain the sorted lists of
  buffers with rbtree instead of TAILQ.  i vaguely remember that the problem
  pointed out by someone on a public mailing list while ago.  sorry, i can't
  remember who and where.

- add some #ifdef'ed out experimental code.

- bump kernel version for struct buf change.
2012-02-17 08:45:11 +00:00
martin d178e64fee Fix fallout from the new tests exercising all error paths: do not deactivate
the pmap of a vmspace-less child of a posix spawn operation that never
made it to userland.
2012-02-15 11:59:30 +00:00
martin f4db024f0d Fix SDT_PROBE macro argument overlooked in argument renaming, noted by <chs> 2012-02-12 20:11:03 +00:00
martin c6a7db15e9 Minor tweaks to posix_spawn error handling.
The standard allows "open" file actions for descriptors that are alreay
open, add support for that.
2012-02-12 13:14:37 +00:00
martin 0b454a86a3 fd_open(): fix confusion between userland and kernel encoding of open flags 2012-02-12 13:12:45 +00:00
martin 8cd221e226 Regen for posix_spawn 2012-02-11 23:18:13 +00:00
martin f8c7c04bbe Add a posix_spawn syscall, as discussed on tech-kern.
Based on the summer of code project by Charles Zhang, heavily reworked
later by me - all bugs are likely mine.
Ok: core, releng.
2012-02-11 23:16:15 +00:00
para 4c23b30cff proper sizing of kmem_arena on different ports
PR port-i386/45946: Kernel locks up in VMEM system
2012-02-10 17:35:47 +00:00
drochner 76ee9eff38 align allocations >=pagesize at a page boundary, to preserve traditional
malloc(9) semantics
fixes dri mappings shared per mmap (at least on i945)
approved by releng
2012-02-06 12:13:44 +00:00
rmind 02bf188b03 - Make KMGUARD interrupt-safe.
- kmem_intr_{alloc,free}: remove workaround.

Changes affect KMGUARD-enabled debug kernels only.
2012-02-05 03:40:07 +00:00
para fa6083dc6c make acorn26 compile by fixing up subpage pool allocations
ok: riz@
2012-02-04 22:11:42 +00:00