Commit Graph

7044 Commits

Author SHA1 Message Date
yamt
805df27570 rw_vector_exit: remove a redundant condition. 2009-05-16 08:36:32 +00:00
yamt
513f4955a7 put a flag bit into v_usecount to prevent vtryget during getcleanvnode.
this fixes the following deadlock.

	a thread doing getcleanvnode:
	pick a vnode
	acqure v_interlock
	v_usecount++
	call vclean

		now, another thread doing cache_lookup:
		picks the vnode
		vtryget succeed
		vn_lock succeed

	now in vclean:
	set VI_XLOCK (too late to be noticed by the competing thread)
	wait on the vnode lock (this might violate locking order)

the use of a flag bit was suggested by Andrew Doran.  PR/41374.
2009-05-16 08:29:53 +00:00
pooka
e8f5dfa79e regen: pad -> PAD 2009-05-15 15:52:39 +00:00
pooka
6c68c84345 Use argname PAD to signal that an argument is used only for padding
and not part of the C interface.  Use this information for rump
syscalls to generate syscall interfaces without the extra parameter.
2009-05-15 15:51:27 +00:00
pooka
500fdd36a7 In addition to off_t alignment, check for dev_t and time_t too
(we don't currently have any syscalls passing time_t, though)
2009-05-15 14:52:47 +00:00
yamt
d4da6c3d2e don't forget to skip marker processes. 2009-05-12 11:42:12 +00:00
yamt
bed2400e59 lockdebug fixes for rw_tryupgrade/rw_downgrade. 2009-05-09 03:33:10 +00:00
yamt
9031548af6 exit1: fix a race with do_sys_wait/proc_free. 2009-05-08 13:32:59 +00:00
bouyer
f48b5c49cc Declare sh_flags volatile.
Without it, on ports where splhigh() is inline, the compiler will optimise
the second SOFTINT_PENDING test in softint_schedule(). A dissasembly
of softint_schedule() with and without the volatile sh_flags confirm this
on sparc.
Because of this there is a race that could lead to the softhand_t
being enqueued twice on si_q, leading to a corrupted queue and
some handler being SOFTINT_PENDING but never called.

Should fix PR kern/38637
2009-05-05 20:26:36 +00:00
yamt
183ff8793d sysctl_doeproc: fix a bug in rev.1.135.
don't forget to mark our marker process PK_MARKER.
this fixes crashes in sched_pstats, etc.
2009-05-04 14:52:33 +00:00
yamt
6f0983460b when freeing cn_pnbuf, make it NULL if DIAGNOSTIC. 2009-05-04 06:05:19 +00:00
yamt
706e6928e0 tweak some assertions on so_head to make them more meaningful. 2009-05-04 06:02:40 +00:00
elad
414eb0a314 Move dovfsusermount to secmodel_bsd44, where it really belongs.
The secmodel code now creates the same knob in two places: both under the
secmodel itself, as well as the widely known location.

Mailing list references:

    http://mail-index.netbsd.org/source-changes/2009/05/02/msg220641.html
    http://mail-index.netbsd.org/tech-kern/2009/05/03/msg005015.html
2009-05-03 21:25:44 +00:00
pooka
ec3ee0abf9 Include some debug print routines if DEBUGPRINT is defined. This
way they can be included without having to include DDB.
(arguably all print routines should be behind #ifdef DEBUGPRINT
and options DDB should define that macro, but I'll tackle that later)
2009-05-03 16:52:54 +00:00
elad
b1bd59c577 Fix locking around mountlist usage, as pointed out by ad@ in:
http://mail-index.netbsd.org/source-changes-d/2009/04/22/msg000322.html
  http://mail-index.netbsd.org/tech-kern/2009/04/22/msg004897.html

Use vfs_busy() and vfs_unbusy(), and properly iterate the mountlist.
2009-05-02 21:47:12 +00:00
pooka
fb42667d02 Move dovfsusermount from vfs_syscalls.c to param.c: secmodel bsd44
depends on it and we can't isolate it in vfs.
(no, it doesn't really belong in param.c, but I couldn't figure out
a better place for it)
2009-05-02 14:13:28 +00:00
cegger
f2c0b025dc remove useless parenthesis 2009-05-01 08:27:41 +00:00
ad
922436b4c6 PR kern/41311: Mutex error: mutex_vector_enter: locking against myself 2009-04-30 20:41:33 +00:00
dyoung
662437b2c3 Remove extraneous parentheses. Fix spelling/grammar. No functional
change intended.
2009-04-30 20:39:08 +00:00
nonaka
3c6a17f5ff include sys/lwp.h for curlwp. 2009-04-30 05:15:36 +00:00
dyoung
dfec23a174 Extract vfs_unmountall1() from vfs_unmountall() for reuse. 2009-04-29 15:44:55 +00:00
dyoung
3e0a641f96 Extract common code from vfs_rootmountalloc(9) and mount_domount() into
a new struct mount-allocation routine, vfs_mountalloc(9).  Documentation
updates will follow.

Attention: Synchronization Oversight Committee!  In mount_domount(),
I postpone the call mutex_enter(&mp->mnt_updating) until right before
the VFS_MOUNT(9) call because (1) that looks to me like the earliest
possible opportunity for mp to become visible to any other LWP, because
it was just kmem_zalloc(9)'d and (2) it made extracting the common code
much easier.  Tell me if my reasoning is faulty.
2009-04-29 01:03:43 +00:00
dyoung
5c35d469c9 Cosmetic: remove unnecessary parentheses. 2009-04-28 20:56:40 +00:00
dyoung
9777865b42 Extract sockaddr_any_by_family() from sockaddr_any() for looking up a
wildcard ("any") address by protocol family instead of by sockaddr.
2009-04-28 20:54:50 +00:00
skrll
15c5c4397e copyin the modctl_load_t for the non-x86 world. Fixes PR/41294. 2009-04-28 17:57:00 +00:00
yamt
a6f64ec082 do_sys_utimes: fix a bug introduced by rev.1.367.
VA_UTIMES_NULL is in va_vaflags, not va_flags.
2009-04-28 03:01:15 +00:00
elad
bab57db991 Replace a NULL check that can never fire with a KASSERT().
Okay ad@.

(this change was originally part of the following commit:
    http://mail-index.netbsd.org/source-changes/2009/04/25/msg220346.html)
2009-04-25 18:53:43 +00:00
rmind
440e5485e0 - Rearrange pg_delete() and pg_remove() (renamed pg_free), thus
proc_enterpgrp() with proc_leavepgrp() to free process group and/or
  session without proc_lock held.
- Rename SESSHOLD() and SESSRELE() to  to proc_sesshold() and
  proc_sessrele().  The later releases proc_lock now.

Quick OK by <ad>.
2009-04-25 15:06:31 +00:00
ad
9b7896c50a A workaround for a bug with some Opteron revisions where locked operations
sometimes do not serve as memory barriers, allowing memory references to
bleed outside of critical sections.  It's possible that this is the
reason for pkgbuild's longstanding crashiness.

For rwlocks, always enable the explicit membars. They were disabled only
on x86, and since they are not in the fast-path it's not a big deal.
TODO: convert these to an atomic_membar_foo() or similar that does ordering
between regular data references and atomic references.
2009-04-24 17:53:06 +00:00
elad
f68b0219b0 Per discussion on tech-kern@:
- Replace use of label/goto with returns

  - Rename, change prototype of, and move functions from vfs_subr.c to
    genfs_vnops.c
2009-04-22 22:57:08 +00:00
yamt
091b54f602 fix an indentation error. no functional change. 2009-04-21 00:02:37 +00:00
elad
d4cc1b437c PR/41251: YAMAMOTO Takashi: veriexec locking seems broken
Part 1: Take the mountlist_lock before traversing the mount list.
2009-04-20 22:09:54 +00:00
elad
386808d4a0 Refactor some duplicated file-system code.
Proposed and received no objections on tech-kern@:

	http://mail-index.netbsd.org/tech-kern/2009/04/18/msg004843.html
2009-04-20 18:06:26 +00:00
ad
4d8f47ae2f cpuctl:
- Add interrupt shielding (direct hardware interrupts away from the
  specified CPUs). Not documented just yet but will be soon.

- Redo /dev/cpu time_t compat so no kernel changes are needed.

x86:

- Make intr_establish, intr_disestablish safe to use when !cold.

- Distribute hardware interrupts among the CPUs, instead of directing
  everything to the boot CPU.

- Add MD code for interrupt sheilding. This works in most cases but there is
  a bug where delivery is not accepted by an LAPIC after redistribution. It
  also needs re-balancing to make things fair after interrupts are turned
  back on for a CPU.
2009-04-19 14:11:36 +00:00
ad
d857e7b19e call rw_obj_init() 2009-04-19 14:04:51 +00:00
ad
a8bd3c39aa Add rw_obj_*() functions to mirror the existing mutex functions.
Proposed on tech-kern quite some time ago.
2009-04-19 08:36:04 +00:00
tsutsui
d779b85d3e Remove extra whitespace added by a stupid tool.
XXX: more in src/sys/arch
2009-04-18 14:58:02 +00:00
dyoung
d2a5799226 Do not interleave device detachment with device shutdown. Instead, try
over and over to detach all of the devices.  Stop when we cannot detach
even a single device in a cycle.  Call shutdown hooks on all of the
devices that remain attached.

This is another step toward the detach/unmount cycle that will help us
tear down arbitrary stacks of filesystems, ccd(4), raid(4), and vnd(4).
2009-04-17 20:45:09 +00:00
dyoung
b29e491b07 Make vfs_unmountall() return true if it was able to unmount any
filesystem at all, false otherwise.  This will support tearing down
stacks of filesystems, ccd(4), raid(4), and vnd(4).

Change the misleading variable name 'allerror' to 'any_error'.  Make it
a bool.
2009-04-17 20:22:52 +00:00
ad
acf9701a7e kpreempt: fix another bug, uintptr_t -> bool truncation. 2009-04-16 21:19:23 +00:00
rmind
d062d5d72b - Manage pid_table with kmem(9).
- Remove M_PROC and unused M_SESSION.
2009-04-16 14:56:41 +00:00
rmind
71923f262a Replace malloc with kmem(9). 2009-04-16 14:55:44 +00:00
skrll
cc65188940 0 -> NULL 2009-04-16 07:47:16 +00:00
rmind
523acc7d68 Avoid few #ifdef KSTACK_CHECK_MAGIC. 2009-04-16 00:17:19 +00:00
elad
2d1c968399 Remove a few KAUTH_GENERIC_ISSUSER in favor of more descriptive
alternatives.

Discussed on tech-kern:

	http://mail-index.netbsd.org/tech-kern/2009/04/11/msg004798.html

Input from ad@, christos@, dyoung@, tsutsui@.

Okay ad@.
2009-04-15 20:44:24 +00:00
yamt
87984ef060 pool_cache_put_paddr: add an assertion. 2009-04-15 11:45:18 +00:00
yamt
f0cdb5ac8d kpreempt: report a failure of cpu_kpreempt_enter. otherwise x86 trap()
loops infinitely.  PR/41202.
2009-04-15 11:44:20 +00:00
christos
86ba58fd64 Fix locking as Andy explained. Also fill in uid and gid like sys_pipe did. 2009-04-11 23:05:26 +00:00
christos
b859fbe7cb Fix PR/37878 and PR/37550: Provide stat(2) for all devices and don't use
fbadop_stat.
2009-04-11 15:47:33 +00:00
christos
bb55634e9d rename ctime to btime for consistency. 2009-04-11 15:46:18 +00:00
christos
bb2d65e097 - maintain timespec internally.
- set birthtime too.
2009-04-11 14:42:28 +00:00
yamt
bdeb01233f 0 -> NULL 2009-04-09 00:57:15 +00:00
yamt
e29a551d79 remove an unnecessary cast. 2009-04-09 00:44:32 +00:00
yamt
a227a1194c sonewconn: add an assertion. 2009-04-09 00:43:38 +00:00
yamt
2c68552273 0 -> NULL where appropriate 2009-04-09 00:37:32 +00:00
ad
48320d4cd3 soo_ioctl:
- cosmetic change after merge of socket locking patch.
- add a comment.
2009-04-08 21:02:09 +00:00
ad
9635fa6829 Patch out soo_drain until I fix it to work correctly. 2009-04-08 20:58:40 +00:00
dyoung
0873e7f203 Cosmetic: join lines. 2009-04-07 18:16:28 +00:00
tsutsui
3d4412216c Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.
2009-04-07 10:49:54 +00:00
dyoung
ac5f968c78 Fix spelling. 2009-04-06 21:22:47 +00:00
bouyer
f8059f7e67 m_split0(): If the newly allocated mbuf holds only the header,
don't forget to set m_len to 0. Otherwise whatever will compute the size
of this chain (including s_split() itself if called again on this chain)
will get it wrong, leading to various issues.

Bug exposed by the NFS server code with linux clients using TCP mounts.
2009-04-05 16:31:21 +00:00
lukem
2b2f4703f2 fix sign-compare issues 2009-04-05 11:48:02 +00:00
ad
ddf65d893c Update the big comment block. 2009-04-04 22:34:03 +00:00
joerg
3cb902383f Allow querying for root devices in the tree by specifying an empty
device name. Ensure that l_devname is NUL-terminated and fail otherwise.
OK cube@
2009-04-04 21:49:05 +00:00
ad
c6367674d6 Add fileops::fo_drain(), to be called from fd_close() when there is more
than one active reference to a file descriptor. It should dislodge threads
sleeping while holding a reference to the descriptor. Implemented only for
sockets but should be extended to pipes, fifos, etc.

Fixes the case of a multithreaded process doing something like the
following, which would have hung until the process got a signal.

thr0	accept(fd, ...)
thr1	close(fd)
2009-04-04 10:12:51 +00:00
ad
4307e10df2 Add disk_isbusy(), iostat_isbusy(). 2009-04-04 07:30:09 +00:00
ad
117500b9b2 workqueue_finiqueue: our stack could be swapped out while enqueued to
a worker thread.
2009-04-03 19:34:19 +00:00
dyoung
e5320b0a17 Take out a noisy debug statement that slipped in with device-detachment
at shutdown.
2009-04-02 22:19:48 +00:00
ad
bbb900dedf banner: fix a minor bug. 2009-04-02 19:43:11 +00:00
drochner
eb4f9278bc In humanize_number(), avoid an integer overflow if the buffer
provided is "too large" (log10(2^64) = 19).
(It can still overflow if the input value is close to 2^64 but I don't
consider this a problem.)
fixes nonsense displayed as "total memory" on boot
2009-04-02 17:25:24 +00:00
dyoung
0d1ba3e899 During shutdown, detach devices in an orderly fashion.
Call the detach routine for every device in the device tree, starting
with the leaves and moving toward the root, expecting that each
(pseudo-)device driver will use the opportunity to gracefully commit
outstandings transactions to the underlying (pseudo-)device and to
relinquish control of the hardware to the system BIOS.

Detaching devices is not suitable for every shutdown: in an emergency,
or if the system state is inconsistent, we should resort to a fast,
simple shutdown that uses only the pmf(9) shutdown hooks and the
(deprecated) shutdownhooks.  For now, if the flag RB_NOSYNC is set in
boothowto, opt for the fast, simple shutdown.

Add a device flag, DVF_DETACH_SHUTDOWN, that indicates by its presence
that it is safe to detach a device during shutdown.  Introduce macros
CFATTACH_DECL3() and CFATTACH_DECL3_NEW() for creating autoconf
attachments with default device flags.  Add DVF_DETACH_SHUTDOWN
to configuration attachments for atabus(4), atw(4) at cardbus(4),
cardbus(4), cardslot(4), com(4) at isa(4), elanpar(4), elanpex(4),
elansc(4), gpio(4), npx(4) at isa(4), nsphyter(4), pci(4), pcib(4),
pcmcia(4), ppb(4), sip(4), wd(4), and wdc(4) at isa(4).

Add a device-detachment "reason" flag, DETACH_SHUTDOWN, that tells the
autoconf code and a device driver that the reason for detachment is
system shutdown.

Add a sysctl, kern.detachall, that tells the system to try to detach
every device at shutdown, regardless of any device's DVF_DETACH_SHUTDOWN
flag.  The default for kern.detachall is 0.  SET IT TO 1, PLEASE, TO
HELP TEST AND DEBUG DEVICE DETACHMENT AT SHUTDOWN.

This is a work in progress.  In future work, I aim to treat
pseudo-devices more thoroughly, and to gracefully tear down a stack of
(pseudo-)disk drivers and filesystems, including cgd(4), vnd(4), and
raid(4) instances at shutdown.

Also commit some changes that are not easily untangled from the rest:

(1) begin to simplify device_t locking: rename struct pmf_private to
device_lock, and incorporate device_lock into struct device.

(2) #include <sys/device.h> in sys/pmf.h in order to get some
definitions that it needs.  Stop unnecessarily #including <sys/device.h>
in sys/arch/x86/include/pic.h to keep the amd64, xen, and i386 releases
building.
2009-04-02 00:09:32 +00:00
christos
b5c4aec40f fix erroneously deleted assignment. 2009-03-30 22:22:44 +00:00
yamt
197e2d1b30 ARRAY_PRINT: 0 is a valid index. 2009-03-30 16:38:05 +00:00
christos
2b1b4bc6ef Move the internal poll/select related API's to use timespec instead
of timeval (rides the uvm bump).
2009-03-29 19:21:19 +00:00
pooka
ad23953a5a protect allevents list with a mutex 2009-03-29 18:21:06 +00:00
christos
6e1db6cea1 - use itimespecfix to detect invalid timespecs
- use tstohz instead of mstohz to prevent overflow.
2009-03-29 17:54:12 +00:00
christos
94f59b4d43 PR/41094: Matteo Beccati: sigtimedwait returns EAGAIN instead of EINVAL if
timeout is invalid
2009-03-29 16:23:23 +00:00
ad
5d4e966fd0 Remove debug code from previous. 2009-03-29 15:23:54 +00:00
ad
f07d0eaa2d Add a cental banner() function to print the copyright and version info. 2009-03-29 10:58:28 +00:00
ad
f51a17bccf kernel memory guard for DEBUG kernels, proposed on tech-kern.
See kmem_alloc(9) for details.
2009-03-29 10:51:53 +00:00
ad
ead83a47c8 _lwp_setprivate: provide the value to MD code if a hook is present.
This will be used to support TLS. The MD method must match the ELF TLS spec
for that CPU architecture (if there is a spec).

At this time it is only implemented for i386, where it means setting the
per-thread base address for %gs. Please implement this for your platform!
2009-03-29 09:24:52 +00:00
pooka
2941a7af6e Include some headers to make rump_syscalls.h self-contained.
(not strictly correct wrt portability, but there are bigger fish
to saltgrill in that area)
2009-03-29 07:56:19 +00:00
rmind
be5c9950c6 kpsignal2: do not start process (when it is stopped) for all termination
signals (i.e. SA_KILL), just if SIGKILL (or SIGCONT).  Improve comments.

Make some functions static, remove unused sigrealloc() prototype.

Fixes PR/39814.  Similar patch reviewed by <ad>.
2009-03-29 05:02:46 +00:00
rmind
6b0e9f0301 fownsignal: pre-check for zero pgid, avoids locking of proc_lock. 2009-03-29 04:40:01 +00:00
mrg
fcc023545e - add new RLIMIT_AS (aka RLIMIT_VMEM) resource that limits the total
address space available to processes.  this limit exists in most other
modern unix variants, and like most of them, our defaults are unlimited.
remove the old mmap / rlimit.datasize hack.

- adds the VMCMD_STACK flag to all the stack-creation vmcmd callers.
it is currently unused, but was added a few years ago.

- add a pair of new process size values to kinfo_proc2{}. one is the
total size of the process memory map, and the other is the total size
adjusted for unused stack space (since most processes have a lot of
this...)

- patch sh, and csh to notice RLIMIT_AS.  (in some cases, the alias
RLIMIT_VMEM was already present and used if availble.)

- patch ps, top and systat to notice the new k_vm_vsize member of
kinfo_proc2{}.

- update irix, svr4, svr4_32, linux and osf1 emulations to support
this information.  (freebsd could be done, but that it's best left
as part of the full-update of compat/freebsd.)


this addresses PR 7897.  it also gives correct memory usage values,
which have never been entirely correct (since mmap), and have been
very incorrect since jemalloc() was enabled.

tested on i386 and sparc64, build tested on several other platforms.

thanks to many folks for feedback and testing but most espcially
chuq and yamt for critical suggestions that lead to this patch not
having a special ugliness i wasn't happy with anyway :-)
2009-03-29 01:02:48 +00:00
rmind
f70325ee02 - kpreempt_disabled: constify l.
- Few predictions.
- KNF.
2009-03-28 21:43:16 +00:00
rmind
4aca8329b0 sys_fcntl: use FD_CLOEXEC, instead of magic number '1'. 2009-03-28 21:42:19 +00:00
rmind
9c724ebaca Make inferior() function static, rename to p_inferior(), return bool. 2009-03-28 21:41:05 +00:00
rmind
ea3400a4b7 - proc_free(): no need assign 'p->p_pptr' to 'parent' many times,
re-use it where appropriate (proc_lock is held across usages).
- Undefine DEBUG_EXIT.
2009-03-28 21:38:55 +00:00
christos
0ca4fa8e5b revert previous; ctags has been fixed. 2009-03-28 18:43:20 +00:00
pooka
5e0d2571db mark a bunch of syscalls as RUMP 2009-03-28 16:33:40 +00:00
drochner
083fa0419a In sigput(), save the siginfo no matter whether SA_SIGINFO is set or not.
There are also sigtimedwait(2) et al. to catch signals without invoking
a signal handler. Fixes PR kern/41076 by Matteo Beccati (the first
test case, where the signal is sent before sigwaitinfo(2) gets called).
2009-03-27 10:58:38 +00:00
dyoung
6506e7a17d ctags(1) gets confused by 'typedef struct X { } X_t', so break 'typedef
struct pmf_private { ... } pmf_private_t' into a struct definition and a
typedef definition.
2009-03-25 21:48:36 +00:00
dyoung
5a65a2f318 DVF_ACTIVE is unconditionally set when we attach a device, so
unconditionally clear it after we give a device's deactivate() routine a
chance.
2009-03-25 21:43:42 +00:00
dyoung
c42425328d When we attach a pseudo-device, set its cfdata_t's cf_fstate to
FSTATE_FOUND, as we do in config_attach_loc(), in order to avoid a
DIAGNOSTIC panic in config_detach() if we detach the device.
2009-03-25 21:28:50 +00:00
christos
c29e9578af use kauth instead of uid != 0 2009-03-24 21:00:05 +00:00
ad
7c4a91a3e5 uid_init: maxproc -> maxcpus 2009-03-22 00:49:13 +00:00
ad
3c11640e0d Fix 'boot -z' bogons. 2009-03-21 15:01:56 +00:00
ad
d16d704d62 PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash
Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
   is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.
2009-03-21 14:41:29 +00:00
ad
7364cd36a3 Allocate sleep queue locks with mutex_obj_alloc. Reduces memory usage
on !MP kernels, and reduces false sharing on MP ones.
2009-03-21 13:11:14 +00:00
ad
912b4160fd Make 'show event', 'dmesg' work with crash(8).
XXX dmesg fails exactly the same way as /sbin/dmesg.
2009-03-21 13:06:39 +00:00
pooka
226a234960 make mount() a rump call 2009-03-19 09:08:35 +00:00
pooka
ddf9eb29c5 kqueue and kevent for rump 2009-03-18 17:51:17 +00:00
pooka
0ef29cbdba Rename rump argument marshalling structure variable to "callarg" to
avoid collision with system calls which use "arg".
2009-03-18 17:27:04 +00:00
cegger
e2cb85904d bcopy -> memcpy 2009-03-18 17:06:41 +00:00
cegger
c363a9cb62 bzero -> memset 2009-03-18 16:00:08 +00:00
cegger
df7f595ecd Ansify function definitions w/o arguments. Generated with sed. 2009-03-18 10:22:21 +00:00
cegger
b8817e4aed ansify function definitions 2009-03-15 17:14:40 +00:00
dsl
454af1c0e8 Change about 4500 of the K&R function definitions to ANSI ones.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
2009-03-14 15:35:58 +00:00
dsl
02cdf4d2c8 Remove all the __P() from sys (excluding sys/dist)
Diff checked with grep and MK1 eyeball.
i386 and amd64 GENERIC and sys still build.
2009-03-14 14:45:51 +00:00
ad
0fa70e9b6f 'boot -z' bogons 2009-03-14 11:08:28 +00:00
yamt
6fb9967219 do_sys_unlink: remove an unused credential. 2009-03-13 11:05:26 +00:00
mrg
9ba87b8cc3 completely rework the way that orphaned sockets that are being fdpassed
via SCM_RIGHTS messages are dealt with:

1. unp_gc: make this a kthread.

2. unp_detach: go not call unp_gc directly. instead, wake up unp_gc kthread.

3. unp_scan: do not close files here. instead, put them on a global list
   for unp_gc to close, along with a per-file "deferred close count". if
   file is already enqueued for close, just increment deferred close count.
   this eliminates the recursive calls.

3. unp_gc: scan files on global deferred close list. close each file N
   times, as specified by deferred close count in file. continue processing
   list until it becomes empty (closing may cause additional files to be
   queued for close).

4. unp_gc: add additional bit to mark files we are scanning. set during
   initial scan of global file list that currently clears FMARK/FDEFER.
   during later scans, never examine / garbage collect descriptors that
   we have not marked during the earlier scan. do not proceed with this
   initial scan until all deferred closes have been processed. be careful
   with locking to ensure no races are introduced between deferred close
   and file scan.

5. unp_gc: use dummy file_t to mark position in list when scanning. allow
   us to drop filelist_lock. in turn allows us to eliminate kmem_alloc()
   and safely close files, etc.

6. prohibit transfer of descriptors within SCM_RIGHTS messages if
   (num_files_in_transit > maxfiles / unp_rights_ratio)

7. fd_allocfile: ensure recycled filse don't get scanned.


this is 97% work done by andrew doran, with a couple of minor bug fixes
and a lot of testing by yours truly.
2009-03-11 06:05:29 +00:00
mrg
ce98775552 like KERN_FILE2: *do* update "needed" when there is no count. we want
userland to know what sort of size to provide..

while here, slightly normalise the previous to init_sysctl.c.
2009-03-11 05:55:22 +00:00
mrg
47fb2b7401 always calculate "needed" for KERN_FILE2 calls. this allows a caller
to get an estimate of the needed space, like the intention is.
2009-03-11 01:30:27 +00:00
mlelstv
046c06b035 Make curlwp accesses conditional on wether the data structures
have been initialized. Fixes PR kern/38599.
2009-03-10 10:48:09 +00:00
uebayasi
7c002f258b KNF. ANSI'fy. 2009-03-09 16:19:22 +00:00
ad
69f9e17075 Don't bother with file_t::f_iflags any more, as it's not used.
Noted by mrg@.
2009-03-08 12:52:08 +00:00
christos
4dbda5da97 don't enforce maxproc resource limits for root. 2009-03-07 19:23:02 +00:00
joerg
f5b0fec0e0 Remove SHMMAXPGS from all kernel configs. Dynamically compute the
initial limit as 1/4 of the physical memory. Ensure the limit is at
least 1024 pages, the old default on most platforms.
2009-03-06 20:31:46 +00:00
uebayasi
c8f1efab53 xc_lowpri: don't truncate `where' from uint64_t to u_int. 2009-03-05 13:18:51 +00:00
yamt
c75123c99b main: disable kerenel preemption during early on boot. namely, between
configure() and configure2().  some kernel threads are not expected
to be run before "cold = 0".  fixes cache_thread() busy-loop.
2009-03-05 06:37:03 +00:00
skrll
5ccc095a3e Fix the posix_fadvise return value... finally.
Tested martin on sparc64/m68k and me on hppa.
2009-03-04 18:11:24 +00:00
rmind
4f1720c349 lwp_create: fix the locking bugs on affinity ingerition path (mea culpa).
pset_assign: traverse the list of LWPs safely.
sched_setaffinity: free cpuset (unused path) outside the lock.

Reviewed (with feedback) by <ad>.
2009-03-03 21:55:06 +00:00
ad
822f68cc07 If DEBUG is enabled, drop kpreempt_pri to zero. It means that every
wakeup will cause a kernel preemption, simulating massive concurrency.

Proposed on tech-kern@.
2009-03-02 21:17:29 +00:00
rmind
4bd0e7cebc fd_copy: fix off-by-one bug in a race condition path and assert.
Should fix PR/40625.  OK by <ad>.
2009-03-02 19:28:08 +00:00
kenh
dc5d469510 If sys/param.h is not included, the kernel compile fails on some platforms
with SOFTINT_COUNT undefined (I noticed it on some evbarm kernels)
2009-02-26 05:50:54 +00:00
ad
b2dec392e0 Fix some comments. 2009-02-23 20:33:30 +00:00
ad
59fcf21389 PR kern/26878 FFSv2 + softdep = livelock (no free ram)
PR kern/16942 panic with softdep and quotas
PR kern/19565 panic: softdep_write_inodeblock: indirect pointer #1 mismatch
PR kern/26274 softdep panic: allocdirect_merge: ...
PR kern/26374 Long delay before non-root users can write to softdep partitions
PR kern/28621 1.6.x "vp != NULL" panic in ffs_softdep.c:4653 while unmounting a softdep (+quota) filesystem
PR kern/29513 FFS+Softdep panic with unfsck-able file-corruption
PR kern/31544 The ffs softdep code appears to fail to write dirty bits to disk
PR kern/31981 stopping scsi disk can cause panic (softdep)
PR kern/32116 kernel panic in softdep (assertion failure)
PR kern/32532 softdep_trackbufs deadlock
PR kern/37191 softdep: locking against myself
PR kern/40474 Kernel panic after remounting raid root with softdep

Retire softdep, pass 2. As discussed and later formally announced on the
mailing lists.
2009-02-22 20:28:05 +00:00
ad
430f67aa17 PR kern/39564 wapbl performance issues with disk cache flushing
PR kern/40361 WAPBL locking panic in -current
PR kern/40361 WAPBL locking panic in -current
PR kern/40470 WAPBL corrupts ext2fs
PR kern/40562 busy loop in ffs_sync when unmounting a file system
PR kern/40525 panic: ffs_valloc: dup alloc

- A fix for an issue that can lead to "ffs_valloc: dup" due to dirty cg
  buffers being invalidated. Problem discovered and patch by dholland@.

- If the syncer fails to lazily sync a vnode due to lock contention,
  retry 1 second later instead of 30 seconds later.

- Flush inode atime updates every ~10 seconds (this makes most sense with
  logging). Presently they didn't hit the disk for read-only files or
  devices until the file system was unmounted. It would be better to trickle
  the updates out but that would require more extensive changes.

- Fix issues with file system corruption, busy looping and other nasty
  problems when logging and non-logging file systems are intermixed,
  with one being the root file system.

- For logging, do not flush metadata on an inode-at-a-time basis if the sync
  has been requested by ioflush. Previously, we could try hundreds of log
  sync operations a second due to inode update activity, causing the syncer
  to fall behind and metadata updates to be serialized across the entire
  file system. Instead, burst out metadata and log flushes at a minimum
  interval of every 10 seconds on an active file system (happens more often
  if the log becomes full). Note this does not change the operation of
  fsync() etc.

- With the flush issue fixed, re-enable concurrent metadata updates in
  vfs_wapbl.c.
2009-02-22 20:10:25 +00:00
pooka
b50bd8632e Instead of linking rump system call entry points directly to the
backend, perform all calls through a syscall table.  This makes it
possible to make system calls to non-local rump kernels.
(requires a bit support code.  it's written but quite messy currently)
2009-02-20 17:56:36 +00:00
yamt
777ded00ac cache_lookup_entry: add an assertion. 2009-02-18 13:36:11 +00:00
yamt
c69852d701 vmem_rehash_all: remove a debug printf slipped in with the previous changes. 2009-02-18 13:33:46 +00:00
yamt
f68e4571e5 - fix vmem unittest. rename VMEM_DEBUG so that it won't be abused again.
- reimplement vmem sanity checks with less code duplication.
- reimplement ddb vmem-related commands in a more consistent ways.
  remove automatic whatis.
2009-02-18 13:31:59 +00:00
yamt
947efbeab9 cache_purge1: consistently unlock ncp a little earlier. 2009-02-18 13:24:18 +00:00
yamt
4c5a0bb384 redo rev.1.19 correctly. 2009-02-18 13:22:10 +00:00
yamt
a13bb3bef4 whitespace 2009-02-18 13:12:00 +00:00
yamt
feff5384df use %zu for size_t 2009-02-18 13:04:59 +00:00
rmind
3de401ae19 Make sched_getrq() inline (gcc does not optimize it), avoids call. 2009-02-17 22:00:14 +00:00
ad
81525af92d Fix min/max confusion that causes a problem with DEBUG on some
architectures. Independently spotted by yamt@. /brick ad
2009-02-17 21:54:30 +00:00
enami
fb8633d4a9 Simplify the code; we already have a hint to decide which string to copy.
(And at least gcc generates better code.)
2009-02-15 03:52:49 +00:00
enami
60ebbc4e81 The knote objects attached by peer will still be linked in our list
if we are closed before the peer.  So, remove them.  It didn't matter
when pipe objects are directly returned to pool, but nowadays they
are cached.
2009-02-15 00:07:54 +00:00
christos
24587463c9 remove 2038 comment 2009-02-14 20:45:29 +00:00
christos
bd1260c5ff from enami: Only apply rootdir changes if the chroot dir != / 2009-02-14 17:06:35 +00:00
christos
78a45c73e9 PR/40634: Christoph Badura: "chroot / /sbin/mount" shows only / as mounted 2009-02-14 16:55:25 +00:00
pooka
3e656734b8 cosmetic: don't print empty line at end of init_sysent.c 2009-02-14 16:21:23 +00:00
apb
0cc72e51ac Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.
2009-02-13 22:41:00 +00:00
christos
160a37667a Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.
2009-02-12 18:24:18 +00:00
enami
d5b0f6df4c s/NOFOLLOW/FOLLOW/ in NDINIT so that it matches actual behavior
which is controlled by NO_FOLLOW bit passed in 2nd arg of vn_open().
2009-02-11 00:32:45 +00:00
enami
7efd6dbe73 Make module (auto)loading under chroot envrionment actually work:
- NOCHROOT flag must be assigned to different bit from TRYEMULROOT
  since the code expected to be executed is in the else clase of
  if (flags & TRYEMULROOT).
- Necessary variables aren't set.
2009-02-11 00:19:11 +00:00