Commit Graph

48 Commits

Author SHA1 Message Date
jdolecek 4d49760268 use the new NOTE_SUBMIT to flag if the locking is necessary
for EVFILT_READ/EVFILT_WRITE knotes

fixes PR kern/23915 by Martin Husemann (pipes), and similar locking problem
in tty code
2004-02-22 17:51:25 +00:00
atatat 13f8d2ce5f Dynamic sysctl.
Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al.  Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.

Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded.  Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.

All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.

PS - I'm sorry, but there's a distinct lack of documentation at the
moment.  I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
2003-12-04 19:38:21 +00:00
yamt 5ee0718f8f plug memory leak on error. 2003-11-13 11:59:46 +00:00
christos cb02efca51 fix uninitialized variable 2003-10-25 09:06:51 +00:00
christos 6edc0e184e - pass signo to fownsignal [ok by jd]
- make urg signal handling use fownsignal
- remove out of band detection in sowakeup
2003-09-22 12:59:55 +00:00
jdolecek 7cea8a1389 cleanup & uniform descriptor owner handling:
* introduce fsetown(), fgetown(), fownsignal() - this sets/retrieves/signals
  the owner of descriptor, according to appropriate sematics
  of TIOCSPGRP/FIOSETOWN/SIOCSPGRP/TIOCGPGRP/FIOGETOWN/SIOCGPGRP ioctl; use
  these routines instead of custom code where appropriate
* make every place handling TIOCSPGRP/TIOCGPGRP handle also FIOSETOWN/FIOGETOWN
  properly, and remove the translation of FIO[SG]OWN to TIOC[SG]PGRP
  in sys_ioctl() & sys_fcntl()
* also remove the socket-specific hack in sys_ioctl()/sys_fcntl() and
  pass the ioctls down to soo_ioctl() as any other ioctl

change discussed on tech-kern@
2003-09-21 19:16:48 +00:00
christos e8ae1a44db ksiginfo_t support. 2003-09-14 23:47:09 +00:00
pk 9f987d0ac0 Workaround to prevent a lockup in pipelock() in the case that signals are
pending while we must wait for the lock.
2003-08-11 10:24:41 +00:00
fvdl d5aece61d6 Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
2003-06-29 22:28:00 +00:00
darrenr 960df3c8d1 Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records.  The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
2003-06-28 14:20:43 +00:00
dsl 60418b39b7 Change 'data' argument to fo_ioctl and fo_fcntl from 'caddr_t' to 'void *'.
Avoids a lot of casting and removes the need for some line breaks.
Removed a load of (caddr_t) casts from calls to copyin/copyout as well.
(approved by christos - he has a plan to remove caddr_t...)
2003-03-21 21:13:50 +00:00
dsl 741870b7ef Validate that pgid argument to TIOCSPGRP in part of current session.
Treat +ve numbers as process group ids and -ve as pids (see tcsetpgrp() in part of current session.
Treat +ve numbers as process group ids and -ve as pids - see tcsetpgrp(3).
(approved by christos)
2003-03-12 23:00:03 +00:00
pk d2b7e43b18 On pipe reads, check for EOF before FNONBLOCK to avoid spurious EAGAIN errors. 2003-02-14 13:16:44 +00:00
pk 3cba4a9f1b Make the pipe code mostly MP-safe. There are a few unaddressed race
conditions at points where it's necessary to access both the up-stream
and down-stream parts of the bi-directional pipe data structure. These
are marked `XXXSMP' in the code.

Also, since the changes are pretty invasive, there little point in keeping
all the "#ifdef FreeBSD" code around; so all of that has been stripped out.
2003-02-12 21:54:15 +00:00
thorpej b193480908 Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant.  Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.
2003-02-01 06:23:35 +00:00
thorpej e0d8d366df Merge the nathanw_sa branch. 2003-01-18 10:06:22 +00:00
jdolecek 4c7e9f23da pipe_stat(): add S_IRUSR and S_IWUSR to mode; this is what Linux does,
and seems like generally sensible (more sensible than not doing so), so done
in generic code rather than compat glue only

Change proposed in PR kern/18767 by Emmanuel Dreyfus.
2002-12-05 16:30:55 +00:00
christos e22906f6d0 si_ -> sel_ to avoid conflicts with siginfo. 2002-11-26 18:44:34 +00:00
perry 6858187df6 /*CONTCOND*/ while (0)'ed macros 2002-11-02 07:20:42 +00:00
kristerw 85b746f61a ISO C requires a statement after a label. 2002-11-01 21:46:51 +00:00
jdolecek 6a40f5edcb pipe_read(): initialize ocnt before pipelock() call; it might have been
used unitialized when the pipelock() call would fail
bug found by Krister Walfridsson
2002-11-01 21:34:30 +00:00
jdolecek e0cc03a09b merge kqueue branch into -current
kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
2002-10-23 09:10:23 +00:00
thorpej 88e741999d Fix signed/unsigned comparison warnings from GCC 3.3. 2002-08-25 23:16:39 +00:00
atatat 31144d9976 Convert ioctl code to use EPASSTHROUGH instead of -1 or ENOTTY for
indicating an unhandled "command".  ERESTART is -1, which can lead to
confusion.  ERESTART has been moved to -3 and EPASSTHROUGH has been
placed at -4.  No ioctl code should now return -1 anywhere.  The
ioctl() system call is now properly restartable.
2002-03-17 19:40:26 +00:00
jdolecek fcc4c4d402 Merge the update to FreeBSD rev 1.95.
Changes:
* MP locking changes (mostly FreeBSD specific)
  XXXSMP the MP locking macros are noops on NetBSD for now
* kevent fix (FreeBSD rev. 1.87): when the last reader/writer
  disconnects, ensure that anybody who is waiting for the kevent
  on the other end of the pipe gets EV_EOF
* kill __P
2002-03-13 21:50:24 +00:00
thorpej a180cee23b Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map).  Try to deal with this:

* Group all information about the backend allocator for a pool in a
  separate structure.  The pool references this structure, rather than
  the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
  to become available, but will still fail if it cannot callocate KVA
  space for the pages.  If this happens, carefully drain all pools using
  the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
  some pages, and use that information to make draining easier and more
  efficient.
* Get rid of PR_URGENT.  There was only one use of it, and it could be
  dealt with by the caller.

From art@openbsd.org.
2002-03-08 20:48:27 +00:00
thorpej 92eb54d5a5 Don't assign NULL to non-pointer variables. 2002-02-28 04:43:16 +00:00
chs a8b519c880 unmap loaned pages before dropping the loan. some pmaps aren't
expecting pmap_kenter_pa() to be used to replace an existing mapping,
plus it just seems like a bad idea to keep around mappings of pages
that may be freed and reused.
2001-12-18 08:49:40 +00:00
jdolecek d7129f9255 fix typo in #ifdef __FreeBSD__
Pointed out by Chris Jepeway in private e-mail, thanks!
2001-12-11 18:15:09 +00:00
lukem adc783d537 add RCSIDs 2001-11-12 15:25:01 +00:00
chs 6fbca7d0fc use pmap_kenter_pa() instead of pmap_enter(), this is required for
pages loaned to the kernel.  this implies that we also need to
call pmap_kremove() before uvm_km_free().

other general cleanup:  remove argument names from prototypes,
rename some variables, etc.
2001-11-06 07:30:14 +00:00
jdolecek 24ba90929c Avoid using microtime(9) for atime/mtime, we don't need to have it
THAT accurate and microtime(9) is painlessly slow on i386 currently.
This speeds up small transfers much. The gain for large transfers
is less significant, but notable too.
Bottleneck was found by Andreas Persson (Re: kern/14246).

Performance improvement with PIII on 661 Mhz according to hbench (with
PIPE_MINDIRECT=8192):

buffersize     before    after
512            17        49
1024           33        110
2048           52        143
4096           77        163
8192           142       190
64K            577       662
128K           372       392
2001-10-28 20:47:15 +00:00
mycroft cbd7c4d140 When a pipe was grown to BIG_PIPE_SIZE, we could get in a select()/write() loop
because pipe_poll() and pipe_write() did not agree on when it was okay to write
more data.  Fix pipe_write(), since it seems to be the broken one.
2001-10-08 07:50:17 +00:00
jdolecek 18c0643bfb Update the uio resid counts appropriately when any error occurs
(not just EPIPE), so that the higher-level code would note partial
write has happened and DTRT if the write was interrupted due to
e.g. delivery of signal.

This fixes kern/14087 by Frank van der Linden.
Much thanks to Frank for extensive help with debugging this, and review
of the fix.

Note: EPIPE/SIGPIPE delivery behaviour was retained - they're delivered
even if the write was partially successful.
2001-09-29 13:48:11 +00:00
jdolecek 25bef3c837 Take care to transfer whole buffer passed via write(2); write(2) should
not do short writes unless when using non-blocking I/O.
This fixes kern/13744 by Geoff C. Wing.

Note this partially undoes rev. 1.5 change. Upon closer examination,
it's been apparent that hbench-OS expectations were not actually justified.
2001-09-25 19:01:21 +00:00
jdolecek 8573719e3d add new UVM_LOAN_WIRED flag - the memory pages loaned in TOPAGE case
are only wired if this flag is present (i.e. they are not wired by default now)
loaned pages are unloaned via new uvm_unloan(), uvm_unloananon() and
uvm_unloanpage() are no longer exported
adjust uvm_unloanpage() to unwire the pages if UVM_LOAN_WIRED is specified
mark uvm_loanuobj() and uvm_loanzero() static also in function implementation

kern/sys_pipe.c: uvm_unloanpage() --> uvm_unloan()
2001-09-22 05:58:04 +00:00
jdolecek 1d161cb2d4 call pmap_update() after pmap_enter()s
ALWAYS call uvm_unloanpage() in cleanup - it's necessary even
in pipe_loan_free() case, since uvm_km_free() doesn't seem
to implicitly unloan the loaned pages
2001-09-20 19:09:13 +00:00
jdolecek 875b784599 pipe_create(): explicitly zero whole memory returned from pool_get(), instead
of some selective pieces. This fixes problem with NEW_PIPE in kernels
with DEBUG option, reported via e-mail by Chuck Silvers.

sys_pipe(): g/c fdp, provide it at the chunk of FreeBSD code where it's used
2001-07-26 14:14:28 +00:00
thorpej 1071f796f4 bcopy -> memcpy 2001-07-18 06:51:38 +00:00
thorpej f2f13262df bzero -> memset 2001-07-18 06:48:27 +00:00
jdolecek f9f0d49b94 comment police 2001-07-17 18:21:59 +00:00
jdolecek db3510e6f8 fix bogus uio->uio_offset check introduced in rev. 1.5, which effectively
disabled loans for writes (a.k.a "direct write"), oops; use uio->uio_resid
for the check instead

don't bother updating uio->uio_offset in pipe_direct_write(), it's not used
by upper layers anyway
2001-07-17 18:18:52 +00:00
jdolecek 37d12500d5 only allocate buffer kva for the end which needs it 2001-07-17 06:05:28 +00:00
jdolecek 12aa43b8b1 Don't try to be too smart about chunking - if the data size is bigger
than PIPE_CHUNK_SIZE, just transfer first PIPE_CHUNK_SIZE and return short
write, expecting the caller to call us again later (if they need). Previous
behaviour (besides being wrong for O_NONBLOCK reads) hung hbench under some
circumstances and other applications may have similar expectations as hbench.
This might also fix port-vax/13333 by Manuel Bowyer.

Other changes to pipe_direct_write() include:
* return short write (and success) on EOF if any data were already read;
  we return EPIPE on next write(2) call
* simplify error handling, actually handle uvm_loan() failure correctly,
  call pipe_loan_free() on error explicitly and only call uvm_unloan()
  if the address space was _not_ already freed by pipe_loan_free()
  Thanks Chuck Silvers for uvm_unloan() hints :)

Fallthough to common write in pipe_write() if pipe_direct_write()
returns ENOMEM, otherwise always break out immediatelly.
Use uvm_km_valloc_wait() instead uvm_km_valloc() in pipe_loan_alloc().
2001-07-02 20:43:39 +00:00
jdolecek 82ce96aaec Don't include opt_new_pipe.h, it's not needed here 2001-06-21 18:59:51 +00:00
jdolecek ad2b5880f0 Oops, fell into rpipe/wpipe trap:
The end we want to do selwakeup() on is not necessarily same as the one
we send SIGIO to. Make pipeselwakeup() accept two parameters and update
callers accordingly. This change fixes behaviour for code, which does
select(2)s on the write end waiting for reader (watched on gv, the problem
manifestated itself as a too long delay before the document was displayed).

Clearly separate the resource free code for FreeBSD
and NetBSD case in pipeclose(), so that it's a bit clearer what's going on.
Also LK_DRAIN the lock before the memory is returned to pipe_pool.

Add missing wakeup() in pipe_write() for PIPE_WANTCLOSE case.
2001-06-21 18:46:22 +00:00
jdolecek ee882e3a09 Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
	for now only, the code would be updated to take advantage of it
	eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
  is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.
2001-06-16 12:00:02 +00:00
jdolecek 664cf935c7 Import FreeBSD sys_pipe.c rev 1.82 for reference (this was used as a base
for the NetBSD port).
2001-06-16 09:21:34 +00:00