Commit Graph

133 Commits

Author SHA1 Message Date
fvdl
d5aece61d6 Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
2003-06-29 22:28:00 +00:00
darrenr
960df3c8d1 Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records.  The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
2003-06-28 14:20:43 +00:00
martin
d505b18964 Make sure to include opt_foo.h if a defflag option FOO is used. 2003-06-23 11:00:59 +00:00
yamt
b2479414be export some of sosend loan routines for nfsd. 2003-05-03 17:53:17 +00:00
thorpej
f775de9f61 * Use a pool_cache constructor to record the physical address of mbufs
in the mbuf header.
* Use the new cached paddr feature of the pool_cache API to record
  the physical address of mbuf clusters.  (We cannot use a ctor for
  clusters, since clusters have no constructed form; they are merely
  buffers).

Bus_dma back-ends may use the cached physical addresses to save having to
extract the physical address from virtual.

* Provide space in m_ext recording the vm_page *'s for an SOSEND_LOAN_CHUNK-
  sized non-cluster external buffer.  Use this in the sosend_loan code to
  save having to extract the physical address from virtual and then look
  up the vm_page *'s.

* Provide an indication that an external buffer is mapped read-only at
  the MMU.  Set this flag for the external buffer in the sosend_loan
  case, since loaned pages are always mapped read-only.  Bus_dma back-ends
  may use this information to save cache flushing, since a cache flush of
  a read-only mapping is redundant on some architectures (the cache would
  have already been flushed when making the mapping read-only).

Part 2 in a series of simple patches contributed by Wasabi Systems
to improve network performance.
2003-04-09 18:38:01 +00:00
matt
65e5548a17 Add MBUFTRACE kernel option.
Do a little mbuf rework while here.  Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *).  These are not performance critical and making them
call m_get saves considerable space.  Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
2003-02-26 06:31:08 +00:00
thorpej
b193480908 Add extensible malloc types, adapted from FreeBSD. This turns
malloc types into a structure, a pointer to which is passed around,
instead of an int constant.  Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.
2003-02-01 06:23:35 +00:00
thorpej
515d52e9e7 Change ext_size to a size_t, and update the signature of ext_free. 2003-01-31 05:00:24 +00:00
itojun
ae1b88aa21 "tv->tv_sec * hz" could overflow a long. millert@openbsd 2002-11-27 04:07:42 +00:00
itojun
dfd721e53e small SO_RCVTIMEO values are mistakenly taken to be zero. FreeBSD PR kern/32827. 2002-11-27 03:36:04 +00:00
christos
e22906f6d0 si_ -> sel_ to avoid conflicts with siginfo. 2002-11-26 18:44:34 +00:00
jdolecek
e0cc03a09b merge kqueue branch into -current
kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
2002-10-23 09:10:23 +00:00
thorpej
b2cc5a4c03 Make use of page loaning for large socket writes the default. The
SOSEND_NO_LOAN option can be used to go back to the old behavior.
2002-08-21 05:13:36 +00:00
thorpej
2807c6789c Rename SB_UPDATE_TAIL() to SB_EMPTY_FIXUP(), per suggestion from
Jonathan Stone.
2002-07-03 21:39:40 +00:00
thorpej
0585ce1489 Make insertion of data into socket buffers O(C):
* Keep pointers to the first and last mbufs of the last record in the
  socket buffer.
* Use the sb_lastrecord pointer in the sbappend*() family of functions
  to avoid traversing the packet chain to find the last record.
* Add a new sbappend_stream() function for stream protocols which
  guarantee that there will never be more than one record in the
  socket buffer.  This function uses the sb_mbtail pointer to perform
  the data insertion.  Make TCP use sbappend_stream().

On a profiling run, this makes sbappend of a TCP transmission using
a 1M socket buffer go from 50% of the time to .02% of the time.

Thanks to Bill Sommerfeld and YAMAMOTO Takashi for their debugging
assistance!
2002-07-03 19:06:47 +00:00
matt
91650be524 Fix 2 bugs with MSG_WAITALL. The first is to not block forever if one is
trying to MSG_PEEK for more than the socket can hold.  The second is that
before sleeping waiting for more data, upcall the protocol telling it you
have just received data so it can kick itself to re-fill the just drained
socket buffer.
2002-06-11 00:21:33 +00:00
he
a8c83879a2 In soreceive(), if any part of a received record has been freed,
and an error occurs, make sure the socket doesn't retain a partial
copy by dropping the rest of the record.

This would otherwise trigger a panic("receive 1a") under DIAGNOSTIC.

Fixes PR#16990, suggested fix adapted.

Reviewed by Matt Thomas.
2002-06-10 20:43:16 +00:00
enami
b42b2c8323 In soreceive(), don't call sopendfree() if MSG_DONTWAIT is set
since it may sleep.  nfsrv_rcv() tries to do its jobs in softintr
handler as far as possible.
2002-05-07 08:06:35 +00:00
thorpej
654768f185 Let the sosend_loan() path be selected at run-time; patch the variable
use_sosend_loan to enable/disable it.  The SOSEND_LOAN kernel option
now causes it to default to 1.
2002-05-03 00:35:14 +00:00
thorpej
7a49fee765 Add some experimental page-loaning for writes on sockets. It is disabled
by default, and can be enabled by adding the SOSEND_LOAN option to your
kernel config.  The SOSEND_COUNTERS option can be used to provide some
instrumentation.

Use of this option, combined with an application that does large enough
writes, gets us zero-copy on the TCP and UDP transmit path.
2002-05-02 17:55:48 +00:00
matt
2bf9358fc0 Don't use the tqh_ field names, instead use the correspond TAILQ_* macro. 2002-04-06 08:04:17 +00:00
thorpej
a180cee23b Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map).  Try to deal with this:

* Group all information about the backend allocator for a pool in a
  separate structure.  The pool references this structure, rather than
  the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
  to become available, but will still fail if it cannot callocate KVA
  space for the pages.  If this happens, carefully drain all pools using
  the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
  some pages, and use that information to make draining easier and more
  efficient.
* Get rid of PR_URGENT.  There was only one use of it, and it could be
  dealt with by the caller.

From art@openbsd.org.
2002-03-08 20:48:27 +00:00
mrg
d6328a8778 fix previous: actually remove the COMPAT_SUNOS code, not just #if 0 it. 2002-01-03 01:16:02 +00:00
mrg
af640de164 move the COMPAT_SUNOS SO_BROADCAST hack out of uipc_socket.c into the
compat/sunos code.  besides being cleaner this allows the sunos LKM
to properly work without any special kernel hacks.
2002-01-03 00:59:00 +00:00
lukem
adc783d537 add RCSIDs 2001-11-12 15:25:01 +00:00
jdolecek
560e3c342e Use lmin() instead of min(), and long for mlen & clen, to avoid integer
overflow on LP64 architectures. This fixes kern/10070 by Juergen Weiss.

Fix tested on NetBSD/alpha by Bernd Ernesti, on NetBSD/sparc64
by David Brownlee and Eduardo Horvath.
2001-09-29 14:16:19 +00:00
jdolecek
a7357fecf4 soreceive(): do not ignore uiomove() error
Problem reported and fix provided by Aaro Koskinen in kern/11692.
2001-09-17 18:59:29 +00:00
thorpej
bf2dcec4f5 Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.
2001-04-13 23:29:55 +00:00
thorpej
20fe4e2d96 Add a protosw flag, PR_ABRTACPTDIS (Abort on Accept of Disconnected
Socket), and add it to the protocols that use that behavior (all
PR_LISTEN protocols except for PF_LOCAL stream sockets).
2001-03-21 19:22:27 +00:00
lukem
de1c2690b6 convert to ANSI KNF 2001-02-27 05:19:13 +00:00
itojun
d1507261c4 return ECONNABORTED, if the socket (tcp connection for example)
is disconnected by RST right before accept(2).  fixes PR 10698/12027.
checked with SUSv2, XNET 5.2, and Stevens (unix network programming
vol 1 2nd ed) section 5.11.
2001-02-07 12:20:43 +00:00
itojun
6e24d735f0 when the peer is disconnected before accept(2) is issued,
do not return junk data in mbuf (= sockaddr on accept(2)'s 2nd arg).
set the length zero.

behavior checked with bsdi and freebsd.
partial solution to PR 12027 and 10698 (need more investigation).
2001-01-22 18:14:11 +00:00
fvdl
405b695086 Make sobind() take a struct proc *. It already took curproc and
passed it down to the appropriate usrreq function, and this
allows usage for contexts that need to be explicitly different
from curproc (like in the NFS code when binding to a reserved port).
2000-12-10 23:16:28 +00:00
augustss
264f1d27c6 Get rid of register declarations. 2000-03-30 09:27:11 +00:00
jonathan
b19c0fbb0a Make kernel SOMAXCONN patchable. Will add sysctl once we
decide on namespace.
2000-02-07 18:43:26 +00:00
thorpej
84380f9fbe In sosend(), if so_error is set, clear it before returning the error to
the process (i.e. pre-Reno behavior).  The 4.4BSD behavior (introduced
in Reno) caused transient errors to stick incorrectly.

This is from PR #7640 (Havard Eidnes), cross-checked w/ FreeBSD, where
Bill Fenner committed the same fix (as described in a comment in the
Vat sources, by Van Jacobsen).
1999-06-08 02:39:57 +00:00
sommerfeld
6c63af182f Delete test code. 1999-05-15 22:37:22 +00:00
sommerfeld
c01c0d9453 Revise previous fix:
1) protect socket flags under splsoftnet()
	2) avoid leaking memory on an error
1999-05-15 22:36:34 +00:00
tv
fc3f28c6bd Wow, that was much easier than I originally thought. Fix PR kern/7583:
serious race condition in sosend().  Upon closer inspection, the appropriate
flags are checked within splsoftnet() for soreceive(), so no change needed
there.  Also a little KNFing in sosend().
1999-05-15 16:42:48 +00:00
lukem
8a931fcdd8 Ensure that you can only bind a more specific address when it is done by the
same uid or by root.

This code is from FreeBSD. (Whilst it was originally obtained from OpenBSD,
FreeBSD fixed it to work with multicast. To quote the commit message:
    - Don't bother checking for conflicting sockets if we're binding to a
      multicast address.
    - Don't return an error if we're binding to INADDR_ANY, the conflicting
      socket is bound to INADDR_ANY, and the conflicting socket has
      SO_REUSEPORT set.
)
1999-03-23 10:45:37 +00:00
mycroft
808496666c Do remove sockets on so_q0, since select(2) and accept(2) do not (currently?)
return them.
1999-01-21 22:09:10 +00:00
mycroft
0fb75f560a Oops; previous was slightly broken. 1999-01-20 20:24:12 +00:00
mycroft
430ecf369d Do not remove sockets from the accept(2) queue on close. 1999-01-20 09:15:41 +00:00
thorpej
2ef3bcfbb8 In the sosend() loop, if the residual count is > 0 before calling PRU_SEND,
set SS_MORETOCOME as a hint to the lower layer that more data is coming
on the next iteration of the loop.  Clear the flag after the PRU_SEND
call.

Suggested by Justin Walker <justin@apple.com> on the freebsd-net
mailing list.
1998-12-16 00:26:10 +00:00
matt
f0071e56cf Fix spl problem in socreate (which lead to the corruption of the
socket pool).
1998-09-25 23:32:27 +00:00
perry
275d1554aa Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) ->  memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
   bcmp(x, y, z) ->  memcmp(x, y, z)
  bzero(x, y)    ->  memset(x, 0, y)
1998-08-04 04:03:10 +00:00
thorpej
a4c7bab10e Use the pool allocator for sockets. 1998-08-02 04:53:11 +00:00
perry
730baa7431 fix sizeofs so they comply with the KNF style guide. yes, it is pedantic. 1998-07-31 22:50:48 +00:00
thorpej
8aee7782f5 defopt COMPAT_SUNOS 1998-06-25 23:40:33 +00:00
kleink
82eb51ee05 In soshutdown(), decouple the evaluation of the `how' argument from FREAD
and FWRITE; use SHUT_{RD,WR,RDWR} instead.
Also, return EINVAL if `how' is invalid.
1998-04-27 13:31:45 +00:00
matt
754c43dcfc Hook for 0-copy (or other optimized) sends and receives 1998-04-25 17:35:18 +00:00
fvdl
e5bc90f40c Merge with Lite2 + local changes 1998-03-01 02:20:01 +00:00
thorpej
cbf3cc6bb8 Make insertion and removal of sockets from the partial and incoming
connections queues O(C) rather than O(N).
1998-01-07 23:47:08 +00:00
thorpej
2e85747e9e From 4.4BSD-Lite2 (noted by Frank van der Linden):
so_linger is used as an argument to tsleep(), so was stuffed with
clockticks for the TCP linger time.  However, so_linger is set directly from
l_linger if the linger time is specified, and l_linger is seconds (although
this is not currently documented anywhere).  Fix this to set the TCP
linger time in seconds, and multiply so_linger by hz when tsleep() is
called to actually perform the linger.
1998-01-05 09:12:29 +00:00
mycroft
f31ed493f7 Fix a mbuf leak in sosend() when we have a negative residual count. 1997-08-27 07:10:01 +00:00
thorpej
cd730bdd50 In sosetopt():
- Disallow < 1 values for SO_SNDBUF, SO_RCVBUF, SO_SNDLOWAT, and
  SO_RCVLOWAT; return EINVAL if the user attempts to set <= 0.
  Inspired by PR #3770, from Havard Eidnes <he@vader.runit.sintef.no>.
- For SO_SNDLOWAT and SO_RCVLOWAT, don't let the low-water mark get
  set above the high-water mark.  Behavior is now consistent with
  BSD/OS: If such an attempt is made, silently truncate to the high-water
  value.
1997-06-24 20:04:45 +00:00
kleink
372bfc7c08 Calculate returned timeval correctly when using SO_SNDTIMEO/SO_RCVTIME;
from Koji Imada <koji@math.human.nagoya-u.ac.jp> in PR/3682.
1997-06-11 10:04:09 +00:00
thorpej
62dcdcfb89 Implement SO_TIMESTAMP socket option: receive a timeval timestamp
as a control message with a datagram.
1997-01-11 05:15:01 +00:00
explorer
a9ef8aef84 This fixes a nasty little bug where traceroute (and other raw-ip sending
programs which attach their own header) can crash the machine.  The problem
in this case was:
	a variable "space" was set to the total data to copy,
	len was used to remember how much to copy in this chunk (mbuf),
	in one case, len = min(MCLBYTES - max_hdr, resid) but
		size -= MCLBYTES;
	 instead of
		size -= len;

Note that userland programs can still crash the machine by providing
bogus data in the ip->ip_len field I suspect.  I haven't verified this,
but will soon be doing so and applying a fix of some sort.  Probably
clamping the ip->ip_len value to the true packet size will be ok.
1996-08-14 05:53:18 +00:00
mycroft
08cc6b486f And PRU_SEND. 1996-05-22 19:06:07 +00:00
mycroft
b85e5d8f5e PRU_CONNECT also needs a proc pointer. 1996-05-22 19:00:52 +00:00
mycroft
49d52c9b1c Pass a proc pointer down to the usrreq and pcbbind functions for PRU_ATTACH, PRU_BIND and
PRU_CONTROL.  The usrreq interface really needs to be split up, but this will have to wait.
Remove SS_PRIV completely.
1996-05-22 13:54:55 +00:00
christos
e630447d8c First pass at prototyping 1996-02-04 02:17:43 +00:00
mycroft
5482957905 splnet --> splsoftnet 1995-08-12 23:59:09 +00:00
cgd
7e68171a95 properly determine if send/rcv timeout values are out of range. 1995-05-23 00:19:30 +00:00
christos
3d1b06ab09 - new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.
1995-04-22 19:42:47 +00:00
cgd
6ac2bbfc35 be more careful with types, also pull in headers where necessary. 1994-10-30 21:43:03 +00:00
cgd
cf92afd66e New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD' 1994-06-29 06:29:24 +00:00
mycroft
d361acde18 Update to 4.4-Lite networking code, with a few local changes. 1994-05-13 06:01:27 +00:00
mycroft
f8a6ac17b2 More return types... 1994-05-04 11:24:06 +00:00
mycroft
f43af3a756 Remove another bit of that. 1994-04-25 08:47:50 +00:00
mycroft
ccb0412b7c Remove a piece of the previous patch. 1994-04-25 08:41:03 +00:00
mycroft
e4af8f69a5 Minor cleanup. 1994-04-25 08:22:07 +00:00
deraadt
abf6a6bfdd more COMPAT_SUNOS changes. 1994-01-23 06:06:21 +00:00
mycroft
7f50bd1829 Canonicalize all #includes. 1993-12-18 04:21:37 +00:00
cgd
8068dd9add fix from david greenman, davidg@freefall.cdrom.com:
sosend was attempting to reserve space in an mbuf cluster for a datagram
header and because of bugs in the sosend's mbuf allocation algorithm,
sosend was calling uiomove twice as many times as was necessary. It turns
out that PREPEND does the right thing when a cluster is associated with
an mbuf header, so the datagram header allocation can be defered. This
also ends up additionally consuming one less mbuf for the TCP protocol
because TCP always allocates another header mbuf regardless if space is
available to prepend the protocol header. The net result of this fix is
that unix domain and pipe throughput is increased by a measured 10%.
1993-11-05 23:00:27 +00:00
cgd
299ff91b14 BSDI official patch #14:
SUMMARY:
    Here is a patch for a kernel hang that can be provoked with a write
    or send of a negative amount.  The talk program is capable of exercising
    this bug.  This patch also includes a fix for a bug that caused data
    to be delivered to TCP in smaller chunks than desired, and which caused
    TCP to send a short packet when starting up.  Finally, there is a bug
    fix for MSG_PEEK with an oobmark pending.
1993-10-26 22:36:25 +00:00
mycroft
64540d3533 Patch from David Greenman to reduce CPU usage during network transmit. 1993-09-08 21:12:49 +00:00
mycroft
bbc8c11fd5 Nuke an extra `||' Chris inserted. 1993-08-03 02:45:20 +00:00
cgd
d6b4910ac2 fix from Garrett Wollman <wollman@emba.uvm.edu> to return EPROTONOTSUPP
if user tries to get a socket for a protocol with no usrreq function
1993-08-03 01:36:10 +00:00
andrew
d46fb2c3fb * ansifications
* Yuval Yarom's socket recv(2) fixes, to prevent incorrect blocking and
  lack thereof with recv(2) and MSG_WAITALL.  Fixes a sbdrop() panic during
  some MSG_WAITALL recv(2) sleeps.  Access rights fix (also in
  uipc_syscalls.c) too.  A test program which shows these problems is
  available.
1993-06-27 06:08:15 +00:00
cgd
8d6c77881c make kernel select interface be one-stop shopping & clean it all up. 1993-05-18 18:18:40 +00:00
cgd
61f282557f initial import of 386bsd-0.1 sources 1993-03-21 09:45:37 +00:00