Commit Graph

71 Commits

Author SHA1 Message Date
thorpej b2cc5a4c03 Make use of page loaning for large socket writes the default. The
SOSEND_NO_LOAN option can be used to go back to the old behavior.
2002-08-21 05:13:36 +00:00
thorpej 2807c6789c Rename SB_UPDATE_TAIL() to SB_EMPTY_FIXUP(), per suggestion from
Jonathan Stone.
2002-07-03 21:39:40 +00:00
thorpej 0585ce1489 Make insertion of data into socket buffers O(C):
* Keep pointers to the first and last mbufs of the last record in the
  socket buffer.
* Use the sb_lastrecord pointer in the sbappend*() family of functions
  to avoid traversing the packet chain to find the last record.
* Add a new sbappend_stream() function for stream protocols which
  guarantee that there will never be more than one record in the
  socket buffer.  This function uses the sb_mbtail pointer to perform
  the data insertion.  Make TCP use sbappend_stream().

On a profiling run, this makes sbappend of a TCP transmission using
a 1M socket buffer go from 50% of the time to .02% of the time.

Thanks to Bill Sommerfeld and YAMAMOTO Takashi for their debugging
assistance!
2002-07-03 19:06:47 +00:00
matt 91650be524 Fix 2 bugs with MSG_WAITALL. The first is to not block forever if one is
trying to MSG_PEEK for more than the socket can hold.  The second is that
before sleeping waiting for more data, upcall the protocol telling it you
have just received data so it can kick itself to re-fill the just drained
socket buffer.
2002-06-11 00:21:33 +00:00
he a8c83879a2 In soreceive(), if any part of a received record has been freed,
and an error occurs, make sure the socket doesn't retain a partial
copy by dropping the rest of the record.

This would otherwise trigger a panic("receive 1a") under DIAGNOSTIC.

Fixes PR#16990, suggested fix adapted.

Reviewed by Matt Thomas.
2002-06-10 20:43:16 +00:00
enami b42b2c8323 In soreceive(), don't call sopendfree() if MSG_DONTWAIT is set
since it may sleep.  nfsrv_rcv() tries to do its jobs in softintr
handler as far as possible.
2002-05-07 08:06:35 +00:00
thorpej 654768f185 Let the sosend_loan() path be selected at run-time; patch the variable
use_sosend_loan to enable/disable it.  The SOSEND_LOAN kernel option
now causes it to default to 1.
2002-05-03 00:35:14 +00:00
thorpej 7a49fee765 Add some experimental page-loaning for writes on sockets. It is disabled
by default, and can be enabled by adding the SOSEND_LOAN option to your
kernel config.  The SOSEND_COUNTERS option can be used to provide some
instrumentation.

Use of this option, combined with an application that does large enough
writes, gets us zero-copy on the TCP and UDP transmit path.
2002-05-02 17:55:48 +00:00
matt 2bf9358fc0 Don't use the tqh_ field names, instead use the correspond TAILQ_* macro. 2002-04-06 08:04:17 +00:00
thorpej a180cee23b Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map).  Try to deal with this:

* Group all information about the backend allocator for a pool in a
  separate structure.  The pool references this structure, rather than
  the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
  to become available, but will still fail if it cannot callocate KVA
  space for the pages.  If this happens, carefully drain all pools using
  the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
  some pages, and use that information to make draining easier and more
  efficient.
* Get rid of PR_URGENT.  There was only one use of it, and it could be
  dealt with by the caller.

From art@openbsd.org.
2002-03-08 20:48:27 +00:00
mrg d6328a8778 fix previous: actually remove the COMPAT_SUNOS code, not just #if 0 it. 2002-01-03 01:16:02 +00:00
mrg af640de164 move the COMPAT_SUNOS SO_BROADCAST hack out of uipc_socket.c into the
compat/sunos code.  besides being cleaner this allows the sunos LKM
to properly work without any special kernel hacks.
2002-01-03 00:59:00 +00:00
lukem adc783d537 add RCSIDs 2001-11-12 15:25:01 +00:00
jdolecek 560e3c342e Use lmin() instead of min(), and long for mlen & clen, to avoid integer
overflow on LP64 architectures. This fixes kern/10070 by Juergen Weiss.

Fix tested on NetBSD/alpha by Bernd Ernesti, on NetBSD/sparc64
by David Brownlee and Eduardo Horvath.
2001-09-29 14:16:19 +00:00
jdolecek a7357fecf4 soreceive(): do not ignore uiomove() error
Problem reported and fix provided by Aaro Koskinen in kern/11692.
2001-09-17 18:59:29 +00:00
thorpej bf2dcec4f5 Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.
2001-04-13 23:29:55 +00:00
thorpej 20fe4e2d96 Add a protosw flag, PR_ABRTACPTDIS (Abort on Accept of Disconnected
Socket), and add it to the protocols that use that behavior (all
PR_LISTEN protocols except for PF_LOCAL stream sockets).
2001-03-21 19:22:27 +00:00
lukem de1c2690b6 convert to ANSI KNF 2001-02-27 05:19:13 +00:00
itojun d1507261c4 return ECONNABORTED, if the socket (tcp connection for example)
is disconnected by RST right before accept(2).  fixes PR 10698/12027.
checked with SUSv2, XNET 5.2, and Stevens (unix network programming
vol 1 2nd ed) section 5.11.
2001-02-07 12:20:43 +00:00
itojun 6e24d735f0 when the peer is disconnected before accept(2) is issued,
do not return junk data in mbuf (= sockaddr on accept(2)'s 2nd arg).
set the length zero.

behavior checked with bsdi and freebsd.
partial solution to PR 12027 and 10698 (need more investigation).
2001-01-22 18:14:11 +00:00
fvdl 405b695086 Make sobind() take a struct proc *. It already took curproc and
passed it down to the appropriate usrreq function, and this
allows usage for contexts that need to be explicitly different
from curproc (like in the NFS code when binding to a reserved port).
2000-12-10 23:16:28 +00:00
augustss 264f1d27c6 Get rid of register declarations. 2000-03-30 09:27:11 +00:00
jonathan b19c0fbb0a Make kernel SOMAXCONN patchable. Will add sysctl once we
decide on namespace.
2000-02-07 18:43:26 +00:00
thorpej 84380f9fbe In sosend(), if so_error is set, clear it before returning the error to
the process (i.e. pre-Reno behavior).  The 4.4BSD behavior (introduced
in Reno) caused transient errors to stick incorrectly.

This is from PR #7640 (Havard Eidnes), cross-checked w/ FreeBSD, where
Bill Fenner committed the same fix (as described in a comment in the
Vat sources, by Van Jacobsen).
1999-06-08 02:39:57 +00:00
sommerfeld 6c63af182f Delete test code. 1999-05-15 22:37:22 +00:00
sommerfeld c01c0d9453 Revise previous fix:
1) protect socket flags under splsoftnet()
	2) avoid leaking memory on an error
1999-05-15 22:36:34 +00:00
tv fc3f28c6bd Wow, that was much easier than I originally thought. Fix PR kern/7583:
serious race condition in sosend().  Upon closer inspection, the appropriate
flags are checked within splsoftnet() for soreceive(), so no change needed
there.  Also a little KNFing in sosend().
1999-05-15 16:42:48 +00:00
lukem 8a931fcdd8 Ensure that you can only bind a more specific address when it is done by the
same uid or by root.

This code is from FreeBSD. (Whilst it was originally obtained from OpenBSD,
FreeBSD fixed it to work with multicast. To quote the commit message:
    - Don't bother checking for conflicting sockets if we're binding to a
      multicast address.
    - Don't return an error if we're binding to INADDR_ANY, the conflicting
      socket is bound to INADDR_ANY, and the conflicting socket has
      SO_REUSEPORT set.
)
1999-03-23 10:45:37 +00:00
mycroft 808496666c Do remove sockets on so_q0, since select(2) and accept(2) do not (currently?)
return them.
1999-01-21 22:09:10 +00:00
mycroft 0fb75f560a Oops; previous was slightly broken. 1999-01-20 20:24:12 +00:00
mycroft 430ecf369d Do not remove sockets from the accept(2) queue on close. 1999-01-20 09:15:41 +00:00
thorpej 2ef3bcfbb8 In the sosend() loop, if the residual count is > 0 before calling PRU_SEND,
set SS_MORETOCOME as a hint to the lower layer that more data is coming
on the next iteration of the loop.  Clear the flag after the PRU_SEND
call.

Suggested by Justin Walker <justin@apple.com> on the freebsd-net
mailing list.
1998-12-16 00:26:10 +00:00
matt f0071e56cf Fix spl problem in socreate (which lead to the corruption of the
socket pool).
1998-09-25 23:32:27 +00:00
perry 275d1554aa Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) ->  memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
   bcmp(x, y, z) ->  memcmp(x, y, z)
  bzero(x, y)    ->  memset(x, 0, y)
1998-08-04 04:03:10 +00:00
thorpej a4c7bab10e Use the pool allocator for sockets. 1998-08-02 04:53:11 +00:00
perry 730baa7431 fix sizeofs so they comply with the KNF style guide. yes, it is pedantic. 1998-07-31 22:50:48 +00:00
thorpej 8aee7782f5 defopt COMPAT_SUNOS 1998-06-25 23:40:33 +00:00
kleink 82eb51ee05 In soshutdown(), decouple the evaluation of the `how' argument from FREAD
and FWRITE; use SHUT_{RD,WR,RDWR} instead.
Also, return EINVAL if `how' is invalid.
1998-04-27 13:31:45 +00:00
matt 754c43dcfc Hook for 0-copy (or other optimized) sends and receives 1998-04-25 17:35:18 +00:00
fvdl e5bc90f40c Merge with Lite2 + local changes 1998-03-01 02:20:01 +00:00
thorpej cbf3cc6bb8 Make insertion and removal of sockets from the partial and incoming
connections queues O(C) rather than O(N).
1998-01-07 23:47:08 +00:00
thorpej 2e85747e9e From 4.4BSD-Lite2 (noted by Frank van der Linden):
so_linger is used as an argument to tsleep(), so was stuffed with
clockticks for the TCP linger time.  However, so_linger is set directly from
l_linger if the linger time is specified, and l_linger is seconds (although
this is not currently documented anywhere).  Fix this to set the TCP
linger time in seconds, and multiply so_linger by hz when tsleep() is
called to actually perform the linger.
1998-01-05 09:12:29 +00:00
mycroft f31ed493f7 Fix a mbuf leak in sosend() when we have a negative residual count. 1997-08-27 07:10:01 +00:00
thorpej cd730bdd50 In sosetopt():
- Disallow < 1 values for SO_SNDBUF, SO_RCVBUF, SO_SNDLOWAT, and
  SO_RCVLOWAT; return EINVAL if the user attempts to set <= 0.
  Inspired by PR #3770, from Havard Eidnes <he@vader.runit.sintef.no>.
- For SO_SNDLOWAT and SO_RCVLOWAT, don't let the low-water mark get
  set above the high-water mark.  Behavior is now consistent with
  BSD/OS: If such an attempt is made, silently truncate to the high-water
  value.
1997-06-24 20:04:45 +00:00
kleink 372bfc7c08 Calculate returned timeval correctly when using SO_SNDTIMEO/SO_RCVTIME;
from Koji Imada <koji@math.human.nagoya-u.ac.jp> in PR/3682.
1997-06-11 10:04:09 +00:00
thorpej 62dcdcfb89 Implement SO_TIMESTAMP socket option: receive a timeval timestamp
as a control message with a datagram.
1997-01-11 05:15:01 +00:00
explorer a9ef8aef84 This fixes a nasty little bug where traceroute (and other raw-ip sending
programs which attach their own header) can crash the machine.  The problem
in this case was:
	a variable "space" was set to the total data to copy,
	len was used to remember how much to copy in this chunk (mbuf),
	in one case, len = min(MCLBYTES - max_hdr, resid) but
		size -= MCLBYTES;
	 instead of
		size -= len;

Note that userland programs can still crash the machine by providing
bogus data in the ip->ip_len field I suspect.  I haven't verified this,
but will soon be doing so and applying a fix of some sort.  Probably
clamping the ip->ip_len value to the true packet size will be ok.
1996-08-14 05:53:18 +00:00
mycroft 08cc6b486f And PRU_SEND. 1996-05-22 19:06:07 +00:00
mycroft b85e5d8f5e PRU_CONNECT also needs a proc pointer. 1996-05-22 19:00:52 +00:00
mycroft 49d52c9b1c Pass a proc pointer down to the usrreq and pcbbind functions for PRU_ATTACH, PRU_BIND and
PRU_CONTROL.  The usrreq interface really needs to be split up, but this will have to wait.
Remove SS_PRIV completely.
1996-05-22 13:54:55 +00:00