Commit Graph

112 Commits

Author SHA1 Message Date
rtr ce6a5ff64f revert the removal of struct lwp * parameter from bind, listen and connect
user requests.

this should resolve the issue relating to nfs client hangs presented
recently by wiz on current-users@
2014-08-05 05:24:26 +00:00
rtr 892163b8e9 split PRU_DISCONNECT, PRU_SHUTDOWN and PRU_ABORT function out of
pr_generic() usrreq switches and put into separate functions

   xxx_disconnect(struct socket *)
   xxx_shutdown(struct socket *)
   xxx_abort(struct socket *)

   - always KASSERT(solocked(so)) even if not implemented
   - replace calls to pr_generic() with req =
PRU_{DISCONNECT,SHUTDOWN,ABORT}
     with calls to pr_{disconnect,shutdown,abort}() respectively

rename existing internal functions used to implement above functionality
to permit use of the names for xxx_{disconnect,shutdown,abort}().

   - {l2cap,sco,rfcomm}_disconnect() ->
{l2cap,sco,rfcomm}_disconnect_pcb()
   - {unp,rip,tcp}_disconnect() -> {unp,rip,tcp}_disconnect1()
   - unp_shutdown() -> unp_shutdown1()

patch reviewed by rmind
2014-07-31 03:39:35 +00:00
rtr ad6ae402db split PRU_CONNECT function out of pr_generic() usrreq switches and put
into seaparate functions

  xxx_listen(struct socket *, struct mbuf *)

  - always KASSERT(solocked(so)) and KASSERT(nam != NULL)
  - replace calls to pr_generic() with req = PRU_CONNECT with
    pr_connect()
  - rename existin {l2cap,sco,rfcomm}_connect() to
    {l2cap,sco,rfcomm}_connect_pcb() respectively to permit
    naming consistency with other protocols functions.
  - drop struct lwp * parameter from unp_connect() and at_pcbconnect()
    and use curlwp instead where appropriate.

patch reviewed by rmind
2014-07-30 10:04:25 +00:00
rtr 6dd8eef044 split PRU_BIND and PRU_LISTEN function out of pr_generic() usrreq
switches and put into separate functions
  xxx_bind(struct socket *, struct mbuf *)
  xxx_listen(struct socket *)

  - always KASSERT(solocked(so)) even if not implemented

  - replace calls to pr_generic() with req = PRU_BIND with call to
    pr_bind()

  - replace calls to pr_generic() with req = PRU_LISTEN with call to
    pr_listen()

  - drop struct lwp * parameter from at_pcbsetaddr(), in_pcbbind() and
    unp_bind() and always use curlwp.

rename existing functions that operate on PCB for consistency (and to
free up their names for xxx_{bind,listen}() PRUs

  - l2cap_{bind,listen}() -> l2cap_{bind,listen}_pcb()
  - sco_{bind,listen}() -> sco_{bind,listen}_pcb()
  - rfcomm_{bind,listen}() -> rfcomm_{bind,listen}_pcb()

patch reviewed by rmind

welcome to netbsd 6.99.48
2014-07-24 15:12:03 +00:00
rtr 35b22fa96a split PRU_SENDOOB and PRU_RCVOOB function out of pr_generic() usrreq
switches and put into separate functions
  xxx_sendoob(struct socket *, struct mbuf *, struct mbuf *)
  xxx_recvoob(struct socket *, struct mbuf *, int)

  - always KASSERT(solocked(so)) even if request is not implemented

  - replace calls to pr_generic() with req = PRU_{SEND,RCV}OOB with
    calls to pr_{send,recv}oob() respectively.

there is still some tweaking of m_freem(m) and m_freem(control) to come
for consistency.  not performed with this commit for clarity.

reviewed by rmind
2014-07-23 13:17:18 +00:00
rtr d27b133d27 * split PRU_ACCEPT function out of pr_generic() usrreq switches and put
into a separate function xxx_accept(struct socket *, struct mbuf *)

note: future cleanup will take place to remove struct mbuf parameter
type and replace it with a more appropriate type.

patch reviewed by rmind
2014-07-09 14:41:42 +00:00
rtr d575eb5454 * split PRU_PEERADDR and PRU_SOCKADDR function out of pr_generic()
usrreq switches and put into separate functions
  xxx_{peer,sock}addr(struct socket *, struct mbuf *).

    - KASSERT(solocked(so)) always in new functions even if request
      is not implemented

    - KASSERT(pcb != NULL) and KASSERT(nam) if the request is
      implemented and not for tcp.

* for tcp roll #ifdef KPROF and #ifdef DEBUG code from tcp_usrreq() into
  easier to cut & paste functions tcp_debug_capture() and
tcp_debug_trace()

    - functions provided by rmind
    - remaining use of PRU_{PEER,SOCK}ADDR #define to be removed in a
      future commit.

* rename netbt functions to permit consistency of pru function names
  (as has been done with other requests already split out).

    - l2cap_{peer,sock}addr()  -> l2cap_{peer,sock}_addr_pcb()
    - rfcomm_{peer,sock}addr() -> rfcomm_{peer,sock}_addr_pcb()
    - sco_{peer,sock}addr()    -> sco_{peer,sock}_addr_pcb()

* split/refactor do_sys_getsockname(lwp, fd, which, nam) into
  two functions do_sys_get{peer,sock}name(fd, nam).

    - move PRU_PEERADDR handling into do_sys_getpeername() from
      do_sys_getsockname()
    - have svr4_stream directly call do_sys_get{sock,peer}name()
      respectively instead of providing `which' & fix a DPRINTF string
      that incorrectly wrote "getpeername" when it meant "getsockname"
    - fix sys_getpeername() and sys_getsockname() to call
      do_sys_get{sock,peer}name() without `which' and `lwp' & adjust
      comments
    - bump kernel version for removal of lwp & which parameters from
      do_sys_getsockname()

note: future cleanup to remove struct mbuf * abuse in
xxx_{peer,sock}name()
still to come, not done in this commit since it is easier to do post
split.

patch reviewed by rmind

welcome to 6.99.47
2014-07-09 04:54:03 +00:00
rtr ff90c29d04 * sprinkle KASSERT(solocked(so)); in all pr_stat() functions.
* fix remaining inconsistent struct socket parameter names.
2014-07-07 17:13:56 +00:00
rtr 909a1fc699 backout change that made pr_stat return EOPNOTSUPP for protocols that
were not filling in struct stat.

decision made after further discussion with rmind and investigation of
how other operating systems behave.  soo_stat() is doing just enough to
be able to call what gets returned valid and thus justifys a return of
success.

additional review will be done to determine of the pr_stat functions
that were already returning EOPNOTSUPP can be considered successful with
what soo_stat() is doing.
2014-07-07 15:13:21 +00:00
rtr 183fc9ab77 * have pr_stat return EOPNOTSUPP consistently for all protocols that do
not fill in struct stat instead of returning success.

* in pr_stat remove all checks for non-NULL so->so_pcb except where the
  pcb is actually used (i.e. cases where we don't return EOPNOTSUPP).

proposed on tech-net@
2014-07-07 07:09:58 +00:00
rtr a60320ca07 * split PRU_SENSE functionality out of xxx_usrreq() switches and place into
separate xxx_stat(struct socket *, struct stat *) functions.
* replace calls using pr_generic with req == PRU_SENSE with pr_stat().

further change will follow that cleans up the pattern used to extract the
pcb and test for its presence.

reviewed by rmind
2014-07-06 03:33:33 +00:00
rtr 0dedd9772f fix parameter types in pr_ioctl, called xx_control() functions and remove
abuse of pointer to struct mbuf type.

param2 changed to u_long type and uses parameter name 'cmd' (ioctl command)
param3 changed to void * type and uses parameter name 'data'
param4 changed to struct ifnet * and uses parameter name 'ifp'
param5 has been removed (formerly struct lwp *) and uses of 'l' have been
       replaced with curlwp from curproc(9).

callers have had (now unnecessary) casts to struct mbuf * removed, called
code has had (now unnecessary) casts to u_long, void * and struct ifnet *
respectively removed.

reviewed by rmind@
2014-07-01 05:49:18 +00:00
rtr c5cb349386 where appropriate rename xxx_ioctl() struct mbuf * parameters from
`control' to `ifp' after split from xxx_usrreq().

sys_socket.c
    fix wrapping of arguments to be consistent with other function calls
    in the file after replacing pr_usrreq() call with pr_ioctl() which
    required one less argument.

link_proto.c
    fix indentation of parameters in link_ioctl() prototype to be
    consistent with the rest of the file.

discussed with rmind@
2014-06-23 17:18:45 +00:00
rtr d54d7ab24a * split PRU_CONTROL functionality out of xxx_userreq() switches and place
into separate xxx_ioctl() functions.
* place KASSERT(req != PRU_CONTROL) inside xxx_userreq() as it is now
  inappropriate for req = PRU_CONTROL in xxx_userreq().
* replace calls to pr_generic() with req = PRU_CONTROL with pr_ioctl().
* remove & fixup references to PRU_CONTROL xxx_userreq() function comments.
* fix various comments references for xxx_userreq() that mentioned
  PRU_CONTROL as xxx_userreq() no longer handles the request.

a further change will follow to fix parameter and naming inconsistencies
retained from original code.

Reviewed by rmind@
2014-06-22 08:10:18 +00:00
christos 5d61e6c015 Introduce 2 new variables: ipsec_enabled and ipsec_used.
Ipsec enabled is controlled by sysctl and determines if is allowed.
ipsec_used is set automatically based on ipsec being enabled, and
rules existing.
2014-05-30 01:39:03 +00:00
rmind 9a6de984e5 Move udp6_input(), udp6_sendup(), udp6_realinput() and udp6_input_checksum()
from udp_usrreq.c to udp6_usrreq.c where they belong.  No functional change.
2014-05-22 22:56:53 +00:00
rmind e401453f3f Adjust PR_WRAP_USRREQS() to include the attach/detach functions.
We still need the kernel-lock for some corner cases.
2014-05-20 19:04:00 +00:00
rmind 4ae03c1815 - Split off PRU_ATTACH and PRU_DETACH logic into separate functions.
- Replace malloc with kmem and eliminate M_PCB while here.
- Sprinkle more asserts.
2014-05-19 02:51:24 +00:00
rmind 39bd8dee77 Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw.  Welcome to 6.99.62!
2014-05-18 14:46:15 +00:00
pooka 4f6fb3bf35 Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before
the sysctl link sets are processed, and remove redundancy.

Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate
lines of code.
2014-02-25 18:30:08 +00:00
pooka acb676442c Allow kernels compiled with INET+INET6 to be booted as IPv4-only or IPv6-only. 2014-01-02 18:29:01 +00:00
christos 40114b997c PR/46602: Move the rfc6056 port randomization to the IP layer. 2012-06-22 14:54:34 +00:00
christos 5ec72efbaa Add inet6 part of the rfc6056 code contributed by Vlad Balan as part of
Google SoC-2011
2011-09-24 17:22:14 +00:00
dyoung c2e43be1c5 Reduces the resources demanded by TCP sessions in TIME_WAIT-state using
methods called Vestigial Time-Wait (VTW) and Maximum Segment Lifetime
Truncation (MSLT).

MSLT and VTW were contributed by Coyote Point Systems, Inc.

Even after a TCP session enters the TIME_WAIT state, its corresponding
socket and protocol control blocks (PCBs) stick around until the TCP
Maximum Segment Lifetime (MSL) expires.  On a host whose workload
necessarily creates and closes down many TCP sockets, the sockets & PCBs
for TCP sessions in TIME_WAIT state amount to many megabytes of dead
weight in RAM.

Maximum Segment Lifetimes Truncation (MSLT) assigns each TCP session to
a class based on the nearness of the peer.  Corresponding to each class
is an MSL, and a session uses the MSL of its class.  The classes are
loopback (local host equals remote host), local (local host and remote
host are on the same link/subnet), and remote (local host and remote
host communicate via one or more gateways).  Classes corresponding to
nearer peers have lower MSLs by default: 2 seconds for loopback, 10
seconds for local, 60 seconds for remote.  Loopback and local sessions
expire more quickly when MSLT is used.

Vestigial Time-Wait (VTW) replaces a TIME_WAIT session's PCB/socket
dead weight with a compact representation of the session, called a
"vestigial PCB".  VTW data structures are designed to be very fast and
memory-efficient: for fast insertion and lookup of vestigial PCBs,
the PCBs are stored in a hash table that is designed to minimize the
number of cacheline visits per lookup/insertion.  The memory both
for vestigial PCBs and for elements of the PCB hashtable come from
fixed-size pools, and linked data structures exploit this to conserve
memory by representing references with a narrow index/offset from the
start of a pool instead of a pointer.  When space for new vestigial PCBs
runs out, VTW makes room by discarding old vestigial PCBs, oldest first.
VTW cooperates with MSLT.

It may help to think of VTW as a "FIN cache" by analogy to the SYN
cache.

A 2.8-GHz Pentium 4 running a test workload that creates TIME_WAIT
sessions as fast as it can is approximately 17% idle when VTW is active
versus 0% idle when VTW is inactive.  It has 103 megabytes more free RAM
when VTW is active (approximately 64k vestigial PCBs are created) than
when it is inactive.
2011-05-03 18:28:44 +00:00
pooka 11281f01a0 Replace a large number of link set based sysctl node creations with
calls from subsystem constructors.  Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL
2009-09-16 15:23:04 +00:00
cegger c363a9cb62 bzero -> memset 2009-03-18 16:00:08 +00:00
thorpej b129a80c20 Simplify the interface to netstat_sysctl() and allocate space for
the collated counters using kmem_alloc().

PR kern/38577
2008-05-04 07:22:14 +00:00
yamt fb7535aecb udp6_init: fix a comment. 2008-04-28 15:01:39 +00:00
ad 15e29e981b Merge the socket locking patch:
- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
2008-04-24 11:38:36 +00:00
thorpej 33326077b1 Use <net/net_stats.h> / netstat_sysctl(). 2008-04-23 05:26:50 +00:00
thorpej c2da059bc6 Make udp6 stats per-cpu. 2008-04-15 04:43:25 +00:00
matt 58bb9f6508 Convert to ansi definitions from old-style definitons.
Remember that func() is not ansi, func(void) is.
2008-02-27 19:54:27 +00:00
dyoung c03a973fd0 KNF. Remove superfluous parentheses. In the switch-statement,
consolidate all of the 'error = EOPNOTSUPP;' cases.  No functional
change intended.
2007-11-14 22:58:27 +00:00
dyoung 2bc0ce1e9b Take a clue from udp_usrreq(): block IPL_SOFTNET in udp6_usrreq(),
both while we purge an interface, and while we call udp6_output().

XXX udp6_usrreq() needs more attention.
2007-11-06 23:47:08 +00:00
dyoung 122b86e247 De-__P(). 2007-11-01 20:33:56 +00:00
christos 53524e44ef Kill caddr_t; there will be some MI fallout, but it will be fixed shortly. 2007-03-04 05:59:00 +00:00
dyoung 5493f188c7 KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
   in6_src.c, avoid casts by changing several route_in6 pointers
   to struct route pointers.  Remove unnecessary casts to caddr_t
   elsewhere.

Pave the way for eliminating address family-specific route caches:
   soon, struct route will not embed a sockaddr, but it will hold
   a reference to an external sockaddr, instead.  We will set the
   destination sockaddr using rtcache_setdst().  (I created a stub
   for it, but it isn't used anywhere, yet.)  rtcache_free() will
   free the sockaddr.  I have extracted from rtcache_free() a helper
   subroutine, rtcache_clear().  rtcache_clear() will "forget" a
   cached route, but it will not forget the destination by releasing
   the sockaddr.  I use rtcache_clear() instead of rtcache_free()
   in rtcache_update(), because rtcache_update() is not supposed
   to forget the destination.

Constify:

   1 Introduce const accessor for route->ro_dst, rtcache_getdst().

   2 Constify the 'dst' argument to ifnet->if_output().  This
     led me to constify a lot of code called by output routines.

   3 Constify the sockaddr argument to protosw->pr_ctlinput.  This
     led me to constify a lot of code called by ctlinput routines.

   4 Introduce const macros for converting from a generic sockaddr
     to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
     satocsin, et cetera.
2007-02-17 22:34:07 +00:00
ad f474dceb13 Use the LWP cached credentials where sane. 2006-07-23 22:06:03 +00:00
rpaulo de8db47547 Add support for RFC 3542 Adv. Socket API for IPv6 (which obsoletes 2292).
* RFC 3542 isn't binary compatible with RFC 2292.
* RFC 2292 support is on by default but can be disabled.
* update ping6, telnet and traceroute6 to the new API.

From the KAME project (www.kame.net).
Reviewed by core.
2006-05-05 00:03:21 +00:00
rpaulo 78678b130a Better support of IPv6 scoped addresses.
- most of the kernel code will not care about the actual encoding of
  scope zone IDs and won't touch "s6_addr16[1]" directly.
- similarly, most of the kernel code will not care about link-local
  scoped addresses as a special case.
- scope boundary check will be stricter.  For example, the current
  *BSD code allows a packet with src=::1 and dst=(some global IPv6
  address) to be sent outside of the node, if the application do:
    s = socket(AF_INET6);
    bind(s, "::1");
    sendto(s, some_global_IPv6_addr);
  This is clearly wrong, since ::1 is only meaningful within a single
  node, but the current implementation of the *BSD kernel cannot
  reject this attempt.
- and, while there, don't try to remove the ff02::/32 interface route
  entry in in6_ifdetach() as it's already gone.

This also includes some level of support for the standard source
address selection algorithm defined in RFC3484, which will be
completed on in the future.

From the KAME project via JINMEI Tatuya.
Approved by core@.
2006-01-21 00:15:35 +00:00
christos 95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
dsl c24781af04 Pass the current process structure to in_pcbconnect() so that it can
pass it to in_pcbbind() so that can allocate a low numbered port
if setsockopt() has been used to set IP_PORTRANGE to IP_PORTRANGE_LOW.
While there, fail in_pcbconnect() if the in_pcbbind() fails - rather
than sending the request out from a port of zero.
This has been largely broken since the socket option was added in 1998.
2005-11-15 18:39:46 +00:00
rpaulo 151760f5d2 Implement net.inet6.udp6.stats.
Reviewed by Elad Efrat.
2005-08-28 21:01:02 +00:00
yamt f02551ec2d move {tcp,udp}_do_loopback_cksum back to tcp/udp
so that they can be referenced by ipv6.
2005-08-10 13:06:49 +00:00
christos 2ab31527e2 - avoid shadowed variables
- sprinkle const.
2005-05-29 21:43:51 +00:00
atatat 5b8a6c916d Revert the change that made kern.file2 and net.*.*.pcblist into nodes
instead of structs.  It had other deleterious side-effects that are
rather nasty.  Another solution must be found.
2005-03-11 06:16:15 +00:00
atatat ca63da437a Change types of kern.file2 and net.*.*.pcblist to NODE 2005-03-10 05:43:25 +00:00
atatat 7c62c74d09 Add the following nodes to the sysctl tree:
net.local.stream.pcblist
	net.local.dgram.pcblist
	net.inet.tcp.pcblist
	net.inet.udp.pcblist
	net.inet.raw.pcblist
	net.inet6.tcp6.pcblist
	net.inet6.udp6.pcblist
	net.inet6.raw6.pcblist

which allow retrieval of the pcbs in use for those protocols.  The
struct involved is 32/64 bit clean and incorporates parts of struct
inpcb, struct unpcb, a bit of struct tcpcb, and two socket addresses.
2005-03-09 05:07:19 +00:00
thorpej 7994b6f95e Don't perform checksums on loopback interfaces. They can be reenabled with
the net.inet.*.do_loopback_cksum sysctl.

Approved by: groo
2004-12-15 04:25:19 +00:00
atatat 4de3747b89 Sysctl descriptions under net subtree (net.key not done) 2004-05-25 04:33:59 +00:00