Commit Graph

1778 Commits

Author SHA1 Message Date
ad a2a3828545 machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h 2007-10-19 11:59:34 +00:00
dyoung 60149b1ce8 Work in progress: use a raw socket for GRE in IP encapsulation
instead of adding/subtracting our own IPv4 header.

There are many benefits:  gre(4) needn't grok the outer encapsulation
header any longer, so this simplifies the gre(4) code.  The IP
stack needn't grok GRE, so it is simplified, too.  gre(4) will
benefit from optimizations in the socket code.  Eventually, gre(4)
will gain an IPv6 encapsulation with very few new lines of code.

There is a small performance loss.  A 133 MHz, 486-class AMD Elan
sinks/sources a TCP stream over GRE with about 93% the throughput
of the old code.  TCP throughput on a 266 MHz, 586-class AMD Geode
is about 96% the throughput of the old code.  A 175-MHz ADM5120
(MIPS) only sinks a TCP stream over GRE at about 90% of the old
code; I am still investigating that.

I produced stripped-down versions of sosend() and soreceive() for
gre(4) to use.  They are guaranteed not to block, so they can be
called from a software interrupt and from a socket upcall,
respectively.

A kernel thread is no longer necessary for socket transmit/receive,
but I didn't get around to removing it, yet.

Thanks to Matt Thomas for suggesting the use of stripped-down socket
code and software interrupts, and to Andrew Doran for advice and
answers concerning software interrupts, threads, and performance.
2007-10-05 03:28:12 +00:00
dyoung d07b0a69f6 Delete the unused second argument to ip_stripoptions(), move it
closer to its single caller in if_eon.c, try to move fewer bytes
by moving the IP header forward instead of moving the tail of the
mbuf backward, and use m_adj(9) instead of fiddling directly with
mbuf data members.
2007-10-02 20:35:04 +00:00
dyoung 3cdf25631c Don't use INADDR_ANY to initialize a const struct, because INADDR_ANY
is not necessarily const.
2007-09-19 18:52:55 +00:00
dyoung 43390716bc Constify sockaddr argument to ether_multiaddr(). Change struct
ifreq * arguments to ether_addmulti() and ether_delmulti() to const
struct sockaddr *, since ether_{add,del}multi() only ever read the
sockaddr ifreq member, ifr_addr.  Update uses in carp(4) and in
vlan(4).
2007-09-19 05:25:33 +00:00
dyoung 4c9b6756a5 1) Introduce a new socket option, (SOL_SOCKET, SO_NOHEADER), that
tells a socket that it should both add a protocol header to tx'd
   datagrams and remove the header from rx'd datagrams:

        int onoff = 1, s = socket(...);
        setsockopt(s, SOL_SOCKET, SO_NOHEADER, &onoff);

2) Add an implementation of (SOL_SOCKET, SO_NOHEADER) for raw IPv4
   sockets.

3) Reorganize the protocols' pr_ctloutput implementations a bit.
   Consistently return ENOPROTOOPT when an option is unsupported,
   and EINVAL if a supported option's arguments are incorrect.
   Reorganize the flow of code so that it's more clear how/when
   options are passed down the stack until they are handled.

   Shorten some pr_ctloutput staircases for readability.

4) Extract common mbuf code into subroutines, add new sockaddr
   methods, and introduce a new subroutine, fsocreate(), for reuse
   later; use it first in sys_socket():

struct mbuf *m_getsombuf(struct socket *so)

        Create an mbuf and make its owner the socket `so'.

struct mbuf *m_intopt(struct socket *so, int val)

        Create an mbuf, make its owner the socket `so', put the
        int `val' into it, and set its length to sizeof(int).


int fsocreate(..., int *fd)

        Create a socket, a la socreate(9), put the socket into the
        given LWP's descriptor table, return the descriptor at `fd'
        on success.

void *sockaddr_addr(struct sockaddr *sa, socklen_t *slenp)
const void *sockaddr_const_addr(const struct sockaddr *sa, socklen_t *slenp)

        Extract a pointer to the address part of a sockaddr.  Write
        the length of the address  part at `slenp', if `slenp' is
        not NULL.

socklen_t sockaddr_getlen(const struct sockaddr *sa)

        Return the length of a sockaddr.  This just evaluates to
        sa->sa_len.  I only add this for consistency with code that
        appears in a portable userland library that I am going to
        import.

const struct sockaddr *sockaddr_any(const struct sockaddr *sa)

        Return the "don't care" sockaddr in the same family as
        `sa'.  This is the address a client should sobind(9) if it
        does not care the source address and, if applicable, the
        port et cetera that it uses.

const void *sockaddr_anyaddr(const struct sockaddr *sa, socklen_t *slenp)

        Return the "don't care" sockaddr in the same family as
        `sa'.  This is the address a client should sobind(9) if it
        does not care the source address and, if applicable, the
        port et cetera that it uses.
2007-09-19 04:33:42 +00:00
degroote 640e23d7c9 In some FAST_IPSEC, spl level is not restored correctly. Fix that.
Spotted by Wolfgang Stukenbrock in pr/36800
2007-09-11 14:18:09 +00:00
dyoung 99975917cd We cannot sleep in a software interrupt, so do not sockaddr_dl_alloc(...,
M_WAITOK).  Instead, sockaddr_dl_init() a sockaddr_dl on the stack.
2007-09-05 05:29:35 +00:00
dyoung 88399b6877 We cannot sleep in a software interrupt, so do not sockaddr_dl_alloc(...,
M_WAITOK).  Instead, sockaddr_dl_init() a sockaddr_dl on the stack.
2007-09-02 19:42:21 +00:00
dyoung db10b0d586 m_copym(..., 0, M_COPYALL, ...) -> m_copypacket(..., ...). 2007-09-02 07:18:55 +00:00
dyoung 6173a47677 m_copy() was deprecated, apparently, long ago. m_copy(...) ->
m_copym(..., M_DONTWAIT).
2007-09-02 03:12:23 +00:00
dyoung 0af5ef16d6 Be consistent: use the prefix sc_ for all members of the gre_softc. 2007-09-02 01:49:49 +00:00
dyoung 2fc102750d Use ifreq_setaddr(), ifreq_getaddr(), sockaddr_in_init(), and
sockaddr_copy().  Constify.  Compare pointers with NULL, not 0.
Don't "test truth" of pointers, but compare with NULL.
2007-09-01 04:32:50 +00:00
dyoung f06b9f6f72 Fix bug in last: add missing ampersand. 2007-08-31 23:40:08 +00:00
dyoung 353d6b2744 Stop sharing a sockaddr_in template among multicast routines,
because that's just going to cause problems down the road.  (Suppose
we can have two CPUs in the network stack someday?)  Instead, use
sockaddr_in_init() to initialize a sockaddr_in on the stack.

Use ifreq_setaddr() to initialize ifreq.ifr_addr.
2007-08-31 21:56:43 +00:00
dyoung b3fc296326 Use malloc(9) for sockaddrs instead of pool(9), and remove dom_sa_pool
and dom_sa_len members from struct domain.  Pools of fixed-size
objects are too rigid for sockaddr_dls, whose size can vary over
a wide range.

Return sockaddr_dl to its "historical" size.  Now that I'm using
malloc(9) instead of pool(9) to allocate sockaddr_dl, I can create
a sockaddr_dl of any size in the kernel, so expanding sockaddr_dl
is useless.

Avoid using sizeof(struct sockaddr_dl) in the kernel.

Introduce sockaddr_dl_alloc() for allocating & initializing an
arbitrary sockaddr_dl on the heap.

Add an argument, the sockaddr length, to sockaddr_alloc(),
sockaddr_copy(), and sockaddr_dl_setaddr().

Constify: LLADDR() -> CLLADDR().

Where the kernel overwrites LLADDR(), use sockaddr_dl_setaddr(),
instead.  Used properly, sockaddr_dl_setaddr() will not overrun
the end of the sockaddr.
2007-08-30 02:17:34 +00:00
cube 2eca33e853 Fix ipv4 multicast that could sometimes send packets with the wrong
Ethernet multicast address.

Reported by jmcneill@, fix discussed with dyoung@, _very_ light testing by
myself, some more money for my dealer of anxiolytics after reading
ip_output()'s twisted code maze.
2007-08-28 23:45:39 +00:00
dyoung 64bfe92e2b Cosmetic: 0 -> NULL. Remove unnecessary cast. 2007-08-27 05:39:44 +00:00
dyoung 7caec74f02 Reorganize and extract arplookup1() for code-sharing. Share
null_sdl.  Introduce arp_setgate() for initializing a link-layer
nexthop, and use it to fulfill RTM_SETGATE requests.
2007-08-27 01:13:09 +00:00
dyoung 5204966a96 Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.
2007-08-26 22:59:08 +00:00
dyoung 473d5fc042 Use sockaddr_in_init(). 2007-08-21 08:34:33 +00:00
dyoung bd98464c6f Don't call rtcache_check() from the fast-forward code, which runs
at IPL_NET, because rtcache_check() may read the forwarding table.
Elsewhere, the kernel only blocks interrupts at priority IPL_SOFTNET
and below while it modifies the forwarding table, so rtcache_check()
could be reading the table in an inconsistent state.  Use
rtcache_done(), instead.

XXX netinet/ip_flow.c and netinet6/ip6_flow.c are virtually identical.
XXX They should share code.
2007-08-20 19:42:34 +00:00
dyoung b40a86e49c Use sockaddr_dl_init(). 2007-08-10 22:46:16 +00:00
dyoung 0640a03023 Use satocsdl() et cetera instead of SDL(). Constify. 2007-08-07 04:37:04 +00:00
yamt 7431e54c17 make rfbuf_ts a tcp timestamp so that calculations in tcp_input make sense. 2007-08-02 13:12:35 +00:00
yamt e74ee454c1 our tcp timestamps are in PR_SLOWHZ, not HZ. 2007-08-02 13:06:30 +00:00
rmind 4175f8693b TCP socket buffers automatic sizing - ported from FreeBSD.
http://mail-index.netbsd.org/tech-net/2007/02/04/0006.html

! Disabled by default, marked as experimental. Testers are very needed.
! Someone should thoroughly test this, and improve if possible.

Discussed on <tech-net>:
http://mail-index.netbsd.org/tech-net/2007/07/12/0002.html
Thanks Greg Troxel for comments.

OK by the long silence on <tech-net>.
2007-08-02 02:42:40 +00:00
dyoung 08e6f22226 Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

        Introduce rt_walktree() for walking the routing table and
        applying a function to each rtentry.  Replace most
        rn_walktree() calls with it.

        Use rt_getkey()/rt_setkey() to get/set a route's destination.
        Keep a pointer to the sockaddr key in the rtentry, so that
        rtentry users do not have to grovel in the radix_node for
        the key.

        Add a RTM_GET method to rtrequest.  Use that instead of
        radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

        Constify.  KNF.  Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
        et cetera.  Use NULL instead of 0 for null pointers.  Use
        __arraycount().  Reduce gratuitous parenthesization.

        Stop using variadic arguments for rip6_output(), it is
        unnecessary.

        Remove the unnecessary rtentry member rt_genmask and the
        code to maintain it, since nothing actually used it.

        Make rt_maskedcopy() easier to read by using meaningful variable
        names.

        Extract a subroutine intern_netmask() for looking up a netmask in
        the masks table.

        Start converting backslash-ridden IPv6 macros in
        sys/netinet6/in6_var.h into inline subroutines that one
        can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
2007-07-19 20:48:52 +00:00
xtraeme 48e23b4a25 Replace a simple lock with a mutex and make it static. 2007-07-11 21:34:16 +00:00
ad 88ab7da936 Merge some of the less invasive changes from the vmlocking branch:
- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
2007-07-09 20:51:58 +00:00
christos d1ffad0af7 Handle mapped and scoped ipv6 addresses. From Anon Ymous. 2007-06-28 21:11:12 +00:00
degroote 4ddfe916ff Add support for options IPSEC_NAT_T (RFC 3947 and 3948) for fast_ipsec(4).
No objection on tech-net@
2007-06-27 20:38:32 +00:00
xtraeme c18664aad7 Protect inet6_ident_core() with #ifdef INET6, fixes building without
options INET6.
2007-06-26 09:19:36 +00:00
christos 0a36551606 tcpdrop kernel bits (from anon ymous) 2007-06-25 23:35:12 +00:00
christos eeff189533 - per socket keepalive settings
- settable connection establishment timeout
2007-06-20 15:29:17 +00:00
christos fc506e028c PR/36484: Pavlin Radoslavov: PIM Register in-kernel encapsulation IP_DF
setting is incorrect
2007-06-13 23:09:59 +00:00
dyoung ae302fd15c Use __arraycount(). 2007-06-13 21:08:29 +00:00
dyoung 46bc24d79c Use LIST_FOREACH(). 2007-06-13 04:55:25 +00:00
dyoung 779264d3a2 Complete removal of radix_node knowledge. 2007-06-12 22:55:44 +00:00
dyoung 95edb940c2 Get rid of radix_node_head.rnh_walktree, because it is only ever
set to rn_walktree.

Introduce rt_walktree(), which applies a subroutine to every route
in a particular address family.  Use it instead of rn_walktree()
virtually everywhere.  This helps to hide the routing table
implementation.
2007-06-09 03:07:21 +00:00
riz 711b142f07 Fix compilation in the TCP_SIGNATURE case:
- don't use void * for pointer arithmetic
	- don't try to modify const parameters

A kernel with 'options TCP_SIGNATURE' works as well as it ever did, now.
(ie, clunky, but passable)
2007-05-18 21:48:43 +00:00
riz 89c9ca415d Revert a small part of revision 1.254 - remove const qualifier from
the struct tcphdr * argument of tcp_dooptions().  RFC2385 support
(options TCP_SIGNATURE) needs to modify the header during options
processing, and this revision broke it.

OK yamt@.
2007-05-18 21:31:16 +00:00
dyoung c7cb104b6b KNF. Use sockaddr_in_init(). Shorten staircases. No functional
changes intended.
2007-05-12 02:10:25 +00:00
dyoung 9552c98d25 Use sockaddr_in_init(). 2007-05-12 02:03:15 +00:00
dyoung e1d4e2922e In AppleTalk, IPv4, and IPv6 routing domains, help sockaddr_cmp()
avoid an indirect function call by comparing the family, length,
and bytes [dom->dom_sa_cmpofs, dom->dom_sa_cmpofs + dom->dom_sa_cmplen),
corresponding to the the sockaddrs' "address" members.

For ISO, actually use sockaddr_iso_cmp, for a change.  Thanks to
yamt@ for pointing out my error.
2007-05-06 02:56:37 +00:00
dyoung fc86b519e8 Oops, commit this straggler from the last change to net/if_gre.[ch]. 2007-05-06 02:48:38 +00:00
dyoung 8b646d9bb9 Remove obsolete files netinet/in_route.[ch]. 2007-05-02 22:39:03 +00:00
dyoung 38175939e7 Remove unused option. 2007-05-02 20:43:47 +00:00
dyoung 72f0a6dfb0 Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing.  Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously.  Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs.  I have
  introduced routines for allocating, copying, and duplicating,
  and freeing sockaddrs:

        struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
        struct sockaddr *sockaddr_copy(struct sockaddr *dst,
                                       const struct sockaddr *src);
        struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
        void sockaddr_free(struct sockaddr *sa);

  sockaddr_alloc() returns either a sockaddr from the pool belonging
  to the specified family, or NULL if the pool is exhausted.  The
  returned sockaddr has the right size for that family; sa_family
  and sa_len fields are initialized to the family and sockaddr
  length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
  sockaddr_in).  sockaddr_free() puts the given sockaddr back into
  its family's pool.

  sockaddr_dup() and sockaddr_copy() work analogously to strdup()
  and strcpy(), respectively.  sockaddr_copy() KASSERTs that the
  family of the destination and source sockaddrs are alike.

  The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
  passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
  family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
  etc.  They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more.  All protocol families
  use struct route.  I have changed the route cache, 'struct route',
  so that it does not contain storage space for a sockaddr.  Instead,
  struct route points to a sockaddr coming from the pool the sockaddr
  belongs to.  I added a new method to struct route, rtcache_setdst(),
  for setting the cache destination:

        int rtcache_setdst(struct route *, const struct sockaddr *);

  rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
  available to create the sockaddr storage.

  It is now possible for rtcache_getdst() to return NULL if, say,
  rtcache_setdst() failed.  I check the return value for NULL
  everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
  caches, dom_rtcache.  rtflushall(sa_family_t af) looks up the
  domain indicated by 'af', walks the domain's list of route caches
  and invalidates each one.
2007-05-02 20:40:22 +00:00
dyoung d43d3ae5b8 Get rid of some gratuitous casts and join some lines. 2007-04-25 00:11:18 +00:00