Commit Graph

1888 Commits

Author SHA1 Message Date
dyoung 73b0c685df Use IFADDR_FOREACH(). 2007-12-04 10:31:14 +00:00
dyoung 79d53b3100 Move IN_NEED_CHECKSUM() to in_offload.h for re-use. 2007-11-28 04:14:11 +00:00
christos a9c710744b require that the options argument is the right size, not that it is greater
or equal to the requested size. Suggested by Matt Thomas.
2007-11-27 22:45:29 +00:00
yamt 8ed07fbf78 inetctlerrmap: use designated initializer. 2007-11-26 08:40:46 +00:00
cube cb1f63b2dc Follow up on arc -> arcnet renaming. Pointed out by joerg@. 2007-11-14 01:11:14 +00:00
dyoung 94b72f0f97 Change macros SYN_CACHE_PUT() and SYN_CACHE_RM() into inline
subroutines syn_cache_put() and syn_cache_rm().
2007-11-09 23:55:58 +00:00
dyoung 9250821580 KNF. Remove superfluous casts and parentheses. 2007-11-09 23:53:13 +00:00
dyoung e54fbb261f Use sockaddr_in_init(). KNF. No functional change intended. 2007-11-09 23:42:56 +00:00
kefren 9536f25523 Don't MCLAIM in ipintr() because we do it anyway in ip_input() 2007-11-09 06:59:33 +00:00
rmind d63e75f696 Pick the smallest possible TCP window scaling factor that will still allow
us to scale up to sb_max.  This might fix the problems with some firewalls.

Taken from FreeBSD (silby).
OK by <dyoung>.
2007-11-04 11:04:26 +00:00
ad a2a3828545 machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h 2007-10-19 11:59:34 +00:00
dyoung 60149b1ce8 Work in progress: use a raw socket for GRE in IP encapsulation
instead of adding/subtracting our own IPv4 header.

There are many benefits:  gre(4) needn't grok the outer encapsulation
header any longer, so this simplifies the gre(4) code.  The IP
stack needn't grok GRE, so it is simplified, too.  gre(4) will
benefit from optimizations in the socket code.  Eventually, gre(4)
will gain an IPv6 encapsulation with very few new lines of code.

There is a small performance loss.  A 133 MHz, 486-class AMD Elan
sinks/sources a TCP stream over GRE with about 93% the throughput
of the old code.  TCP throughput on a 266 MHz, 586-class AMD Geode
is about 96% the throughput of the old code.  A 175-MHz ADM5120
(MIPS) only sinks a TCP stream over GRE at about 90% of the old
code; I am still investigating that.

I produced stripped-down versions of sosend() and soreceive() for
gre(4) to use.  They are guaranteed not to block, so they can be
called from a software interrupt and from a socket upcall,
respectively.

A kernel thread is no longer necessary for socket transmit/receive,
but I didn't get around to removing it, yet.

Thanks to Matt Thomas for suggesting the use of stripped-down socket
code and software interrupts, and to Andrew Doran for advice and
answers concerning software interrupts, threads, and performance.
2007-10-05 03:28:12 +00:00
dyoung d07b0a69f6 Delete the unused second argument to ip_stripoptions(), move it
closer to its single caller in if_eon.c, try to move fewer bytes
by moving the IP header forward instead of moving the tail of the
mbuf backward, and use m_adj(9) instead of fiddling directly with
mbuf data members.
2007-10-02 20:35:04 +00:00
dyoung 3cdf25631c Don't use INADDR_ANY to initialize a const struct, because INADDR_ANY
is not necessarily const.
2007-09-19 18:52:55 +00:00
dyoung 43390716bc Constify sockaddr argument to ether_multiaddr(). Change struct
ifreq * arguments to ether_addmulti() and ether_delmulti() to const
struct sockaddr *, since ether_{add,del}multi() only ever read the
sockaddr ifreq member, ifr_addr.  Update uses in carp(4) and in
vlan(4).
2007-09-19 05:25:33 +00:00
dyoung 4c9b6756a5 1) Introduce a new socket option, (SOL_SOCKET, SO_NOHEADER), that
tells a socket that it should both add a protocol header to tx'd
   datagrams and remove the header from rx'd datagrams:

        int onoff = 1, s = socket(...);
        setsockopt(s, SOL_SOCKET, SO_NOHEADER, &onoff);

2) Add an implementation of (SOL_SOCKET, SO_NOHEADER) for raw IPv4
   sockets.

3) Reorganize the protocols' pr_ctloutput implementations a bit.
   Consistently return ENOPROTOOPT when an option is unsupported,
   and EINVAL if a supported option's arguments are incorrect.
   Reorganize the flow of code so that it's more clear how/when
   options are passed down the stack until they are handled.

   Shorten some pr_ctloutput staircases for readability.

4) Extract common mbuf code into subroutines, add new sockaddr
   methods, and introduce a new subroutine, fsocreate(), for reuse
   later; use it first in sys_socket():

struct mbuf *m_getsombuf(struct socket *so)

        Create an mbuf and make its owner the socket `so'.

struct mbuf *m_intopt(struct socket *so, int val)

        Create an mbuf, make its owner the socket `so', put the
        int `val' into it, and set its length to sizeof(int).


int fsocreate(..., int *fd)

        Create a socket, a la socreate(9), put the socket into the
        given LWP's descriptor table, return the descriptor at `fd'
        on success.

void *sockaddr_addr(struct sockaddr *sa, socklen_t *slenp)
const void *sockaddr_const_addr(const struct sockaddr *sa, socklen_t *slenp)

        Extract a pointer to the address part of a sockaddr.  Write
        the length of the address  part at `slenp', if `slenp' is
        not NULL.

socklen_t sockaddr_getlen(const struct sockaddr *sa)

        Return the length of a sockaddr.  This just evaluates to
        sa->sa_len.  I only add this for consistency with code that
        appears in a portable userland library that I am going to
        import.

const struct sockaddr *sockaddr_any(const struct sockaddr *sa)

        Return the "don't care" sockaddr in the same family as
        `sa'.  This is the address a client should sobind(9) if it
        does not care the source address and, if applicable, the
        port et cetera that it uses.

const void *sockaddr_anyaddr(const struct sockaddr *sa, socklen_t *slenp)

        Return the "don't care" sockaddr in the same family as
        `sa'.  This is the address a client should sobind(9) if it
        does not care the source address and, if applicable, the
        port et cetera that it uses.
2007-09-19 04:33:42 +00:00
degroote 640e23d7c9 In some FAST_IPSEC, spl level is not restored correctly. Fix that.
Spotted by Wolfgang Stukenbrock in pr/36800
2007-09-11 14:18:09 +00:00
dyoung 99975917cd We cannot sleep in a software interrupt, so do not sockaddr_dl_alloc(...,
M_WAITOK).  Instead, sockaddr_dl_init() a sockaddr_dl on the stack.
2007-09-05 05:29:35 +00:00
dyoung 88399b6877 We cannot sleep in a software interrupt, so do not sockaddr_dl_alloc(...,
M_WAITOK).  Instead, sockaddr_dl_init() a sockaddr_dl on the stack.
2007-09-02 19:42:21 +00:00
dyoung db10b0d586 m_copym(..., 0, M_COPYALL, ...) -> m_copypacket(..., ...). 2007-09-02 07:18:55 +00:00
dyoung 6173a47677 m_copy() was deprecated, apparently, long ago. m_copy(...) ->
m_copym(..., M_DONTWAIT).
2007-09-02 03:12:23 +00:00
dyoung 0af5ef16d6 Be consistent: use the prefix sc_ for all members of the gre_softc. 2007-09-02 01:49:49 +00:00
dyoung 2fc102750d Use ifreq_setaddr(), ifreq_getaddr(), sockaddr_in_init(), and
sockaddr_copy().  Constify.  Compare pointers with NULL, not 0.
Don't "test truth" of pointers, but compare with NULL.
2007-09-01 04:32:50 +00:00
dyoung f06b9f6f72 Fix bug in last: add missing ampersand. 2007-08-31 23:40:08 +00:00
dyoung 353d6b2744 Stop sharing a sockaddr_in template among multicast routines,
because that's just going to cause problems down the road.  (Suppose
we can have two CPUs in the network stack someday?)  Instead, use
sockaddr_in_init() to initialize a sockaddr_in on the stack.

Use ifreq_setaddr() to initialize ifreq.ifr_addr.
2007-08-31 21:56:43 +00:00
dyoung b3fc296326 Use malloc(9) for sockaddrs instead of pool(9), and remove dom_sa_pool
and dom_sa_len members from struct domain.  Pools of fixed-size
objects are too rigid for sockaddr_dls, whose size can vary over
a wide range.

Return sockaddr_dl to its "historical" size.  Now that I'm using
malloc(9) instead of pool(9) to allocate sockaddr_dl, I can create
a sockaddr_dl of any size in the kernel, so expanding sockaddr_dl
is useless.

Avoid using sizeof(struct sockaddr_dl) in the kernel.

Introduce sockaddr_dl_alloc() for allocating & initializing an
arbitrary sockaddr_dl on the heap.

Add an argument, the sockaddr length, to sockaddr_alloc(),
sockaddr_copy(), and sockaddr_dl_setaddr().

Constify: LLADDR() -> CLLADDR().

Where the kernel overwrites LLADDR(), use sockaddr_dl_setaddr(),
instead.  Used properly, sockaddr_dl_setaddr() will not overrun
the end of the sockaddr.
2007-08-30 02:17:34 +00:00
cube 2eca33e853 Fix ipv4 multicast that could sometimes send packets with the wrong
Ethernet multicast address.

Reported by jmcneill@, fix discussed with dyoung@, _very_ light testing by
myself, some more money for my dealer of anxiolytics after reading
ip_output()'s twisted code maze.
2007-08-28 23:45:39 +00:00
dyoung 64bfe92e2b Cosmetic: 0 -> NULL. Remove unnecessary cast. 2007-08-27 05:39:44 +00:00
dyoung 7caec74f02 Reorganize and extract arplookup1() for code-sharing. Share
null_sdl.  Introduce arp_setgate() for initializing a link-layer
nexthop, and use it to fulfill RTM_SETGATE requests.
2007-08-27 01:13:09 +00:00
dyoung 5204966a96 Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.
2007-08-26 22:59:08 +00:00
dyoung 473d5fc042 Use sockaddr_in_init(). 2007-08-21 08:34:33 +00:00
dyoung bd98464c6f Don't call rtcache_check() from the fast-forward code, which runs
at IPL_NET, because rtcache_check() may read the forwarding table.
Elsewhere, the kernel only blocks interrupts at priority IPL_SOFTNET
and below while it modifies the forwarding table, so rtcache_check()
could be reading the table in an inconsistent state.  Use
rtcache_done(), instead.

XXX netinet/ip_flow.c and netinet6/ip6_flow.c are virtually identical.
XXX They should share code.
2007-08-20 19:42:34 +00:00
dyoung b40a86e49c Use sockaddr_dl_init(). 2007-08-10 22:46:16 +00:00
dyoung 0640a03023 Use satocsdl() et cetera instead of SDL(). Constify. 2007-08-07 04:37:04 +00:00
yamt 7431e54c17 make rfbuf_ts a tcp timestamp so that calculations in tcp_input make sense. 2007-08-02 13:12:35 +00:00
yamt e74ee454c1 our tcp timestamps are in PR_SLOWHZ, not HZ. 2007-08-02 13:06:30 +00:00
rmind 4175f8693b TCP socket buffers automatic sizing - ported from FreeBSD.
http://mail-index.netbsd.org/tech-net/2007/02/04/0006.html

! Disabled by default, marked as experimental. Testers are very needed.
! Someone should thoroughly test this, and improve if possible.

Discussed on <tech-net>:
http://mail-index.netbsd.org/tech-net/2007/07/12/0002.html
Thanks Greg Troxel for comments.

OK by the long silence on <tech-net>.
2007-08-02 02:42:40 +00:00
dyoung 08e6f22226 Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

        Introduce rt_walktree() for walking the routing table and
        applying a function to each rtentry.  Replace most
        rn_walktree() calls with it.

        Use rt_getkey()/rt_setkey() to get/set a route's destination.
        Keep a pointer to the sockaddr key in the rtentry, so that
        rtentry users do not have to grovel in the radix_node for
        the key.

        Add a RTM_GET method to rtrequest.  Use that instead of
        radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

        Constify.  KNF.  Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
        et cetera.  Use NULL instead of 0 for null pointers.  Use
        __arraycount().  Reduce gratuitous parenthesization.

        Stop using variadic arguments for rip6_output(), it is
        unnecessary.

        Remove the unnecessary rtentry member rt_genmask and the
        code to maintain it, since nothing actually used it.

        Make rt_maskedcopy() easier to read by using meaningful variable
        names.

        Extract a subroutine intern_netmask() for looking up a netmask in
        the masks table.

        Start converting backslash-ridden IPv6 macros in
        sys/netinet6/in6_var.h into inline subroutines that one
        can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
2007-07-19 20:48:52 +00:00
xtraeme 48e23b4a25 Replace a simple lock with a mutex and make it static. 2007-07-11 21:34:16 +00:00
ad 88ab7da936 Merge some of the less invasive changes from the vmlocking branch:
- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
2007-07-09 20:51:58 +00:00
christos d1ffad0af7 Handle mapped and scoped ipv6 addresses. From Anon Ymous. 2007-06-28 21:11:12 +00:00
degroote 4ddfe916ff Add support for options IPSEC_NAT_T (RFC 3947 and 3948) for fast_ipsec(4).
No objection on tech-net@
2007-06-27 20:38:32 +00:00
xtraeme c18664aad7 Protect inet6_ident_core() with #ifdef INET6, fixes building without
options INET6.
2007-06-26 09:19:36 +00:00
christos 0a36551606 tcpdrop kernel bits (from anon ymous) 2007-06-25 23:35:12 +00:00
christos eeff189533 - per socket keepalive settings
- settable connection establishment timeout
2007-06-20 15:29:17 +00:00
christos fc506e028c PR/36484: Pavlin Radoslavov: PIM Register in-kernel encapsulation IP_DF
setting is incorrect
2007-06-13 23:09:59 +00:00
dyoung ae302fd15c Use __arraycount(). 2007-06-13 21:08:29 +00:00
dyoung 46bc24d79c Use LIST_FOREACH(). 2007-06-13 04:55:25 +00:00
dyoung 779264d3a2 Complete removal of radix_node knowledge. 2007-06-12 22:55:44 +00:00
dyoung 95edb940c2 Get rid of radix_node_head.rnh_walktree, because it is only ever
set to rn_walktree.

Introduce rt_walktree(), which applies a subroutine to every route
in a particular address family.  Use it instead of rn_walktree()
virtually everywhere.  This helps to hide the routing table
implementation.
2007-06-09 03:07:21 +00:00
riz 711b142f07 Fix compilation in the TCP_SIGNATURE case:
- don't use void * for pointer arithmetic
	- don't try to modify const parameters

A kernel with 'options TCP_SIGNATURE' works as well as it ever did, now.
(ie, clunky, but passable)
2007-05-18 21:48:43 +00:00
riz 89c9ca415d Revert a small part of revision 1.254 - remove const qualifier from
the struct tcphdr * argument of tcp_dooptions().  RFC2385 support
(options TCP_SIGNATURE) needs to modify the header during options
processing, and this revision broke it.

OK yamt@.
2007-05-18 21:31:16 +00:00
dyoung c7cb104b6b KNF. Use sockaddr_in_init(). Shorten staircases. No functional
changes intended.
2007-05-12 02:10:25 +00:00
dyoung 9552c98d25 Use sockaddr_in_init(). 2007-05-12 02:03:15 +00:00
dyoung e1d4e2922e In AppleTalk, IPv4, and IPv6 routing domains, help sockaddr_cmp()
avoid an indirect function call by comparing the family, length,
and bytes [dom->dom_sa_cmpofs, dom->dom_sa_cmpofs + dom->dom_sa_cmplen),
corresponding to the the sockaddrs' "address" members.

For ISO, actually use sockaddr_iso_cmp, for a change.  Thanks to
yamt@ for pointing out my error.
2007-05-06 02:56:37 +00:00
dyoung fc86b519e8 Oops, commit this straggler from the last change to net/if_gre.[ch]. 2007-05-06 02:48:38 +00:00
dyoung 8b646d9bb9 Remove obsolete files netinet/in_route.[ch]. 2007-05-02 22:39:03 +00:00
dyoung 38175939e7 Remove unused option. 2007-05-02 20:43:47 +00:00
dyoung 72f0a6dfb0 Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing.  Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously.  Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs.  I have
  introduced routines for allocating, copying, and duplicating,
  and freeing sockaddrs:

        struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
        struct sockaddr *sockaddr_copy(struct sockaddr *dst,
                                       const struct sockaddr *src);
        struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
        void sockaddr_free(struct sockaddr *sa);

  sockaddr_alloc() returns either a sockaddr from the pool belonging
  to the specified family, or NULL if the pool is exhausted.  The
  returned sockaddr has the right size for that family; sa_family
  and sa_len fields are initialized to the family and sockaddr
  length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
  sockaddr_in).  sockaddr_free() puts the given sockaddr back into
  its family's pool.

  sockaddr_dup() and sockaddr_copy() work analogously to strdup()
  and strcpy(), respectively.  sockaddr_copy() KASSERTs that the
  family of the destination and source sockaddrs are alike.

  The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
  passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
  family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
  etc.  They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more.  All protocol families
  use struct route.  I have changed the route cache, 'struct route',
  so that it does not contain storage space for a sockaddr.  Instead,
  struct route points to a sockaddr coming from the pool the sockaddr
  belongs to.  I added a new method to struct route, rtcache_setdst(),
  for setting the cache destination:

        int rtcache_setdst(struct route *, const struct sockaddr *);

  rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
  available to create the sockaddr storage.

  It is now possible for rtcache_getdst() to return NULL if, say,
  rtcache_setdst() failed.  I check the return value for NULL
  everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
  caches, dom_rtcache.  rtflushall(sa_family_t af) looks up the
  domain indicated by 'af', walks the domain's list of route caches
  and invalidates each one.
2007-05-02 20:40:22 +00:00
dyoung d43d3ae5b8 Get rid of some gratuitous casts and join some lines. 2007-04-25 00:11:18 +00:00
dyoung 2fe02c923a Constify. 2007-04-24 23:43:50 +00:00
dyoung 1c9313a294 In in_rtflushall(), clear the route caches using rtcache_clear()
instead of rtcache_free().  It is not desirable to clear the cached
destination as well as the route, however, rtcache_free() will
eventually release all resources held by the cache, including the
destination.

Add some additional diagnostic assertions.
2007-04-22 06:01:57 +00:00
dyoung d8fb0f4dac Add optimization hint for compiler. In a debug printf,
s/freeing/flushing/.
2007-04-18 23:22:26 +00:00
dyoung d60552baa5 Cosmetic: shorten a staircase. bzero -> memset. KNF. 2007-04-15 06:15:58 +00:00
liamjfoy 39b3c7f047 use size_t for indexes
just pass a *ip to ipflow_hash instead of members

ok christos@
2007-04-05 18:11:47 +00:00
liamjfoy 68880dffbf Add a small note regarding further commented code in netinet6/ip6_flow.c 2007-03-26 00:29:15 +00:00
liamjfoy b8ef59d720 Add net.inet.ip.hashsize to control the IPv4 fast forward hash table size. 2007-03-25 20:12:20 +00:00
liamjfoy ac43382f1f Don't call ip*flow_reap if we're just looking up maxflows 2007-03-24 00:27:58 +00:00
dyoung 271d77fa58 If we do not recognize the protocol of a received packet, then
increase ifi_noproto.  If the GRE header contains routing options,
increase the input-error count, ifi_ierrors.

While I am here, make some cosmetic changes: remove unnecessary
'proto' argument from gre_input3().  Shorten some staircases.
2007-03-21 01:56:05 +00:00
ad 59d979c5f1 Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.
2007-03-12 18:18:22 +00:00
liamjfoy 5aa6f5addf Move ipflow_slowtimo from ip_slowtimo and into in_proto.c
ok matt@
2007-03-05 00:50:53 +00:00
liamjfoy f84185c912 inet6domain -> inetdomain
thanks simon
2007-03-04 23:53:36 +00:00
liamjfoy a461422cd5 Initialize protocol switch with structure initializers.
ok christos@
2007-03-04 20:17:05 +00:00
tsutsui 6f8d4c537b Pass (char *) to mtod(9) on address calculation. 2007-03-04 10:53:32 +00:00
christos 53524e44ef Kill caddr_t; there will be some MI fallout, but it will be fixed shortly. 2007-03-04 05:59:00 +00:00
dyoung bc99546f43 Reverse sense of preference numbers: prefer source addresses with
higher preference numbers.  Thanks to Mihai Chelaru for pointing
out my mistake.
2007-02-22 08:08:40 +00:00
dyoung 9111c8b6e3 Add net.inet.ip.selectsrc.default even if GETIFA_DEBUG is not
#define'd.
2007-02-22 07:33:48 +00:00
thorpej 7cc07e11dc TRUE -> true, FALSE -> false 2007-02-22 06:16:03 +00:00
matt 93feeb1203 Fix lossage from boolean_t -> bool and updated x86 bus_dma. 2007-02-22 04:38:02 +00:00
thorpej 712239e366 Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
2007-02-21 22:59:35 +00:00
dyoung 5493f188c7 KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
   in6_src.c, avoid casts by changing several route_in6 pointers
   to struct route pointers.  Remove unnecessary casts to caddr_t
   elsewhere.

Pave the way for eliminating address family-specific route caches:
   soon, struct route will not embed a sockaddr, but it will hold
   a reference to an external sockaddr, instead.  We will set the
   destination sockaddr using rtcache_setdst().  (I created a stub
   for it, but it isn't used anywhere, yet.)  rtcache_free() will
   free the sockaddr.  I have extracted from rtcache_free() a helper
   subroutine, rtcache_clear().  rtcache_clear() will "forget" a
   cached route, but it will not forget the destination by releasing
   the sockaddr.  I use rtcache_clear() instead of rtcache_free()
   in rtcache_update(), because rtcache_update() is not supposed
   to forget the destination.

Constify:

   1 Introduce const accessor for route->ro_dst, rtcache_getdst().

   2 Constify the 'dst' argument to ifnet->if_output().  This
     led me to constify a lot of code called by output routines.

   3 Constify the sockaddr argument to protosw->pr_ctlinput.  This
     led me to constify a lot of code called by ctlinput routines.

   4 Introduce const macros for converting from a generic sockaddr
     to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
     satocsin, et cetera.
2007-02-17 22:34:07 +00:00
dyoung c80b247b25 Join lines. 2007-02-17 05:36:29 +00:00
dyoung 7ed406393a s/in_rtflush/in_rtcache/g 2007-02-17 05:35:50 +00:00
dyoung f272db0899 bzero -> memset 2007-02-17 05:31:39 +00:00
dyoung 08f386424b bcopy -> memcpy
Use NULL instead of (struct rtentry *)0.
2007-02-17 05:31:15 +00:00
degroote e2211411a4 Commit my SoC work
Add ipv6 support for fast_ipsec
Note that currently, packet with extensions headers are not correctly
supported
Change the ipcomp logic
2007-02-10 09:43:05 +00:00
dyoung ad4f290a37 bzero -> memset. 2007-01-29 06:00:11 +00:00
dyoung 24c98aa46f In ip_setmoptions(), don't leave a route cache (struct route) on
the stack if we exit with EADDRNOTAVAIL.
2007-01-29 05:59:30 +00:00
dyoung 0468886560 Cosmetic: remove extraneous, non-KNF parentheses. Change a
sizeof(type) to a sizeof(*ptr) so the correctness of the statement
is correct "at a glance" (or so I hope).
2007-01-29 05:48:56 +00:00
dyoung 4921da146d bzero -> memset 2007-01-29 05:46:33 +00:00
dyoung d8316ce94e KNF: bzero -> memset, change (struct in_ifaddr *)0 to NULL. 2007-01-26 19:15:26 +00:00
dyoung 3cd4307b24 bzero -> memset 2007-01-26 19:12:21 +00:00
joerg 7645663790 Unconditionally zero and free iproute. Before IPsec tunnel packets e.g.
from ICMP could end up in leaking the reference in iproute, as
ipsec4_output would overwrite the ro pointer in state.

Tested by Juraj Hercek and supposed to fix PR kern/35273 and kern/35318.
2007-01-13 23:13:46 +00:00
yamt 48bbcc400d ip_output: reload ip_len after running pfil_run_hooks.
pf "fragment reassemble" rule can change it, at least.
2007-01-08 04:14:54 +00:00
joerg fbd2dfee02 Use rtcache_free for consistency. 2007-01-05 15:47:33 +00:00
elad b2eb9a5389 Consistent usage of KAUTH_GENERIC_ISSUSER. 2007-01-04 19:07:03 +00:00
ad dd85fd121f ipintr(): check if the queue is empty before looping. Hardly a giant
win, but removed 30% of splnet() calls in one local test.
2006-12-22 05:34:02 +00:00
christos ae91f9ec0a According to ANSI c the only portably defined bitfields are unsigned int ones. 2006-12-17 20:07:36 +00:00
joerg eb04733c4e Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.
2006-12-15 21:18:52 +00:00
dyoung c308b1c661 Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route).  Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL.  Provide
in_rtcache() for adding a route to the chain.  Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches.  In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain.  In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.
2006-12-09 05:33:04 +00:00
joerg c882b2cbc1 When a dynamic route is deleted in in_losing and in6_losing, rtrequest
is called, but the current reference via the PCB is not removed. This
is effectively a leaked reference. Call rtfree unconditional.
2006-12-08 16:06:22 +00:00
jdc 6d7a98c7bc Explicitly include <sys/device.h>, which we need for `struct device'.
This allows us to compile on !i386.  (On i386, <machine/cpu.h> pulled
in <sys/device.h> for us, thus hiding the compilation problem.)

OK by rpaulo@.
2006-12-06 21:42:38 +00:00
yamt 8836e5995d add some more tcp mowners. 2006-12-06 09:10:45 +00:00
yamt f5830ee995 - make tcp_reass static.
- constify.
2006-12-06 09:08:27 +00:00
dyoung 2bbeb90e43 Remove stray curly brace. Thanks, yamt! 2006-12-06 04:29:09 +00:00
dyoung d7a8741d84 KNF. 2006-12-06 00:39:56 +00:00
dyoung 0394fe1e42 KNF. 2006-12-06 00:38:16 +00:00
yamt 401e606d0d move tso-by-software code to their own files. no functional changes. 2006-11-25 18:41:36 +00:00
christos 3d98aa3f4b fix spelling of accidentally; from Zapher 2006-11-24 19:37:02 +00:00
martin 54b769f306 Make it compile on IPv4-only kernels 2006-11-23 23:12:59 +00:00
yamt 809ec70bcf implement ipv6 TSO.
partly from Matthias Scheler.  tested by him.
2006-11-23 19:41:58 +00:00
tron 9506122aab Backout accidental commit which broke kernel builds. 2006-11-23 09:43:56 +00:00
rpaulo 5423539f94 New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
	* Fixes PR 34268.
	* Separates the code from gif(4) (which is more cleaner).
	* Allows the usage of STP (Spanning Tree Protocol).
	* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.
2006-11-23 04:07:07 +00:00
dyoung 8cfa750e0f Use LIST_FOREACH(). 2006-11-16 22:54:14 +00:00
dyoung 641edc65f1 Cosmetic: s/g_proto/sc_proto/. Remove superfluous parentheses and
curly braces.
2006-11-16 22:26:35 +00:00
christos 168cd830d2 __unused removal on arguments; approved by core. 2006-11-16 01:32:37 +00:00
rpaulo 7c10983a54 Remove ifndef COMPAT_42. No objections in tech-net. 2006-11-14 12:05:55 +00:00
dyoung 2d1d707101 Plug memory leak. 2006-11-13 05:48:00 +00:00
dyoung a25eaede91 Add a source-address selection policy mechanism to the kernel.
Also, add ioctls SIOCGIFADDRPREF/SIOCSIFADDRPREF to get/set preference
numbers for addresses.  Make ifconfig(8) set/display preference
numbers.

To activate source-address selection policies in your kernel, add
'options IPSELSRC' to your kernel configuration.

Miscellaneous changes in support of source-address selection:

        1 Factor out some common code, producing rt_replace_ifa().

        2 Abbreviate a for-loop with TAILQ_FOREACH().

        3 Add the predicates on IPv4 addresses IN_LINKLOCAL() and
          IN_PRIVATE(), that are true for link-local unicast
          (169.254/16) and RFC1918 private addresses, respectively.
          Add the predicate IN_ANY_LOCAL() that is true for link-local
          unicast and multicast.

        4 Add IPv4-specific interface attach/detach routines,
          in_domifattach and in_domifdetach, which build #ifdef
          IPSELSRC.

See in_getifa(9) for a more thorough description of source-address
selection policy.
2006-11-13 05:13:38 +00:00
yamt d4d55c3dc9 tcp_ctloutput: when called for a socket which is not AF_INET or AF_INET6,
panic rather than returning possibly leaking an mbuf.
2006-11-10 13:19:16 +00:00
yamt 22ffb8ee31 udp_ctloutput: plug a memory leak. 2006-11-10 13:02:32 +00:00
yamt 850e08319b remove some __unused in function parameters. 2006-11-10 13:01:55 +00:00
yamt d547c3b722 udp_ctloutput: remove unnecessary goto and break. 2006-11-10 13:00:23 +00:00
yamt 511f1a8ff8 udp_ctloutput: ansify. 2006-11-10 12:59:59 +00:00
christos 9217ff877d Fix typo (hi Elad) 2006-10-30 00:58:21 +00:00
elad adf8d7aab2 Introduce KAUTH_REQ_NETWORK_SOCKET_OPEN, to check if opening a socket is
allowed. It takes three int * arguments indicating domain, type, and
protocol. Replace previous KAUTH_REQ_NETWORK_SOCKET_RAWSOCK with it (but
keep it still).

Places that used to explicitly check for privileged context now don't
need it anymore, so I replaced these with XXX comment indiacting it for
future reference.

Documented and updated examples as well.
2006-10-25 22:49:22 +00:00
elad f2ce4f0704 Kill some KAUTH_GENERIC_ISSUSER. 2006-10-25 18:11:22 +00:00
elad 75939147ff Kill some KAUTH_GENERIC_ISSUSER. 2006-10-25 12:48:44 +00:00
yamt 80e1bbb713 add sack_dump(), a function to dump sack holes, if defined(DDB). 2006-10-21 10:26:21 +00:00
yamt 7253aad93f constify. 2006-10-21 10:24:47 +00:00
yamt c31e22237d - constify.
- make tcp_dooptions and tcpipqent_pool static.
2006-10-21 10:08:54 +00:00
liamjfoy cd64dacbef Remove some dead code - From OpenBSD Rev. 1.129 2006-10-20 19:13:02 +00:00
reinoud 78f5b5f9d5 Fix alignment problems causing regular panics in tpc_sack_option on
NetBSD/alpha and NetBSD/sparc. This fixes PR#34751.

The problem most likely started to show in gcc4 and is caused by the use of
a casting to an uint32_t pointer that is later copied from using memcpy.
Gcc detects the copying of 4 bytes from an uint32_t pointer and decides to
just replace it with an aligned copy causing the trap.

Fix provided by Izumi Tsutsui and ok'd by Martin.
2006-10-20 13:11:09 +00:00
rpaulo 8106a506d3 Use a better way to create sysctl subtrees for ECN and Congctl.
Inspired on ABC subtree.
2006-10-19 14:14:34 +00:00
yamt c549acefec tcp_reno_newack: remove an __unused because it's now used. 2006-10-19 11:42:32 +00:00
yamt df8e5bddfa tcp_reno_newack: regardless of sysctl setting, use L=1*SMSS when
we are doing retransmission.
2006-10-19 11:42:02 +00:00
yamt 81463c93c7 implement RFC3465 appropriate byte counting.
from Kentaro A. Kurahone, with minor adjustments by me.
the ack prediction part of the original patch was omitted because
it's a separate change.  reviewed by Rui Paulo.
2006-10-19 11:40:51 +00:00
dogcow 372e6ef309 now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.
2006-10-17 18:21:29 +00:00
yamt 389488e9b6 tcp_output: as a comment in tcp_sack_newack says, actually send
one or two segments on partial acks.  even if sack_bytes_rxmt==0,
if we are in fast recovory with sack, snd_cwnd has somewhat special
meaning here.  PR/34749.
2006-10-17 11:11:40 +00:00
yamt e1c6fffb40 tcp_input: if we have SACK, don't enter fastrecovery on three dupacks.
otherwise, we can enter fastrecovery due to DSACKs, which we treat
as dupacks here.  PR/34748.  reviewed by Rui Paulo.
2006-10-17 09:31:17 +00:00
rpaulo 21df8206df Export the tcp_do_rfc1948 variable to userland via sysctl.
The code to generate an ISS via an MD5 hash has been present in the
NetBSD kernel since 2001, but it wasn't even exported to userland at
that time. It was agreed on tech-net with the original author <thorpej>
that we should let the user decide if he wants to enable it or not.
Not enabled by default.
2006-10-16 18:13:56 +00:00
rpaulo 1c1f230e81 Move comments to proper places. 2006-10-15 17:53:30 +00:00
rpaulo a70594d346 Add a new tcp_congctl(9) structure member for congestion experienced callback.
Needed by HSTCP.
2006-10-15 17:45:06 +00:00
dogcow 44603cac1f more unused variable fallout. 2006-10-13 18:28:06 +00:00
elad 8c494ca741 Introduce KAUTH_REQ_NETWORK_SOCKET_CANSEE. Since we're not gonna be having
credentials on sockets, at least not anytime soon, this is a way to check
if we can "look" at a socket. Later on when (and if) we do have socket
credentials, the interface usage remains the same because we pass the
socket.

This also fixes sysctl for inet/inet6 pcblist.
2006-10-13 15:39:18 +00:00
rpaulo c1fc16d084 PR 34776: don't accept TCP connections to broadcast addresses.
Move the multicast/broadcast check above (before creating a syn_cache entry)
By Yasuoka Yasuoka.
2006-10-12 11:46:30 +00:00
christos 4d595fd7b1 - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
2006-10-12 01:30:41 +00:00
dogcow 55ddfc9aae change the MOWNER_INIT define to take two args; fix extant struct mowner
decls to use it. Makes options MBUFTRACE compile again and not whinge about
missing structure declarations. (Also makes initialization consistent.)
2006-10-10 21:49:14 +00:00
rpaulo a6762e54d7 Revert previous. The check is now done in tcp_congctl. 2006-10-10 11:13:02 +00:00
rpaulo e1b1f65f6b tcp_reno_newack(): bring the exact original code.
tcp_newreno_newack(): call tcp_reno_newack() if partialacks < 0.
2006-10-10 11:12:39 +00:00