Commit Graph

79 Commits

Author SHA1 Message Date
dyoung b3fc296326 Use malloc(9) for sockaddrs instead of pool(9), and remove dom_sa_pool
and dom_sa_len members from struct domain.  Pools of fixed-size
objects are too rigid for sockaddr_dls, whose size can vary over
a wide range.

Return sockaddr_dl to its "historical" size.  Now that I'm using
malloc(9) instead of pool(9) to allocate sockaddr_dl, I can create
a sockaddr_dl of any size in the kernel, so expanding sockaddr_dl
is useless.

Avoid using sizeof(struct sockaddr_dl) in the kernel.

Introduce sockaddr_dl_alloc() for allocating & initializing an
arbitrary sockaddr_dl on the heap.

Add an argument, the sockaddr length, to sockaddr_alloc(),
sockaddr_copy(), and sockaddr_dl_setaddr().

Constify: LLADDR() -> CLLADDR().

Where the kernel overwrites LLADDR(), use sockaddr_dl_setaddr(),
instead.  Used properly, sockaddr_dl_setaddr() will not overrun
the end of the sockaddr.
2007-08-30 02:17:34 +00:00
dyoung 5204966a96 Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.
2007-08-26 22:59:08 +00:00
dyoung 27de48611a Avoid writing past the end of the buffer [lldst, lldst + dstsize)
in nd6_storelladdr().

Use sockaddr_dl_setaddr().  Constify some sockaddr_dl's.  Constify
a sockaddr argument to nd6_na_output().  Change SDL() to "standard"
satocsdl() or satosdl().  Change SIN6() to satocsin6() or satosin6().

bcmp -> memcmp, bcopy -> memcpy.
2007-08-07 04:35:42 +00:00
dyoung 08e6f22226 Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

        Introduce rt_walktree() for walking the routing table and
        applying a function to each rtentry.  Replace most
        rn_walktree() calls with it.

        Use rt_getkey()/rt_setkey() to get/set a route's destination.
        Keep a pointer to the sockaddr key in the rtentry, so that
        rtentry users do not have to grovel in the radix_node for
        the key.

        Add a RTM_GET method to rtrequest.  Use that instead of
        radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

        Constify.  KNF.  Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
        et cetera.  Use NULL instead of 0 for null pointers.  Use
        __arraycount().  Reduce gratuitous parenthesization.

        Stop using variadic arguments for rip6_output(), it is
        unnecessary.

        Remove the unnecessary rtentry member rt_genmask and the
        code to maintain it, since nothing actually used it.

        Make rt_maskedcopy() easier to read by using meaningful variable
        names.

        Extract a subroutine intern_netmask() for looking up a netmask in
        the masks table.

        Start converting backslash-ridden IPv6 macros in
        sys/netinet6/in6_var.h into inline subroutines that one
        can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
2007-07-19 20:48:52 +00:00
ad 88ab7da936 Merge some of the less invasive changes from the vmlocking branch:
- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
2007-07-09 20:51:58 +00:00
christos 72cfe7327b Ansify + add a few comments, from Karl Sjödahl 2007-05-23 17:14:59 +00:00
dyoung 1db31a59af Fix the memory leak reported in kern/36337. Thanks Matthias Scheler
for the heads-up.  My fix is based on the following patches from
FreeBSD, however, I extracted the code into a subroutine,
nd6_llinfo_release_pkts():

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet6/nd6.c.diff?r1=1.48.2.18;r2=1.48.2.19
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet6/nd6_nbr.c.diff?r1=1.29.2.8;r2=1.29.2.9
2007-05-17 00:53:26 +00:00
dyoung 72f0a6dfb0 Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing.  Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously.  Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs.  I have
  introduced routines for allocating, copying, and duplicating,
  and freeing sockaddrs:

        struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
        struct sockaddr *sockaddr_copy(struct sockaddr *dst,
                                       const struct sockaddr *src);
        struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
        void sockaddr_free(struct sockaddr *sa);

  sockaddr_alloc() returns either a sockaddr from the pool belonging
  to the specified family, or NULL if the pool is exhausted.  The
  returned sockaddr has the right size for that family; sa_family
  and sa_len fields are initialized to the family and sockaddr
  length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
  sockaddr_in).  sockaddr_free() puts the given sockaddr back into
  its family's pool.

  sockaddr_dup() and sockaddr_copy() work analogously to strdup()
  and strcpy(), respectively.  sockaddr_copy() KASSERTs that the
  family of the destination and source sockaddrs are alike.

  The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
  passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
  family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
  etc.  They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more.  All protocol families
  use struct route.  I have changed the route cache, 'struct route',
  so that it does not contain storage space for a sockaddr.  Instead,
  struct route points to a sockaddr coming from the pool the sockaddr
  belongs to.  I added a new method to struct route, rtcache_setdst(),
  for setting the cache destination:

        int rtcache_setdst(struct route *, const struct sockaddr *);

  rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
  available to create the sockaddr storage.

  It is now possible for rtcache_getdst() to return NULL if, say,
  rtcache_setdst() failed.  I check the return value for NULL
  everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
  caches, dom_rtcache.  rtflushall(sa_family_t af) looks up the
  domain indicated by 'af', walks the domain's list of route caches
  and invalidates each one.
2007-05-02 20:40:22 +00:00
dyoung ab751193cc Don't open-code TAILQ_FOREACH(). KNF: Fix K&R prototypes and
parameter-type declarations.
2007-03-15 23:39:51 +00:00
christos 53524e44ef Kill caddr_t; there will be some MI fallout, but it will be fixed shortly. 2007-03-04 05:59:00 +00:00
dyoung 5493f188c7 KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
   in6_src.c, avoid casts by changing several route_in6 pointers
   to struct route pointers.  Remove unnecessary casts to caddr_t
   elsewhere.

Pave the way for eliminating address family-specific route caches:
   soon, struct route will not embed a sockaddr, but it will hold
   a reference to an external sockaddr, instead.  We will set the
   destination sockaddr using rtcache_setdst().  (I created a stub
   for it, but it isn't used anywhere, yet.)  rtcache_free() will
   free the sockaddr.  I have extracted from rtcache_free() a helper
   subroutine, rtcache_clear().  rtcache_clear() will "forget" a
   cached route, but it will not forget the destination by releasing
   the sockaddr.  I use rtcache_clear() instead of rtcache_free()
   in rtcache_update(), because rtcache_update() is not supposed
   to forget the destination.

Constify:

   1 Introduce const accessor for route->ro_dst, rtcache_getdst().

   2 Constify the 'dst' argument to ifnet->if_output().  This
     led me to constify a lot of code called by output routines.

   3 Constify the sockaddr argument to protosw->pr_ctlinput.  This
     led me to constify a lot of code called by ctlinput routines.

   4 Introduce const macros for converting from a generic sockaddr
     to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
     satocsin, et cetera.
2007-02-17 22:34:07 +00:00
dyoung 741e438b04 Cosmetic: bzero -> memset. Change a bcopy() to a struct assignment. 2007-01-29 06:20:43 +00:00
joerg eb04733c4e Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.
2006-12-15 21:18:52 +00:00
dyoung c308b1c661 Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route).  Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL.  Provide
in_rtcache() for adding a route to the chain.  Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches.  In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain.  In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.
2006-12-09 05:33:04 +00:00
dyoung 3b46d8b708 Use the queue(3) macros instead of open-coding them. Shorten
staircases.  Remove unnecessary casts.  Where appropriate, s/8/NBBY/.
De-__P().  KNF.

No functional changes intended.
2006-12-02 18:59:17 +00:00
drochner 7d0c55ee34 fix the dad_count logic: if we send a packet successfully, reset the counter
for sent tries -- otherwise it gets confused if dad_count is set to >15
by the sysctl, and addresses get stuck in "tentative" state forever
2006-06-28 16:43:43 +00:00
liamjfoy 4876c304b1 Integrate Common Address Redundancy Procotol (CARP) from OpenBSD
'pseudo-device	carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@
2006-05-18 09:05:49 +00:00
rpaulo 941ce91614 Rename local variables called delay that shadow the delay() decl.
Pointed out by Robert Swindells.
2006-03-06 20:33:52 +00:00
rpaulo 8c2379fd97 NDP-related improvements:
RFC4191
	- supports host-side router-preference

	RFC3542
	- if DAD fails on a interface, disables IPv6 operation on the
          interface
	- don't advertise MLD report before DAD finishes

	Others
	- fixes integer overflow for valid and preferred lifetimes
	- improves timer granularity for MLD, using callout-timer.
	- reflects rtadvd's IPv6 host variable information into kernel
	  (router only)
	- adds a sysctl option to enable/disable pMTUd for multicast
          packets
	- performs NUD on PPP/GRE interface by default
	- Redirect works regardless of ip6_accept_rtadv
	- removes RFC1885-related code

From the KAME project via SUZUKI Shinsuke.
Reviewed by core.
2006-03-05 23:47:08 +00:00
rpaulo eb35daf5b2 Fix typos in comments.
From: the KAME project via SUZUKI Shinsuke.
2006-03-03 14:07:06 +00:00
wiz 1ad8067cb3 Fix typos, reported by Alexey Dobriyan ("Gathered from Linux"),
forwarded by jmc@openbsd.
2006-02-25 00:58:34 +00:00
rpaulo 78678b130a Better support of IPv6 scoped addresses.
- most of the kernel code will not care about the actual encoding of
  scope zone IDs and won't touch "s6_addr16[1]" directly.
- similarly, most of the kernel code will not care about link-local
  scoped addresses as a special case.
- scope boundary check will be stricter.  For example, the current
  *BSD code allows a packet with src=::1 and dst=(some global IPv6
  address) to be sent outside of the node, if the application do:
    s = socket(AF_INET6);
    bind(s, "::1");
    sendto(s, some_global_IPv6_addr);
  This is clearly wrong, since ::1 is only meaningful within a single
  node, but the current implementation of the *BSD kernel cannot
  reject this attempt.
- and, while there, don't try to remove the ff02::/32 interface route
  entry in in6_ifdetach() as it's already gone.

This also includes some level of support for the standard source
address selection algorithm defined in RFC3484, which will be
completed on in the future.

From the KAME project via JINMEI Tatuya.
Approved by core@.
2006-01-21 00:15:35 +00:00
christos 95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
christos 2ab31527e2 - avoid shadowed variables
- sprinkle const.
2005-05-29 21:43:51 +00:00
perry f07677dd81 nuke trailing whitespace 2005-02-26 22:45:09 +00:00
itojun 692c601c25 backout 1.54. heurestic code should never be used. if you experience DAD
failure, suspect your driver, not ND code.
2005-02-10 02:57:17 +00:00
drochner e1e8770b32 Give DAD a chance to succeed even if the network is "slightly broken"
(in my case it as a switch set to "monitor" mode):
If we see an NS request for the address we are just probing for, for
three times the number of DAD packets we are supposed to send (the
"ip6.dad_count" sysctl variable), assume that these are our own packets
and let DAD succeed.
The code for this was mostly there, commented out. Just needed some fixes.
The "three times" is heuristic of course.
Being here, reset the "dad_ns_tcount" variable on a successful send;
otherwise we get strange interdependencies with user-settable variables
(ever tried to set ip6.dad_count to something >15?).
2005-02-02 20:56:27 +00:00
itojun e2d302c40d reduce useless variables 2004-02-10 20:57:20 +00:00
simonb a2facef339 Remove some assigned-to but otherwise unused variables. 2003-10-30 01:43:08 +00:00
itojun a245b3dc6d u_short -> u_int16_t. sync w/ kame.
don't set ip6_plen where unneeded (i.e. before calling ip6_output)
2003-09-05 23:20:48 +00:00
itojun 11ede1ed88 remove ipsec_set/getsocket. now we explicitly pass socket * to ip{,6}_output. 2003-08-22 22:00:36 +00:00
itojun 82eb4ce914 change the additional arg to be passed to ip{,6}_output to struct socket *.
this fixes KAME policy lookup which was broken by the previous commit.
2003-08-22 21:53:01 +00:00
jonathan 902669955f Replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec.  Reviewed by itojun.
2003-08-22 20:20:09 +00:00
itojun 2cadb8ca7a split ND6 cache timer management to per-entry. increased accuracy,
no O(N) loop.   sync w/ kame
2003-06-27 08:41:08 +00:00
itojun 6d4a3c4191 remove unneeded checks of accept_rtadv. from kame 2003-06-24 07:54:47 +00:00
itojun 194f048bd9 use time.tv_sec directly 2003-06-24 07:39:24 +00:00
itojun 346e0198f0 always use PULLDOWN_TEST codepath. 2003-05-14 06:47:33 +00:00
simonb 4e3613273b Remove breaks after returns, unreachable returns and returns after
returns(!).
2002-09-23 05:51:10 +00:00
itojun b05ff066a7 whitespace cleanup 2002-06-09 14:43:10 +00:00
itojun 7316bc595b KNF 2002-06-08 21:29:26 +00:00
itojun 2495e99fc7 gc 2002-06-08 21:28:18 +00:00
itojun 6d8d0d63d8 sync with latest KAME in6_ifaddr/prefix/default router manipulation.
behavior changes:
- two iocts used by ndp(8) are now obsolete (backward compat provided).
  use sysctl path instead.
- lo0 does not get ::1 automatically.  it will get ::1 when lo0 comes up.
2002-06-08 21:22:29 +00:00
itojun a11e34efc5 whitespace 2002-06-07 07:38:51 +00:00
itojun e2ce1896bd whitespace 2002-06-07 07:35:39 +00:00
itojun 5c1df51d53 attach nd_ifinfo structure into if_afdata.
split IPv6 link MTU (advertised by RA) from real link MTU.
sync with kame
2002-05-29 07:53:39 +00:00
itojun d208a22daa use arc4random() where possible.
XXX is it necessary to do microtime() on tcp syn cache?
2002-05-28 10:11:49 +00:00
itojun 3faedc3f92 s/0/NULL/ as ln_hold is a pointer. sync w/ kame 2002-03-15 09:36:27 +00:00
lukem 4f2ad95259 add RCSIDs 2001-11-13 00:56:55 +00:00
itojun ae5499819c reduce diffs with kame (mostly cosmetic).
move IPV6_CHECKSUM processing to sys/netinet6/raw_ip6.c.
constify a couple of places.
2001-10-18 07:44:33 +00:00
itojun 1990d680c4 do not change neighbor cache state on entry timeout,
if the cache entry is for outgoing router.

perform on-linkness check before default router (re-)seletion.

do not play with interface direct route on nd6_rtrequest.

sync a lot of cosmetic changes.  sync with kame
2001-10-17 10:55:09 +00:00