Commit Graph

57 Commits

Author SHA1 Message Date
dyoung
08e6f22226 Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

        Introduce rt_walktree() for walking the routing table and
        applying a function to each rtentry.  Replace most
        rn_walktree() calls with it.

        Use rt_getkey()/rt_setkey() to get/set a route's destination.
        Keep a pointer to the sockaddr key in the rtentry, so that
        rtentry users do not have to grovel in the radix_node for
        the key.

        Add a RTM_GET method to rtrequest.  Use that instead of
        radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

        Constify.  KNF.  Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
        et cetera.  Use NULL instead of 0 for null pointers.  Use
        __arraycount().  Reduce gratuitous parenthesization.

        Stop using variadic arguments for rip6_output(), it is
        unnecessary.

        Remove the unnecessary rtentry member rt_genmask and the
        code to maintain it, since nothing actually used it.

        Make rt_maskedcopy() easier to read by using meaningful variable
        names.

        Extract a subroutine intern_netmask() for looking up a netmask in
        the masks table.

        Start converting backslash-ridden IPv6 macros in
        sys/netinet6/in6_var.h into inline subroutines that one
        can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
2007-07-19 20:48:52 +00:00
dyoung
95edb940c2 Get rid of radix_node_head.rnh_walktree, because it is only ever
set to rn_walktree.

Introduce rt_walktree(), which applies a subroutine to every route
in a particular address family.  Use it instead of rn_walktree()
virtually everywhere.  This helps to hide the routing table
implementation.
2007-06-09 03:07:21 +00:00
dyoung
680f00f8b5 Factor rtcache_lookup2() out of rtcache_lookup1(), for re-use in
the IPv6 stack.  rtcache_lookup2() takes an int * argument that it
writes with 1 if we had a cache 'hit', 0 if there was a cache
'miss'.
2007-05-06 02:17:54 +00:00
dyoung
72f0a6dfb0 Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing.  Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously.  Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs.  I have
  introduced routines for allocating, copying, and duplicating,
  and freeing sockaddrs:

        struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
        struct sockaddr *sockaddr_copy(struct sockaddr *dst,
                                       const struct sockaddr *src);
        struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
        void sockaddr_free(struct sockaddr *sa);

  sockaddr_alloc() returns either a sockaddr from the pool belonging
  to the specified family, or NULL if the pool is exhausted.  The
  returned sockaddr has the right size for that family; sa_family
  and sa_len fields are initialized to the family and sockaddr
  length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
  sockaddr_in).  sockaddr_free() puts the given sockaddr back into
  its family's pool.

  sockaddr_dup() and sockaddr_copy() work analogously to strdup()
  and strcpy(), respectively.  sockaddr_copy() KASSERTs that the
  family of the destination and source sockaddrs are alike.

  The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
  passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
  family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
  etc.  They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more.  All protocol families
  use struct route.  I have changed the route cache, 'struct route',
  so that it does not contain storage space for a sockaddr.  Instead,
  struct route points to a sockaddr coming from the pool the sockaddr
  belongs to.  I added a new method to struct route, rtcache_setdst(),
  for setting the cache destination:

        int rtcache_setdst(struct route *, const struct sockaddr *);

  rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
  available to create the sockaddr storage.

  It is now possible for rtcache_getdst() to return NULL if, say,
  rtcache_setdst() failed.  I check the return value for NULL
  everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
  caches, dom_rtcache.  rtflushall(sa_family_t af) looks up the
  domain indicated by 'af', walks the domain's list of route caches
  and invalidates each one.
2007-05-02 20:40:22 +00:00
xtraeme
df40909241 rtcache_clear is defined as static void in route.c, but it's used
in netinet/in_route.c. Move the prototype into route.h to fix
the build.
2007-04-22 13:05:21 +00:00
christos
53524e44ef Kill caddr_t; there will be some MI fallout, but it will be fixed shortly. 2007-03-04 05:59:00 +00:00
dyoung
5493f188c7 KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
   in6_src.c, avoid casts by changing several route_in6 pointers
   to struct route pointers.  Remove unnecessary casts to caddr_t
   elsewhere.

Pave the way for eliminating address family-specific route caches:
   soon, struct route will not embed a sockaddr, but it will hold
   a reference to an external sockaddr, instead.  We will set the
   destination sockaddr using rtcache_setdst().  (I created a stub
   for it, but it isn't used anywhere, yet.)  rtcache_free() will
   free the sockaddr.  I have extracted from rtcache_free() a helper
   subroutine, rtcache_clear().  rtcache_clear() will "forget" a
   cached route, but it will not forget the destination by releasing
   the sockaddr.  I use rtcache_clear() instead of rtcache_free()
   in rtcache_update(), because rtcache_update() is not supposed
   to forget the destination.

Constify:

   1 Introduce const accessor for route->ro_dst, rtcache_getdst().

   2 Constify the 'dst' argument to ifnet->if_output().  This
     led me to constify a lot of code called by output routines.

   3 Constify the sockaddr argument to protosw->pr_ctlinput.  This
     led me to constify a lot of code called by ctlinput routines.

   4 Introduce const macros for converting from a generic sockaddr
     to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
     satocsin, et cetera.
2007-02-17 22:34:07 +00:00
joerg
8632294e2e Add a debug option for the route cache to help tracing down issues
like PR 35272 and 35318. When the kernel is compiled with
-DRTCACHE_DEBUG, all rtcache entries are logged to a list with the place
they got initialised. This allows overwrites, double inits and other
manual messing to be detected.
2007-01-05 16:40:08 +00:00
joerg
eb04733c4e Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.
2006-12-15 21:18:52 +00:00
dyoung
c308b1c661 Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route).  Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL.  Provide
in_rtcache() for adding a route to the chain.  Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches.  In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain.  In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.
2006-12-09 05:33:04 +00:00
joerg
b49bdf49d7 Deinline rt_get_ifa. Keep it in route.c as it is part of the routing
API, even though rtsock.c is the only user right now.
2006-12-07 19:37:08 +00:00
joerg
d87b42b41f Deinline rt_replace_ifa and move rt_set_ifa and rt_set_ifa1 to
route.c as they are not used outside that file.
2006-12-07 19:20:14 +00:00
dyoung
00aa0b8d95 Fix bugs in rt_get_ifa() and put aside the sequence number stuff,
which isn't ready for primetime yet.
2006-11-13 19:14:30 +00:00
dyoung
a25eaede91 Add a source-address selection policy mechanism to the kernel.
Also, add ioctls SIOCGIFADDRPREF/SIOCSIFADDRPREF to get/set preference
numbers for addresses.  Make ifconfig(8) set/display preference
numbers.

To activate source-address selection policies in your kernel, add
'options IPSELSRC' to your kernel configuration.

Miscellaneous changes in support of source-address selection:

        1 Factor out some common code, producing rt_replace_ifa().

        2 Abbreviate a for-loop with TAILQ_FOREACH().

        3 Add the predicates on IPv4 addresses IN_LINKLOCAL() and
          IN_PRIVATE(), that are true for link-local unicast
          (169.254/16) and RFC1918 private addresses, respectively.
          Add the predicate IN_ANY_LOCAL() that is true for link-local
          unicast and multicast.

        4 Add IPv4-specific interface attach/detach routines,
          in_domifattach and in_domifdetach, which build #ifdef
          IPSELSRC.

See in_getifa(9) for a more thorough description of source-address
selection policy.
2006-11-13 05:13:38 +00:00
christos
95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
elad
976bf6cfdd Multiple inclusion protection, as suggested by christos@ on tech-kern@
few days ago.
2005-12-10 23:21:38 +00:00
dyoung
9063402978 Resolve conflicts in importation of 18-May-2005 ath(4) / net80211(9)
from FreeBSD.  Introduce compatibility shims (sys/dev/ic/ath_netbsd.[ch],
sys/net80211/ieee80211_netbsd.[ch]).  Update drivers (an, atu, atw,
awi, ipw, iwi, rtw, wi) for the new net80211(9) API.
2005-06-22 06:14:51 +00:00
christos
333e176687 - sprinkle const
- remove unneeded casts
- use more mem*() instead of b*() funcs.
2005-05-29 21:22:52 +00:00
perry
f07677dd81 nuke trailing whitespace 2005-02-26 22:45:09 +00:00
matt
7cf8938ddd ANSI-fy and some additional de-__P and constification. 2004-04-21 21:03:43 +00:00
matt
e3b919c754 Constify if.c radix.c and route.c (and fix related fallout). 2004-04-21 04:17:28 +00:00
agc
aad01611e7 Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22364, verified by myself.
2003-08-07 16:26:28 +00:00
fvdl
d5aece61d6 Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
2003-06-29 22:28:00 +00:00
darrenr
960df3c8d1 Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records.  The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V
2003-06-28 14:20:43 +00:00
wiz
f585195db2 bandwidth, not bandwith. 2003-01-18 12:02:40 +00:00
itojun
50a545a34b remove all entries in rt timer queue on ip_mtudisc change, instead of
destroying the queue.
2002-11-12 02:10:13 +00:00
itojun
96910acf99 add an argument to rt_timer_remove_all(), to specify if we need to call
timeout routine on removal.
2002-11-12 01:37:30 +00:00
perry
6858187df6 /*CONTCOND*/ while (0)'ed macros 2002-11-02 07:20:42 +00:00
matt
2d83d27dfa Eliminate more commons. 2002-05-12 20:40:11 +00:00
enami
d189c89a19 - lineup comment.
- fix typo in comment.
2001-03-08 03:22:28 +00:00
itojun
e79a9123a3 use u_quad_t for rtstat.
not sure if it really matters, but short (32K) looks way too small given
recent fat pipes connecting *BSD boxes, and our great uptime :-).
2001-02-21 05:45:11 +00:00
itojun
02adaaf197 cleanup cloned route when parent route (RTF_CLONING) goes away.
adds rt_parent to link parent from child (like NRL did, ours do refcnt
rt_refcnt properly).

bsdi rt_walkbranch would speedup the processing, but since the code will not
be visited too frequently, the current code (with rt_walktree) should be okay.
2001-01-27 10:39:33 +00:00
itojun
fee00b1a78 mark cloned routes with RTF_CLONED. present it with netstat -r by "c".
let static routes overwrite cloned routes, as cloned routes can come back again
if necessary.  behavior same as freebsd/bsdi, code partially from bsdi42.
(NRL rt->rt_parent was not added)
should fix PR 11916 and maybe some other PRs with ARP behavior.

recompilation of usr.sbin/route6d is suggested.
2001-01-27 04:49:31 +00:00
itojun
df9784d749 pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).
have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works.  previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *.  it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.
2001-01-17 04:05:41 +00:00
itojun
5eae50d991 update icmp6 too big validation. the change is necessary since pmtud is
mandatory for IPv6 (so we can't just validate by using connected pcb - we need
to allow traffic from unconnected pcb to do pmtud).
- if the traffic is validated by xx_ctlinput, allow up to "hiwat" pmtud
  route entries.
- if the traffic was not validated by xx_ctlinput, allow up to "lowat" pmtud
  route entries (there's upper limit, so bad guys cannot blow up our routing
  table).
sync with kame

XXX need to think again about default hiwat/lowat value.
XXX victim selection to help starvation case
2000-12-09 01:29:45 +00:00
ragge
8a9c114515 Change rt_refcnt from short to int, to allow more than 32k routes thru
one interface without unexpected side effects.
2000-05-04 17:33:03 +00:00
thorpej
0f5c059d1f - Add link status to if_data, so that routing daemons and other interested
parties can easily know the state of a link.
- Define an interface announcement message for the routing socket so that
  routing daemons and other interested parties know when an interface
  is attached/detached.
2000-03-06 20:49:00 +00:00
bouyer
f86517a031 Update protocoles and interfaces stats counters to 64bit.
RTM_IFINFO is now 0xf, 0xe is RTM_OIFINFO which returns the old (if_msghdr14)
struct with 32bit counters (binary compat, conditioned on COMPAT_14).
Same for sysctl: node 3 is renamed NET_RT_OIFLIST, NET_RT_IFLIST is now node 4.
Change rt_msg1() to add an mbuf to the mbuf chain instead of just panic()
when the message is larger than MHLEN.
1999-11-19 10:41:41 +00:00
itojun
06c350054d remove reference to in6_systm.h (file itself will be removed afterwords) 1999-07-30 10:35:34 +00:00
itojun
118d2b1d4f IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
  data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
  package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
  file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.
1999-07-01 08:12:45 +00:00
thorpej
f079e8d39c Simplify the rttimer code somewhat; use TAILQs instead of CIRCLEQs (we
didn't really need to traverse the queues backwards anyhow), and other
minor code simplification.
1998-12-27 18:27:48 +00:00
christos
13d58281de IPX counters and centralize statistics routine. 1998-12-10 15:52:39 +00:00
thorpej
2cffea3962 Use do { ... } while (0) in RTFREE(). 1998-08-25 04:22:33 +00:00
thorpej
5e7c21d896 Need <sys/socket.h> to stand alone. 1998-05-02 21:19:03 +00:00
thorpej
93b075a492 Oops, we depend on <sys/queue.h>. 1998-04-29 17:49:58 +00:00
kml
8cdafd0efb Add generic route timeout functionality; used by path MTU discovery code 1998-04-29 03:41:49 +00:00
christos
964633009c Sync with Lite2. 1997-04-02 21:17:28 +00:00
mycroft
49d52c9b1c Pass a proc pointer down to the usrreq and pcbbind functions for PRU_ATTACH, PRU_BIND and
PRU_CONTROL.  The usrreq interface really needs to be split up, but this will have to wait.
Remove SS_PRIV completely.
1996-05-22 13:54:55 +00:00
christos
206e75c6f1 Net prototypes 1996-02-13 21:59:53 +00:00
jtc
7c04233887 KERNEL -> _KERNEL 1995-03-26 20:23:52 +00:00