Commit Graph

130 Commits

Author SHA1 Message Date
dyoung
de87fe677d *** Summary ***
When a link-layer address changes (e.g., ifconfig ex0 link
02🇩🇪ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address.  (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior.  Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability.  KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR.  In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr.  That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR.  In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR.  For example, pull ..._init() out of any switch
statement that looks like this:

        switch (...->sa_family) {
        case ...:
                ..._init();
                ...
                break;
        ...
        default:
                ..._init();
                ...
                break;
        }

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

        switch (x & (IFF_UP|IFF_RUNNING)) {
        case 0:
                ...
                break;
        case IFF_RUNNING:
                ...
                break;
        case IFF_UP:
                ...
                break;
        case IFF_UP|IFF_RUNNING:
                ...
                break;
        }

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure.  Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls.  In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source.  In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively.  Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset.  Delete unnecessary casts to void *.  Use
sockaddr_in_init() and sockaddr_in6_init().  Compare pointers with
NULL instead of "testing truth".  Replace some instances of (type
*)0 with NULL.  Change some K&R prototypes to ANSI C, and join
lines.
2008-11-07 00:20:01 +00:00
dyoung
cf969cfa5a Constify the rt_addrinfo argument to the ifa_rtrequest member
function of struct ifaddr.
2008-10-24 17:07:33 +00:00
dyoung
ee1bfcb3e8 bzero -> memset. Do not "test truth" of pointers, but compare with
NULL, instead.  Do not gratuitously cast to void *.  Use NULL
instead of (type *)0.

No functional changes intended.
2008-10-24 16:54:18 +00:00
dyoung
9e7ef562d2 Simplify RT_DPRINTF() calls. 2008-05-15 01:33:28 +00:00
dyoung
323b0fda0c Compare route with NULL instead of testing truth. Where applicable,
s/0/NULL/.  s/u_char/uint8_t/.  Remove superfluous curly braces.
2008-05-11 20:19:44 +00:00
ad
15e29e981b Merge the socket locking patch:
- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
2008-04-24 11:38:36 +00:00
thorpej
0dd41b37de Make ip6 and icmp6 stats per-cpu. 2008-04-15 03:57:04 +00:00
thorpej
aa8724ff7b Change ICMP6 stats from a structure to an array of uint64_t's.
Note: This is ABI-compatible with the old icmp6stat structure; old netstat
binaries will continue to work properly.
2008-04-08 15:04:35 +00:00
dyoung
5bbde3d775 Use IFNET_FOREACH() and IFADDR_FOREACH(). 2007-12-04 10:27:33 +00:00
dyoung
c44fbd164e Use sockaddr_in6_init(). Use a static initializer for all1_sa.
Constify a cast (may as well).  No functional change intended.
2007-11-10 00:07:57 +00:00
dyoung
122b86e247 De-__P(). 2007-11-01 20:33:56 +00:00
dyoung
88399b6877 We cannot sleep in a software interrupt, so do not sockaddr_dl_alloc(...,
M_WAITOK).  Instead, sockaddr_dl_init() a sockaddr_dl on the stack.
2007-09-02 19:42:21 +00:00
dyoung
b3fc296326 Use malloc(9) for sockaddrs instead of pool(9), and remove dom_sa_pool
and dom_sa_len members from struct domain.  Pools of fixed-size
objects are too rigid for sockaddr_dls, whose size can vary over
a wide range.

Return sockaddr_dl to its "historical" size.  Now that I'm using
malloc(9) instead of pool(9) to allocate sockaddr_dl, I can create
a sockaddr_dl of any size in the kernel, so expanding sockaddr_dl
is useless.

Avoid using sizeof(struct sockaddr_dl) in the kernel.

Introduce sockaddr_dl_alloc() for allocating & initializing an
arbitrary sockaddr_dl on the heap.

Add an argument, the sockaddr length, to sockaddr_alloc(),
sockaddr_copy(), and sockaddr_dl_setaddr().

Constify: LLADDR() -> CLLADDR().

Where the kernel overwrites LLADDR(), use sockaddr_dl_setaddr(),
instead.  Used properly, sockaddr_dl_setaddr() will not overrun
the end of the sockaddr.
2007-08-30 02:17:34 +00:00
dyoung
27de48611a Avoid writing past the end of the buffer [lldst, lldst + dstsize)
in nd6_storelladdr().

Use sockaddr_dl_setaddr().  Constify some sockaddr_dl's.  Constify
a sockaddr argument to nd6_na_output().  Change SDL() to "standard"
satocsdl() or satosdl().  Change SIN6() to satocsin6() or satosin6().

bcmp -> memcmp, bcopy -> memcpy.
2007-08-07 04:35:42 +00:00
dyoung
08e6f22226 Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

        Introduce rt_walktree() for walking the routing table and
        applying a function to each rtentry.  Replace most
        rn_walktree() calls with it.

        Use rt_getkey()/rt_setkey() to get/set a route's destination.
        Keep a pointer to the sockaddr key in the rtentry, so that
        rtentry users do not have to grovel in the radix_node for
        the key.

        Add a RTM_GET method to rtrequest.  Use that instead of
        radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

        Constify.  KNF.  Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
        et cetera.  Use NULL instead of 0 for null pointers.  Use
        __arraycount().  Reduce gratuitous parenthesization.

        Stop using variadic arguments for rip6_output(), it is
        unnecessary.

        Remove the unnecessary rtentry member rt_genmask and the
        code to maintain it, since nothing actually used it.

        Make rt_maskedcopy() easier to read by using meaningful variable
        names.

        Extract a subroutine intern_netmask() for looking up a netmask in
        the masks table.

        Start converting backslash-ridden IPv6 macros in
        sys/netinet6/in6_var.h into inline subroutines that one
        can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
2007-07-19 20:48:52 +00:00
ad
88ab7da936 Merge some of the less invasive changes from the vmlocking branch:
- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
2007-07-09 20:51:58 +00:00
dyoung
1db31a59af Fix the memory leak reported in kern/36337. Thanks Matthias Scheler
for the heads-up.  My fix is based on the following patches from
FreeBSD, however, I extracted the code into a subroutine,
nd6_llinfo_release_pkts():

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet6/nd6.c.diff?r1=1.48.2.18;r2=1.48.2.19
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet6/nd6_nbr.c.diff?r1=1.29.2.8;r2=1.29.2.9
2007-05-17 00:53:26 +00:00
dyoung
72f0a6dfb0 Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing.  Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously.  Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs.  I have
  introduced routines for allocating, copying, and duplicating,
  and freeing sockaddrs:

        struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
        struct sockaddr *sockaddr_copy(struct sockaddr *dst,
                                       const struct sockaddr *src);
        struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
        void sockaddr_free(struct sockaddr *sa);

  sockaddr_alloc() returns either a sockaddr from the pool belonging
  to the specified family, or NULL if the pool is exhausted.  The
  returned sockaddr has the right size for that family; sa_family
  and sa_len fields are initialized to the family and sockaddr
  length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
  sockaddr_in).  sockaddr_free() puts the given sockaddr back into
  its family's pool.

  sockaddr_dup() and sockaddr_copy() work analogously to strdup()
  and strcpy(), respectively.  sockaddr_copy() KASSERTs that the
  family of the destination and source sockaddrs are alike.

  The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
  passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
  family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
  etc.  They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more.  All protocol families
  use struct route.  I have changed the route cache, 'struct route',
  so that it does not contain storage space for a sockaddr.  Instead,
  struct route points to a sockaddr coming from the pool the sockaddr
  belongs to.  I added a new method to struct route, rtcache_setdst(),
  for setting the cache destination:

        int rtcache_setdst(struct route *, const struct sockaddr *);

  rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
  available to create the sockaddr storage.

  It is now possible for rtcache_getdst() to return NULL if, say,
  rtcache_setdst() failed.  I check the return value for NULL
  everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
  caches, dom_rtcache.  rtflushall(sa_family_t af) looks up the
  domain indicated by 'af', walks the domain's list of route caches
  and invalidates each one.
2007-05-02 20:40:22 +00:00
dyoung
95b277379f In nd6_rtrequest(), when we lookup/create a route whose destination
is equal to one of the host's IPv6 addresses, do not stop at setting
the route's interface to lo0, but also clear the route's RTF_CLONED
flag, if it is present, so that ip6_input() will accept packets
sent to that destination.  This is necessary because ip6_input()
will not accept a packet if it looks up the packet's destination
and finds a route with RTF_CLONED set.

I believe this will help IPv6 networking survive '/etc/rc.d/network
restart'.  See the problem report, kern/33279.
2007-03-17 06:32:46 +00:00
dyoung
833cc39940 In nd6_lookup, shorten a staircase. KNF: change return (expr); to
return expr; throughout.  Fix K&R prototypes and parameter type
declarations.
2007-03-15 23:35:25 +00:00
christos
53524e44ef Kill caddr_t; there will be some MI fallout, but it will be fixed shortly. 2007-03-04 05:59:00 +00:00
dyoung
5493f188c7 KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
   in6_src.c, avoid casts by changing several route_in6 pointers
   to struct route pointers.  Remove unnecessary casts to caddr_t
   elsewhere.

Pave the way for eliminating address family-specific route caches:
   soon, struct route will not embed a sockaddr, but it will hold
   a reference to an external sockaddr, instead.  We will set the
   destination sockaddr using rtcache_setdst().  (I created a stub
   for it, but it isn't used anywhere, yet.)  rtcache_free() will
   free the sockaddr.  I have extracted from rtcache_free() a helper
   subroutine, rtcache_clear().  rtcache_clear() will "forget" a
   cached route, but it will not forget the destination by releasing
   the sockaddr.  I use rtcache_clear() instead of rtcache_free()
   in rtcache_update(), because rtcache_update() is not supposed
   to forget the destination.

Constify:

   1 Introduce const accessor for route->ro_dst, rtcache_getdst().

   2 Constify the 'dst' argument to ifnet->if_output().  This
     led me to constify a lot of code called by output routines.

   3 Constify the sockaddr argument to protosw->pr_ctlinput.  This
     led me to constify a lot of code called by ctlinput routines.

   4 Introduce const macros for converting from a generic sockaddr
     to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
     satocsin, et cetera.
2007-02-17 22:34:07 +00:00
christos
1665d5e960 fix spelling of accommodate; from Zapher. 2006-11-24 19:46:58 +00:00
dyoung
810f665a21 Use LIST_/TAILQ_ macros, esp. LIST_FOREACH() and TAILQ_FOREACH().
Use the usual idiom for iterating over a list where we might
_REMOVE() entries,

        for (x = TAILQ_FIRST(...); x != NULL; x = nx) {
                nx = TAILQ_NEXT(x, ...);
                ...
        }
2006-11-20 04:34:16 +00:00
christos
168cd830d2 __unused removal on arguments; approved by core. 2006-11-16 01:32:37 +00:00
dyoung
a25eaede91 Add a source-address selection policy mechanism to the kernel.
Also, add ioctls SIOCGIFADDRPREF/SIOCSIFADDRPREF to get/set preference
numbers for addresses.  Make ifconfig(8) set/display preference
numbers.

To activate source-address selection policies in your kernel, add
'options IPSELSRC' to your kernel configuration.

Miscellaneous changes in support of source-address selection:

        1 Factor out some common code, producing rt_replace_ifa().

        2 Abbreviate a for-loop with TAILQ_FOREACH().

        3 Add the predicates on IPv4 addresses IN_LINKLOCAL() and
          IN_PRIVATE(), that are true for link-local unicast
          (169.254/16) and RFC1918 private addresses, respectively.
          Add the predicate IN_ANY_LOCAL() that is true for link-local
          unicast and multicast.

        4 Add IPv4-specific interface attach/detach routines,
          in_domifattach and in_domifdetach, which build #ifdef
          IPSELSRC.

See in_getifa(9) for a more thorough description of source-address
selection policy.
2006-11-13 05:13:38 +00:00
christos
4d595fd7b1 - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
2006-10-12 01:30:41 +00:00
christos
7ca3c62d31 - fix initializers
- add const
- remove dead code
2006-09-02 07:22:44 +00:00
kardel
de4337ab21 merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
  time.tv_sec -> time_second
- struct timeval mono_time is gone
  mono_time.tv_sec -> time_uptime
- access to time via
	{get,}{micro,nano,bin}time()
	get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
  Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
  NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
2006-06-07 22:33:33 +00:00
liamjfoy
4876c304b1 Integrate Common Address Redundancy Procotol (CARP) from OpenBSD
'pseudo-device	carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@
2006-05-18 09:05:49 +00:00
christos
826c34719e Coverity CID 857: Prevent NULL deref. 2006-04-15 00:09:29 +00:00
rpaulo
1acb7de56f From KAME via SUZUKI Shinsuke:
fixed a memory leak when net.inet6.icmp6.nd6_maxqueuelen is
	greater than 1.
2006-03-24 19:24:38 +00:00
rpaulo
8c2379fd97 NDP-related improvements:
RFC4191
	- supports host-side router-preference

	RFC3542
	- if DAD fails on a interface, disables IPv6 operation on the
          interface
	- don't advertise MLD report before DAD finishes

	Others
	- fixes integer overflow for valid and preferred lifetimes
	- improves timer granularity for MLD, using callout-timer.
	- reflects rtadvd's IPv6 host variable information into kernel
	  (router only)
	- adds a sysctl option to enable/disable pMTUd for multicast
          packets
	- performs NUD on PPP/GRE interface by default
	- Redirect works regardless of ip6_accept_rtadv
	- removes RFC1885-related code

From the KAME project via SUZUKI Shinsuke.
Reviewed by core.
2006-03-05 23:47:08 +00:00
rpaulo
eb35daf5b2 Fix typos in comments.
From: the KAME project via SUZUKI Shinsuke.
2006-03-03 14:07:06 +00:00
dyoung
8bc5b41214 In nd6_llinfo_timer, don't duplicate part of nd6_llinfo_settimer's
logic, and then call nd6_llinfo_settimer.   Instead, call
nd6_llinfo_settimer immediately.

This should cause no functional change.  I've been running this
patch for months.
2006-03-02 05:11:31 +00:00
rpaulo
78678b130a Better support of IPv6 scoped addresses.
- most of the kernel code will not care about the actual encoding of
  scope zone IDs and won't touch "s6_addr16[1]" directly.
- similarly, most of the kernel code will not care about link-local
  scoped addresses as a special case.
- scope boundary check will be stricter.  For example, the current
  *BSD code allows a packet with src=::1 and dst=(some global IPv6
  address) to be sent outside of the node, if the application do:
    s = socket(AF_INET6);
    bind(s, "::1");
    sendto(s, some_global_IPv6_addr);
  This is clearly wrong, since ::1 is only meaningful within a single
  node, but the current implementation of the *BSD kernel cannot
  reject this attempt.
- and, while there, don't try to remove the ff02::/32 interface route
  entry in in6_ifdetach() as it's already gone.

This also includes some level of support for the standard source
address selection algorithm defined in RFC3484, which will be
completed on in the future.

From the KAME project via JINMEI Tatuya.
Approved by core@.
2006-01-21 00:15:35 +00:00
christos
95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
christos
2ab31527e2 - avoid shadowed variables
- sprinkle const.
2005-05-29 21:43:51 +00:00
seanb
40b52d3132 - Arithmetic error when calculating ticks to nd6_llinfo_settimer().
- Reviewed by christos.
2005-05-27 22:26:25 +00:00
tron
6589458a53 Make sure that prefixes get purged. This fixes PR kern/21189,
PR kern/25968 and PR kern/27873.
2005-04-03 11:02:27 +00:00
peter
396b87b8c2 Convert lo(4) to a clonable device.
This also removes the loif array and changes all code to use the new
lo0ifp pointer which points to the lo0 ifnet structure.

Approved by christos.
2004-12-04 16:10:25 +00:00
itojun
32e4b55076 do not loop on nd6_output() when transmission fails. from kame 2004-05-19 17:45:05 +00:00
itojun
6c8714a95e avoid ugly typecast 2004-02-11 10:37:33 +00:00
simonb
a2facef339 Remove some assigned-to but otherwise unused variables. 2003-10-30 01:43:08 +00:00
itojun
4e6aca94c2 correct missing inclusion of opt_ipsec.h 2003-08-22 22:11:44 +00:00
itojun
2cadb8ca7a split ND6 cache timer management to per-entry. increased accuracy,
no O(N) loop.   sync w/ kame
2003-06-27 08:41:08 +00:00
itojun
6d4a3c4191 remove unneeded checks of accept_rtadv. from kame 2003-06-24 07:54:47 +00:00
itojun
adb5d5afb4 * kame/sys/netinet6/nd6.c (nd6_rtrequest): changed a condition to
decide whether to create an empty llinfo stricter so that a user
can manually change the link-layer address of an existing neighbor
cache.
Pointed out by: KIU Shueng Chuan

from kame
2003-06-24 07:49:03 +00:00
itojun
194f048bd9 use time.tv_sec directly 2003-06-24 07:39:24 +00:00
itojun
5b0c3f9506 clear ln_hold earlier. from kame 2003-06-24 07:32:03 +00:00