Commit Graph

79 Commits

Author SHA1 Message Date
dyoung 72f0a6dfb0 Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing.  Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously.  Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs.  I have
  introduced routines for allocating, copying, and duplicating,
  and freeing sockaddrs:

        struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
        struct sockaddr *sockaddr_copy(struct sockaddr *dst,
                                       const struct sockaddr *src);
        struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
        void sockaddr_free(struct sockaddr *sa);

  sockaddr_alloc() returns either a sockaddr from the pool belonging
  to the specified family, or NULL if the pool is exhausted.  The
  returned sockaddr has the right size for that family; sa_family
  and sa_len fields are initialized to the family and sockaddr
  length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
  sockaddr_in).  sockaddr_free() puts the given sockaddr back into
  its family's pool.

  sockaddr_dup() and sockaddr_copy() work analogously to strdup()
  and strcpy(), respectively.  sockaddr_copy() KASSERTs that the
  family of the destination and source sockaddrs are alike.

  The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
  passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
  family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
  etc.  They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more.  All protocol families
  use struct route.  I have changed the route cache, 'struct route',
  so that it does not contain storage space for a sockaddr.  Instead,
  struct route points to a sockaddr coming from the pool the sockaddr
  belongs to.  I added a new method to struct route, rtcache_setdst(),
  for setting the cache destination:

        int rtcache_setdst(struct route *, const struct sockaddr *);

  rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
  available to create the sockaddr storage.

  It is now possible for rtcache_getdst() to return NULL if, say,
  rtcache_setdst() failed.  I check the return value for NULL
  everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
  caches, dom_rtcache.  rtflushall(sa_family_t af) looks up the
  domain indicated by 'af', walks the domain's list of route caches
  and invalidates each one.
2007-05-02 20:40:22 +00:00
dyoung 5493f188c7 KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
   in6_src.c, avoid casts by changing several route_in6 pointers
   to struct route pointers.  Remove unnecessary casts to caddr_t
   elsewhere.

Pave the way for eliminating address family-specific route caches:
   soon, struct route will not embed a sockaddr, but it will hold
   a reference to an external sockaddr, instead.  We will set the
   destination sockaddr using rtcache_setdst().  (I created a stub
   for it, but it isn't used anywhere, yet.)  rtcache_free() will
   free the sockaddr.  I have extracted from rtcache_free() a helper
   subroutine, rtcache_clear().  rtcache_clear() will "forget" a
   cached route, but it will not forget the destination by releasing
   the sockaddr.  I use rtcache_clear() instead of rtcache_free()
   in rtcache_update(), because rtcache_update() is not supposed
   to forget the destination.

Constify:

   1 Introduce const accessor for route->ro_dst, rtcache_getdst().

   2 Constify the 'dst' argument to ifnet->if_output().  This
     led me to constify a lot of code called by output routines.

   3 Constify the sockaddr argument to protosw->pr_ctlinput.  This
     led me to constify a lot of code called by ctlinput routines.

   4 Introduce const macros for converting from a generic sockaddr
     to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
     satocsin, et cetera.
2007-02-17 22:34:07 +00:00
dyoung a25eaede91 Add a source-address selection policy mechanism to the kernel.
Also, add ioctls SIOCGIFADDRPREF/SIOCSIFADDRPREF to get/set preference
numbers for addresses.  Make ifconfig(8) set/display preference
numbers.

To activate source-address selection policies in your kernel, add
'options IPSELSRC' to your kernel configuration.

Miscellaneous changes in support of source-address selection:

        1 Factor out some common code, producing rt_replace_ifa().

        2 Abbreviate a for-loop with TAILQ_FOREACH().

        3 Add the predicates on IPv4 addresses IN_LINKLOCAL() and
          IN_PRIVATE(), that are true for link-local unicast
          (169.254/16) and RFC1918 private addresses, respectively.
          Add the predicate IN_ANY_LOCAL() that is true for link-local
          unicast and multicast.

        4 Add IPv4-specific interface attach/detach routines,
          in_domifattach and in_domifdetach, which build #ifdef
          IPSELSRC.

See in_getifa(9) for a more thorough description of source-address
selection policy.
2006-11-13 05:13:38 +00:00
liamjfoy 4876c304b1 Integrate Common Address Redundancy Procotol (CARP) from OpenBSD
'pseudo-device	carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@
2006-05-18 09:05:49 +00:00
perry fbae48b901 Change "inline" back to "__inline" in .h files -- C99 is still too
new, and some apps compile things in C89 mode. C89 keywords stay.

As per core@.
2006-02-16 20:17:12 +00:00
perry 0f0296d88a Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete. 2005-12-24 20:45:08 +00:00
christos f01a2f5714 Define INADDR_NONE when we are in the kernel too. 2005-12-20 19:32:30 +00:00
christos 95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
elad 6439f2618f Add sysctls for IP, ICMP, TCP, and UDP statistics. 2005-08-05 09:21:25 +00:00
kim c9f56c04dc Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.
2005-01-31 23:49:36 +00:00
thorpej 7994b6f95e Don't perform checksums on loopback interfaces. They can be reenabled with
the net.inet.*.do_loopback_cksum sysctl.

Approved by: groo
2004-12-15 04:25:19 +00:00
manu 6e3c639957 IPv4 PIM support, based on a submission from Pavlin Radoslavov posted on
tech-net@
2004-09-04 23:29:44 +00:00
jonathan 85b3ba5bf1 Redo net.inet.* sysctl subtree for fast-ipsec from scratch.
Attach FAST-IPSEC statistics with 64-bit counters to new sysctl MIB.
Rework netstat to show FAST_IPSEC statistics, via sysctl,  for
netstat -p ipsec.

New kernel files:
	sys/netipsec/Makefile		(new file; install *_var.h includes)
	sys/netipsec/ipsec_var.h	(new 64-bit mib counter struct)

Changed kernel files:
	sys/Makefile			(recurse into sys/netipsec/)
	sys/netinet/in.h		(fake IP_PROTO name for fast_ipsec
					sysctl subtree.)
	sys/netipsec/ipsec.h		(minimal userspace inclusion)
	sys/netipsec/ipsec_osdep.h	(minimal userspace inclusion)
	sys/netipsec/ipsec_netbsd.c	(redo sysctl subtree from scratch)
	sys/netipsec/key*.c		(fix broken net.key subtree)

	sys/netipsec/ah_var.h		(increase all counters to 64 bits)
	sys/netipsec/esp_var.h		(increase all counters to 64 bits)
	sys/netipsec/ipip_var.h		(increase all counters to 64 bits)
	sys/netipsec/ipcomp_var.h	(increase all counters to 64 bits)

	sys/netipsec/ipsec.c		(add #include netipsec/ipsec_var.h)
	sys/netipsec/ipsec_mbuf.c	(add #include netipsec/ipsec_var.h)
	sys/netipsec/ipsec_output.c	(add #include netipsec/ipsec_var.h)

	sys/netinet/raw_ip.c		(add #include netipsec/ipsec_var.h)
	sys/netinet/tcp_input.c		(add #include netipsec/ipsec_var.h)
	sys/netinet/udp_usrreq.c	(add #include netipsec/ipsec_var.h)

Changes to usr.bin/netstat to print the new fast-ipsec sysctl tree
for "netstat -s -p ipsec":

New file:
	usr.bin/netstat/fast_ipsec.c	(print fast-ipsec counters)

Changed files:
	usr.bin/netstat/Makefile	(add fast_ipsec.c)
	usr.bin/netstat/netstat.h	(declarations for fast_ipsec.c)
	usr.bin/netstat/main.c		(call KAME-vs-fast-ipsec dispatcher)
2004-05-07 00:55:14 +00:00
itojun 0f06e31eb6 no space between function name and paren: foo (blah) -> foo(blah) 2004-04-21 17:49:46 +00:00
matt db6a0b431a De __P() 2004-04-18 21:00:35 +00:00
jonathan 130f3bfc26 Patch back support for (badly) randomized IP ids, by request:
* Include "opt_inet.h" everywhere IP-ids are generated with ip_newid(),
  so the RANDOM_IP_ID option is visible. Also in ip_id(), to ensure
  the prototype for ip_randomid() is made visible.

* Add new sysctl to enable randomized IP-ids, provided the kernel was
  configured with RANDOM_IP_ID. (The sysctl defaults to zero, and is
  a read-only zero if RANDOM_IP_ID is not configured).

Note that the implementation of randomized IP ids is still defective,
and should not be enabled at all (even if configured) without
very careful deliberation. Caveat emptor.
2003-11-19 18:39:34 +00:00
jonathan b86d07f435 Allocate sysctl oid for ipv4 sysctl node "ifq", define symbolic name, and
bump IPCTL_MAXID. (Should have been committed with other ifq sysctl changes).
2003-11-10 20:50:29 +00:00
agc aad01611e7 Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22364, verified by myself.
2003-08-07 16:26:28 +00:00
bjh21 4be7a2dcf3 Add a new feature-test macro, _NETBSD_SOURCE. If this is defined
by the application, all NetBSD interfaces are made visible, even
if some other feature-test macro (like _POSIX_C_SOURCE) is defined.
<sys/featuretest.h> defined _NETBSD_SOURCE if none of _ANSI_SOURCE,
_POSIX_C_SOURCE and _XOPEN_SOURCE is defined, so as to preserve
existing behaviour.

This has two major advantages:
+ Programs that require non-POSIX facilities but define _POSIX_C_SOURCE
  can trivially be overruled by putting -D_NETBSD_SOURCE in their CFLAGS.
+ It makes most of the #ifs simpler, in that they're all now ORs of the
  various macros, rather than having checks for (!defined(_ANSI_SOURCE) ||
  !defined(_POSIX_C_SOURCE) || !defined(_XOPEN_SOURCE)) all over the place.

I've tried not to change the semantics of the headers in any case where
_NETBSD_SOURCE wasn't defined, but there were some places where the
current semantics were clearly mad, and retaining them was harder than
correcting them.  In particular, I've mostly normalised things so that
_ANSI_SOURCE gets you the smallest set of stuff, then _POSIX_C_SOURCE,
_XOPEN_SOURCE and _NETBSD_SOURCE in that order.

Tested by building for vax, encouraged by thorpej, and uncontested in
tech-userlevel for a week.
2003-04-28 23:16:11 +00:00
dogcow 6045dfd329 PR/991: Darren Reed: Add a sysctl (checkinteface) to implement this. This
implementation is taken from FreeBSD, but we default to off.
XXX: We should really do this on a per ifaddr basis as jason suggested.
2003-04-12 00:17:49 +00:00
kleink 8fb0c069dc C++ does not permit static a data member to have the same name as its
class, so in a C++ environment rename the ip_opts member to Ip_opts as
observed in several other implementations; from Jon Olsson in
PR toolchain/19880.
2003-01-27 09:57:09 +00:00
kleink 1b8d8d79a8 Define uint{8,32}_t locally, per XNS5.2/POSIX-2001, and use them in this
header where applicable; use private fixed-width integer types otherwise.
2002-05-13 13:34:32 +00:00
kleink 602066c0d6 Provide local definitions of in_{addr,port}_t in <netinet/in.h> and use
them where deemed appropriate by XNS5.2/POSIX-2001.
2002-05-12 23:04:15 +00:00
martin a7d662b71c Clear M_BCAST and M_MCAST on outgoing mbufs.
Don't copy ttl from the inner packet to the encapsulating packet. Make
the outer ttl sysctl'able. This should close PR 14269 from Jasper Wallace
(change partly from there) and it makes traceroute work over gre tunnels.
2002-02-24 17:22:20 +00:00
thorpej ad9d3794b0 Implement support for IP/TCP/UDP checksum offloading provided by
network interfaces.  This works by pre-computing the pseudo-header
checksum and caching it, delaying the actual checksum to ip_output()
if the hardware cannot perform the sum for us.  In-bound checksums
can either be fully-checked by hardware, or summed up for final
verification by software.  This method was modeled after how this
is done in FreeBSD, although the code is significantly different in
most places.

We don't delay checksums for IPv6/TCP, but we do take advantage of the
cached pseudo-header checksum.

Note: hardware-assisted checksumming defaults to "off".  It is
enabled with ifconfig(8).  See the manual page for details.

Implement hardware-assisted checksumming on the DP83820 Gigabit Ethernet,
3c90xB/3c90xC 10/100 Ethernet, and Alteon Tigon/Tigon2 Gigabit Ethernet.
2001-06-02 16:17:09 +00:00
itojun e44d476e4e typo in comment 2001-05-27 23:46:51 +00:00
itojun 4b72eeeee5 net.inet.ip.maxfragpackets defines the maximum size of ip reass queue
(prevents fragment flood from chewing up mbuf memory space).
derived from KAME net.inet6.ip6.maxfragpackets.
2001-03-27 02:24:38 +00:00
kleink 4c96c6b51f Add IPPROTO_VRRP. 2001-01-19 09:01:48 +00:00
simonb 94d3076df3 #define<tab> cleanup. 2000-08-28 02:12:22 +00:00
tron a97bfde931 Add new sysctl variables "net.inet.ip.lowportmin" and
"net.inet.ip.lowportmax" which can be used to the set minimum
and maximum port number assigned to sockets using
IP_PORTRANGE_LOW.
2000-08-25 13:35:05 +00:00
kleink 079b94ad72 Avoid recursion with traditional cpp. 2000-07-28 12:13:32 +00:00
kleink d2787dad27 XNS5.2: define sa_family_t and use it where specified by the standard. 2000-06-26 15:48:19 +00:00
itojun 673e8e6fad move IPPROTO_DONE to IPPROTO_xx group 2000-03-10 15:30:55 +00:00
darrenr fd7edad6c3 Change the use of pfil hooks. There is no longer a single list of all
pfil information, instead, struct protosw now contains a structure
which caontains list heads, etc.  The per-protosw pfil struct is passed
to pfil_hook_get(), along with an in/out flag to get the head of the
relevant filter list.  This has been done for only IPv4 and IPv6, at
present, with these patches only enabling filtering for IPPROTO_IP and
IPPROTO_IPV6, although it is possible to have tcp/udp, etc, dedicated
filters now also.  The ipfilter code has been updated to only filter
IPv4 packets - next major release of ipfilter is required for ipv6.
2000-02-17 10:59:32 +00:00
itojun 59d74f3d21 to improve RFC2553/2292 compliance, and promote use of
RFC2553/2292-compliant header file path, now the following headers are
forbidden:
	netinet6/ip6.h
	netinet6/icmp6.h
	netinet6/in6.h

if you want netinet6/{ip6,icmp6}.h, use netinet/{ip6,icmp6}.h.

if you want netinet6/in6.h, you just need to include netinet/in.h.
it pulls it in.
(we may need to integrate them into netinet/in.h, but for cross-BSD code
sharing i'd like to keep it like this for now)
2000-02-09 00:54:55 +00:00
itojun ea861f0183 sync IPv6 part with latest KAME tree. IPsec part is left unmodified
due to massive changes in KAME side.
- IPv6 output goes through nd6_output
- faith can capture IPv4 packets as well - you can run IPv4-to-IPv6 translator
  using heavily modified DNS servers
- per-interface statistics (required for IPv6 MIB)
- interface autoconfig is revisited
- udp input handling has a big change for mapped address support.
- introduce in4_cksum() for non-overwriting checksumming
- introduce m_pulldown()
- neighbor discovery cleanups/improvements
- netinet/in.h strictly conforms to RFC2553 (no extra defs visible to userland)
- IFA_STATS is fixed a bit (not tested)
- and more more more.

TODO:
- cleanup os-independency #ifdef
- avoid rcvif dual use (for IPsec) to help ifdetach

(sorry for jumbo commit, I can't separate this any more...)
1999-12-13 15:17:17 +00:00
thorpej 046d593425 Add the `packed' attribute to structures which describe wire protocol data. 1999-11-20 00:37:58 +00:00
itojun f8346292af move ipsec sysctl index to IPPROTO_AH (instead of IPPROTO_ESP),
so that you can perform sysctl operation when ESP is not compiled in.
1999-07-02 08:46:47 +00:00
itojun 118d2b1d4f IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
  data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
  package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
  file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.
1999-07-01 08:12:45 +00:00
sommerfeld b7c70d2b2f If the new global variable hostzerobroadcast is zero, no longer assume
address zero of each net/subnet is a broadcast address.
(The default value is nonzero, which preserves the current behavior).

This can be set using sysctl; the boot-time default can also be
configured using the HOSTZEROBROADCAST kernel config option.

While we're here, defopt HOSTZEROBROADCAST and SUBNETSARELOCAL
1999-06-26 06:16:47 +00:00
hwr cf70cc28c7 Typo. :( 1998-09-14 21:15:56 +00:00
hwr 517139017e Some additions.
And IDPR-CMTP is 38 not 39 according to IANA.
1998-09-14 21:09:51 +00:00
hwr 366b9c4515 Add a gre tunnel pseudo network device. Gre = generic route encapsulation.
This device shows up like any other network interface and can be used to
tunnel L3 protocols as e.g. IP over IP.
1998-09-13 20:27:47 +00:00
kleink bb4f7768e4 Protect _XOPEN_SOURCE against sysctl MIB identifiers. 1998-09-05 19:03:25 +00:00
matt 36eac04cc0 Default IP flow to being enabled. Add a sysctl to control the maximum
number of flows (net.inet.ip.maxflows).  If set to 0, will disable fast
path forwarding.
1998-05-04 19:24:53 +00:00
kml 1579dcec47 Add support for deletion of routes added by path MTU discovery;
uses new generic route timeout code.  Add sysctl for timeout period.
1998-04-29 03:44:11 +00:00
perry f73530ba55 add/cleanup multiple inclusion protection. 1998-02-10 01:26:19 +00:00
lukem c80b4400e5 add the following, derived from FreeBSD:
* IP_PORTRANGE socket option, which controls how the ephemeral ports
  are allocated. it takes the following settings:
	IP_PORTRANGE_DEFAULT	use anonportmin (49152) -> anonportmax (65535)
	IP_PORTRANGE_HIGH	as IP_PORTRANGE_DEFAULT (retained for FreeBSD
				compat reasons, where these are separate)
	IP_PORTRANGE_LOW	use 600 -> 1023. only works if uid==0.
* in_pcb flag INP_ANONPORT. set if port was allocated ephmerally
1998-01-07 22:51:22 +00:00
lukem 1f8f74b669 enhance ephemeral port allocation code:
* support sysctl net.inet.ip.anonportmin (lowest ephemeral port)
  and net.inet.ip.anonportmax (highest ephemeral port).
  these can't be set to >65535, < IPPORT_RESERVED (unless IPNOPRIVPORTS
  is defined), and anonportmin has to be < anonportmax.
* use a cleaner way of only cycling through the available set once;
  this will be useful for when a random allocation scheme is used
* define IPPORT_ANON{MIN,MAX} instead of IPPORT_USER{LOW,HIGH}
1998-01-05 09:52:02 +00:00
lukem 0b57ba7265 as per the IANA assigned ports numbers document, use ports
49152..65535 for ephemeral ports (instead of 1024..5000).
closes my [kern/4440], but with correct code :)
1997-12-30 02:54:08 +00:00