Commit Graph

1970 Commits

Author SHA1 Message Date
plunky
8094317b1b constify sockopt in the PRCO_SETOPT path 2008-08-16 21:51:43 +00:00
tls
dba208aabd Change copyright statement to NetBSD 2-clause with correct attribution. 2008-08-10 14:07:41 +00:00
cegger
bbae282081 make this compile as proposed by dholland@ 2008-08-07 06:20:14 +00:00
plunky
fd7356a917 Convert socket options code to use a sockopt structure
instead of laying everything into an mbuf.

approved by core
2008-08-06 15:01:23 +00:00
spz
79462c037e typo fix in comment (drops the ' in drop's :) 2008-08-04 07:01:05 +00:00
matt
3e368ad90b Free the socket only after disposing of the PCB. 2008-08-04 06:29:58 +00:00
tls
c5ddeafa76 Unlock reassembly queue before calling sorwakeup(), not after. In unusual
cases with in-kernel consumers which might send data on the same socket,
we can deadlock on the reassembly queue otherwise (observed while testing
accept filters).
2008-08-04 04:08:47 +00:00
tls
717f903a98 Add accept filters, ported from FreeBSD by Coyote Point Systems. Add inetd
support for specifying an accept filter for a service (mostly as a usage
example, but it can be handy for other things).  Manual pages to follow
in a day or so.

OK core@.
2008-08-04 03:55:47 +00:00
matt
34ac358652 Reacquire softnet_lock after calling soabort which returns with the socket
unlocked.
2008-07-28 18:41:07 +00:00
cyber
76c8d40dd1 Add IANA allocation and header for RFC 5006 (RA RDNSS) IPv6 Router
Advertisement option.
2008-07-11 07:35:05 +00:00
ad
c4e6bfaf85 tcp_input: add a couple of assertions. 2008-07-04 18:22:21 +00:00
ad
4c75eca868 syn_cache_get: remove new endpoint's socket from head's queue if aborting
the connection. Should fix KASSERT(so->so_head == NULL).
2008-07-03 15:35:28 +00:00
yamt
fff57c5525 merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@.  requested by core@
2008-06-18 09:06:25 +00:00
dyoung
a8ad22e5d9 Don't cast to void * unnecessarily. 2008-05-22 01:06:39 +00:00
dyoung
518ccec3d5 bzero -> memset, bcopy -> memcpy. 2008-05-13 18:24:01 +00:00
dyoung
0f58320be0 Cosmetic: use __arraycount(). s/0/NULL/ where appropriate. Pass
"null" instead of 0 to printf %s.  Remove superfluous parentheses
in return statements.  Compare pointers with NULL instead of "testing
truth."
2008-05-13 17:51:26 +00:00
dyoung
62c140415f Cosmetic: compare sa_family with AF_UNSPEC instead of testing truth.
Join a line.  Compare sa_len with 0 instead of testing truth.
2008-05-11 20:17:59 +00:00
dyoung
df0b11bb4e Use memset() instead of Bzero().
In arplookup1(), put the static sockaddr_inarp onto the stack, and
zero it before use.
2008-05-11 20:16:12 +00:00
taca
fd376618e5 Make sure to clear csum_flags before forward the packet.
This change should be fix DIAGNOSTIC kernel's panic when the machine act
as multicast router.

Advised from tls@ and approved by thorpej@.
2008-05-08 08:00:55 +00:00
ad
e071d39c84 - Convert hashinit() to use kmem_alloc(). The hash tables can be large
and it's better to not have them in kmem_map.
- Convert a couple of minor items along the way to kmem_alloc().
- Fix some memory leaks.
2008-05-05 17:11:16 +00:00
thorpej
b129a80c20 Simplify the interface to netstat_sysctl() and allocate space for
the collated counters using kmem_alloc().

PR kern/38577
2008-05-04 07:22:14 +00:00
ad
2830fe3488 PR kern/38497 Out of memory allocating ksiginfo
Work around: don't acquire softnet_lock in protocol drain routines.
2008-05-02 13:40:32 +00:00
martin
ce099b4099 Remove clause 3 and 4 from TNF licenses 2008-04-28 20:22:51 +00:00
yamt
4f47226d42 udp_init: don't forget to allocate udp6stat_percpu. 2008-04-26 08:13:59 +00:00
yamt
167fe02fc8 tcp_init: don't forget to allocate tcpstat_percpu. 2008-04-26 08:13:35 +00:00
ad
15e29e981b Merge the socket locking patch:
- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
2008-04-24 11:38:36 +00:00
thorpej
caf49ea572 Make IPSEC and FAST_IPSEC stats per-cpu. Use <net/net_stats.h> and
netstat_sysctl().
2008-04-23 06:09:04 +00:00
thorpej
33326077b1 Use <net/net_stats.h> / netstat_sysctl(). 2008-04-23 05:26:50 +00:00
dyoung
71455e2d0d C99 does not allow u_int8_t bitfields, so use unsigned int, instead. 2008-04-16 20:58:35 +00:00
thorpej
83dd106948 Make IGMP stats per-cpu. 2008-04-15 16:02:03 +00:00
thorpej
881a947288 Make ARP stats per-cpu. 2008-04-15 15:17:54 +00:00
thorpej
1121526b25 Make CARP status per-cpu. 2008-04-15 06:03:28 +00:00
thorpej
c2da059bc6 Make udp6 stats per-cpu. 2008-04-15 04:43:25 +00:00
thorpej
0dd41b37de Make ip6 and icmp6 stats per-cpu. 2008-04-15 03:57:04 +00:00
thorpej
7ff8d08aae Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.
2008-04-12 05:58:22 +00:00
dyoung
2527883e86 s/8/NBBY/ 2008-04-10 18:09:14 +00:00
thorpej
04e54b2ef5 - ipflow is not used outside ip_flow.c; move its definition there.
- Make ipflow_reap() private to ip_flow.c, and introduce ipflow_prune()
  for external callers to use (avoids returning an ipflow * that is never
  actually used anyway).
2008-04-09 05:14:20 +00:00
thorpej
3f466bce48 Change IPv6 stats from a structure to an array of uint64_t's.
Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.
2008-04-08 23:37:43 +00:00
thorpej
aa8724ff7b Change ICMP6 stats from a structure to an array of uint64_t's.
Note: This is ABI-compatible with the old icmp6stat structure; old netstat
binaries will continue to work properly.
2008-04-08 15:04:35 +00:00
thorpej
f5c68c0b9f Change TCP stats from a structure to an array of uint64_t's.
Note: This is ABI-compatible with the old tcpstat structure; old netstat
binaries will continue to work properly.
2008-04-08 01:03:58 +00:00
thorpej
88d65e9212 Change IP stats from a structure to an array of uint64_t's.
Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.
2008-04-07 06:31:27 +00:00
thorpej
738aabaf82 Change UDP stats from a structure to an array of uint64_t's.
Note: This is ABI-compatible with the old icmpstat structure; old netstat
binaries will continue to work properly.
2008-04-06 20:17:27 +00:00
thorpej
67b7abb1ce Change ICMP stats from a structure to an array of uint64_t's.
Note: This is ABI-compatible with the old icmpstat structure; old netstat
binaries will continue to work properly.
2008-04-06 19:04:48 +00:00
cube
564b60af35 - Make sure we send a reasonable fragment size when IPSEC is configured.
Otherwise we end up sending a dubious "0" whenever we cannot find a
  proper association for the packet.
- Reset sack_newdata along with snd_nxt to avoid improper integer
  arithmetics that lead to sending data from an incorrect place in the
  stream, making it appear as corrupted.

Patch by Michael Van Elst, based on an analysis by Michael for the IPSEC
stuff and I for the SACK issue.
2008-03-27 00:18:56 +00:00
ws
8297b01db8 Set scope on IPv6 multicast address to give carp a chance to work for IPv6, too.
From FreeBSD.
2008-03-15 16:44:03 +00:00
rmind
c6186face4 Welcome to 4.99.55:
- Add a lot of missing selinit() and seldestroy() calls.

- Merge selwakeup() and selnotify() calls into a single selnotify().

- Add an additional 'events' argument to selnotify() call.  It will
  indicate which event (POLL_IN, POLL_OUT, etc) happen.  If unknown,
  zero may be used.

Note: please pass appropriate value of 'events' where possible.
Proposed on: <tech-kern>
2008-03-01 14:16:49 +00:00
matt
a34217b8de Rework tcp congctl selection code so that the congctl entries can be const.
Don't access tcp_congctl stuff outside of tcp_congctl.c, use routines to
update t_congctl.  This code is slightly now more complicated.
2008-02-29 07:39:17 +00:00
matt
a4a1e5ce55 Convert stragglers to ansi definitions from old-style definitons.
Remember that func() is not ansi, func(void) is.
2008-02-27 19:41:51 +00:00
yamt
c3985cffec make TCP_SETUP_ACK, ICMP_CHECK, TCP_FIELDS_TO_HOST, and TCP_FIELDS_TO_NET
static functions.
2008-02-20 11:44:07 +00:00
joerg
862a285bde Explicitly predict panic conditions as false. 2008-02-12 13:05:55 +00:00
joerg
80b711a35e Reimplement in4_cksum to not copy data, but sum up directly.
Tested on sparc and m68k by martin@.
2008-02-07 22:45:20 +00:00
matt
fb71901dbc Add a new ip_id generation scheme based on a Fisher-Yates shuffle over a
sliding window.  XXX replace use of arc4random RSN.
2008-02-06 03:20:50 +00:00
yamt
f35baba8dd - start tcp timestamp from 1 instead of 0.
- add a comment to explain why:
+        * We start with 1, because 0 doesn't work with linux, which
+        * considers timestamp 0 in a SYN packet as a bug and disables
+        * timestamps.
2008-02-05 09:38:47 +00:00
yamt
d5bac2f6b1 redo tcp_input.c rev.1.230 correctly.
revision 1.230
    date: 2005/06/30 02:58:28;  author: christos;  state: Exp;  lines: +20 -4
    Normalize our PAWS code with Free and Open, as mentioned in tech-security.

reviewed by christos@ and matt@.
2008-02-04 23:56:14 +00:00
yamt
a944f4302a revert tcp_output.c 1.253 because it has an ill effect when sending
small (not full-sized) segments.
http://mail-index.NetBSD.org/tech-net/2008/01/27/0009.html
2008-01-29 12:34:47 +00:00
joerg
6e869e402d Refactor in_cksum/in4_cksum/in6_cksum implementations:
- All three functions are included in the kernel by default.
  They call a backend function cpu_in_cksum after possibly
  computing the checksum of the pseudo header.
- cpu_in_cksum is the core to implement the one-complement sum.
  The default implementation is moderate fast on most platforms
  and provides a 32bit accumulator with 16bit addends for L32 platforms
  and a 64bit accumulator with 32bit addends for L64 platforms.
  It handles edge cases like very large mbuf chains (could happen with
  native IPv6 in the future) and provides a good base for new native
  implementations.
- Modify i386 and amd64 assembly to use the new interface.

This disables the MD implementations on !x86 until the conversion is
done. For Alpha, the portable version is faster.
2008-01-25 21:12:10 +00:00
joerg
3615cf7715 Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.
2008-01-20 18:09:03 +00:00
dyoung
2d4e7e5856 Use rtcache_validate() instead of rtcache_getrt(). Shorten staircase
in in_losing().
2008-01-14 04:19:09 +00:00
dyoung
1386ee4adf Good-bye, rtcache_check(). Call both rtcache_validate() and
rtcache_update(,1) instead of rtcache_check().
2008-01-12 02:58:58 +00:00
joerg
71c98bab0d When not compiling for the kernel, use stdio.h instead of sys/systm.h
(printf) and locally define the protoype. Makes it possible to use
in_cksum.c for regression testing.
2008-01-09 17:13:52 +00:00
joerg
a7a33965fc Anyone seriously interested in implementing in_cksum on a new platform
should read RFC 1071, so point them to it.
2008-01-09 17:01:46 +00:00
dyoung
f9c1ba02ee Constify a bit. 2008-01-04 23:28:07 +00:00
dyoung
a4455600d4 Replace rtcache_down() with rtcache_validate() and update rtcache_down()
uses.
2008-01-04 23:26:44 +00:00
degroote
d23595095d Restore correctly the sp level in case of FAST_IPSEC + IPSEC_NAT_T 2007-12-29 15:13:55 +00:00
degroote
61e79ba32a Simplify the FAST_IPSEC output path
Only record an IPSEC_OUT_DONE tag when we have finished the processing
In ip{,6}_output, check this tag to know if we have already processed this
packet.
Remove some dead code (IPSEC_PENDING_TDB is not used in NetBSD)

Fix pr/36870
2007-12-29 14:53:24 +00:00
perry
b6a2ef7569 Convert many of the uses of __attribute__ to equivalent
__packed, __unused and __dead macros from cdefs.h
2007-12-25 18:33:32 +00:00
matt
f7dbcaa3d4 Make sure ip_newid etal doesn't return an ip_id of 0. 2007-12-22 16:04:45 +00:00
matt
0ec69f885b Fix offset calculation.
Make sure that all frags use the same TOS.
2007-12-22 15:41:11 +00:00
matt
f064a5136b Add ipq_tos to struct ipqe. (Doesn't increase size since the last member
was a u_int16_t).
2007-12-22 15:40:02 +00:00
matt
1f3ca215ea Also make sure the first is at 68 bytes long. 2007-12-21 23:49:09 +00:00
matt
6f23ff186c Prevent TCP blind data attacks by not allowing non-initial fragments to
start at less than 68 bytes (minimal fragment size).
2007-12-21 18:58:55 +00:00
matt
15c4637507 Add fix for ip_id information leakage. Since the leakage information is
primarily used with TCP SYN and RST packets and such packets are less than
the smallest sized packet that an IP stack is allowed to fragment, we simply
set ip_id to 0 for all packets 68 bytes or less.
2007-12-21 02:07:54 +00:00
dyoung
6f3852fab4 Constify struct ifnet->if_sadl and every use throughout the tree.
Add if_set_sadl() that both sets the link-layer address length and
replaces the current link-layer address with a new one, and use it
throughout the tree.
2007-12-20 21:08:17 +00:00
martin
7080c9db1e A few missing ifdefs to make non-INET6 kernels build again. 2007-12-20 20:24:49 +00:00
dyoung
72fa642a86 Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt.  Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address.  Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c.  Move the rtflush()
code into rtcache_clear() and delete rtflush().  Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt().  They compile, but I have not
tested them.  I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.
2007-12-20 19:53:29 +00:00
elad
ce55394a89 Oops. Remove kauth.h inclusion.
Pointed out by gdt@, thanks.
2007-12-16 18:39:57 +00:00
elad
7beaf4911f Really fix low port allocation, by always passing a valid lwp to
in_pcbbind().

Okay dyoung@.

Note that the network code is another candidate for major cleanup... also
note that this issue is likely to be present in netinet6 code, too.
2007-12-16 14:12:34 +00:00
lukem
456279df8f use __KERNEL_RCSID() 2007-12-11 12:29:11 +00:00
elad
3668e580ae Use struct initializers. No functional change. 2007-12-07 19:46:18 +00:00
dyoung
b579a81e92 Use ifa_insert(), ifa_remove(). 2007-12-06 00:28:36 +00:00
dyoung
b8f324fabd Extract common code, creating a subroutine if_purgeaddrs(ifp,
family, purgeaddr) which applies function `purgeaddr' to each
address on `ifp' belonging to `family'.
2007-12-05 23:47:17 +00:00
dyoung
0bf994db38 Use IFADDR_FIRST() and IFADDR_NEXT(). 2007-12-05 22:56:51 +00:00
dyoung
73b0c685df Use IFADDR_FOREACH(). 2007-12-04 10:31:14 +00:00
dyoung
79d53b3100 Move IN_NEED_CHECKSUM() to in_offload.h for re-use. 2007-11-28 04:14:11 +00:00
christos
a9c710744b require that the options argument is the right size, not that it is greater
or equal to the requested size. Suggested by Matt Thomas.
2007-11-27 22:45:29 +00:00
yamt
8ed07fbf78 inetctlerrmap: use designated initializer. 2007-11-26 08:40:46 +00:00
cube
cb1f63b2dc Follow up on arc -> arcnet renaming. Pointed out by joerg@. 2007-11-14 01:11:14 +00:00
dyoung
94b72f0f97 Change macros SYN_CACHE_PUT() and SYN_CACHE_RM() into inline
subroutines syn_cache_put() and syn_cache_rm().
2007-11-09 23:55:58 +00:00
dyoung
9250821580 KNF. Remove superfluous casts and parentheses. 2007-11-09 23:53:13 +00:00
dyoung
e54fbb261f Use sockaddr_in_init(). KNF. No functional change intended. 2007-11-09 23:42:56 +00:00
kefren
9536f25523 Don't MCLAIM in ipintr() because we do it anyway in ip_input() 2007-11-09 06:59:33 +00:00
rmind
d63e75f696 Pick the smallest possible TCP window scaling factor that will still allow
us to scale up to sb_max.  This might fix the problems with some firewalls.

Taken from FreeBSD (silby).
OK by <dyoung>.
2007-11-04 11:04:26 +00:00
ad
a2a3828545 machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h 2007-10-19 11:59:34 +00:00
dyoung
60149b1ce8 Work in progress: use a raw socket for GRE in IP encapsulation
instead of adding/subtracting our own IPv4 header.

There are many benefits:  gre(4) needn't grok the outer encapsulation
header any longer, so this simplifies the gre(4) code.  The IP
stack needn't grok GRE, so it is simplified, too.  gre(4) will
benefit from optimizations in the socket code.  Eventually, gre(4)
will gain an IPv6 encapsulation with very few new lines of code.

There is a small performance loss.  A 133 MHz, 486-class AMD Elan
sinks/sources a TCP stream over GRE with about 93% the throughput
of the old code.  TCP throughput on a 266 MHz, 586-class AMD Geode
is about 96% the throughput of the old code.  A 175-MHz ADM5120
(MIPS) only sinks a TCP stream over GRE at about 90% of the old
code; I am still investigating that.

I produced stripped-down versions of sosend() and soreceive() for
gre(4) to use.  They are guaranteed not to block, so they can be
called from a software interrupt and from a socket upcall,
respectively.

A kernel thread is no longer necessary for socket transmit/receive,
but I didn't get around to removing it, yet.

Thanks to Matt Thomas for suggesting the use of stripped-down socket
code and software interrupts, and to Andrew Doran for advice and
answers concerning software interrupts, threads, and performance.
2007-10-05 03:28:12 +00:00
dyoung
d07b0a69f6 Delete the unused second argument to ip_stripoptions(), move it
closer to its single caller in if_eon.c, try to move fewer bytes
by moving the IP header forward instead of moving the tail of the
mbuf backward, and use m_adj(9) instead of fiddling directly with
mbuf data members.
2007-10-02 20:35:04 +00:00
dyoung
3cdf25631c Don't use INADDR_ANY to initialize a const struct, because INADDR_ANY
is not necessarily const.
2007-09-19 18:52:55 +00:00
dyoung
43390716bc Constify sockaddr argument to ether_multiaddr(). Change struct
ifreq * arguments to ether_addmulti() and ether_delmulti() to const
struct sockaddr *, since ether_{add,del}multi() only ever read the
sockaddr ifreq member, ifr_addr.  Update uses in carp(4) and in
vlan(4).
2007-09-19 05:25:33 +00:00
dyoung
4c9b6756a5 1) Introduce a new socket option, (SOL_SOCKET, SO_NOHEADER), that
tells a socket that it should both add a protocol header to tx'd
   datagrams and remove the header from rx'd datagrams:

        int onoff = 1, s = socket(...);
        setsockopt(s, SOL_SOCKET, SO_NOHEADER, &onoff);

2) Add an implementation of (SOL_SOCKET, SO_NOHEADER) for raw IPv4
   sockets.

3) Reorganize the protocols' pr_ctloutput implementations a bit.
   Consistently return ENOPROTOOPT when an option is unsupported,
   and EINVAL if a supported option's arguments are incorrect.
   Reorganize the flow of code so that it's more clear how/when
   options are passed down the stack until they are handled.

   Shorten some pr_ctloutput staircases for readability.

4) Extract common mbuf code into subroutines, add new sockaddr
   methods, and introduce a new subroutine, fsocreate(), for reuse
   later; use it first in sys_socket():

struct mbuf *m_getsombuf(struct socket *so)

        Create an mbuf and make its owner the socket `so'.

struct mbuf *m_intopt(struct socket *so, int val)

        Create an mbuf, make its owner the socket `so', put the
        int `val' into it, and set its length to sizeof(int).


int fsocreate(..., int *fd)

        Create a socket, a la socreate(9), put the socket into the
        given LWP's descriptor table, return the descriptor at `fd'
        on success.

void *sockaddr_addr(struct sockaddr *sa, socklen_t *slenp)
const void *sockaddr_const_addr(const struct sockaddr *sa, socklen_t *slenp)

        Extract a pointer to the address part of a sockaddr.  Write
        the length of the address  part at `slenp', if `slenp' is
        not NULL.

socklen_t sockaddr_getlen(const struct sockaddr *sa)

        Return the length of a sockaddr.  This just evaluates to
        sa->sa_len.  I only add this for consistency with code that
        appears in a portable userland library that I am going to
        import.

const struct sockaddr *sockaddr_any(const struct sockaddr *sa)

        Return the "don't care" sockaddr in the same family as
        `sa'.  This is the address a client should sobind(9) if it
        does not care the source address and, if applicable, the
        port et cetera that it uses.

const void *sockaddr_anyaddr(const struct sockaddr *sa, socklen_t *slenp)

        Return the "don't care" sockaddr in the same family as
        `sa'.  This is the address a client should sobind(9) if it
        does not care the source address and, if applicable, the
        port et cetera that it uses.
2007-09-19 04:33:42 +00:00
degroote
640e23d7c9 In some FAST_IPSEC, spl level is not restored correctly. Fix that.
Spotted by Wolfgang Stukenbrock in pr/36800
2007-09-11 14:18:09 +00:00
dyoung
99975917cd We cannot sleep in a software interrupt, so do not sockaddr_dl_alloc(...,
M_WAITOK).  Instead, sockaddr_dl_init() a sockaddr_dl on the stack.
2007-09-05 05:29:35 +00:00
dyoung
88399b6877 We cannot sleep in a software interrupt, so do not sockaddr_dl_alloc(...,
M_WAITOK).  Instead, sockaddr_dl_init() a sockaddr_dl on the stack.
2007-09-02 19:42:21 +00:00
dyoung
db10b0d586 m_copym(..., 0, M_COPYALL, ...) -> m_copypacket(..., ...). 2007-09-02 07:18:55 +00:00
dyoung
6173a47677 m_copy() was deprecated, apparently, long ago. m_copy(...) ->
m_copym(..., M_DONTWAIT).
2007-09-02 03:12:23 +00:00
dyoung
0af5ef16d6 Be consistent: use the prefix sc_ for all members of the gre_softc. 2007-09-02 01:49:49 +00:00
dyoung
2fc102750d Use ifreq_setaddr(), ifreq_getaddr(), sockaddr_in_init(), and
sockaddr_copy().  Constify.  Compare pointers with NULL, not 0.
Don't "test truth" of pointers, but compare with NULL.
2007-09-01 04:32:50 +00:00
dyoung
f06b9f6f72 Fix bug in last: add missing ampersand. 2007-08-31 23:40:08 +00:00
dyoung
353d6b2744 Stop sharing a sockaddr_in template among multicast routines,
because that's just going to cause problems down the road.  (Suppose
we can have two CPUs in the network stack someday?)  Instead, use
sockaddr_in_init() to initialize a sockaddr_in on the stack.

Use ifreq_setaddr() to initialize ifreq.ifr_addr.
2007-08-31 21:56:43 +00:00
dyoung
b3fc296326 Use malloc(9) for sockaddrs instead of pool(9), and remove dom_sa_pool
and dom_sa_len members from struct domain.  Pools of fixed-size
objects are too rigid for sockaddr_dls, whose size can vary over
a wide range.

Return sockaddr_dl to its "historical" size.  Now that I'm using
malloc(9) instead of pool(9) to allocate sockaddr_dl, I can create
a sockaddr_dl of any size in the kernel, so expanding sockaddr_dl
is useless.

Avoid using sizeof(struct sockaddr_dl) in the kernel.

Introduce sockaddr_dl_alloc() for allocating & initializing an
arbitrary sockaddr_dl on the heap.

Add an argument, the sockaddr length, to sockaddr_alloc(),
sockaddr_copy(), and sockaddr_dl_setaddr().

Constify: LLADDR() -> CLLADDR().

Where the kernel overwrites LLADDR(), use sockaddr_dl_setaddr(),
instead.  Used properly, sockaddr_dl_setaddr() will not overrun
the end of the sockaddr.
2007-08-30 02:17:34 +00:00
cube
2eca33e853 Fix ipv4 multicast that could sometimes send packets with the wrong
Ethernet multicast address.

Reported by jmcneill@, fix discussed with dyoung@, _very_ light testing by
myself, some more money for my dealer of anxiolytics after reading
ip_output()'s twisted code maze.
2007-08-28 23:45:39 +00:00
dyoung
64bfe92e2b Cosmetic: 0 -> NULL. Remove unnecessary cast. 2007-08-27 05:39:44 +00:00
dyoung
7caec74f02 Reorganize and extract arplookup1() for code-sharing. Share
null_sdl.  Introduce arp_setgate() for initializing a link-layer
nexthop, and use it to fulfill RTM_SETGATE requests.
2007-08-27 01:13:09 +00:00
dyoung
5204966a96 Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.
2007-08-26 22:59:08 +00:00
dyoung
473d5fc042 Use sockaddr_in_init(). 2007-08-21 08:34:33 +00:00
dyoung
bd98464c6f Don't call rtcache_check() from the fast-forward code, which runs
at IPL_NET, because rtcache_check() may read the forwarding table.
Elsewhere, the kernel only blocks interrupts at priority IPL_SOFTNET
and below while it modifies the forwarding table, so rtcache_check()
could be reading the table in an inconsistent state.  Use
rtcache_done(), instead.

XXX netinet/ip_flow.c and netinet6/ip6_flow.c are virtually identical.
XXX They should share code.
2007-08-20 19:42:34 +00:00
dyoung
b40a86e49c Use sockaddr_dl_init(). 2007-08-10 22:46:16 +00:00
dyoung
0640a03023 Use satocsdl() et cetera instead of SDL(). Constify. 2007-08-07 04:37:04 +00:00
yamt
7431e54c17 make rfbuf_ts a tcp timestamp so that calculations in tcp_input make sense. 2007-08-02 13:12:35 +00:00
yamt
e74ee454c1 our tcp timestamps are in PR_SLOWHZ, not HZ. 2007-08-02 13:06:30 +00:00
rmind
4175f8693b TCP socket buffers automatic sizing - ported from FreeBSD.
http://mail-index.netbsd.org/tech-net/2007/02/04/0006.html

! Disabled by default, marked as experimental. Testers are very needed.
! Someone should thoroughly test this, and improve if possible.

Discussed on <tech-net>:
http://mail-index.netbsd.org/tech-net/2007/07/12/0002.html
Thanks Greg Troxel for comments.

OK by the long silence on <tech-net>.
2007-08-02 02:42:40 +00:00
dyoung
08e6f22226 Take steps to hide the radix_node implementation of the forwarding table
from the forwarding table's users:

        Introduce rt_walktree() for walking the routing table and
        applying a function to each rtentry.  Replace most
        rn_walktree() calls with it.

        Use rt_getkey()/rt_setkey() to get/set a route's destination.
        Keep a pointer to the sockaddr key in the rtentry, so that
        rtentry users do not have to grovel in the radix_node for
        the key.

        Add a RTM_GET method to rtrequest.  Use that instead of
        radix_node lookups in, e.g., carp(4).

Add sys/net/link_proto.c, which supplies sockaddr routines for
link-layer socket addresses (sockaddr_dl).

Cosmetic:

        Constify.  KNF.  Stop open-coding LIST_FOREACH, TAILQ_FOREACH,
        et cetera.  Use NULL instead of 0 for null pointers.  Use
        __arraycount().  Reduce gratuitous parenthesization.

        Stop using variadic arguments for rip6_output(), it is
        unnecessary.

        Remove the unnecessary rtentry member rt_genmask and the
        code to maintain it, since nothing actually used it.

        Make rt_maskedcopy() easier to read by using meaningful variable
        names.

        Extract a subroutine intern_netmask() for looking up a netmask in
        the masks table.

        Start converting backslash-ridden IPv6 macros in
        sys/netinet6/in6_var.h into inline subroutines that one
        can read without special eyeglasses.

One functional change: when the kernel serves an RTM_GET, RTM_LOCK,
or RTM_CHANGE request, it applies the netmask (if supplied) to a
destination before searching for it in the forwarding table.

I have changed sys/netinet/ip_carp.c, carp_setroute(), to remove
the unlawful radix_node knowledge.

Apart from the changes to carp(4), netiso, ATM, and strip(4), I
have run the changes on three nodes in my wireless routing testbed,
which involves IPv4 + IPv6 dynamic routing acrobatics, and it's
working beautifully so far.
2007-07-19 20:48:52 +00:00
xtraeme
48e23b4a25 Replace a simple lock with a mutex and make it static. 2007-07-11 21:34:16 +00:00
ad
88ab7da936 Merge some of the less invasive changes from the vmlocking branch:
- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
2007-07-09 20:51:58 +00:00
christos
d1ffad0af7 Handle mapped and scoped ipv6 addresses. From Anon Ymous. 2007-06-28 21:11:12 +00:00
degroote
4ddfe916ff Add support for options IPSEC_NAT_T (RFC 3947 and 3948) for fast_ipsec(4).
No objection on tech-net@
2007-06-27 20:38:32 +00:00
xtraeme
c18664aad7 Protect inet6_ident_core() with #ifdef INET6, fixes building without
options INET6.
2007-06-26 09:19:36 +00:00
christos
0a36551606 tcpdrop kernel bits (from anon ymous) 2007-06-25 23:35:12 +00:00
christos
eeff189533 - per socket keepalive settings
- settable connection establishment timeout
2007-06-20 15:29:17 +00:00
christos
fc506e028c PR/36484: Pavlin Radoslavov: PIM Register in-kernel encapsulation IP_DF
setting is incorrect
2007-06-13 23:09:59 +00:00
dyoung
ae302fd15c Use __arraycount(). 2007-06-13 21:08:29 +00:00
dyoung
46bc24d79c Use LIST_FOREACH(). 2007-06-13 04:55:25 +00:00
dyoung
779264d3a2 Complete removal of radix_node knowledge. 2007-06-12 22:55:44 +00:00
dyoung
95edb940c2 Get rid of radix_node_head.rnh_walktree, because it is only ever
set to rn_walktree.

Introduce rt_walktree(), which applies a subroutine to every route
in a particular address family.  Use it instead of rn_walktree()
virtually everywhere.  This helps to hide the routing table
implementation.
2007-06-09 03:07:21 +00:00
riz
711b142f07 Fix compilation in the TCP_SIGNATURE case:
- don't use void * for pointer arithmetic
	- don't try to modify const parameters

A kernel with 'options TCP_SIGNATURE' works as well as it ever did, now.
(ie, clunky, but passable)
2007-05-18 21:48:43 +00:00
riz
89c9ca415d Revert a small part of revision 1.254 - remove const qualifier from
the struct tcphdr * argument of tcp_dooptions().  RFC2385 support
(options TCP_SIGNATURE) needs to modify the header during options
processing, and this revision broke it.

OK yamt@.
2007-05-18 21:31:16 +00:00
dyoung
c7cb104b6b KNF. Use sockaddr_in_init(). Shorten staircases. No functional
changes intended.
2007-05-12 02:10:25 +00:00
dyoung
9552c98d25 Use sockaddr_in_init(). 2007-05-12 02:03:15 +00:00
dyoung
e1d4e2922e In AppleTalk, IPv4, and IPv6 routing domains, help sockaddr_cmp()
avoid an indirect function call by comparing the family, length,
and bytes [dom->dom_sa_cmpofs, dom->dom_sa_cmpofs + dom->dom_sa_cmplen),
corresponding to the the sockaddrs' "address" members.

For ISO, actually use sockaddr_iso_cmp, for a change.  Thanks to
yamt@ for pointing out my error.
2007-05-06 02:56:37 +00:00
dyoung
fc86b519e8 Oops, commit this straggler from the last change to net/if_gre.[ch]. 2007-05-06 02:48:38 +00:00
dyoung
8b646d9bb9 Remove obsolete files netinet/in_route.[ch]. 2007-05-02 22:39:03 +00:00
dyoung
38175939e7 Remove unused option. 2007-05-02 20:43:47 +00:00
dyoung
72f0a6dfb0 Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing.  Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously.  Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs.  I have
  introduced routines for allocating, copying, and duplicating,
  and freeing sockaddrs:

        struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
        struct sockaddr *sockaddr_copy(struct sockaddr *dst,
                                       const struct sockaddr *src);
        struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
        void sockaddr_free(struct sockaddr *sa);

  sockaddr_alloc() returns either a sockaddr from the pool belonging
  to the specified family, or NULL if the pool is exhausted.  The
  returned sockaddr has the right size for that family; sa_family
  and sa_len fields are initialized to the family and sockaddr
  length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
  sockaddr_in).  sockaddr_free() puts the given sockaddr back into
  its family's pool.

  sockaddr_dup() and sockaddr_copy() work analogously to strdup()
  and strcpy(), respectively.  sockaddr_copy() KASSERTs that the
  family of the destination and source sockaddrs are alike.

  The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
  passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
  family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
  etc.  They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more.  All protocol families
  use struct route.  I have changed the route cache, 'struct route',
  so that it does not contain storage space for a sockaddr.  Instead,
  struct route points to a sockaddr coming from the pool the sockaddr
  belongs to.  I added a new method to struct route, rtcache_setdst(),
  for setting the cache destination:

        int rtcache_setdst(struct route *, const struct sockaddr *);

  rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
  available to create the sockaddr storage.

  It is now possible for rtcache_getdst() to return NULL if, say,
  rtcache_setdst() failed.  I check the return value for NULL
  everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
  caches, dom_rtcache.  rtflushall(sa_family_t af) looks up the
  domain indicated by 'af', walks the domain's list of route caches
  and invalidates each one.
2007-05-02 20:40:22 +00:00
dyoung
d43d3ae5b8 Get rid of some gratuitous casts and join some lines. 2007-04-25 00:11:18 +00:00
dyoung
2fe02c923a Constify. 2007-04-24 23:43:50 +00:00
dyoung
1c9313a294 In in_rtflushall(), clear the route caches using rtcache_clear()
instead of rtcache_free().  It is not desirable to clear the cached
destination as well as the route, however, rtcache_free() will
eventually release all resources held by the cache, including the
destination.

Add some additional diagnostic assertions.
2007-04-22 06:01:57 +00:00
dyoung
d8fb0f4dac Add optimization hint for compiler. In a debug printf,
s/freeing/flushing/.
2007-04-18 23:22:26 +00:00
dyoung
d60552baa5 Cosmetic: shorten a staircase. bzero -> memset. KNF. 2007-04-15 06:15:58 +00:00
liamjfoy
39b3c7f047 use size_t for indexes
just pass a *ip to ipflow_hash instead of members

ok christos@
2007-04-05 18:11:47 +00:00
liamjfoy
68880dffbf Add a small note regarding further commented code in netinet6/ip6_flow.c 2007-03-26 00:29:15 +00:00
liamjfoy
b8ef59d720 Add net.inet.ip.hashsize to control the IPv4 fast forward hash table size. 2007-03-25 20:12:20 +00:00
liamjfoy
ac43382f1f Don't call ip*flow_reap if we're just looking up maxflows 2007-03-24 00:27:58 +00:00