Commit Graph

306 Commits

Author SHA1 Message Date
plunky d334ec0fc0 fix potential mbuf overflow, from Alexander Danilov on tech-net 2010-12-02 19:07:27 +00:00
bouyer adad9c5471 Make sure SYN_CACHE_TIMER_ARM() has been run before calling syn_cache_put()
as it will reschedule the timer.  Fixes PR kern/43318.
2010-05-26 17:38:29 +00:00
bouyer c638cbeac1 syn_cache_put(): defer all pool_put() to the callout. Reschedule
the callout if needed so frees are not delayed too much.
syn_cache_timer(): we can't call syn_cache_put() here any more,
so move code deleted from syn_cache_put() here.

Avoid KASSERT() in kern_timeout.c because pool_put() is called from
ipintr context, as reported in
http://mail-index.netbsd.org/tech-kern/2010/03/19/msg007762.html
Thanks to Andrew Doran and Mindaugas Rasiukevicius for help and review.
2010-04-21 20:40:16 +00:00
rmind b278cb5138 tcp_input: set ECE flag even if CWR flag is active.
Submitted by Richard Scheffenegger in PR/43150.
2010-04-16 03:13:03 +00:00
tls 4e0229021b Oops. Fix LOCKDEBUG panic -- and spurious calls to tcp_output()! -- in
previous.  Be careful with that {}, Eugene.
2010-04-01 14:31:51 +00:00
tls 994b02bdbe After discussion with ad@: it appears that KERNEL_LOCK also protects
the driver output path (that is, ifp->if_output()).  In the case of
entry through the socket code, we are fine, because pru_usrreq takes
KERNEL_LOCK.  However, there are a few other ways to cause output
which require protection:

	1) direct calls to tcp_output() in tcp_input()
	2) fast-forwarding code (ip_flow) -- protected elsewise
	   against itself by the softnet lock.
	3) *Possibly* the ARP code.  I have currently persuaded
	   myself that it is safe because of how it's called.
	4) Possibly the ICMP code.

This change addresses #1 and #2.
2010-04-01 00:24:41 +00:00
pooka 54b3dc4108 tcp sockbuf autoscaling was initially added turned off because it
was experimental.  People (including myself) have been running with
it turned on for eons now, so flip the default to enabled.
2010-01-26 18:09:07 +00:00
darran ddd44491c6 Make tcp msl (max segment life) tunable via sysctl net.inet.tcp.msl.
Okayed by tls@.
2009-09-09 22:41:28 +00:00
minskim 2708c3c1b9 Check the minimum ttl only when pcb is available. 2009-07-18 23:09:53 +00:00
minskim d0a9c36e4a Add the IP_MINTTL socket option.
The IP_MINTTL option may be used on SOCK_STREAM sockets to discard
packets with a TTL lower than the option value.  This can be used to
implement the Generalized TTL Security Mechanism (GTSM) according to
RFC 3682.

OK'ed by christos@.
2009-07-17 22:02:54 +00:00
christos 8d20d2e953 Follow exactly the recommendation of draft-ietf-tcpm-tcpsecure-11.txt:
Don't check gainst the last ack received, but the expected sequence number.
This makes RST handling independent of delayed ACK. From Joanne M Mikkelson.
2009-06-20 17:29:31 +00:00
cegger c363a9cb62 bzero -> memset 2009-03-18 16:00:08 +00:00
cegger 35fb64746b bcmp -> memcmp 2009-03-18 15:14:29 +00:00
cegger dc56dbbd97 ansify function definitions 2009-03-15 21:23:31 +00:00
pooka c7a407f862 stinkset purge: POOL_INIT -> pool_init
also, make the syncache pool static in scope
2009-01-29 20:38:22 +00:00
tls c5ddeafa76 Unlock reassembly queue before calling sorwakeup(), not after. In unusual
cases with in-kernel consumers which might send data on the same socket,
we can deadlock on the reassembly queue otherwise (observed while testing
accept filters).
2008-08-04 04:08:47 +00:00
matt 34ac358652 Reacquire softnet_lock after calling soabort which returns with the socket
unlocked.
2008-07-28 18:41:07 +00:00
ad c4e6bfaf85 tcp_input: add a couple of assertions. 2008-07-04 18:22:21 +00:00
ad 4c75eca868 syn_cache_get: remove new endpoint's socket from head's queue if aborting
the connection. Should fix KASSERT(so->so_head == NULL).
2008-07-03 15:35:28 +00:00
martin ce099b4099 Remove clause 3 and 4 from TNF licenses 2008-04-28 20:22:51 +00:00
ad 15e29e981b Merge the socket locking patch:
- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
2008-04-24 11:38:36 +00:00
thorpej caf49ea572 Make IPSEC and FAST_IPSEC stats per-cpu. Use <net/net_stats.h> and
netstat_sysctl().
2008-04-23 06:09:04 +00:00
thorpej 7ff8d08aae Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.
2008-04-12 05:58:22 +00:00
thorpej f5c68c0b9f Change TCP stats from a structure to an array of uint64_t's.
Note: This is ABI-compatible with the old tcpstat structure; old netstat
binaries will continue to work properly.
2008-04-08 01:03:58 +00:00
rmind c6186face4 Welcome to 4.99.55:
- Add a lot of missing selinit() and seldestroy() calls.

- Merge selwakeup() and selnotify() calls into a single selnotify().

- Add an additional 'events' argument to selnotify() call.  It will
  indicate which event (POLL_IN, POLL_OUT, etc) happen.  If unknown,
  zero may be used.

Note: please pass appropriate value of 'events' where possible.
Proposed on: <tech-kern>
2008-03-01 14:16:49 +00:00
matt a4a1e5ce55 Convert stragglers to ansi definitions from old-style definitons.
Remember that func() is not ansi, func(void) is.
2008-02-27 19:41:51 +00:00
yamt c3985cffec make TCP_SETUP_ACK, ICMP_CHECK, TCP_FIELDS_TO_HOST, and TCP_FIELDS_TO_NET
static functions.
2008-02-20 11:44:07 +00:00
yamt f35baba8dd - start tcp timestamp from 1 instead of 0.
- add a comment to explain why:
+        * We start with 1, because 0 doesn't work with linux, which
+        * considers timestamp 0 in a SYN packet as a bug and disables
+        * timestamps.
2008-02-05 09:38:47 +00:00
yamt d5bac2f6b1 redo tcp_input.c rev.1.230 correctly.
revision 1.230
    date: 2005/06/30 02:58:28;  author: christos;  state: Exp;  lines: +20 -4
    Normalize our PAWS code with Free and Open, as mentioned in tech-security.

reviewed by christos@ and matt@.
2008-02-04 23:56:14 +00:00
yamt a944f4302a revert tcp_output.c 1.253 because it has an ill effect when sending
small (not full-sized) segments.
http://mail-index.NetBSD.org/tech-net/2008/01/27/0009.html
2008-01-29 12:34:47 +00:00
dyoung 2d4e7e5856 Use rtcache_validate() instead of rtcache_getrt(). Shorten staircase
in in_losing().
2008-01-14 04:19:09 +00:00
martin 7080c9db1e A few missing ifdefs to make non-INET6 kernels build again. 2007-12-20 20:24:49 +00:00
dyoung 72fa642a86 Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt.  Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address.  Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c.  Move the rtflush()
code into rtcache_clear() and delete rtflush().  Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt().  They compile, but I have not
tested them.  I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.
2007-12-20 19:53:29 +00:00
elad 7beaf4911f Really fix low port allocation, by always passing a valid lwp to
in_pcbbind().

Okay dyoung@.

Note that the network code is another candidate for major cleanup... also
note that this issue is likely to be present in netinet6 code, too.
2007-12-16 14:12:34 +00:00
dyoung 94b72f0f97 Change macros SYN_CACHE_PUT() and SYN_CACHE_RM() into inline
subroutines syn_cache_put() and syn_cache_rm().
2007-11-09 23:55:58 +00:00
rmind d63e75f696 Pick the smallest possible TCP window scaling factor that will still allow
us to scale up to sb_max.  This might fix the problems with some firewalls.

Taken from FreeBSD (silby).
OK by <dyoung>.
2007-11-04 11:04:26 +00:00
yamt e74ee454c1 our tcp timestamps are in PR_SLOWHZ, not HZ. 2007-08-02 13:06:30 +00:00
rmind 4175f8693b TCP socket buffers automatic sizing - ported from FreeBSD.
http://mail-index.netbsd.org/tech-net/2007/02/04/0006.html

! Disabled by default, marked as experimental. Testers are very needed.
! Someone should thoroughly test this, and improve if possible.

Discussed on <tech-net>:
http://mail-index.netbsd.org/tech-net/2007/07/12/0002.html
Thanks Greg Troxel for comments.

OK by the long silence on <tech-net>.
2007-08-02 02:42:40 +00:00
ad 88ab7da936 Merge some of the less invasive changes from the vmlocking branch:
- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
2007-07-09 20:51:58 +00:00
christos eeff189533 - per socket keepalive settings
- settable connection establishment timeout
2007-06-20 15:29:17 +00:00
riz 711b142f07 Fix compilation in the TCP_SIGNATURE case:
- don't use void * for pointer arithmetic
	- don't try to modify const parameters

A kernel with 'options TCP_SIGNATURE' works as well as it ever did, now.
(ie, clunky, but passable)
2007-05-18 21:48:43 +00:00
riz 89c9ca415d Revert a small part of revision 1.254 - remove const qualifier from
the struct tcphdr * argument of tcp_dooptions().  RFC2385 support
(options TCP_SIGNATURE) needs to modify the header during options
processing, and this revision broke it.

OK yamt@.
2007-05-18 21:31:16 +00:00
dyoung 72f0a6dfb0 Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing.  Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously.  Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs.  I have
  introduced routines for allocating, copying, and duplicating,
  and freeing sockaddrs:

        struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
        struct sockaddr *sockaddr_copy(struct sockaddr *dst,
                                       const struct sockaddr *src);
        struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
        void sockaddr_free(struct sockaddr *sa);

  sockaddr_alloc() returns either a sockaddr from the pool belonging
  to the specified family, or NULL if the pool is exhausted.  The
  returned sockaddr has the right size for that family; sa_family
  and sa_len fields are initialized to the family and sockaddr
  length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
  sockaddr_in).  sockaddr_free() puts the given sockaddr back into
  its family's pool.

  sockaddr_dup() and sockaddr_copy() work analogously to strdup()
  and strcpy(), respectively.  sockaddr_copy() KASSERTs that the
  family of the destination and source sockaddrs are alike.

  The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
  passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
  family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
  etc.  They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more.  All protocol families
  use struct route.  I have changed the route cache, 'struct route',
  so that it does not contain storage space for a sockaddr.  Instead,
  struct route points to a sockaddr coming from the pool the sockaddr
  belongs to.  I added a new method to struct route, rtcache_setdst(),
  for setting the cache destination:

        int rtcache_setdst(struct route *, const struct sockaddr *);

  rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
  available to create the sockaddr storage.

  It is now possible for rtcache_getdst() to return NULL if, say,
  rtcache_setdst() failed.  I check the return value for NULL
  everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
  caches, dom_rtcache.  rtflushall(sa_family_t af) looks up the
  domain indicated by 'af', walks the domain's list of route caches
  and invalidates each one.
2007-05-02 20:40:22 +00:00
ad 59d979c5f1 Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.
2007-03-12 18:18:22 +00:00
christos 53524e44ef Kill caddr_t; there will be some MI fallout, but it will be fixed shortly. 2007-03-04 05:59:00 +00:00
thorpej 7cc07e11dc TRUE -> true, FALSE -> false 2007-02-22 06:16:03 +00:00
degroote e2211411a4 Commit my SoC work
Add ipv6 support for fast_ipsec
Note that currently, packet with extensions headers are not correctly
supported
Change the ipcomp logic
2007-02-10 09:43:05 +00:00
joerg eb04733c4e Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.
2006-12-15 21:18:52 +00:00
dyoung c308b1c661 Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route).  Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL.  Provide
in_rtcache() for adding a route to the chain.  Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches.  In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain.  In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.
2006-12-09 05:33:04 +00:00
yamt 8836e5995d add some more tcp mowners. 2006-12-06 09:10:45 +00:00