Commit Graph

213 Commits

Author SHA1 Message Date
ad 59d979c5f1 Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.
2007-03-12 18:18:22 +00:00
christos 53524e44ef Kill caddr_t; there will be some MI fallout, but it will be fixed shortly. 2007-03-04 05:59:00 +00:00
dyoung 5493f188c7 KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
   in6_src.c, avoid casts by changing several route_in6 pointers
   to struct route pointers.  Remove unnecessary casts to caddr_t
   elsewhere.

Pave the way for eliminating address family-specific route caches:
   soon, struct route will not embed a sockaddr, but it will hold
   a reference to an external sockaddr, instead.  We will set the
   destination sockaddr using rtcache_setdst().  (I created a stub
   for it, but it isn't used anywhere, yet.)  rtcache_free() will
   free the sockaddr.  I have extracted from rtcache_free() a helper
   subroutine, rtcache_clear().  rtcache_clear() will "forget" a
   cached route, but it will not forget the destination by releasing
   the sockaddr.  I use rtcache_clear() instead of rtcache_free()
   in rtcache_update(), because rtcache_update() is not supposed
   to forget the destination.

Constify:

   1 Introduce const accessor for route->ro_dst, rtcache_getdst().

   2 Constify the 'dst' argument to ifnet->if_output().  This
     led me to constify a lot of code called by output routines.

   3 Constify the sockaddr argument to protosw->pr_ctlinput.  This
     led me to constify a lot of code called by ctlinput routines.

   4 Introduce const macros for converting from a generic sockaddr
     to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
     satocsin, et cetera.
2007-02-17 22:34:07 +00:00
degroote e2211411a4 Commit my SoC work
Add ipv6 support for fast_ipsec
Note that currently, packet with extensions headers are not correctly
supported
Change the ipcomp logic
2007-02-10 09:43:05 +00:00
yamt 8836e5995d add some more tcp mowners. 2006-12-06 09:10:45 +00:00
christos 168cd830d2 __unused removal on arguments; approved by core. 2006-11-16 01:32:37 +00:00
yamt 81463c93c7 implement RFC3465 appropriate byte counting.
from Kentaro A. Kurahone, with minor adjustments by me.
the ack prediction part of the original patch was omitted because
it's a separate change.  reviewed by Rui Paulo.
2006-10-19 11:40:51 +00:00
dogcow 372e6ef309 now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.
2006-10-17 18:21:29 +00:00
dogcow 44603cac1f more unused variable fallout. 2006-10-13 18:28:06 +00:00
christos 4d595fd7b1 - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
2006-10-12 01:30:41 +00:00
dogcow 55ddfc9aae change the MOWNER_INIT define to take two args; fix extant struct mowner
decls to use it. Makes options MBUFTRACE compile again and not whinge about
missing structure declarations. (Also makes initialization consistent.)
2006-10-10 21:49:14 +00:00
rpaulo f3330397f0 Modular (I tried ;-) TCP congestion control API. Whenever certain conditions
happen in the TCP stack, this interface calls the specified callback to
handle the situation according to the currently selected congestion
control algorithm.
A new sysctl node was created: net.inet.tcp.congctl.{available,selected}
with obvious meanings.
The old net.inet.tcp.newreno MIB was removed.
The API is discussed in tcp_congctl(9).

In the near future, it will be possible to selected a congestion control
algorithm on a per-socket basis.

Discussed on tech-net and reviewed by <yamt>.
2006-10-09 16:27:07 +00:00
yamt 38fb8d4a38 revert tcp_close part of tcp_subr.c rev.1.200 because it's unnecessary.
all callers of tcp_close are at splsoftnet already:
	tcp_close
		tcp_input ok
		tcp_disconnect
			tcp_usrreq ok
		tcp_usrclosed
			tcp_usrreq ok
			tcp_disconnect
		tcp_timer_2msl ok
		tcp_drop
			tcp_usrreq
			tcp_disconnect
			tcp_timer_rexmt ok
			tcp_timer_persist ok
			tcp_timer_keep ok
			tcp_input
			syn_cache_get
				tcp_input
2006-10-07 19:53:42 +00:00
tls 8cc016b4bc Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

	s=splfoo();
	/* frob data structure */
	splx(s);
	pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next.  "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them.  One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html
2006-10-05 17:35:19 +00:00
rpaulo 2fb2ae3251 Import of TCP ECN algorithm for congestion control.
Both available for IPv4 and IPv6.
Basic implementation test results are available at
http://netbsd-soc.sourceforge.net/projects/ecn/testresults.html.

Work sponsored by the Google Summer of Code project 2006.
Special thanks to Kentaro Kurahone, Allen Briggs and Matt Thomas for their
help, comments and support during the project.
2006-09-05 00:29:35 +00:00
christos ddb5372e69 Coverity CID 1149: Add KASSERT before deref. 2006-04-15 02:30:39 +00:00
christos 7a396ae9a9 Coverity CID 1148: Add KASSERT before deref. 2006-04-15 02:29:12 +00:00
christos 95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
yamt f02551ec2d move {tcp,udp}_do_loopback_cksum back to tcp/udp
so that they can be referenced by ipv6.
2005-08-10 13:06:49 +00:00
yamt 8220c378e6 device independent part of ipv6 rx checksum offloading. 2005-08-10 13:05:16 +00:00
he 4047396e46 Make this build without INET6. 2005-07-20 08:05:43 +00:00
christos 89940190d0 Implement PMTU checks from:
http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html

1. Don't act on ICMP-need-frag immediately if adhoc checks on the
advertised MTU fail. The MTU update is delayed until a TCP retransmit
happens.
2. Ignore ICMP Source Quench messages meant for TCP connections.

From OpenBSD.
2005-07-19 17:00:02 +00:00
christos ea2d4204b6 - add const
- remove bogus casts
- avoid nested variables
2005-05-29 21:41:23 +00:00
yamt e5a2b5a4a4 fix problems related to loopback interface checksum omission. PR/29971.
- for ipv4, defer decision to ip layer as h/w checksum offloading does
  so that it can check the actual interface the packet is going to.
- for ipv6, disable it.
  (maybe will be revisited when it implements h/w checksum offloading.)

ok'ed by Jason Thorpe.
2005-04-18 21:50:25 +00:00
kurahone f7707899c1 Added sysctl tunable limits for the number of maximum SACK holes
per connection and per system.

Idea taken from FreeBSD.
2005-04-05 01:07:17 +00:00
yamt 8b0967ff45 protect tcpipqent with splvm. 2005-03-29 20:10:16 +00:00
yamt df05ca7085 simplify data receiver side sack processing.
- introduce t_segqlen, the number of segments in segq/timeq.
  the name is from freebsd.
- rather than maintaining a copy of sack blocks (rcv_sack_block[]),
  build it directly from the segment list when needed.
2005-03-16 00:39:56 +00:00
yamt 0446b7c3e3 - use full sized segments unless we actually have SACKs to send.
- avoid TSO duplicate D-SACK.
- send SACKs regardless of TF_ACKNOW.
- don't clear rcv_sack_num when transmitting.

discussed on tech-net@.
2005-03-16 00:38:27 +00:00
simonb e491fee6a5 s/quence/quench/. 2005-03-09 04:24:12 +00:00
simonb 3792275475 Add an extra `i' to notifes/notifed. 2005-03-09 04:23:33 +00:00
jonathan 4ae1f36dc9 Commit TCP SACK patches from Kentaro A. Karahone's patch at:
http://www.sigusr1.org/~kurahone/tcp-sack-netbsd-02152005.diff.gz

Fixes in that patch for pre-existing TCP pcb initializations were already
committed to NetBSD-current, so are not included in this commit.

The SACK patch has been observed to correctly negotiate and respond,
to SACKs in wide-area traffic.

There are two indepenently-observed, as-yet-unresolved anomalies:
First, seeing unexplained delays between in fast retransmission
(potentially explainable by an 0.2sec RTT between adjacent
ethernet/wifi NICs); and second, peculiar and unepxlained TCP
retransmits observed over an ath0 card.

After discussion with several interested developers, I'm committing
this now, as-is, for more eyes to use and look over.  Current hypothesis
is that the anomalies above may in fact be due to link/level (hardware,
driver, HAL, firmware) abberations in the test setup, affecting  both
Kentaro's  wired-Ethernet NIC and in my two (different) WiFi NICs.
2005-02-28 16:20:59 +00:00
perry f07677dd81 nuke trailing whitespace 2005-02-26 22:45:09 +00:00
briggs a825f3e77c Initialize t_partialacks in the tcpcb template.
From Kentaro A. Kurahone.
2005-02-16 14:59:40 +00:00
heas 52b0cd6b47 ntohs->htons for ip6 plen (payload length).
It is not technically necessary to set plen here, since ip6_output() starts
off by calculating it, but leaving it keeps it consistent with other code.
2005-02-12 01:24:07 +00:00
perry b02c92c5bf ANSIfy function declarations 2005-02-03 23:50:33 +00:00
perry 3494482345 de-__P -- will ANSIfy .c files later. 2005-02-02 21:41:55 +00:00
heas fe4b3cd078 In tcp_respond(), clear the m_pkthdr.csum_flags that was inherited from the
received packet so that the checksum is not performed twice.  Also,
tcp_respond() does not fill-in the m_pkthdr.csum_data, so a h/w checksum may
have the wrong offset.

OK from Jason Thorpe.
2005-01-03 19:47:30 +00:00
christos 77e7bdb8aa yamt's changes seem to fix all the checksumming issues. Turn the loopback
checksums back off so we can make sure that everything works.
2004-12-19 06:42:24 +00:00
christos 60fb5c0ece Turn checksumming on loopback back on until we fix the bugs in it.
Connect over tcp on the loopback is broken:

  4729 amq      0.000007 CALL  connect(4,0x804f2a0,0x1c)
  4729 amq      75.007420 RET   connect -1 errno 60 Connection timed out
2004-12-17 22:54:52 +00:00
thorpej 7994b6f95e Don't perform checksums on loopback interfaces. They can be reenabled with
the net.inet.*.do_loopback_cksum sysctl.

Approved by: groo
2004-12-15 04:25:19 +00:00
yamt 0ea22c32fa fix ipqent pool corruption problems. make tcp reass code use
its own pool of ipqent rather than sharing it with ip reass code.
PR/24782.
2004-09-15 09:21:22 +00:00
itojun 4ebcfcf29a fix MD5 signature support to actually validate inbound signature, and
drop packet if fails.
2004-05-18 14:44:14 +00:00
matt da67d85073 Use EVCNT_ATTACH_STATIC{,2} 2004-05-01 02:20:42 +00:00
itojun 362e07a3c9 zero-clear ip6?pseudo before use 2004-04-26 05:18:13 +00:00
itojun f103f9aee9 declare ip6_hdr_pseudo (for kernel only) and use it for TCP MD5 signature 2004-04-26 05:15:47 +00:00
itojun 67372cc454 sync comment with reality 2004-04-26 05:05:49 +00:00
itojun e0395ac8f0 make TCP MD5 signature work with KAME IPSEC (#define IPSEC).
support IPv6 if KAME IPSEC (RFC is not explicit about how we make data stream
for checksum with IPv6, but i'm pretty sure using normal pseudo-header is the
right thing).

XXX
current TCP MD5 signature code has giant flaw:
it does not validate signature on input (can't believe it! what is the point?)
2004-04-26 03:54:28 +00:00
jonathan 887b782b0b Initial commit of a port of the FreeBSD implementation of RFC 2385
(MD5 signatures for TCP, as used with BGP).  Credit for original
FreeBSD code goes to Bruce M. Simpson, with FreeBSD sponsorship
credited to sentex.net.  Shortening of the setsockopt() name
attributed to Vincent Jardin.

This commit is a minimal, working version of the FreeBSD code, as
MFC'ed to FreeBSD-4. It has received minimal testing with a ttcp
modified to set the TCP-MD5 option; BMS's additions to tcpdump-current
(tcpdump -M) confirm that the MD5 signatures are correct.  Committed
as-is for further testing between a NetBSD BGP speaker (e.g., quagga)
and industry-standard BGP speakers (e.g., Cisco, Juniper).


NOTE: This version has two potential flaws. First, I do see any code
that verifies recieved TCP-MD5 signatures.  Second, the TCP-MD5
options are internally padded and assumed to be 32-bit aligned. A more
space-efficient scheme is to pack all TCP options densely (and
possibly unaligned) into the TCP header ; then do one final padding to
a 4-byte boundary.  Pre-existing comments note that accounting for
TCP-option space when we add SACK is yet to be done. For now, I'm
punting on that; we can solve it properly, in a way that will handle
SACK blocks, as a separate exercise.

In case a pullup to NetBSD-2 is requested, this adds sys/netipsec/xform_tcp.c
,and modifies:

sys/net/pfkeyv2.h,v 1.15
sys/netinet/files.netinet,v 1.5
sys/netinet/ip.h,v 1.25
sys/netinet/tcp.h,v 1.15
sys/netinet/tcp_input.c,v 1.200
sys/netinet/tcp_output.c,v 1.109
sys/netinet/tcp_subr.c,v 1.165
sys/netinet/tcp_usrreq.c,v 1.89
sys/netinet/tcp_var.h,v 1.109
sys/netipsec/files.netipsec,v 1.3
sys/netipsec/ipsec.c,v 1.11
sys/netipsec/ipsec.h,v 1.7
sys/netipsec/key.c,v 1.11
share/man/man4/tcp.4,v 1.16
lib/libipsec/pfkey.c,v 1.20
lib/libipsec/pfkey_dump.c,v 1.17
lib/libipsec/policy_token.l,v 1.8
sbin/setkey/parse.y,v 1.14
sbin/setkey/setkey.8,v 1.27
sbin/setkey/token.l,v 1.15

Note that the preceding two revisions to tcp.4 will be
required to cleanly apply this diff.
2004-04-25 22:25:03 +00:00
simonb b5d0e6bf06 Initialise (most) pools from a link set instead of explicit calls
to pool_init.  Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

 Convert struct session, ucred and lockf to pools.
2004-04-25 16:42:40 +00:00
tls 7eb2f214d5 Change the default state of two tunables; bring our TCP a little bit
closer to normal behaviour for the current century.

New Reno is now on by default (which is really the only reasonable
choice, since we don't do SACK); instead of an initial window of 1
for non-local nets, we now use Sally Floyd's magic 4K rule.
2004-04-22 02:19:39 +00:00