Commit Graph

186 Commits

Author SHA1 Message Date
yamt
0446b7c3e3 - use full sized segments unless we actually have SACKs to send.
- avoid TSO duplicate D-SACK.
- send SACKs regardless of TF_ACKNOW.
- don't clear rcv_sack_num when transmitting.

discussed on tech-net@.
2005-03-16 00:38:27 +00:00
simonb
e491fee6a5 s/quence/quench/. 2005-03-09 04:24:12 +00:00
simonb
3792275475 Add an extra `i' to notifes/notifed. 2005-03-09 04:23:33 +00:00
jonathan
4ae1f36dc9 Commit TCP SACK patches from Kentaro A. Karahone's patch at:
http://www.sigusr1.org/~kurahone/tcp-sack-netbsd-02152005.diff.gz

Fixes in that patch for pre-existing TCP pcb initializations were already
committed to NetBSD-current, so are not included in this commit.

The SACK patch has been observed to correctly negotiate and respond,
to SACKs in wide-area traffic.

There are two indepenently-observed, as-yet-unresolved anomalies:
First, seeing unexplained delays between in fast retransmission
(potentially explainable by an 0.2sec RTT between adjacent
ethernet/wifi NICs); and second, peculiar and unepxlained TCP
retransmits observed over an ath0 card.

After discussion with several interested developers, I'm committing
this now, as-is, for more eyes to use and look over.  Current hypothesis
is that the anomalies above may in fact be due to link/level (hardware,
driver, HAL, firmware) abberations in the test setup, affecting  both
Kentaro's  wired-Ethernet NIC and in my two (different) WiFi NICs.
2005-02-28 16:20:59 +00:00
perry
f07677dd81 nuke trailing whitespace 2005-02-26 22:45:09 +00:00
briggs
a825f3e77c Initialize t_partialacks in the tcpcb template.
From Kentaro A. Kurahone.
2005-02-16 14:59:40 +00:00
heas
52b0cd6b47 ntohs->htons for ip6 plen (payload length).
It is not technically necessary to set plen here, since ip6_output() starts
off by calculating it, but leaving it keeps it consistent with other code.
2005-02-12 01:24:07 +00:00
perry
b02c92c5bf ANSIfy function declarations 2005-02-03 23:50:33 +00:00
perry
3494482345 de-__P -- will ANSIfy .c files later. 2005-02-02 21:41:55 +00:00
heas
fe4b3cd078 In tcp_respond(), clear the m_pkthdr.csum_flags that was inherited from the
received packet so that the checksum is not performed twice.  Also,
tcp_respond() does not fill-in the m_pkthdr.csum_data, so a h/w checksum may
have the wrong offset.

OK from Jason Thorpe.
2005-01-03 19:47:30 +00:00
christos
77e7bdb8aa yamt's changes seem to fix all the checksumming issues. Turn the loopback
checksums back off so we can make sure that everything works.
2004-12-19 06:42:24 +00:00
christos
60fb5c0ece Turn checksumming on loopback back on until we fix the bugs in it.
Connect over tcp on the loopback is broken:

  4729 amq      0.000007 CALL  connect(4,0x804f2a0,0x1c)
  4729 amq      75.007420 RET   connect -1 errno 60 Connection timed out
2004-12-17 22:54:52 +00:00
thorpej
7994b6f95e Don't perform checksums on loopback interfaces. They can be reenabled with
the net.inet.*.do_loopback_cksum sysctl.

Approved by: groo
2004-12-15 04:25:19 +00:00
yamt
0ea22c32fa fix ipqent pool corruption problems. make tcp reass code use
its own pool of ipqent rather than sharing it with ip reass code.
PR/24782.
2004-09-15 09:21:22 +00:00
itojun
4ebcfcf29a fix MD5 signature support to actually validate inbound signature, and
drop packet if fails.
2004-05-18 14:44:14 +00:00
matt
da67d85073 Use EVCNT_ATTACH_STATIC{,2} 2004-05-01 02:20:42 +00:00
itojun
362e07a3c9 zero-clear ip6?pseudo before use 2004-04-26 05:18:13 +00:00
itojun
f103f9aee9 declare ip6_hdr_pseudo (for kernel only) and use it for TCP MD5 signature 2004-04-26 05:15:47 +00:00
itojun
67372cc454 sync comment with reality 2004-04-26 05:05:49 +00:00
itojun
e0395ac8f0 make TCP MD5 signature work with KAME IPSEC (#define IPSEC).
support IPv6 if KAME IPSEC (RFC is not explicit about how we make data stream
for checksum with IPv6, but i'm pretty sure using normal pseudo-header is the
right thing).

XXX
current TCP MD5 signature code has giant flaw:
it does not validate signature on input (can't believe it! what is the point?)
2004-04-26 03:54:28 +00:00
jonathan
887b782b0b Initial commit of a port of the FreeBSD implementation of RFC 2385
(MD5 signatures for TCP, as used with BGP).  Credit for original
FreeBSD code goes to Bruce M. Simpson, with FreeBSD sponsorship
credited to sentex.net.  Shortening of the setsockopt() name
attributed to Vincent Jardin.

This commit is a minimal, working version of the FreeBSD code, as
MFC'ed to FreeBSD-4. It has received minimal testing with a ttcp
modified to set the TCP-MD5 option; BMS's additions to tcpdump-current
(tcpdump -M) confirm that the MD5 signatures are correct.  Committed
as-is for further testing between a NetBSD BGP speaker (e.g., quagga)
and industry-standard BGP speakers (e.g., Cisco, Juniper).


NOTE: This version has two potential flaws. First, I do see any code
that verifies recieved TCP-MD5 signatures.  Second, the TCP-MD5
options are internally padded and assumed to be 32-bit aligned. A more
space-efficient scheme is to pack all TCP options densely (and
possibly unaligned) into the TCP header ; then do one final padding to
a 4-byte boundary.  Pre-existing comments note that accounting for
TCP-option space when we add SACK is yet to be done. For now, I'm
punting on that; we can solve it properly, in a way that will handle
SACK blocks, as a separate exercise.

In case a pullup to NetBSD-2 is requested, this adds sys/netipsec/xform_tcp.c
,and modifies:

sys/net/pfkeyv2.h,v 1.15
sys/netinet/files.netinet,v 1.5
sys/netinet/ip.h,v 1.25
sys/netinet/tcp.h,v 1.15
sys/netinet/tcp_input.c,v 1.200
sys/netinet/tcp_output.c,v 1.109
sys/netinet/tcp_subr.c,v 1.165
sys/netinet/tcp_usrreq.c,v 1.89
sys/netinet/tcp_var.h,v 1.109
sys/netipsec/files.netipsec,v 1.3
sys/netipsec/ipsec.c,v 1.11
sys/netipsec/ipsec.h,v 1.7
sys/netipsec/key.c,v 1.11
share/man/man4/tcp.4,v 1.16
lib/libipsec/pfkey.c,v 1.20
lib/libipsec/pfkey_dump.c,v 1.17
lib/libipsec/policy_token.l,v 1.8
sbin/setkey/parse.y,v 1.14
sbin/setkey/setkey.8,v 1.27
sbin/setkey/token.l,v 1.15

Note that the preceding two revisions to tcp.4 will be
required to cleanly apply this diff.
2004-04-25 22:25:03 +00:00
simonb
b5d0e6bf06 Initialise (most) pools from a link set instead of explicit calls
to pool_init.  Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

 Convert struct session, ucred and lockf to pools.
2004-04-25 16:42:40 +00:00
tls
7eb2f214d5 Change the default state of two tunables; bring our TCP a little bit
closer to normal behaviour for the current century.

New Reno is now on by default (which is really the only reasonable
choice, since we don't do SACK); instead of an initial window of 1
for non-local nets, we now use Sally Floyd's magic 4K rule.
2004-04-22 02:19:39 +00:00
itojun
f2e796b13f - respond to RST by ACK, as suggested in NISCC recommendation
- rate-limit ACKs against RSTs and SYNs
2004-04-20 16:52:12 +00:00
christos
90e1f431ca adjust to the sbreserve prototype change. 2004-04-17 15:18:53 +00:00
christos
99d2bc9467 PR/22551: Invoking tcpcb's get erroneously free'd resulting in to_ticks <= 0
assertion. Approved by he.
2004-04-05 21:49:21 +00:00
matt
9196bdd1f8 When accepting a peer's MSS, never let it drop below 256 (SLIP + TCP will
be the lowest MSS we should ever enounter).
2004-01-07 19:15:43 +00:00
thorpej
db71356cd1 - Change callout_setfunc() to require that the callout handle is already
initialized.  Update the txp(4) to compensate.
- Statically initialize the TCP timer callout handles in the tcpcb
  template.  We still use callout_setfunc(), but that call is now much
  less expensive.  Add a comment that the compiler is likely to unroll
  the loop (so don't sweat that it's there).
2003-10-27 16:52:01 +00:00
christos
649137925e initialize off 2003-10-25 08:13:28 +00:00
thorpej
9e4220c00a Oops, a little to aggressive in the previous patch; TCP_TIMER_INIT()
still needs to be in tcp_newtcpcb(), for now.  Pointed out by enami.
2003-10-22 05:55:54 +00:00
thorpej
31923baa46 Rather than zeroing a tcpcb structure and filling in all the fields
individually, create a tcpcb template pre-initialized (and pre-zero'd)
with the static and mostly-static tcpcb parameters.  The template is
now copied into the new tcpcb, which zeros and initializes most of the
tcpcb in one pass.  The template is kept up-to-date as TCP sysctl
variables are changed.

Combined with the previous sb_max change, TCP socket creation is now
25% faster.
2003-10-22 02:45:57 +00:00
thorpej
861856caa0 Add event counters that measure FAST_MBSEARCH. 2003-10-21 21:17:20 +00:00
mycroft
3114965161 Fix glaring errors in recent changes. 2003-09-25 00:59:31 +00:00
itojun
495bd5ff91 initialize ip_hl for ipsec policy lookup. PR kern/22715 2003-09-08 02:06:34 +00:00
itojun
32e3deae21 randomize IPv4/v6 fragment ID and IPv6 flowlabel. avoids predictability
of these fields.  ip_id.c is from openbsd.  ip6_id.c is adapted by kame.
2003-09-06 03:36:30 +00:00
itojun
495906ca8e revamp inpcb/in6pcb so that they are more aligned with each other.
in6pcb lookup now uses hash(9).
2003-09-04 09:16:57 +00:00
itojun
58f57a60fd tp could be null in tcp_respond() 2003-08-22 22:27:07 +00:00
itojun
11ede1ed88 remove ipsec_set/getsocket. now we explicitly pass socket * to ip{,6}_output. 2003-08-22 22:00:36 +00:00
itojun
82eb4ce914 change the additional arg to be passed to ip{,6}_output to struct socket *.
this fixes KAME policy lookup which was broken by the previous commit.
2003-08-22 21:53:01 +00:00
jonathan
902669955f Replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec.  Reviewed by itojun.
2003-08-22 20:20:09 +00:00
jonathan
28b5f5dfab (fast-ipsec): Add hooks to pass IPv4 IPsec traffic into fast-ipsec, if
configured with ``options FAST_IPSEC''.  Kernels with KAME IPsec or
with no IPsec should work as before.

All calls to ip_output() now always pass an additional compulsory
argument: the inpcb associated with the packet being sent,
or 0 if no inpcb is available.

Fast-ipsec tested with ICMP or UDP over ESP. TCP doesn't work, yet.
2003-08-15 03:42:00 +00:00
agc
aad01611e7 Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22364, verified by myself.
2003-08-07 16:26:28 +00:00
he
80ccb5520c As a temporary workaround, apply the fix from PR#20390, thereby
cooperating with the callout code in working around the race
condition caused by the TCP code's use of the callout facility.

Instead of unconditionally releasing memory in tcp_close() and
SYN_CACHE_PUT(), check whether any of the related callout handlers
are about to be invoked (but have not yet done callout_ack()), and
if so, just mark the associated data structure (tcpcb or syn cache
entry) as "dead", and test for this (and release storage) in the
callout handler functions.
2003-07-20 16:35:07 +00:00
ragge
9e2d68cb61 Make it possible to set TCP_INIT_WIN and TCP_INIT_WIN_LOCAL in the config
file as options.
2003-07-03 08:28:16 +00:00
fvdl
d5aece61d6 Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
2003-06-29 22:28:00 +00:00
ragge
679db94879 Add code to remember where in the send queue of mbufs the last packet was
sent from. This change avoid a linear search through all mbufs when using
large TCP windows, and therefore permit high-speed connections on long
distances.

Tested on a 1 Gigabit connection between Luleå and San Francisco, a distance
of about 15000km.  With TCP windows of just over 20 Mbytes it could keep up
with 950Mbit/s.

After discussions with Matt Thomas and Jason Thorpe.
2003-06-29 18:58:26 +00:00
martin
d505b18964 Make sure to include opt_foo.h if a defflag option FOO is used. 2003-06-23 11:00:59 +00:00
thorpej
cdf1b0026c Allow TCP connections to hosts on a local network to use a larger
slow start initial window.  Default this larger initial window to
4 packets, allowing it to be adjusted with net.inet.tcp.init_win_local.
2003-03-01 04:40:27 +00:00
matt
65e5548a17 Add MBUFTRACE kernel option.
Do a little mbuf rework while here.  Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *).  These are not performance critical and making them
call m_get saves considerable space.  Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
2003-02-26 06:31:08 +00:00
scw
5521093d4b Quell an uninitialised variable warning. 2002-11-24 10:52:47 +00:00