Commit Graph

76 Commits

Author SHA1 Message Date
rpaulo f3330397f0 Modular (I tried ;-) TCP congestion control API. Whenever certain conditions
happen in the TCP stack, this interface calls the specified callback to
handle the situation according to the currently selected congestion
control algorithm.
A new sysctl node was created: net.inet.tcp.congctl.{available,selected}
with obvious meanings.
The old net.inet.tcp.newreno MIB was removed.
The API is discussed in tcp_congctl(9).

In the near future, it will be possible to selected a congestion control
algorithm on a per-socket basis.

Discussed on tech-net and reviewed by <yamt>.
2006-10-09 16:27:07 +00:00
elad 874fef3711 integrate kauth. 2006-05-14 21:19:33 +00:00
christos 49cd195740 Coverity CID 1153: Add KASSERT before deref. 2006-04-15 02:33:41 +00:00
christos 95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
christos 89940190d0 Implement PMTU checks from:
http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html

1. Don't act on ICMP-need-frag immediately if adhoc checks on the
advertised MTU fail. The MTU update is delayed until a TCP retransmit
happens.
2. Ignore ICMP Source Quench messages meant for TCP connections.

From OpenBSD.
2005-07-19 17:00:02 +00:00
mycroft c9f058f65e Copyright maintenance. 2005-03-02 10:20:18 +00:00
jonathan 4ae1f36dc9 Commit TCP SACK patches from Kentaro A. Karahone's patch at:
http://www.sigusr1.org/~kurahone/tcp-sack-netbsd-02152005.diff.gz

Fixes in that patch for pre-existing TCP pcb initializations were already
committed to NetBSD-current, so are not included in this commit.

The SACK patch has been observed to correctly negotiate and respond,
to SACKs in wide-area traffic.

There are two indepenently-observed, as-yet-unresolved anomalies:
First, seeing unexplained delays between in fast retransmission
(potentially explainable by an 0.2sec RTT between adjacent
ethernet/wifi NICs); and second, peculiar and unepxlained TCP
retransmits observed over an ath0 card.

After discussion with several interested developers, I'm committing
this now, as-is, for more eyes to use and look over.  Current hypothesis
is that the anomalies above may in fact be due to link/level (hardware,
driver, HAL, firmware) abberations in the test setup, affecting  both
Kentaro's  wired-Ethernet NIC and in my two (different) WiFi NICs.
2005-02-28 16:20:59 +00:00
perry b02c92c5bf ANSIfy function declarations 2005-02-03 23:50:33 +00:00
mycroft e236dc1c36 Whoops. Exit fast recovery when handling a timeout. 2005-01-27 18:45:41 +00:00
mycroft 5283ca74ad Fix two problems in our TCP stack:
1) If an echoed RFC 1323 time stamp appears to be later than the current time,
   ignore it and fall back to old-style RTT calculation.  This prevents ending
   up with a negative RTT and panicking later.

2) Fix NewReno.  This involves a few changes:

   a) Implement the send_high variable in RFC 2582.  Our implementation is
      subtly different; it is one *past* the last sequence number transmitted
      rather than being equal to it.  This simplifies some logic and makes
      the code smaller.  Additional logic was required to prevent sequence
      number wraparound problems; this is not mentioned in RFC 2582.

   b) Make sure we reset t_dupacks on new acks, but *not* on a partial ack.
      All of the new ack code is pushed out into tcp_newreno().  (Later this
      will probably be a pluggable function.)  Thus t_dupacks keeps track of
      whether we're in fast recovery all the time, with Reno or NewReno, which
      keeps some logic simpler.

   c) We do not need to update snd_recover when we're not in fast recovery.
      See tech-net for an explanation of this.

   d) In the gratuitous fast retransmit prevention case, do not send a packet.
      RFC 2582 specifically says that we should "do nothing".

   e) Do not inflate the congestion window on a partial ack.  (This is done by
      testing t_dupacks to see whether we're still in fast recovery.)

This brings the performance of NewReno back up to the same as Reno in a few
random test cases (e.g. transferring peer-to-peer over my wireless network).
I have not concocted a good test case for the behavior specific to NewReno.
2005-01-26 21:49:27 +00:00
itojun 344b08b44b some corrections from markus@openbsd;
- callout_ack() was called with wrong argument
2004-01-02 15:51:04 +00:00
itojun 3fef2ba893 make it compilable with TCP_DEBUG defined 2003-10-27 07:43:01 +00:00
agc aad01611e7 Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22364, verified by myself.
2003-08-07 16:26:28 +00:00
he 80ccb5520c As a temporary workaround, apply the fix from PR#20390, thereby
cooperating with the callout code in working around the race
condition caused by the TCP code's use of the callout facility.

Instead of unconditionally releasing memory in tcp_close() and
SYN_CACHE_PUT(), check whether any of the related callout handlers
are about to be invoked (but have not yet done callout_ack()), and
if so, just mark the associated data structure (tcpcb or syn cache
entry) as "dead", and test for this (and release storage) in the
callout handler functions.
2003-07-20 16:35:07 +00:00
thorpej 882dec6ba3 Test callout_pending(), not callout_active(), and eliminate now-unnecessary
callout_deactivate() calls.
2003-02-03 23:50:59 +00:00
scw 5521093d4b Quell an uninitialised variable warning. 2002-11-24 10:52:47 +00:00
simonb 4dd4549b31 Guard use of "so" in tcp_timer_persist() and tcp_timer_2msl() with
#ifdef TCP_DEBUG.
2002-10-22 03:11:03 +00:00
itojun f192b66b94 whitespace 2002-06-09 16:33:36 +00:00
itojun 3e7ae517e0 path MTU discovery blackhole detection.
PR 12790 (sorry for not committing it for a long time)
2002-05-26 16:05:43 +00:00
lukem ea1cd7eb08 add RCSIDs 2001-11-13 00:32:34 +00:00
matt 47577dca93 Change a few variable/tables to const since they are read-only. 2001-11-04 13:42:27 +00:00
thorpej 050e9de009 Use callouts for SYN cache timers, rather than traversing time queues
in tcp_slowtimo().
2001-09-11 21:03:20 +00:00
thorpej 4745c7f252 Update copyrights. 2001-09-10 22:45:46 +00:00
thorpej 6d0e813f6c Use callouts for TCP timers, rather than traversing the list of
all open TCP connections in tcp_slowtimo() (which is called 2x
per second).  It's fairly rare for TCP timers to actually fire,
so saving this list traversal is good, especially if you want
to scale to thousands of open connections.
2001-09-10 22:14:26 +00:00
thorpej 413e5cb878 Initialize TCP timer variables in a new function, tcp_timer_init(). 2001-09-10 20:36:43 +00:00
thorpej 45e02f5ee8 Split tcp_timers() into multiple functions, one for each timer,
and call it directly from tcp_slowtimo() (via a table) rather
than going through tcp_userreq().

This will allow us to call TCP timers directly from callouts,
in a future revision.
2001-09-10 20:15:14 +00:00
thorpej 7446fd2bc8 Change the way receive idle time and round trip time are measured.
Instead of incrementing t_idle and t_rtt in tcp_slowtimo(), we now
take a timstamp (via tcp_now) and use subtraction to compute the
delta when we actually need it (using unsigned arithmetic so that
tcp_now wrapping is handled correctly).

Based on similar changes in FreeBSD.
2001-09-10 15:23:09 +00:00
thorpej 783db90019 Use a callout for the delayed ACK timer, and delete tcp_fasttimo().
Expose the delayed ACK timer as net.inet.tcp.delack_ticks.
2001-09-10 04:24:24 +00:00
itojun 9183e2dc4e remove #ifdef TCP6. it is not likely for us to bring in sys/netinet6/tcp6*.c
(separate TCP/IPv6 stack) into netbsd-current.
2000-10-19 20:22:59 +00:00
itojun a7e15e4935 be more friendly with INET-less build.
XXX we need to do more to do a working INET-less build
2000-10-17 03:06:42 +00:00
augustss 8529438fe6 Remove register declarations. 2000-03-30 12:51:13 +00:00
itojun 685747d56c Use proper ip protocol # field and tcp hdr on sending RST against SYN,
when ip header and tcp header are not adjacent to each other
(i.e. when ip6 options are attached).

To test this, try
	telnet @::1@::1 port
toward a port without responding server.  Prior to the fix, the kernel will
generate broken RST packet.
1999-07-14 22:37:13 +00:00
itojun 118d2b1d4f IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
  data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
  package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
  file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.
1999-07-01 08:12:45 +00:00
mouse b95116821c Create tcp.keepidle, tcp.keepintvl, tcp.keepcnt, tcp.slowhz sysctls. 1998-09-10 10:46:03 +00:00
mycroft 2f501074f8 Fix a couple of bogons related to tcp_new_iss():
* Don't add tcp_iss_seq when creating a new ISS from TIME-WAIT state.
* Do the clock increment even when using the rnd device.
1998-09-04 22:29:54 +00:00
thorpej 1c4ff0a086 Comment where we use the Loss Window. 1998-07-17 22:18:49 +00:00
thorpej c296923d2f Loss window MUST be one segment, per draft-floyd-incr-init-win-03. 1998-06-02 17:22:26 +00:00
thorpej 49573284f5 Make sure a timer is marked "disarmed" once it has expired. 1998-05-11 20:52:18 +00:00
thorpej 5596fe2614 Nuke TUBA per my note to tech-net; there's no reason to keep it around. 1998-05-11 19:57:23 +00:00
thorpej dc49b0342e Define all TCP timers in terms of PRT timers. 1998-05-07 01:30:46 +00:00
thorpej 1ffa60ac01 Use macros from tcp_timer.h to manipulate TCP timers, so that their
implementation can be changed easily.
1998-05-06 01:21:20 +00:00
kml e173e7a084 Remove bogus black hole discovery code 1998-05-01 01:15:55 +00:00
thorpej 13f972a4d6 Make use of the work-arounds for ancient broken TCP peers run-time
conditional (tcp_compat_42).  The kernel config option TCP_COMPAT_42
will still enable this by default, or disable this by default if the
option is not included (i.e. current behavior).  This will be made a
sysctl soon.
1998-04-29 05:16:46 +00:00
kml 1579dcec47 Add support for deletion of routes added by path MTU discovery;
uses new generic route timeout code.  Add sysctl for timeout period.
1998-04-29 03:44:11 +00:00
thorpej 2da6c91259 Fix a potential-congestion case in the larger initial congestion window
code, as clarified in the TCPIMPL WG meeting at IETF #41: If the SYN
(active open) or SYN,ACK (passive open) was retransmitted, the initial
congestion window for the first slow start of that connection must be
one segment.
1998-03-31 22:49:09 +00:00
kml 123232e156 Fix a retransmission bug introduced by the Brakmo and Peterson
RTO estimation changes.  Under some circumstances it would return a value
of 0, while the old Van Jacobson RTO code would return a minimum of 3.
This would result in 12 retransmissions, each 1 second apart.
This takes care of those instances, and ensures that t_rttmin is
used everywhere as a lower bound.
1998-03-19 22:29:33 +00:00
thorpej 5837cc6b07 Update copyright (sigh, should have done this long ago). 1998-02-19 02:36:42 +00:00
scottr 3cdcd5e1c7 Use option header file for TCP_COMPAT_42 1998-01-12 03:00:42 +00:00
thorpej e5e283e02d Finishing merging 4.4BSD-Lite2 netinet. At this point, the only changes
left were SCCS IDs and Copyright dates.
1998-01-05 10:31:44 +00:00
thorpej 673fb149c6 Implement a queue for delayed ACK processing. This queue is used in
tcp_fasttimo() in lieu of scanning all open TCP connections.
1997-12-31 03:31:23 +00:00