Commit Graph

196 Commits

Author SHA1 Message Date
itojun
a7e15e4935 be more friendly with INET-less build.
XXX we need to do more to do a working INET-less build
2000-10-17 03:06:42 +00:00
thorpej
d839a91f5f Add an IP_MTUDISC flag to the flags that can be passed to
ip_output().  This flag, if set, causes ip_output() to set
DF in the IP header if the MTU in the route is not locked.

This allows a bunch of redundant code, which I was never
really all that happy about adding in the first place, to
be eliminated.

Inspired by a similar change made by provos@openbsd.org when
he integrated NetBSD's Path MTU Discovery code into OpenBSD.
2000-10-17 02:57:01 +00:00
itojun
6e3a9bc311 validate mbuf chain length on *_ctlinput. remote node may be able to
transmit a truncated icmp6 packet and panic the system.  sync with kame.
2000-10-13 17:53:44 +00:00
itojun
dde2adf8e4 for t_template, allocate mbuf cluster only if really necessary.
this avoids too aggressive memory usage on heavy load web server, for example.
From: Kevin Lahey <kml@dotrocket.com>

release and reallocate t_template, if t_template->m_len changes.
(this happens if we connect to IPv4 mapped destination and then IPv6
destination, on a single AF_INET6 socket)

KAME 1.26 -> 1.28
2000-09-19 18:21:41 +00:00
itojun
23f6a4f4e8 remove old mbuf assumption (ip header and tcp header are on the same mbuf).
this is for m_pulldown use. (sync with kame)
2000-06-30 16:44:33 +00:00
augustss
8529438fe6 Remove register declarations. 2000-03-30 12:51:13 +00:00
simonb
1058c2aba9 Delete redundant decl of zeroin6_addr, it's in <netinet6/in6_var.h>. 2000-03-30 02:38:53 +00:00
itojun
04ac848d6f introduce m->m_pkthdr.aux to hold random data which needs to be passed
between protocol handlers.

ipsec socket pointers, ipsec decryption/auth information, tunnel
decapsulation information are in my mind - there can be several other usage.
at this moment, we use this for ipsec socket pointer passing.  this will
avoid reuse of m->m_pkthdr.rcvif in ipsec code.

due to the change, MHLEN will be decreased by sizeof(void *) - for example,
for i386, MHLEN was 100 bytes, but is now 96 bytes.
we may want to increase MSIZE from 128 to 256 for some of our architectures.

take caution if you use it for keeping some data item for long period
of time - use extra caution on M_PREPEND() or m_adj(), as they may result
in loss of m->m_pkthdr.aux pointer (and mbuf leak).

this will bump kernel version.

(as discussed in tech-net, tested in kame tree)
2000-03-01 12:49:27 +00:00
itojun
82ab98145f ensure tcp window size does not overflow (16bit unsigned after window scale).
FreeBSD PR: 16914
2000-02-29 05:25:49 +00:00
itojun
76064f5770 don't chase mbuf pointer when it is NULL. 2000-02-06 08:06:43 +00:00
itojun
1a2a1e2b1f bring in latest KAME ipsec tree.
- interop issues in ipcomp is fixed
- padding type (after ESP) is configurable
- key database memory management (need more fixes)
- policy specification is revisited

XXX m->m_pkthdr.rcvif is still overloaded - hope to fix it soon
2000-01-31 14:18:52 +00:00
itojun
abddb5f851 do not overwrite traffic class field when we write IPv6 version field. 1999-12-15 06:28:43 +00:00
itojun
ea861f0183 sync IPv6 part with latest KAME tree. IPsec part is left unmodified
due to massive changes in KAME side.
- IPv6 output goes through nd6_output
- faith can capture IPv4 packets as well - you can run IPv4-to-IPv6 translator
  using heavily modified DNS servers
- per-interface statistics (required for IPv6 MIB)
- interface autoconfig is revisited
- udp input handling has a big change for mapped address support.
- introduce in4_cksum() for non-overwriting checksumming
- introduce m_pulldown()
- neighbor discovery cleanups/improvements
- netinet/in.h strictly conforms to RFC2553 (no extra defs visible to userland)
- IFA_STATS is fixed a bit (not tested)
- and more more more.

TODO:
- cleanup os-independency #ifdef
- avoid rcvif dual use (for IPsec) to help ifdetach

(sorry for jumbo commit, I can't separate this any more...)
1999-12-13 15:17:17 +00:00
ragge
713b50cde9 Avoid GCC complaints (under some circumstances). 1999-12-12 19:51:49 +00:00
itojun
313f5eb9cd do not drop from IP header to tcp option until sbappend(), to reduce
requirement to mbuf chain.
part of KAME sync, committed separately for its (possible) impact.
1999-12-08 16:22:20 +00:00
enami
5326516a15 Make this compile without INET6. 1999-09-23 04:02:27 +00:00
itojun
9474edfcd8 cleanup and correct TCP MSS consideration with IPsec headers.
MSS advertisement must always be:
	max(if mtu) - ip hdr siz - tcp hdr siz
We violated this in the previous code so it was fixed.

tcp_mss_to_advertise() now takes af (af on wire) as its argument,
to compute right ip hdr siz.

tcp_segsize() will take care of IPsec header size.
One thing I'm not really sure is how to handle IPsec header size in
*rxsegsizep (inbound segment size estimation).
The current code subtracts possible *outbound* IPsec size from *rxsegsizep,
hoping that the peer is using the same IPsec policy as me.
It may not be applicable, could TCP gulu please comment...
1999-09-23 02:21:30 +00:00
itojun
4597cff18d fix tcp mss consideration on ipsec operation.
now tcp-over-ipsec should not experience fragmentation due to
addition of ipsec header.

From: proff@suburbia.net (Julian Assange)
1999-08-27 02:56:14 +00:00
itojun
809ab7f1ff When listening socket goes away, remove assockated syn cache entires.
Stale syn cache entries are useless because none of them will be used
if there is no listening socket, as tcp_input looks up listening socket by
in_pcblookup*() before looking into syn cache.

This fixes race condition due to dangling socket pointer from syn cache
entries to listening socket (this was introduced when ipsec is merged in).

This should preserve currently implemented behavior (but not 4.4BSD
behavior prior to syn cache).

Tested in KAME repository before commit, but we'd better run some
regression tests.
1999-08-25 15:23:12 +00:00
itojun
d48c55f4f0 ctlinput handling must look at ip6_src, not ip6_dst.
(this makes path mtu handling wrong)
1999-08-25 12:38:14 +00:00
itojun
a9b7fe4621 return with doing nothing from xx_ctlinput(), when sa->sa_family
is not the expected one.

I see PRC_REDIRECT_HOST with sa->sa_family == AF_UNIX coming to
{tcp,udp}_ctlinput() when I use dhclient, and I feel like adding
more sanity checks, without logging - if we log it it is too noisy.
1999-08-09 10:55:29 +00:00
itojun
70ada0957e sync with recent KAME.
- loosen ipsec restriction on packet diredtion.
- revise icmp6 redirect handling on IsRouter bit.
- tcp/udp notification processing (link-local address case)
- cosmetic fixes (better code share across *BSD).
1999-07-31 18:41:15 +00:00
itojun
42c5caafe7 do not include unnecessary include files. 1999-07-23 15:21:17 +00:00
itojun
7fee35f579 - implement IPv6 pmtud, which is necessary for TCP6.
- fix memory leak on SO_DEBUG over TCP.
1999-07-22 12:56:56 +00:00
itojun
685747d56c Use proper ip protocol # field and tcp hdr on sending RST against SYN,
when ip header and tcp header are not adjacent to each other
(i.e. when ip6 options are attached).

To test this, try
	telnet @::1@::1 port
toward a port without responding server.  Prior to the fix, the kernel will
generate broken RST packet.
1999-07-14 22:37:13 +00:00
drochner
46f90cb053 make sending of keepalive messages work again:
-remove bogus sanity check involving an uninitialized variable
-correct mbuf cluster allocation
-(non-critical) remove redundant check in cleanup after error
1999-07-14 22:08:52 +00:00
thorpej
f9a7668b3f defopt IPSEC and IPSEC_ESP (both into opt_ipsec.h). 1999-07-09 22:57:15 +00:00
fvdl
e3fa5cc725 Fix for -Wunitialized warnings broke compiles without INET6, refix. 1999-07-02 21:02:05 +00:00
itojun
4b961b81e3 avoid "variable not initialized" warnings on some of the platforms. 1999-07-02 12:45:32 +00:00
itojun
118d2b1d4f IPv6 kernel code, based on KAME/NetBSD 1.4, SNAP kit 19990628.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.

- sys/kern: do not assume single mbuf, accept chained mbuf on passing
  data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
  package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
  file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen

In my understanding no code here is subject to export control so it
should be safe.
1999-07-01 08:12:45 +00:00
explorer
cff4c9630b Don't mix in data just to stir the rnd pool. Extracting data will do that,
any network packets received might, too, so this is already taken care of.
1999-02-28 13:41:24 +00:00
thorpej
6c30816c15 Fix a slight error in previous. Rearrange some code in tcp_respond() so
that a DIAGNOSTIC check against the destination address is actually
checking the destination address.  "oops."
1999-01-26 08:28:50 +00:00
thorpej
a43786143f Fix a problem pointed out by Charles Hannum; DF wasn't being set in
SYN,ACK packets during Path MTU Discovery.  Fix tcp_respond() to do the
appropriate route lookup and set DF as appropriate.

Also, fixup similar code in tcp_output() to relookup the route if it
is down.
1999-01-20 03:39:54 +00:00
thorpej
4f177aec90 Add a lock around the TCPCB's sequence queue, to prevent tcp_drain()
from corrupting the queue if called from a device's interrupt context.

Similar in nature to the problem reported in PR #5684.
1998-12-18 21:38:02 +00:00
thorpej
974aa74abd Use the pool allocator for ipqent structures. 1998-10-08 01:19:25 +00:00
thorpej
6cfb33b4e4 Use the pool allocator for the tcpcb's TCP/IP header template. 1998-10-07 23:20:03 +00:00
matt
8e8f38e0f2 Add a sysctl for newreno (default to off). 1998-10-06 00:20:44 +00:00
mycroft
31347e4671 Always send a 0 window with a RST. Suggested by Darren Reed. 1998-09-19 04:02:52 +00:00
mycroft
2f501074f8 Fix a couple of bogons related to tcp_new_iss():
* Don't add tcp_iss_seq when creating a new ISS from TIME-WAIT state.
* Do the clock increment even when using the rnd device.
1998-09-04 22:29:54 +00:00
thorpej
833061914a Use the pool allocator for tcpcbs. 1998-08-02 00:36:19 +00:00
thorpej
3a9ed00799 Document that we are more conservative after doing MTU discovery than the
suggestion in draft-floyd-incr-init-win-03.  Rather than scaling cwnd back
by the ratio of new segment size to old segment size, we perform a slow start
using the Initial Window, computed with the new segment size.
1998-07-17 23:09:58 +00:00
thorpej
0f909866c0 Clarify that we're using the Loss Window when we receive a source quench. 1998-07-17 23:02:38 +00:00
kml
dd5ed34b88 Changed initialization of peermss to ensure that it didn't have
the TCP and IP options lengths removed from it -- the IP options can
change over the course of a connection...
1998-05-12 21:45:51 +00:00
kml
1216b9a560 Change comments on tcp_mss_to_advertise to match actual arguments 1998-05-07 22:30:23 +00:00
thorpej
ce3d776874 Rework the syn cache code somewhat:
- Don't use home-grown queue manipulation.  Use <sys/queue.h> instead.  The
  data structures are a little larger, but we are otherwise wasting the
  memory chunk anyway (we're already a 64-byte malloc bucket).
- Fix a bug in the cache-is-full case: if the oldest element removed from
  the first non-empty bucket was the only element in the bucket, the
  bucket wouldn't be removed from the bucket cache, causing queue corruption
  later.
- Optimize the syn cache timers by using PRT timers rather than home-grown
  decrement-and-propagate timers.

This code is now a fair bit smaller, and significantly easier to read
and understand.
1998-05-07 01:37:27 +00:00
thorpej
1ffa60ac01 Use macros from tcp_timer.h to manipulate TCP timers, so that their
implementation can be changed easily.
1998-05-06 01:21:20 +00:00
thorpej
e44c4fb7d3 Once again, move a declaration for the benefit of TUBA (grumble). 1998-05-03 19:54:56 +00:00
matt
334f006538 New TCP reassembly code. The new code reduces the memory needed by
out-of-order packets and builds the infrastructure needed for sending
SACK blocks (to be added shortly).
1998-04-29 20:43:29 +00:00
thorpej
13f972a4d6 Make use of the work-arounds for ancient broken TCP peers run-time
conditional (tcp_compat_42).  The kernel config option TCP_COMPAT_42
will still enable this by default, or disable this by default if the
option is not included (i.e. current behavior).  This will be made a
sysctl soon.
1998-04-29 05:16:46 +00:00
kml
fcf0227962 Fix to ensure that the correct MSS is advertised for loopback
TCP connections by using the MTU of the interface.  Also added
a knob, mss_ifmtu, to force all connections to use the MTU of
the interface to calculate the advertised MSS.
1998-04-13 21:18:19 +00:00
thorpej
2da6c91259 Fix a potential-congestion case in the larger initial congestion window
code, as clarified in the TCPIMPL WG meeting at IETF #41: If the SYN
(active open) or SYN,ACK (passive open) was retransmitted, the initial
congestion window for the first slow start of that connection must be
one segment.
1998-03-31 22:49:09 +00:00
thorpej
d725b1a332 Remove a comment in tcp_mss_to_advertise() that no longer applies. 1998-03-28 19:39:57 +00:00
kml
96954c2a53 Ensure that we take the IP option length into account when we calculate
the effective maximum send size for TCP.  ip_optlen() and tcp_optlen()
should probably be inlined for efficiency.
1998-03-24 03:10:02 +00:00
kml
123232e156 Fix a retransmission bug introduced by the Brakmo and Peterson
RTO estimation changes.  Under some circumstances it would return a value
of 0, while the old Van Jacobson RTO code would return a minimum of 3.
This would result in 12 retransmissions, each 1 second apart.
This takes care of those instances, and ensures that t_rttmin is
used everywhere as a lower bound.
1998-03-19 22:29:33 +00:00
kml
ffb211fb9d Ensure that the TCP segment size reflects the size of TCP options
in the packet.  This fixes a bug that was resulting in extra packets
in retransmissions (the second packet would be 12 bytes long,
reflecting the RFC1323 timestamp option size).
1998-03-17 23:50:30 +00:00
thorpej
5837cc6b07 Update copyright (sigh, should have done this long ago). 1998-02-19 02:36:42 +00:00
mellon
27a5a0a616 Take PCB off delayed ack queue before freeing. 1998-01-30 08:42:11 +00:00
scottr
3cdcd5e1c7 Use option header file for TCP_COMPAT_42 1998-01-12 03:00:42 +00:00
thorpej
e5e283e02d Finishing merging 4.4BSD-Lite2 netinet. At this point, the only changes
left were SCCS IDs and Copyright dates.
1998-01-05 10:31:44 +00:00
thorpej
673fb149c6 Implement a queue for delayed ACK processing. This queue is used in
tcp_fasttimo() in lieu of scanning all open TCP connections.
1997-12-31 03:31:23 +00:00
thorpej
c02a72fcd0 Implement an infrastructure to allow larger initial congestion windows.
The sysctl'able variable "tcp_init_win", when set to 0, selects an
auto-tuning algorithm for selecting the initial window, based on transmit
segment size, per discussion in the IETF tcpimpl working group.

Default initial window is still 1 segment, but will soon become 2 segments,
per discussion in tcpimpl.
1997-12-11 22:47:24 +00:00
thorpej
c40f4eb3cc Implement tcp_drain(). 1997-12-10 01:58:07 +00:00
kml
3b9fc85803 Remove an extraneous call to rtfree() in the path mtu discovery code;
this was causing negative reference counts on routes...
1997-11-11 21:10:50 +00:00
kml
86275dc497 TCP MSS fixes to provide cleaner slow-start and recovery. 1997-11-08 02:35:22 +00:00
kml
6b86b260cb change sysctl net.inet.icmp.mtudisc to net.inet.ip.mtudisc 1997-10-18 21:18:28 +00:00
kml
323c04642b Path MTU Discovery support. This is turned off by default.
Use sysctl -w net.inet.icmp.mtudisc=1 to turn on.
Still to come:  path removal after some period, black hole detection
1997-10-17 22:12:14 +00:00
explorer
80513cb5ae o Make usage of /dev/random dependant on
pseudo-device   rnd                     # /dev/random and in-kernel generator
  in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
  that this code is derived in part from Ted Tyso's linux code.
1997-10-13 00:46:08 +00:00
explorer
790e114732 Add hooks to use the kernel random system to generate TCP sequence numbers. 1997-10-10 01:51:07 +00:00
thorpej
4ed600dbd0 Fix several annoyances related to MSS handling in BSD TCP:
- Don't overload t_maxseg.  Previous behavior was to set it to the min
  of the peer's advertised MSS, our advertised MSS, and tcp_mssdflt
  (for non-local networks).  This breaks PMTU discovery running on
  either host.  Instead, remember the MSS we advertise, and use it
  as appropriate (in silly window avoidance).
- Per last bullet, split tcp_mss() into several functions for handling
  MSS (ours and peer's), and performing various tasks when a connection
  becomes ESTABLISHED.
- Introduce a new function, tcp_segsize(), which computes the max size
  for every segment transmitted in tcp_output().  This will eventually
  be used to hook in PMTU discovery.
1997-09-22 21:49:55 +00:00
thorpej
efa8881dbe Pull SYN_cache_branch down into the main line. 1997-07-23 21:26:40 +00:00
thorpej
a0e791807e Eliminate use of dtom() from the network code, allowing more flexible
use of mbuf external storage and increasing performance (by eliminating
an m_pullup() for clusters in the IP reassembly code).

Changes from Koji Imada <koji@math.human.nagoya-u.ac.jp>, in PR #3628
and #3480, with ever-so-slight integration changes by me.
1997-06-24 02:25:59 +00:00
mycroft
315bb1ab50 Fix RTT scaling problems introduced with Brakmo and Peterson changes. 1996-12-10 18:20:19 +00:00
mycroft
9bfa240a98 Hash unconnected PCBs. 1996-09-15 18:11:06 +00:00
mycroft
62a6cce9ca Add in_nullhost() and in_hosteq() macros, to hide some protocol
details.  Also, fix a bug in TCP wrt SYN+URG packets.
1996-09-09 14:51:07 +00:00
christos
14d9cd33af netinet prototypes 1996-02-13 23:40:59 +00:00
mycroft
67e78477db Build a hash table of PCBs. Hash function needs tweaking. 1996-01-31 03:49:23 +00:00
cgd
dfad729a16 make netinet work on systems where pointers and longs are 64 bits
(like the alpha).  Biggest problem: IP headers were overlayed with
structure which included pointers, and which therefore didn't overlay
properly on 64-bit machines.  Solution: instead of threading pointers
through IP header overlays, add a "queue element" structure to do
the threading, and point it at the ip headers.
1995-11-21 01:07:34 +00:00
mycroft
351cfd5ed8 Fix bogon in previous. 1995-06-12 06:48:54 +00:00
mycroft
22687aa834 Change in_pcbnotify*() to take an errno value. Make inetctlerrmap[] an
array on ints, not u_chars.
1995-06-12 06:46:34 +00:00
mycroft
10a4696964 Oops. Make source quench work again. 1995-06-12 06:24:21 +00:00
mycroft
6897f39ae9 Various cleanup, including:
* Convert several data structures to use queue.h.
* Split in_pcbnotify() into two parts; one for notifying a specific PCB, and
one for notifying all PCBs for a particular foreign address.
1995-06-12 00:46:47 +00:00
mycroft
2be9b519ac As suggested by Brakmo and Peterson:
* Don't add the extra 1/8 of the mss when ramping up the congestion window.
* Scale the RTT values slightly to adjust for rounding errors.
* Set the lower bound of the RTO to RTT+2.
1995-06-11 20:39:22 +00:00
mycroft
0a99592372 Clean up many more casts. 1995-06-04 05:06:49 +00:00
cgd
80929f8527 be a bit more careful and explicit with types. (basically a large no-op.) 1995-04-13 06:35:38 +00:00
mycroft
63bb09e6da Don't return received data to the user until the initial handshake is complete.
Also use TCPS_HAVEESTABLISHED() in a few other places.
1994-10-14 16:01:48 +00:00
cgd
cf92afd66e New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD' 1994-06-29 06:29:24 +00:00
mycroft
07b4f2ab54 Update to 4.4-Lite networking code, with a few local changes. 1994-05-13 06:02:48 +00:00
mycroft
627b841797 Change the counters to be all the same type -- u_long. 1994-01-10 23:27:39 +00:00
mycroft
b79490fcca Should compile now with or without `options MULTICAST'. 1994-01-10 20:14:14 +00:00
mycroft
e43117185e Prototypes. 1994-01-08 23:07:16 +00:00
mycroft
4fe12e6e88 Fix some inconsistent spacing; spaces at the end of lines, etc. 1994-01-08 21:21:28 +00:00
mycroft
95b048b53a Canonicalize all #includes. 1993-12-18 00:40:47 +00:00
cgd
fe1802950b add include of select.h if necessary for protos, or delete if extraneous 1993-05-22 11:40:42 +00:00
cgd
8d6c77881c make kernel select interface be one-stop shopping & clean it all up. 1993-05-18 18:18:40 +00:00
mycroft
348f9280dc Ignore forged ICMP_UNREACH with dport==0 and sport==0. 1993-04-12 11:07:57 +00:00
cgd
61f282557f initial import of 386bsd-0.1 sources 1993-03-21 09:45:37 +00:00