Commit Graph

670 Commits

Author SHA1 Message Date
mycroft
94895652e1 Modify the checksum slightly so that the htons()s can all be combined. 1999-01-24 12:57:38 +00:00
thorpej
a58f271406 Oops, forgot to update copyright notice in previous. 1999-01-24 01:21:18 +00:00
thorpej
86e2c3fbc6 * Completely rewrite syn_cache_respond().
- Don't use tcp_respond(), instead create the tcp/ip header from scratch,
and send it ourself.
- Reuse the mbuf that carried the SYN, or allocate one if that is not
available.
- Cache the route we look up to do the Path MTU Discovery check, and
transfer the reference to that route to the inpcb when the connection
completes.
* Macro'ize a small, but often repeated code fragment.
1999-01-24 01:19:28 +00:00
mycroft
b790730226 Fix problems with fr_tcpsum() that prevented the FTP proxy from working. 1999-01-23 08:50:52 +00:00
thorpej
a43786143f Fix a problem pointed out by Charles Hannum; DF wasn't being set in
SYN,ACK packets during Path MTU Discovery.  Fix tcp_respond() to do the
appropriate route lookup and set DF as appropriate.

Also, fixup similar code in tcp_output() to relookup the route if it
is down.
1999-01-20 03:39:54 +00:00
mycroft
231a906c71 There's just no plausible reason to byte-swap ip_id internally. It's opaque. 1999-01-19 23:39:56 +00:00
mycroft
7eeb5a04da Don't screw with ip_len; just subtract from it where we actually use the
value.
1999-01-19 23:03:20 +00:00
mycroft
d3ea3de1af Fix byte-swapping of ip_len in returned IP header. 1999-01-19 22:10:42 +00:00
mycroft
fc1211a6ab Don't overwrite the checksum fields when checking them. There's no reason to
do this, and it screws up ICMP replies.
XXX The returned IP checksum and length are still wrong.
1999-01-19 21:58:40 +00:00
thorpej
4b0e6bb4dc Domains are associated with protocol families, not address families. 1999-01-14 01:16:55 +00:00
thorpej
98d3903da8 Use the count supplied to the pseudo-device attach routine to dynamically
allocate (once) the ipip_softc array; don't assume NIPIP contains the count.
1999-01-13 23:01:20 +00:00
thorpej
555784ccd5 Fix byte order and ip_len inconsistencies in ICMP reply code. Also, fix
some formatting and HTONS(foo) vs. foo = htons(foo) inconsistencies.

PR #6602, Darren Reed.
1999-01-11 22:35:06 +00:00
thorpej
6ae68b4feb Pull the IP-in-IP tunneling support out of the GRE code. It's not handled
by a separate IP-IP input path.

XXX Should eventually do the same thing for IPPROTO_MOBILE.
1999-01-11 21:32:13 +00:00
thorpej
9508f259bf Adjust for the new IP-IP input path. mrt_ipip_input() is called from
ipip_input(), and returns non-zero if mrt_ipip_input() handled the
packet.

XXX Eventually, the multicast code should probably use regular IP-IP
XXX `interfaces', but mrouted knows about the VIF table, etc.
1999-01-11 21:31:03 +00:00
thorpej
9d996b9e4e Adjust for the new IP-IP input path. 1999-01-11 21:28:28 +00:00
thorpej
9f9be750f6 Separate out the IP-in-IP implementation from the GRE code. This cleans
up the interface to ip_mroute.c somewhat, and properly separates IP-IP
from GRE.  (They are similar, but they are different protocols, and should
not be implemented in the same place.)
1999-01-11 21:26:53 +00:00
thorpej
5f69dedb2c ipip_input() -> mrt_ipip_input(). 1998-12-22 02:51:32 +00:00
thorpej
54377d1212 Simplify the tunnel lookup routine. 1998-12-22 01:49:04 +00:00
thorpej
12632ebf71 Reverse the copyright-notice-swap. It went against existing practice. 1998-12-19 02:46:12 +00:00
thorpej
4f177aec90 Add a lock around the TCPCB's sequence queue, to prevent tcp_drain()
from corrupting the queue if called from a device's interrupt context.

Similar in nature to the problem reported in PR #5684.
1998-12-18 21:38:02 +00:00
thorpej
ca15e01c76 Add a lock around the IP fragment reassembly queue, to prevent ip_drain()
from corrupting the queue if called from a device's interrupt context.

Should fix PR #5684.
1998-12-18 21:35:11 +00:00
thorpej
93454aafc6 Delay sending if SS_MORETOCOME is set in so_state. This avoids the case
where the user issued a write with a length greater than MLEN but less
than MINCLSIZE, thus causing two mbufs to be used.  The loop in sosend()
would then call PRU_SEND twice, causing TCP to transmit 2 packets when
it could have transmitted one.

Suggested by Justin Walker <justin@apple.com> on the freebsd-net
mailing list.
1998-12-16 00:33:14 +00:00
mrg
2f2fd097ef remove this insanity. appeared with ipfilter 3.2.10... 1998-12-11 23:47:16 +00:00
drochner
914642e439 correction to the previous: protect against _LKM too
pointed out by Todd Whitesel <toddpw@best.com>
1998-12-11 11:57:33 +00:00
drochner
36b809fed6 correcton tp previous: don't try to include kernel option headers in
userland
fixes PR kern/6561 (Takahiro Kambe)
1998-12-11 09:15:42 +00:00
christos
ce96f9960a defopt IPFILTER_LOG 1998-12-10 15:50:59 +00:00
christos
c7578c510a defopt 1998-12-10 11:01:01 +00:00
mrg
a94214bdd0 add a patch from darren reed, to make ipfilter use our cksum routine. 1998-11-26 12:21:47 +00:00
sommerfe
0cdf66e377 Fragments should start with a header mbuf allocated by MGETHDR() 1998-11-25 21:13:58 +00:00
mrg
4dd9bebb58 add two more prototypes. noted missing by mjacob. 1998-11-22 23:30:36 +00:00
mrg
78db9d7d95 merge ipf 3.2.10 1998-11-22 15:17:18 +00:00
lukem
0cd1643609 if INADDR_ANY is given in in_pcbconnect(), choose the ia_addr of the first
interface, not the ia_broadaddr.  should fix [standards/5645] and [kern/6425]
1998-11-16 05:47:19 +00:00
drochner
1658ac64a8 fix the previous: "securelevel" in kernel only 1998-11-15 17:36:19 +00:00
tls
da1c106b85 In 'highly secure' mode (securelevel >= 2), the filter lists may not be tampered with. It might be desirable to allow enabling of preset filter lists, but it seems too good a candidate for a denial-of-service attack, so we don't. 1998-11-14 07:42:37 +00:00
lukem
cc41dfe747 simplify test in in_pcbbind() for setting wild=1; no need to check if
((so->so_proto->pr_flags & PR_CONNREQUIRED) == 0 ||
	(so->so_options & SO_ACCEPTCONN) == 0)
since the latter is always true, so the former test in unnecessary.
from `TCP/IP Illustrated, Volume 2', W. Richard Stevens, p 730.
1998-11-13 10:50:10 +00:00
thorpej
0e3a0a7f80 Once a fragmented IP packet has been reassembled, recompute the packet
length before passing it up the stack.  From FreeBSD.
1998-11-13 03:24:22 +00:00
ws
ede30e2813 Fix a buglet when looking up an interface for multicast:
Zero out the routing structure before calling the route lookup code
in order to correctly match addresses.
1998-10-26 17:31:01 +00:00
matt
3ad026ac87 vax -> __vax__ (and mips to __mips__ in ultrix_misc.c) 1998-10-20 01:46:27 +00:00
kim
cd7e3136ad Use ETHERTYPE_ATALK instead of ETHERTYPE_AT. The former seems more common.
Our other constants also use "ATALK".

Added many new ETHERTYPE constants to sys/net/ethertypes.h, including the
ones from libpcap and tcpdump "ethertype.h" files.
1998-10-13 02:34:31 +00:00
thorpej
14f5ac9081 Use the pool allocator for ipflow entries. 1998-10-08 01:41:45 +00:00
thorpej
974aa74abd Use the pool allocator for ipqent structures. 1998-10-08 01:19:25 +00:00
thorpej
588ccb2d75 Fix some typos in comments, and clean up some whitespace. 1998-10-07 23:33:02 +00:00
thorpej
6cfb33b4e4 Use the pool allocator for the tcpcb's TCP/IP header template. 1998-10-07 23:20:03 +00:00
matt
bf4e491879 Fix boolean dyslexic test. Duh! 1998-10-06 00:41:13 +00:00
matt
8e8f38e0f2 Add a sysctl for newreno (default to off). 1998-10-06 00:20:44 +00:00
lukem
a1ea50ee45 * in_pcblookup_port(): deprecate INPLOOKUP_WILDCARD and flags in favour
of a lookup_wildcard arg; simplifies the logic a bit.
* when assigning ephemeral ports in in_pcbbind(), always call
  in_pcblookup_port() with lookup_wildcard=1, so that ephemeral port
  allocation on sockets with SO_REUSEADDR set won't potentially bind to a
  port in use by something else (principle of least surprise).
1998-10-05 14:33:14 +00:00
matt
25054b5cf7 Adapt the NEWRENO changes from the UCSB diffs of BSDI 3.0's TCP
to NetBSD.  Ignore the SACK & FACK stuff for now.
1998-10-04 21:33:52 +00:00
kleink
c68106edad Use #error instead of causing a parse error. 1998-10-02 21:21:04 +00:00
drochner
5ddf423985 print reason for arplookup() failure (ala FreeBSD) 1998-10-01 11:04:24 +00:00
tls
c4730d65cf Switch order of TNF and UCB copyrights so UCB copyright is first; this seems more appropriate since UCB wrote the original code, after all. 1998-09-30 21:52:24 +00:00
hwr
eaccb9cd8d Start supporting IPPROTO_MOBILE (55) encapsulation. This is yet
another tunneling protocol used by the Mobile-IP people. See RFC 2004
for this.
1998-09-30 05:59:27 +00:00
christos
e74ca32804 SIOCGIFALIAS should not be restricted to the superuser. 1998-09-28 12:32:43 +00:00
mycroft
4a000a54e6 Fix a typo (not mine) in a comment. 1998-09-19 04:34:34 +00:00
mycroft
04ef3bf88d If we're in LISTEN state and all of RST, SYN and ACK are clear, send a RST. 1998-09-19 04:32:51 +00:00
mycroft
31347e4671 Always send a 0 window with a RST. Suggested by Darren Reed. 1998-09-19 04:02:52 +00:00
hwr
cf70cc28c7 Typo. :( 1998-09-14 21:15:56 +00:00
hwr
517139017e Some additions.
And IDPR-CMTP is 38 not 39 according to IANA.
1998-09-14 21:09:51 +00:00
hwr
366b9c4515 Add a gre tunnel pseudo network device. Gre = generic route encapsulation.
This device shows up like any other network interface and can be used to
tunnel L3 protocols as e.g. IP over IP.
1998-09-13 20:27:47 +00:00
christos
66dd35d72c Fix copyright spacing and 'Van' -> 'van' for consistency. 1998-09-13 15:45:40 +00:00
tv
235fc6a6a9 egcs {brace} warning fix 1998-09-10 19:53:28 +00:00
mouse
b95116821c Create tcp.keepidle, tcp.keepintvl, tcp.keepcnt, tcp.slowhz sysctls. 1998-09-10 10:46:03 +00:00
thorpej
9fd57e8917 Make a diagnostic printf more sensible, PR #5951, Heiko W. Rupp. 1998-09-09 04:57:18 +00:00
thorpej
4dbfe05f1f Use an algorithm similar to that in tcp_notify() to determine if
syn_cache_unreach() should remove the entry, or just continue on.

Algorithm is to only remove the entry if we've had more than one unreach
error and have retransmitted 3 or more times.  This prevents the following
scenario, as noted in PR #5909 (PR from Ty Sarna, scenario from
Charles Hannum):

	* Host A sends a SYN.
	* Host A retransmits the SYN.
	* Host B gets the first SYN and sends a SYN-ACK.
	* Host B gets the second SYN and sends a SYN-ACK.
	* One of the SYN-ACK bounces with an
	  ICMP unreachable, causing the `SYN cache' entry to be
	  removed with no notification.
	* Host A receives the other SYN-ACK, sends an ACK, and goes to
	  ESTABLISHED state.

Should fix PR #5909.
1998-09-09 01:32:27 +00:00
christos
0f024deb52 Add SIOCGIFALIAS 1998-09-06 17:52:01 +00:00
kleink
bb4f7768e4 Protect _XOPEN_SOURCE against sysctl MIB identifiers. 1998-09-05 19:03:25 +00:00
mycroft
e2cb6dad8d Make the randomized part of the ISS 24 bits. 1998-09-04 22:34:51 +00:00
mycroft
2f501074f8 Fix a couple of bogons related to tcp_new_iss():
* Don't add tcp_iss_seq when creating a new ISS from TIME-WAIT state.
* Do the clock increment even when using the rnd device.
1998-09-04 22:29:54 +00:00
scottr
e3e7e1673f Fix the NEXT_IA_WITH_SAME_ADDR macro introduced in 1.27: it was finding
the first in_ifaddr structure with a different internet address!  Reverse
the sense of the test.  Spotted by and fix from Eric Haszlakiewicz.
1998-08-14 06:57:54 +00:00
mrg
4a75265273 defopt PFIL_HOOKS. 1998-08-09 08:58:18 +00:00
thorpej
833061914a Use the pool allocator for tcpcbs. 1998-08-02 00:36:19 +00:00
thorpej
d319e4b419 Use the pool allocator for syn_cache entries. 1998-08-02 00:35:51 +00:00
thorpej
47e9dcf841 Use the pool allocator for inpcbs. 1998-08-02 00:35:31 +00:00
tls
31d0752b99 change IN_IFADDR_HASH_SIZE to 509, which actually uses no more space than 293 due to rounding up to nearest power of two in hashinit. 1998-07-29 05:18:54 +00:00
pk
84840da908 in_pcballoc(): we can't afford to wait for memory. 1998-07-23 08:24:33 +00:00
mycroft
cca4e566a9 Implement a better fix for the `gratuitous FIN' problem, as
mentioned on tcp-impl but with a bit more commentary.
1998-07-21 10:46:00 +00:00
thorpej
3a9ed00799 Document that we are more conservative after doing MTU discovery than the
suggestion in draft-floyd-incr-init-win-03.  Rather than scaling cwnd back
by the ratio of new segment size to old segment size, we perform a slow start
using the Initial Window, computed with the new segment size.
1998-07-17 23:09:58 +00:00
thorpej
0f909866c0 Clarify that we're using the Loss Window when we receive a source quench. 1998-07-17 23:02:38 +00:00
thorpej
fa20f24cd9 Add a comment wrt. a current issue w/ CWM. 1998-07-17 23:00:02 +00:00
thorpej
a3f4316cba Clarify that we are using the Loss Window if a retransmission occurred
during the three-way handshake.
1998-07-17 22:58:56 +00:00
thorpej
830879a809 Comment where the Restart Window is computed, and in the non-CWM case,
make sure it never _increases_ cwnd.
1998-07-17 22:52:01 +00:00
thorpej
1c4ff0a086 Comment where we use the Loss Window. 1998-07-17 22:18:49 +00:00
sommerfe
69b1b4758d Fix PR5559: if fast-forwarding, DF set, and packet too large, send ICMP error.. 1998-07-17 00:35:23 +00:00
sommerfe
534520d815 Fix PR5508: ipfil cut-through forwarding causes panic 1998-07-17 00:28:00 +00:00
tls
deac3540de Put original hash function back. It wastes a little bit of space, but is much more even -- think of the case of a web service provider, some of whose customers end up getting 'inferior service' because they're on addresses that happen to be out at the end of a hash chain. With webservers with thousands of addresses, this is a real issue. If the wasted space is a big deal, we could pick a prime number that's slightly _less_ than a power of two... 1998-07-16 06:45:09 +00:00
thorpej
389da54091 Garbage collect imp' and hy'. We don't have the rest of the code, and
it's not like anyone is ever going to be using either of them.
1998-07-15 17:39:20 +00:00
veego
97ab1bd53b Resolve conflicts from the import. 1998-07-12 15:23:59 +00:00
mycroft
3a64270ca6 Back out the change from TCP/IP vol 2, in revision 1.7, which removed TH_FIN
from the output flags for CLOSING state.  There is no harm in retransmitting
the FIN, and this change has unexpected side effects that break simultaneous
close behaviour.
1998-07-09 05:49:56 +00:00
sommerfe
065cac9798 Delete bogus (void) cast of m_freem (which is already a void function..) 1998-07-07 00:04:59 +00:00
jonathan
b37021c1a1 defopt NATM. 1998-07-05 22:48:05 +00:00
jonathan
9bf2ba0928 Garbage-collect ``needs-flag'' from attributes ether, fddi, arc:
NETHER, NFDDI, NARC are  not used anywhere. Remove #include "ether.h",
   which had no effect.
Removes clash with "options NATM" for native-ATM network protocol stack.
1998-07-05 22:29:51 +00:00
jonathan
011f2bda08 defopt NS, NSIP. 1998-07-05 06:49:00 +00:00
jonathan
5c0c5dd0b4 defopt ISO TPIP. 1998-07-05 04:37:35 +00:00
jonathan
f2a2327e0a defopt EON. 1998-07-05 01:06:49 +00:00
jonathan
3751946b97 defopt INET, NETATALK. 1998-07-05 00:51:04 +00:00
jonathan
466e784ee1 defopt DDB. 1998-07-04 22:18:13 +00:00
thorpej
8cfe8959a6 Fix TCPS_HAVERCVDFIN() to actually catch all TCP states in which a FIN
has been received (CLOSE_WAIT, CLOSING, LAST_ACK, and TIME_WAIT).

From David Borman <dab@bsdi.com>.
1998-07-03 05:39:56 +00:00
is
0ca02c68a7 Thinko in last fix: we have to actually check each address for a copy on
our ifp, else we might fail for some strange configurations.
1998-07-02 14:00:39 +00:00
is
d8b8a41918 The rewrite of if_arp.c to work with the hashed interface address lists
(1.44) missed a test for the right interface, making some machines answer
to some bogus arp requests (like for WHO-HAS 127.0.0.1).

The quick patch in 1.46-1.47 does not work for so-called "unnumbered"
interfaces, that is, (point-to-point) interfaces that share their local
address with another (e.g., the Ethernet) interface.

We add a macro to in_var.h, to step (in the current implementation) through
the hash chain and fine more entries with the same address, and use that
in if_arp.c to find one which belongs to our interface.
1998-07-02 11:39:56 +00:00
tls
b0d2c08b6b Fix buglet where we might respond to arp on wrong interface. 1998-06-25 20:47:48 +00:00
cgd
651b44e211 Rework the way kernel include files are installed. In the new method,
as with user-land programs, include files are installed by each directory
in the tree that has includes to install.  (This allows more flexibility
as to what gets installed, makes 'partial installs' easier, and gives us
more options as to which machines' includes get installed at any given
time.)  The old SYS_INCLUDES={symlinks,copies} behaviours are _both_
still supported, though at least one bug in the 'symlinks' case is
fixed by this change.  Include files can't be build before installation,
so directories that have includes as targets (e.g. dev/pci) have to move
those targets into a different Makefile.
1998-06-12 23:22:30 +00:00
sommerfe
a90d5cd12e Truncate mbufs to the correct length before forwarding; fixes pr5560 1998-06-10 00:47:57 +00:00
thorpej
b22946827d Add a comment explaining why we do _not_ ACK data that might accompany
a SYN (avoidance of a DoS attack).
1998-06-02 18:33:02 +00:00
thorpej
c296923d2f Loss window MUST be one segment, per draft-floyd-incr-init-win-03. 1998-06-02 17:22:26 +00:00
thorpej
edc01ec330 In addition to the IP flow hash table, put the flows on a list. The table
is used for fast lookup, the list for traversal of all flows.  Also, use
PRT timers.
1998-06-02 15:48:03 +00:00
thorpej
837a8317b5 Eek, we were wasting almost half of the in_ifaddr hash space by modulo'ing
with IN_IFADDR_HASH_SIZE.  Instead, AND with the hash mask computed by
hashinit().
1998-06-01 00:50:07 +00:00
thorpej
08b5a4ecb8 Protect the ipflow_reap() call with splsoftnet. 1998-06-01 00:39:37 +00:00
cgd
dd8ed56342 Another demonstration that when you're converting variables from 'long's
to fixed 32-bit integers, you have to exercise care.
1998-05-31 19:39:13 +00:00
veego
6250554a65 Fix compiler warnings: Add missing ()'s. 1998-05-29 20:28:17 +00:00
veego
630030601c Fix some compiler warnings: Missing prototype and ()'s. 1998-05-29 20:27:18 +00:00
veego
a4c89e3e2e Resolve conflicts from the import of IPFilter 3.2.7. 1998-05-29 20:24:36 +00:00
matt
c0a1601f95 Change arp so its console log messages print out IP addresses in
dotted quad format instead of hex.
1998-05-29 15:34:24 +00:00
thorpej
f555f6d93f Fix OBOB in IP timestamp option processing, as noted in FreeBSD PR 6738,
from Jennifer Dawn Meyers <jdm@enteract.com>.
1998-05-24 20:14:53 +00:00
matt
f070ddb8ed Move the ppcb pointer towards the front of the structure so that it and the
pcb chain pointers can possibly be in the same cache line.
1998-05-18 17:10:37 +00:00
matt
1b2b1d801b Fix two bugs. 1998-05-18 17:08:56 +00:00
veego
82423e3d01 Resolve conflicts 1998-05-17 16:50:15 +00:00
kml
dd5ed34b88 Changed initialization of peermss to ensure that it didn't have
the TCP and IP options lengths removed from it -- the IP options can
change over the course of a connection...
1998-05-12 21:45:51 +00:00
thorpej
c5fc2e9acd Back out previous. This problem was already fixed in a different way. 1998-05-11 23:13:40 +00:00
matt
53b04a8d3c Let usr.sbin/tcpdump build again. 1998-05-11 23:09:35 +00:00
thorpej
49573284f5 Make sure a timer is marked "disarmed" once it has expired. 1998-05-11 20:52:18 +00:00
thorpej
5596fe2614 Nuke TUBA per my note to tech-net; there's no reason to keep it around. 1998-05-11 19:57:23 +00:00
kml
1216b9a560 Change comments on tcp_mss_to_advertise to match actual arguments 1998-05-07 22:30:23 +00:00
thorpej
ce3d776874 Rework the syn cache code somewhat:
- Don't use home-grown queue manipulation.  Use <sys/queue.h> instead.  The
  data structures are a little larger, but we are otherwise wasting the
  memory chunk anyway (we're already a 64-byte malloc bucket).
- Fix a bug in the cache-is-full case: if the oldest element removed from
  the first non-empty bucket was the only element in the bucket, the
  bucket wouldn't be removed from the bucket cache, causing queue corruption
  later.
- Optimize the syn cache timers by using PRT timers rather than home-grown
  decrement-and-propagate timers.

This code is now a fair bit smaller, and significantly easier to read
and understand.
1998-05-07 01:37:27 +00:00
thorpej
dc49b0342e Define all TCP timers in terms of PRT timers. 1998-05-07 01:30:46 +00:00
thorpej
34e34c985a Use the monotonically increasing slow timer timestamp provided by
the protocol dispatch layer for TCP timers.  This saves having to
modify a potentially large number of timer values (which were shorts,
and expanded to ... a lot of code on the Alpha).
1998-05-06 01:24:38 +00:00
thorpej
1ffa60ac01 Use macros from tcp_timer.h to manipulate TCP timers, so that their
implementation can be changed easily.
1998-05-06 01:21:20 +00:00
matt
36eac04cc0 Default IP flow to being enabled. Add a sysctl to control the maximum
number of flows (net.inet.ip.maxflows).  If set to 0, will disable fast
path forwarding.
1998-05-04 19:24:53 +00:00
thorpej
447384d6b8 - kern/5380 (Dennis Ferguson): fix incremental IP header checksum.
- kern/5381 (Dennis Ferguson): check IP header checksum in fast forward
  code.
- In ipflow_slowtimo(), if no IP flows are in use, don't bother checking
  all of the hash buckets.
1998-05-04 05:46:04 +00:00
thorpej
e44c4fb7d3 Once again, move a declaration for the benefit of TUBA (grumble). 1998-05-03 19:54:56 +00:00
thorpej
b9fc258065 Oops, move a variable declaration so TUBA won't lose. 1998-05-02 04:23:05 +00:00
thorpej
b71e4ddf4c Reintroduce the immediate ACK-on-PUSH behavior removed in revision 1.47,
but make the decision to do this dependent on the sysctl variable
net.inet.tcp.ack_on_push, which is disabled by default.
1998-05-02 04:21:58 +00:00
thorpej
e1934b4c36 Correct a comment related to Congestion Window Monitoring. 1998-05-02 01:00:24 +00:00
thorpej
be12c489b4 Garbage-collect. 1998-05-01 18:31:12 +00:00
thorpej
77af553e79 If packets are passed through IP Filter at all, don't allow fast-forward
flow entries to be created for them.

Eventually, IP Filter should be extended to allow IP src/dst pairs to
be specified as "fast forward OK".
1998-05-01 03:28:14 +00:00
thorpej
4452bc9a21 Allow packet filters to prevent a packet from creating a fast-forwarding
flow, by setting the "can fast forward" flag in the packet header, and
giving a chance for filters to clear the flag.  If the flag is still
set after the filters have given it a chance, the packet will be used
to create a fast-forward flow entry.
1998-05-01 03:23:24 +00:00
kml
e173e7a084 Remove bogus black hole discovery code 1998-05-01 01:15:55 +00:00
thorpej
ce40806e29 In the CWM code, don't use the Floyd initial window computation as
the burst size allowed, but rather a fixed number of packets, as
described in the Internet Draft.  Default allowed burst is 4 packets,
per the Draft.

Make the use of CWM and the allowed burst size tunable via sysctl.
1998-04-30 18:27:20 +00:00
thorpej
e81920fa23 Make tcp_compat_42 a sysctl option. 1998-04-30 17:55:27 +00:00
thorpej
7e05be912b Need <net/route.h> 1998-04-30 17:47:26 +00:00
matt
d4d709f7d0 Add support for "fast" forwarding. Add hooks in if_ethersubr.c and
if_fddisubr.c to fastpath IP forwarding.  If ip_forward successfully
forwards a packet, it will create a cache (ipflow) entry.  ether_input
and fddi_input will first call ipflow_fastforward with the received
packet and if the packet passes enough tests, it will be forwarded (the
ttl is decremented and the cksum is adjusted incrementally).
1998-04-29 21:37:52 +00:00
matt
37d70e3b46 defopt GATEWAY 1998-04-29 20:45:30 +00:00
matt
334f006538 New TCP reassembly code. The new code reduces the memory needed by
out-of-order packets and builds the infrastructure needed for sending
SACK blocks (to be added shortly).
1998-04-29 20:43:29 +00:00
thorpej
00d50da592 Fix some whitespace. 1998-04-29 05:44:47 +00:00
thorpej
13f972a4d6 Make use of the work-arounds for ancient broken TCP peers run-time
conditional (tcp_compat_42).  The kernel config option TCP_COMPAT_42
will still enable this by default, or disable this by default if the
option is not included (i.e. current behavior).  This will be made a
sysctl soon.
1998-04-29 05:16:46 +00:00
kml
eadcaa201c change path MTU timeout value to match RFC 1191 1998-04-29 03:45:52 +00:00
kml
1579dcec47 Add support for deletion of routes added by path MTU discovery;
uses new generic route timeout code.  Add sysctl for timeout period.
1998-04-29 03:44:11 +00:00
thorpej
100bfaf39a Change RFC1323 timestamp update rule per Section 3.4 of RFC1323.bis. Old
rule was to update the timestamp if the sequence numbers are in range.  New
rule adds a check that the timestamp is advancing, thus preventing our notion
of the most recent timestamp from incorrectly moving backwards.
1998-04-29 00:43:46 +00:00
thorpej
df750b93da Log the peer's IP address on received window scale factors larger than
TCP_MAX_WINSHIFT (14), as recommended in Section 2.3 of RFC1323.
1998-04-28 21:52:16 +00:00
matt
5b43c678b7 Only transmit fragments if the send queue of interface can actually hold
all of the fragments.  Use the mtu of route in preference of the MTU of the
interface when doing fragmentation decisions.  (ie. Fragment to the path
mtu if it is available).
1998-04-28 15:26:00 +00:00
kml
fcf0227962 Fix to ensure that the correct MSS is advertised for loopback
TCP connections by using the MTU of the interface.  Also added
a knob, mss_ifmtu, to force all connections to use the MTU of
the interface to calculate the advertised MSS.
1998-04-13 21:18:19 +00:00
thorpej
47b4697587 Remember any source routes that may have accompanied a SYN. 1998-04-07 05:09:19 +00:00
thorpej
04d3f25df8 Now that we have a flags word in the syn cache entry, use a flag to indicate
"peer will do timestamps" rather than a bitfield, and give the now-unsed
bit to the hash, making it now 32 bits.
1998-04-03 08:02:45 +00:00
thorpej
b7c562b21c Clean up some comments wrt. the syn cache code. 1998-04-03 07:54:01 +00:00
thorpej
30fcf99ef8 Fix a bug which would cause a panic in soreceive() if multiple raw
receivers ask for ancillary data.

Noted by Francis Dupont <Francis.Dupont@inria.fr> on tech-net.
1998-04-03 07:49:16 +00:00
thorpej
f9463514bf Implement Congestion Window Monitoring as described in the TCPIMPL
meeting of IETF #41 by Amy Hughes <ahughes@isi.edu>, and in an upcoming
internet draft from Hughes, Touch, and Heidemann.

CWM eliminates line-rate bursts after idle periods by counting pending
(unacknowledged) packets and limiting the congestion window to the
initial congestion window plus the pending packet count.  This has the
effect of allowing us to use the window as long as we continue to transmit,
but as soon as we stop transmitting, we go back to a slow-start (also known
as `use it or lose it').

This is not enabled by default.  You can enable this behavior by patching
the "tcp_cwm" global (set it to non-zero) or by building a kernel with the
TCP_CWM option.
1998-04-01 22:15:52 +00:00
thorpej
1b176d9395 Back out a change made some time ago, that would cause the NetBSD TCP
to ACK immediately any packet that arrived with PSH set.  This breaks
delayed ACKs in a few specific common cases that delayed ACKs were
supposed to help, and ends up not making much (if any) difference in
the case where where this ACK-on-PSH change was supposed to help.

Per discussion with several members of the TCPIMPL and TCPSAT IETF
working groups.
1998-03-31 23:44:09 +00:00
thorpej
2da6c91259 Fix a potential-congestion case in the larger initial congestion window
code, as clarified in the TCPIMPL WG meeting at IETF #41: If the SYN
(active open) or SYN,ACK (passive open) was retransmitted, the initial
congestion window for the first slow start of that connection must be
one segment.
1998-03-31 22:49:09 +00:00
scottr
81a5bfdf33 Change from IP-Filter 3.2.3: avoid infinite loop in nat_new() when
NAT'ing to a single IP address.
1998-03-29 22:56:00 +00:00
thorpej
d725b1a332 Remove a comment in tcp_mss_to_advertise() that no longer applies. 1998-03-28 19:39:57 +00:00
kml
96954c2a53 Ensure that we take the IP option length into account when we calculate
the effective maximum send size for TCP.  ip_optlen() and tcp_optlen()
should probably be inlined for efficiency.
1998-03-24 03:10:02 +00:00
kml
123232e156 Fix a retransmission bug introduced by the Brakmo and Peterson
RTO estimation changes.  Under some circumstances it would return a value
of 0, while the old Van Jacobson RTO code would return a minimum of 3.
This would result in 12 retransmissions, each 1 second apart.
This takes care of those instances, and ensures that t_rttmin is
used everywhere as a lower bound.
1998-03-19 22:29:33 +00:00
mrg
45159fa631 convert pfil(9) in and out lists from <sys/queue.h> LISTs to TAILQs, and
change pfil_add_hook to put output filters at the tail of the queue,
while continuing to place input filters at the head of the queue.  update
the two users of these functions, and document these changes.

fixes PR#4593.
1998-03-19 15:45:30 +00:00
kml
ffb211fb9d Ensure that the TCP segment size reflects the size of TCP options
in the packet.  This fixes a bug that was resulting in extra packets
in retransmissions (the second packet would be 12 bytes long,
reflecting the RFC1323 timestamp option size).
1998-03-17 23:50:30 +00:00
thorpej
5837cc6b07 Update copyright (sigh, should have done this long ago). 1998-02-19 02:36:42 +00:00
tls
91de585d5f Add correct copyright notice for IP address hash change. This code is donated to TNF by the original copyright holder, Panix. 1998-02-15 18:24:23 +00:00
tls
c9934a9084 Change list of interface IP addresses to a hash. Improves performance on hosts with a large number of IP addresses significantly. 1998-02-13 18:21:38 +00:00
kleink
a8bd1c7e84 Fix variable declarations: register -> register int. 1998-02-13 10:23:49 +00:00
perry
f73530ba55 add/cleanup multiple inclusion protection. 1998-02-10 01:26:19 +00:00
chs
f64abc7b4c add flags arg to hashinit(), to pass to malloc(). 1998-02-07 02:44:44 +00:00
mellon
27a5a0a616 Take PCB off delayed ack queue before freeing. 1998-01-30 08:42:11 +00:00
thorpej
4c54445530 Use offsetof() from libkern.h 1998-01-28 02:35:10 +00:00
mellon
5685520ac1 Always set sc->sc_timeout (it was missed in one case). This fixes a problem where SYN cache entries are sometimes timed out almost immediately. 1998-01-24 12:27:31 +00:00
mycroft
5ab55e91b7 Fix an old editing error from merging a bug fix into Lite,
that might cause us to erroneously drop a FIN.
Also, minor changes so the code looks more like Stevens vol 2 figure 28.30.
1998-01-24 05:04:27 +00:00
mellon
babb710a0b Never free the mbuf that we give to tcp_respond(). The previous change corrected an inconsistency but in exactly the wrong way. 1998-01-21 01:21:22 +00:00
mellon
ac489008ad In syn_cache_get(), don't free incoming packet before jumping to resetandabort, but do free it after sending the reset. 1998-01-18 05:56:15 +00:00
scottr
54ea074777 Use option header file for MROUTING 1998-01-12 03:02:48 +00:00
scottr
3cdcd5e1c7 Use option header file for TCP_COMPAT_42 1998-01-12 03:00:42 +00:00
lukem
c0e8ee54e9 * start from the top of the given ephemeral range and work down;
results in reserved ephemeral ports starting at the top (as per
  current practice), and shouldn't have a negative effect on normal
  ephemeral ports...
* initialise inpt_lastlow in in_pcbinit
1998-01-08 11:56:50 +00:00
lukem
1a63d90320 add missing ; ... 1998-01-08 00:32:39 +00:00
lukem
c80b4400e5 add the following, derived from FreeBSD:
* IP_PORTRANGE socket option, which controls how the ephemeral ports
  are allocated. it takes the following settings:
	IP_PORTRANGE_DEFAULT	use anonportmin (49152) -> anonportmax (65535)
	IP_PORTRANGE_HIGH	as IP_PORTRANGE_DEFAULT (retained for FreeBSD
				compat reasons, where these are separate)
	IP_PORTRANGE_LOW	use 600 -> 1023. only works if uid==0.
* in_pcb flag INP_ANONPORT. set if port was allocated ephmerally
1998-01-07 22:51:22 +00:00
thorpej
e5e283e02d Finishing merging 4.4BSD-Lite2 netinet. At this point, the only changes
left were SCCS IDs and Copyright dates.
1998-01-05 10:31:44 +00:00
lukem
1f8f74b669 enhance ephemeral port allocation code:
* support sysctl net.inet.ip.anonportmin (lowest ephemeral port)
  and net.inet.ip.anonportmax (highest ephemeral port).
  these can't be set to >65535, < IPPORT_RESERVED (unless IPNOPRIVPORTS
  is defined), and anonportmin has to be < anonportmax.
* use a cleaner way of only cycling through the available set once;
  this will be useful for when a random allocation scheme is used
* define IPPORT_ANON{MIN,MAX} instead of IPPORT_USER{LOW,HIGH}
1998-01-05 09:52:02 +00:00
thorpej
2e85747e9e From 4.4BSD-Lite2 (noted by Frank van der Linden):
so_linger is used as an argument to tsleep(), so was stuffed with
clockticks for the TCP linger time.  However, so_linger is set directly from
l_linger if the linger time is specified, and l_linger is seconds (although
this is not currently documented anywhere).  Fix this to set the TCP
linger time in seconds, and multiply so_linger by hz when tsleep() is
called to actually perform the linger.
1998-01-05 09:12:29 +00:00
thorpej
673fb149c6 Implement a queue for delayed ACK processing. This queue is used in
tcp_fasttimo() in lieu of scanning all open TCP connections.
1997-12-31 03:31:23 +00:00
lukem
0b57ba7265 as per the IANA assigned ports numbers document, use ports
49152..65535 for ephemeral ports (instead of 1024..5000).
closes my [kern/4440], but with correct code :)
1997-12-30 02:54:08 +00:00
thorpej
3c5ff3879d Keep stats on connections dropped due to excessive persist timeout. 1997-12-17 06:06:41 +00:00
thorpej
04ec3df592 From 4.4BSD-Lite2:
- When running the slow timers, skip PCBs in LISTEN state.
- When processing the persist timer, drop the connection if the connection
  idle time exceeds the maximum backoff for retransmit.  Part of
  kern/2335 (pete@daemon.net).
1997-12-17 06:04:17 +00:00
thorpej
82ce1f6a97 From 4.4BSD-Lite2:
- If we fail to allocate mbufs for the outgoing segment, free the header
  and abort.

From Stevens:
- Ensure the persist timer is running if the send window reaches zero.
  Part of the fix for kern/2335 (pete@daemon.net).
1997-12-17 05:59:32 +00:00
thorpej
154fe5a522 Add INADDR_ALLRTRS_GROUP and INADDR_MAX_LOCAL_GROUP. 1997-12-16 00:02:05 +00:00
thorpej
ee84a26869 After further examination of traces of bulk transfers (with help from
Kevin Lahey), undo the "defer window update until next delayed ACK".
1997-12-13 21:02:38 +00:00
thorpej
c02a72fcd0 Implement an infrastructure to allow larger initial congestion windows.
The sysctl'able variable "tcp_init_win", when set to 0, selects an
auto-tuning algorithm for selecting the initial window, based on transmit
segment size, per discussion in the IETF tcpimpl working group.

Default initial window is still 1 segment, but will soon become 2 segments,
per discussion in tcpimpl.
1997-12-11 22:47:24 +00:00
thorpej
3026b32ab3 In the PRU_RCVD entry point, if TF_DELACK is set, don't send the window
update now, since it will be sent within 200ms when the delayed ACK is
sent.  Instrument how many hits we get on this optimization.
1997-12-11 06:53:06 +00:00
thorpej
7f7bb7db17 In tcp_fasttimo(), don't clear TF_DELACK; we need it to count delayed ACKs
in tcp_output(), and it will only be cleared in tcp_output() if the ACK was
transmitted sucessfully.  Also, don't count delayed ACKs here, let tcp_output()
count them.
1997-12-11 06:42:44 +00:00
thorpej
8346cea65d Count delayed ACKs after they have been sucessfully transmitted. 1997-12-11 06:37:48 +00:00
thorpej
6c1840c05c Fix the "stretch ACK violation" bug documented in internet draft
draft-ietf-tcpimpl-prob-02.txt.  Also, fix another bug in the header
prediction case where an ACK would not be sent when it should be.
1997-12-11 06:33:29 +00:00
thorpej
c40f4eb3cc Implement tcp_drain(). 1997-12-10 01:58:07 +00:00
thorpej
eae709d885 Costmetic change: use intotcpcb() in tcp_fasttimo(). 1997-12-09 21:59:17 +00:00
darrenr
9fd3093f39 don't free pointer to static struct. please pullup. 1997-11-28 00:46:39 +00:00
mrg
3300e3e43e fix compile error when "options IPNOPROVPORTS" 1997-11-27 14:03:32 +00:00
mrg
2a9598ccdf fixes for memory leaks in proxying, and byte ordering problems. from darren reed. 1997-11-25 03:14:11 +00:00
thorpej
9f18d18071 Slight change to the previous: just drop the packet in the self-connect
case.  Sending an RST to ourselves is a little silly, considering that
we'll just attempt to remove a non-existent compressed state entry and
then drop the packet anyway.
1997-11-21 06:41:54 +00:00