Basically, in silly window avoidance, don't use the raw MSS we advertised
to the peer. What we really want here is the _expected_ size of received
segments, so we need to account for the path MTU (eventually; right now,
the interface MTU for "local" addresses and loopback or tcp_mssdflt for
non-local addresses). Without this, silly window avoidance would never
kick in if we advertised a very large (e.g. ~64k) MSS to the peer.
- Don't overload t_maxseg. Previous behavior was to set it to the min
of the peer's advertised MSS, our advertised MSS, and tcp_mssdflt
(for non-local networks). This breaks PMTU discovery running on
either host. Instead, remember the MSS we advertise, and use it
as appropriate (in silly window avoidance).
- Per last bullet, split tcp_mss() into several functions for handling
MSS (ours and peer's), and performing various tasks when a connection
becomes ESTABLISHED.
- Introduce a new function, tcp_segsize(), which computes the max size
for every segment transmitted in tcp_output(). This will eventually
be used to hook in PMTU discovery.
where it is now, and adding the specialized for Ethernet version of the ARP
structure, for the benefit of programs which are externally (to us) maintained
and not (yet) ported.
XXX This should NOT be used inside the kernel.
which happen to have a TCB in TIME_WAIT, where an mbuf which had been
advanced past the IP+TCP headers and TCP options would be reused as if
it had not been advanced. Problem found by Juergen Hannken-Illjes, who
also suggested a work-around on which this fix is based.
fixed in FreeBSD by John Polstra:
Fix a bug (apparently very old) that can cause a TCP connection to
be dropped when it has an unusual traffic pattern. For full details
as well as a test case that demonstrates the failure, see the
referenced PR (FreeBSD's kern/3998).
Under certain circumstances involving the persist state, it is
possible for the receive side's tp->rcv_nxt to advance beyond its
tp->rcv_adv. This causes (tp->rcv_adv - tp->rcv_nxt) to become
negative. However, in the code affected by this fix, that difference
was interpreted as an unsigned number by max(). Since it was
negative, it was taken as a huge unsigned number. The effect was
to cause the receiver to believe that its receive window had negative
size, thereby rejecting all received segments including ACKs. As
the test case shows, this led to fruitless retransmissions and
eventually to a dropped connection. Even connections using the
loopback interface could be dropped. The fix substitutes the signed
imax() for the unsigned max() function.
Bill informs me that his research indicates this bug appeared in Reno.
use of mbuf external storage and increasing performance (by eliminating
an m_pullup() for clusters in the IP reassembly code).
Changes from Koji Imada <koji@math.human.nagoya-u.ac.jp>, in PR #3628
and #3480, with ever-so-slight integration changes by me.
a socket, just calling tcp_disconnect() on the tcpcb will do the right thing.
From Thorsten Frueauf <frueauf@ira.uka.de> and W. Richard Stevens in PR/3738
resp. TCP/IP Illustrated, Vol. 2.
operation, since:
- in ipl_enable(), "fr_savep = fr_checkp" is not conditionalized
in the same way (not at all), and
- without this change, it was not possible to enable, disable,
and reenable ipfilter.
silly differences between the NetBSD copy of the code and the
vendor branch, keeping only those which are necessary. Of those
differences that currently exist, several "portability to NetBSD"
issues, which will be fed back to the ipfilter author.