Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)
This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.
1) dupseg_fix_=true from NS: do not count a segment with completely duplicate
data as a duplicate ack. This can occur due to duplicate packets in the
network, or due to fast retransmit from the other side.
2) dupack_reset_=false from NS: do not reset the duplicate ack counter or exit
fast recovery if we happen to get data or a window update along with a
duplicate ack.
3) In the "very old ack" case that itojun added, send an ACK before dropping
the segment, to try to update the other side's send sequence number.
4) Check the ssthresh crossover point with >= rather than >. Otherwise we
start to do "exponential" growth immediately following recovery, where we
should be doing "linear". This is what NS does.
and it could detect even older time stamps. (Really, to be 100% correct, there
should be a timer that clears these out -- but it probably doesn't matter in
the real world.)
* t_partialacks<0 means we are not in fast recovery.
* t_partialacks==0 means we are in fast recovery, but we have not received
any partial acks yet.
* t_partialacks>0 means we are in fast recovery, and we have received
partial acks.
This is used to implement 2 changes in RFC 3782:
* We keep the notion that we are in fast recovery separate from t_dupacks, so
it is not reset due to out-of-order acks. (This affects both the Reno and
NewReno cases.)
* We only reset the retransmit timer on the first partial ack -- preventing us
from possibly taking one RTO per segment once fast recovery is initiated.
As before, it is hard to measure any difference between Reno and NewReno in the
real-world cases that I've tested.
1) If an echoed RFC 1323 time stamp appears to be later than the current time,
ignore it and fall back to old-style RTT calculation. This prevents ending
up with a negative RTT and panicking later.
2) Fix NewReno. This involves a few changes:
a) Implement the send_high variable in RFC 2582. Our implementation is
subtly different; it is one *past* the last sequence number transmitted
rather than being equal to it. This simplifies some logic and makes
the code smaller. Additional logic was required to prevent sequence
number wraparound problems; this is not mentioned in RFC 2582.
b) Make sure we reset t_dupacks on new acks, but *not* on a partial ack.
All of the new ack code is pushed out into tcp_newreno(). (Later this
will probably be a pluggable function.) Thus t_dupacks keeps track of
whether we're in fast recovery all the time, with Reno or NewReno, which
keeps some logic simpler.
c) We do not need to update snd_recover when we're not in fast recovery.
See tech-net for an explanation of this.
d) In the gratuitous fast retransmit prevention case, do not send a packet.
RFC 2582 specifically says that we should "do nothing".
e) Do not inflate the congestion window on a partial ack. (This is done by
testing t_dupacks to see whether we're still in fast recovery.)
This brings the performance of NewReno back up to the same as Reno in a few
random test cases (e.g. transferring peer-to-peer over my wireless network).
I have not concocted a good test case for the behavior specific to NewReno.
- Allow rn_init() to be called multiple times, but do nothing except the
first call.
- Include opt_inet.h so that #ifdef INET works.
- Call rn_init() from encap_init() explicitly rather than depending on the
order of initialization.
received packet so that the checksum is not performed twice. Also,
tcp_respond() does not fill-in the m_pkthdr.csum_data, so a h/w checksum may
have the wrong offset.
OK from Jason Thorpe.
Connect over tcp on the loopback is broken:
4729 amq 0.000007 CALL connect(4,0x804f2a0,0x1c)
4729 amq 75.007420 RET connect -1 errno 60 Connection timed out
and marked as out of window - trying to do the add will result in a failure
and the packet being blocked, incorrectly.
Committed By: darrenr
Tested By: smb
("no domain for AF 0") on if_detach.
- SIOCAIFADDR, SIOCSIFADDR: free an address on error.
- SIOCSIFNETMASK, SIOCSIFDSTADDR: reject operations for an interface which
has no AF_INET addresses.
partly from OpenBSD and FreeBSD.
reviewed by Christos Zoulas on tech-net@.
the interface route and various internal state. Also, it should use an ifreq,
not an if_aliasreq. Addresses PR 9604. (Nothing in our source tree uses
SIOCSIFNETMASK, though. Perhaps it should be deprecated.)
1.) Make sure that "pass" is always initialized.
2.) Make sure the code doesn't use a stale mbuf pointer after fr_makefrip()
has been called. This fixes PR kern/25868.
Analyzed and reviewed by Steve Woodford.
support IPv6 if KAME IPSEC (RFC is not explicit about how we make data stream
for checksum with IPv6, but i'm pretty sure using normal pseudo-header is the
right thing).
XXX
current TCP MD5 signature code has giant flaw:
it does not validate signature on input (can't believe it! what is the point?)
(MD5 signatures for TCP, as used with BGP). Credit for original
FreeBSD code goes to Bruce M. Simpson, with FreeBSD sponsorship
credited to sentex.net. Shortening of the setsockopt() name
attributed to Vincent Jardin.
This commit is a minimal, working version of the FreeBSD code, as
MFC'ed to FreeBSD-4. It has received minimal testing with a ttcp
modified to set the TCP-MD5 option; BMS's additions to tcpdump-current
(tcpdump -M) confirm that the MD5 signatures are correct. Committed
as-is for further testing between a NetBSD BGP speaker (e.g., quagga)
and industry-standard BGP speakers (e.g., Cisco, Juniper).
NOTE: This version has two potential flaws. First, I do see any code
that verifies recieved TCP-MD5 signatures. Second, the TCP-MD5
options are internally padded and assumed to be 32-bit aligned. A more
space-efficient scheme is to pack all TCP options densely (and
possibly unaligned) into the TCP header ; then do one final padding to
a 4-byte boundary. Pre-existing comments note that accounting for
TCP-option space when we add SACK is yet to be done. For now, I'm
punting on that; we can solve it properly, in a way that will handle
SACK blocks, as a separate exercise.
In case a pullup to NetBSD-2 is requested, this adds sys/netipsec/xform_tcp.c
,and modifies:
sys/net/pfkeyv2.h,v 1.15
sys/netinet/files.netinet,v 1.5
sys/netinet/ip.h,v 1.25
sys/netinet/tcp.h,v 1.15
sys/netinet/tcp_input.c,v 1.200
sys/netinet/tcp_output.c,v 1.109
sys/netinet/tcp_subr.c,v 1.165
sys/netinet/tcp_usrreq.c,v 1.89
sys/netinet/tcp_var.h,v 1.109
sys/netipsec/files.netipsec,v 1.3
sys/netipsec/ipsec.c,v 1.11
sys/netipsec/ipsec.h,v 1.7
sys/netipsec/key.c,v 1.11
share/man/man4/tcp.4,v 1.16
lib/libipsec/pfkey.c,v 1.20
lib/libipsec/pfkey_dump.c,v 1.17
lib/libipsec/policy_token.l,v 1.8
sbin/setkey/parse.y,v 1.14
sbin/setkey/setkey.8,v 1.27
sbin/setkey/token.l,v 1.15
Note that the preceding two revisions to tcp.4 will be
required to cleanly apply this diff.
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.
Convert struct session, ucred and lockf to pools.
closer to normal behaviour for the current century.
New Reno is now on by default (which is really the only reasonable
choice, since we don't do SACK); instead of an initial window of 1
for non-local nets, we now use Sally Floyd's magic 4K rule.
left of the receive window, ignore it. Add some additional comments to
the code that deals with received segemnts that are completely to the right
of the receive window. If an invalid SYN is received, force an ACK and
drop it; if the other side really sent the SYN; it'll respond with a reset.
timer, otherwise there is a tiny window where both timers are
active, and this is not correct according to the comments in the
code. I believe that this is the cause of the to_ticks <= 0 assertion
failure in callout_schedule() that I've been getting.
(this can never have worked)
now I can use a "bge" gigabit interface with hw checksumming
ttcp-t: 2147483648 bytes in 18.31 real seconds = 114527.11 KB/sec +++
woow!