between protocol handlers.
ipsec socket pointers, ipsec decryption/auth information, tunnel
decapsulation information are in my mind - there can be several other usage.
at this moment, we use this for ipsec socket pointer passing. this will
avoid reuse of m->m_pkthdr.rcvif in ipsec code.
due to the change, MHLEN will be decreased by sizeof(void *) - for example,
for i386, MHLEN was 100 bytes, but is now 96 bytes.
we may want to increase MSIZE from 128 to 256 for some of our architectures.
take caution if you use it for keeping some data item for long period
of time - use extra caution on M_PREPEND() or m_adj(), as they may result
in loss of m->m_pkthdr.aux pointer (and mbuf leak).
this will bump kernel version.
(as discussed in tech-net, tested in kame tree)
- interop issues in ipcomp is fixed
- padding type (after ESP) is configurable
- key database memory management (need more fixes)
- policy specification is revisited
XXX m->m_pkthdr.rcvif is still overloaded - hope to fix it soon
due to massive changes in KAME side.
- IPv6 output goes through nd6_output
- faith can capture IPv4 packets as well - you can run IPv4-to-IPv6 translator
using heavily modified DNS servers
- per-interface statistics (required for IPv6 MIB)
- interface autoconfig is revisited
- udp input handling has a big change for mapped address support.
- introduce in4_cksum() for non-overwriting checksumming
- introduce m_pulldown()
- neighbor discovery cleanups/improvements
- netinet/in.h strictly conforms to RFC2553 (no extra defs visible to userland)
- IFA_STATS is fixed a bit (not tested)
- and more more more.
TODO:
- cleanup os-independency #ifdef
- avoid rcvif dual use (for IPsec) to help ifdetach
(sorry for jumbo commit, I can't separate this any more...)
MSS advertisement must always be:
max(if mtu) - ip hdr siz - tcp hdr siz
We violated this in the previous code so it was fixed.
tcp_mss_to_advertise() now takes af (af on wire) as its argument,
to compute right ip hdr siz.
tcp_segsize() will take care of IPsec header size.
One thing I'm not really sure is how to handle IPsec header size in
*rxsegsizep (inbound segment size estimation).
The current code subtracts possible *outbound* IPsec size from *rxsegsizep,
hoping that the peer is using the same IPsec policy as me.
It may not be applicable, could TCP gulu please comment...
Stale syn cache entries are useless because none of them will be used
if there is no listening socket, as tcp_input looks up listening socket by
in_pcblookup*() before looking into syn cache.
This fixes race condition due to dangling socket pointer from syn cache
entries to listening socket (this was introduced when ipsec is merged in).
This should preserve currently implemented behavior (but not 4.4BSD
behavior prior to syn cache).
Tested in KAME repository before commit, but we'd better run some
regression tests.
is not the expected one.
I see PRC_REDIRECT_HOST with sa->sa_family == AF_UNIX coming to
{tcp,udp}_ctlinput() when I use dhclient, and I feel like adding
more sanity checks, without logging - if we log it it is too noisy.
when ip header and tcp header are not adjacent to each other
(i.e. when ip6 options are attached).
To test this, try
telnet @::1@::1 port
toward a port without responding server. Prior to the fix, the kernel will
generate broken RST packet.
(Sorry for a big commit, I can't separate this into several pieces...)
Pls check sys/netinet6/TODO and sys/netinet6/IMPLEMENTATION for details.
- sys/kern: do not assume single mbuf, accept chained mbuf on passing
data from userland to kernel (or other way round).
- "midway" ATM card: ATM PVC pseudo device support, like those done in ALTQ
package (ftp://ftp.csl.sony.co.jp/pub/kjc/).
- sys/netinet/tcp*: IPv4/v6 dual stack tcp support.
- sys/netinet/{ip6,icmp6}.h, sys/net/pfkeyv2.h: IETF document assumes those
file to be there so we patch it up.
- sys/netinet: IPsec additions are here and there.
- sys/netinet6/*: most of IPv6 code sits here.
- sys/netkey: IPsec key management code
- dev/pci/pcidevs: regen
In my understanding no code here is subject to export control so it
should be safe.
SYN,ACK packets during Path MTU Discovery. Fix tcp_respond() to do the
appropriate route lookup and set DF as appropriate.
Also, fixup similar code in tcp_output() to relookup the route if it
is down.
suggestion in draft-floyd-incr-init-win-03. Rather than scaling cwnd back
by the ratio of new segment size to old segment size, we perform a slow start
using the Initial Window, computed with the new segment size.
- Don't use home-grown queue manipulation. Use <sys/queue.h> instead. The
data structures are a little larger, but we are otherwise wasting the
memory chunk anyway (we're already a 64-byte malloc bucket).
- Fix a bug in the cache-is-full case: if the oldest element removed from
the first non-empty bucket was the only element in the bucket, the
bucket wouldn't be removed from the bucket cache, causing queue corruption
later.
- Optimize the syn cache timers by using PRT timers rather than home-grown
decrement-and-propagate timers.
This code is now a fair bit smaller, and significantly easier to read
and understand.
conditional (tcp_compat_42). The kernel config option TCP_COMPAT_42
will still enable this by default, or disable this by default if the
option is not included (i.e. current behavior). This will be made a
sysctl soon.
TCP connections by using the MTU of the interface. Also added
a knob, mss_ifmtu, to force all connections to use the MTU of
the interface to calculate the advertised MSS.
code, as clarified in the TCPIMPL WG meeting at IETF #41: If the SYN
(active open) or SYN,ACK (passive open) was retransmitted, the initial
congestion window for the first slow start of that connection must be
one segment.
RTO estimation changes. Under some circumstances it would return a value
of 0, while the old Van Jacobson RTO code would return a minimum of 3.
This would result in 12 retransmissions, each 1 second apart.
This takes care of those instances, and ensures that t_rttmin is
used everywhere as a lower bound.
in the packet. This fixes a bug that was resulting in extra packets
in retransmissions (the second packet would be 12 bytes long,
reflecting the RFC1323 timestamp option size).