code, from netbsd-current repository.
#ifdef'ed version is always available from ftp.kame.net.
XXX please do not make too many diff-unfriendly changes, we'll need to take
bunch of diffs on upgrade...
AF_INET6 wildcard listening socket. heavily documented in ip6(4).
net.inet6.ip6.bindv6only defines default value. default is 1.
"options INET6_BINDV6ONLY" removes any code fragment that supports
IPV6_BINDV6ONLY == 0 case (not defopt'ed as use of this is rare).
Patch from Darren send to the mailing list after he released 3.3.6 and
did a bad job with using the wrong way to update the NetBSD version
of ipfilter.
due to massive changes in KAME side.
- IPv6 output goes through nd6_output
- faith can capture IPv4 packets as well - you can run IPv4-to-IPv6 translator
using heavily modified DNS servers
- per-interface statistics (required for IPv6 MIB)
- interface autoconfig is revisited
- udp input handling has a big change for mapped address support.
- introduce in4_cksum() for non-overwriting checksumming
- introduce m_pulldown()
- neighbor discovery cleanups/improvements
- netinet/in.h strictly conforms to RFC2553 (no extra defs visible to userland)
- IFA_STATS is fixed a bit (not tested)
- and more more more.
TODO:
- cleanup os-independency #ifdef
- avoid rcvif dual use (for IPsec) to help ifdetach
(sorry for jumbo commit, I can't separate this any more...)
(if the definition is like in rfc2553) they are not supposed to be used.
XXX i'm trying to change rfc2553 sockaddr_storage definition to include
"ss_len" and "ss_family". see ipngwg. situation might change soon.
just for reference purposes.
This commit includes 1.4 -> 1.4.1 sync for kame branch.
The branch does not compile at all (due to the lack of ALTQ and some other
source code). Please do not try to modify the branch, this is just for
referenre purposes.
synchronization to latest KAME will take place on HEAD branch soon.
RTM_IFINFO is now 0xf, 0xe is RTM_OIFINFO which returns the old (if_msghdr14)
struct with 32bit counters (binary compat, conditioned on COMPAT_14).
Same for sysctl: node 3 is renamed NET_RT_OIFLIST, NET_RT_IFLIST is now node 4.
Change rt_msg1() to add an mbuf to the mbuf chain instead of just panic()
when the message is larger than MHLEN.
Avoid forwarding ip unicast packets which were contained inside
link-level multicast packets; having M_MCAST still set in the packet
header flags will mean that the packet will get multicast to a bogus
group instead of unicast to the next hop.
Malformed packets like this have occasionally been spotted "in the
wild" on a mediaone cable modem segment which also had multiple netbsd
machines running as router/NAT boxes.
Without this, any subnet with multiple netbsd routers receiving all
multicasts will generate a packet storm on receipt of such a
multicast. Note that we already do the same check here for link-level
broadcasts; ip6_forward already does this as well.
Note that multicast forwarding does not go through ip_forward().
Adding some code to if_ethersubr to sanity check link-level
vs. ip-level multicast addresses might also be worthwhile.
back into the packet. (ip_output() clears it since ipsec reuses that
packet field in the output path. by putting it back, we're going to
pretend we're back on the input path now).
This is important, because for most protocols, link level fragmentation is
used, but with different default effective MTUs. (e.g.: IPv4 default MTU
is 1500 octets, IPv6 default MTU is 9072 octets).
MSS advertisement must always be:
max(if mtu) - ip hdr siz - tcp hdr siz
We violated this in the previous code so it was fixed.
tcp_mss_to_advertise() now takes af (af on wire) as its argument,
to compute right ip hdr siz.
tcp_segsize() will take care of IPsec header size.
One thing I'm not really sure is how to handle IPsec header size in
*rxsegsizep (inbound segment size estimation).
The current code subtracts possible *outbound* IPsec size from *rxsegsizep,
hoping that the peer is using the same IPsec policy as me.
It may not be applicable, could TCP gulu please comment...
This situation happens on severe memory shortage. We may need more
improvements here and there.
- Grab IEEE802 address from IFT_ETHER card, even if the card is
inserted after bootup time. Is there any other card that can be
inserted afterwards? pcmcia fddi card? :-P
- RFC2373 u bit handling suggests that we SHOULD NOT copy interface id from
ethernet card to pseudo interface, when ethernet card has IEEE802/EUI64
with u bit != 0 (this means that IEEE802/EUI64 is not universally unique).
Do not use such address as, for example, interface id for gif interface.
(I have such an ethernet card myself)
This may change interface id for your gif interface. be careful upgrading
rc files.
(sync with recent KAME)
the member is used to pass struct socket to ip{,6}_output for ipsec decisions.
(i agree it is kind of ugly. we need to modify struct mbuf if we are
to do better - which seems to me a bit too much)
New Reno fast recovery code was being executed even when New Reno was
disabled, resulting in an unfortunate interaction with the traditional
fast recovery code, the end resulting being that the very condition
that would trigger the traditional fast recovery mechanism caused fast
recovery to be disabled!
Problem reported by Ted Lemon, and some analytical help from Charles Hannum.
Stale syn cache entries are useless because none of them will be used
if there is no listening socket, as tcp_input looks up listening socket by
in_pcblookup*() before looking into syn cache.
This fixes race condition due to dangling socket pointer from syn cache
entries to listening socket (this was introduced when ipsec is merged in).
This should preserve currently implemented behavior (but not 4.4BSD
behavior prior to syn cache).
Tested in KAME repository before commit, but we'd better run some
regression tests.
check that the packet if of the rigth protocol before giving it to the
proxy module, otherwise let the ipnat code handle it.
What happens in kern/7831 is that a router sends back a icmp message for
a TCP SYN, and ip_proxy.c forwards it to ip_ftp_pxy.c which can only
handle TCP packets. The icmp message is properly handled by ipnat, no need to
go to ip_ftp_pxy.c.
- Make sure that snd_recover is always at least snd_una. If we don't do
this, there can be confusion when sequence numbers wrap around on a
large loss-free data transfer.
- When doing a New Reno retransmit, snd_una hasn't been updated yet,
and the socket's send buffer has not yet dropped off ACK'd data, so
don't muddle with snd_una, so that tcp_output() gets the correct data
offset.
- When doing a New Reno retransmit, make sure the congestion window is
open one segment beyond the ACK'd data, so that we can actually perform
the retransmit.
Partially derived from, although more complete than, similar changes in
OpenBSD, which in turn originated from Tom Henderson <tomh@cs.berkeley.edu>.
is not the expected one.
I see PRC_REDIRECT_HOST with sa->sa_family == AF_UNIX coming to
{tcp,udp}_ctlinput() when I use dhclient, and I feel like adding
more sanity checks, without logging - if we log it it is too noisy.