- All packets are passed to PFIL_HOOKS as they come off the wire, i.e.
fields in protocol headers in network order, etc.
- Allow for multiple hooks to be registered, using a "key" and a "dlt".
The "dlt" is a BPF data link type, indicating what type of header is
present.
- INET and INET6 register with key == AF_INET or AF_INET6, and
dlt == DLT_RAW.
- PFIL_HOOKS now take an argument for the filter hook, and mbuf **,
an ifnet *, and a direction (PFIL_IN or PFIL_OUT), thus making them
less IP (really, IP Filter) centric.
Maintain compatibility with IP Filter by adding wrapper functions for
IP Filter.
Current status:
Only OHCI chip is supported (fwohci).
ping (IPv4) works with Sony's implementation (SmartConnect) on Win98.
sometimes works but not stable.
Not implemented yet:
IRM (Isochronous Resource Manager) functionality.
Link layer fragmentation.
Topology map.
More to do:
clean ups
MCAP
charactor device part
dhcp
There is no entry in GENERIC config file yet.
Follow sys/dev/ieee1394/IMPLEMENTATION to enable if_fw.
TODO: udp6, and sendto consideration. as pmtud is mandatory for IPv6,
it is rather important for us to support those cases.
TODO: more testing
TODO: kame sync
entering rtentry's for hosts we're not actually communicating
with.
Do this by invoking the ctlinput for the protocol, which is
responsible for validating the ICMP message:
* TCP -- Lookup the connection based on the address/port
pairs in the ICMP message.
* AH/ESP -- Lookup the SA based on the SPI in the ICMP message.
If validation succeeds, ctlinput is responsible for calling
icmp_mtudisc(). icmp_mtudisc() then invokes callbacks registered
by protocols (such as TCP) which want to take some sort of special
action when a path's MTU changes. For TCP, this is where we now
refresh cached routes and re-enter slow-start.
As a side-effect, this fixes the problem where TCP would not be
notified when a path's MTU changed if AH/ESP were being used.
XXX Note, this is only a fix for the IPv4 case. For the IPv6
XXX case, we need to wait for the KAME folks.
Reviewed by sommerfeld@netbsd.org and itojun@netbsd.org.
ip_output(). This flag, if set, causes ip_output() to set
DF in the IP header if the MTU in the route is not locked.
This allows a bunch of redundant code, which I was never
really all that happy about adding in the first place, to
be eliminated.
Inspired by a similar change made by provos@openbsd.org when
he integrated NetBSD's Path MTU Discovery code into OpenBSD.
packets. PR 11082.
This is a short-term workaround. whenever new ipfilter comes out with
proper non-IPv4 support, we should migrate to the new ipfilter.
each in_ifaddr and delete it when an address is purged.
- Don't simply try to delete a multicast address record listed in the
ia_multiaddrs. It results a dangling pointer. Let who holds a
reference to it to delete it.
- when all the interface address is removed from an interface, and there's
multicast groups still left joined, keep it in kludge table.
- when an interface address is added again, recover multicast groups from
kludge table.
this will avoid problem with dangling in_ifaddr on pcmcia card removal,
due to the link from multicast group info (in_multi).
the code is basically from sys/netinet6/in6.c (jinmei@kame).
pointed out by: Shiva Shenoy <shiva_s@yahoo.com>
Without this, if a v6 address is placed before a v4 address in if_addrlist,
a PRU_PURGEIF request for v6 tcp protocol purges also v4 addresses and,
as a result, if_detach fails to request PRU_PURGEIF for v4 protocols
other than tcp.
this avoids too aggressive memory usage on heavy load web server, for example.
From: Kevin Lahey <kml@dotrocket.com>
release and reallocate t_template, if t_template->m_len changes.
(this happens if we connect to IPv4 mapped destination and then IPv6
destination, on a single AF_INET6 socket)
KAME 1.26 -> 1.28
rev 1.35 of ip_nat.c checks if packets are too short.
For ICMP packets, this packet length checking double counts
the length of an IP header contained in ICMP messages.
So, unless ICMP packets are long enough (such as echo-reply),
packets are mistakingly considered too short and are dropped.
multiple addresses from same prefix, onto single interface. PR 10427.
more info:
- 4.4BSD did not check return code from in_ifinit() at all.
4.4BSD does not support multiple address from same prefix.
- past KAME change passed in{,6}_ifinit() to upwards, toward ifconfig(8).
the behavior is filed as PR 10427.
- the commit inhibits EEXIST from rtinit(), hence partially recovers old
4.4BSD behavior.
- the right thing to happen is to properly support multiple address assignment
from the same prefix. KAME tree has more extensive change, however, it needs
much more time to get stabilized (rtentry refcnt change can cause serious
issue, we really need to bake it before bring it to netbsd)
fails, return EIO instead.
- iplioctl(): If performing a NAT operation, and IP Filter is not
yet initialized (e.g. by `ipf -E'), enable it implicitly before
doing the NAT operation.
basis. default: 100pps
set default value for net.inet.tcp.rstratelimit to 0 (disabled),
NOTE: it does not work right for smaller-than-1/hz interval. maybe we should
nuke it, or make it impossible to set smaller-than-1/hz value.
unspecified address (::) to mean "unbounded" or "unconnected",
and can be confused by packets from outside.
use of :: as source is not documented well in IPv6 specification.
not sure if it presents a real threat. the worst case scenario is a DoS
against TCP listening socket:
- outsider transmit TCP SYN with :: as IPv6 source
- receiving side creates TCP control block with:
local address = my addres
remote address = :: (meaning "unconnected")
state = SYN_RCVD
note that SYN ACK will not be sent due to ip6_output() filter.
this stays until it timeouts.
- the TCP control block prevents listening TCP control block from
being contacted (DoS).
udp6/raw6 socket may have similar problem, but as they are connectionless,
it may too much to filter it out.
reports a size smaller than the udp header; defends against bogosity
detected by Assar Westerlund.
This patch and the previous ip_icmp.c change were the joint work of
assar, itojun, and myself.
- allow it to work when icmpreturndatabytes is sufficiently large that the
icmp error message doesn't fit in a header mbuf.
- defend against mbuf chains shorter than their contained ip->ip_len.
- add protection mechanism against ND cache corruption due to bad NUD hints.
- more stats
- icmp6 pps limitation. TOOD: should implement ppsratecheck(9).
in ip_mroute.c for duplicate tunnel entries, too. Well, what
really needs to happen is that the mrouting code needs to be
changed to work w/ `gif' tunnels... but...
<vm/pglist.h> -> <uvm/uvm_pglist.h>
<vm/vm_inherit.h> -> <uvm/uvm_inherit.h>
<vm/vm_kern.h> -> into <uvm/uvm_extern.h>
<vm/vm_object.h> -> nothing
<vm/vm_pager.h> -> into <uvm/uvm_pager.h>
also includes a bunch of <vm/vm_page.h> include removals (due to redudancy
with <vm/vm.h>), and a scattering of other similar headers.
correct timestamp option validation (len and ptr upper/lower bound
based on RFC791).
fill "pointer" field for parameter problem in timestamp option processing.
to protocol handlers, based on src/dst (for ip proto #4/41).
see comment in ip_encap.c for details of the problem we have.
there are too many protocol specs for ip proto #4/41.
backward compatibility with MROUTING case is now provided in ip_encap.c.
fix ipip to work with gif (using ip_encap.c). sorry for breakage.
gif now uses ip_encap.c.
introduce stf pseudo interface (implements 6to4, another IPv6-over-IPv4 code
with ip proto #41).
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.
The old timeout()/untimeout() API has been removed from the kernel.
it was a bit too strong, and forbids multiple addresses from
same prefix to be assigned.
now the behavior is the same as previous - memory leak on interface address
addition failure.
http://orange.kame.net/dev/query-pr.cgi?pr=218
during KAME merge (this is part of WIDE's expeirmental reass code...)
NetBSD PR: 9412
From: Wolfgang Rupprecht <wolfgang@wsrcc.com>
Fix from: ho@crt.se
itojun was notified from: theo
between protocol handlers.
ipsec socket pointers, ipsec decryption/auth information, tunnel
decapsulation information are in my mind - there can be several other usage.
at this moment, we use this for ipsec socket pointer passing. this will
avoid reuse of m->m_pkthdr.rcvif in ipsec code.
due to the change, MHLEN will be decreased by sizeof(void *) - for example,
for i386, MHLEN was 100 bytes, but is now 96 bytes.
we may want to increase MSIZE from 128 to 256 for some of our architectures.
take caution if you use it for keeping some data item for long period
of time - use extra caution on M_PREPEND() or m_adj(), as they may result
in loss of m->m_pkthdr.aux pointer (and mbuf leak).
this will bump kernel version.
(as discussed in tech-net, tested in kame tree)
draft-ietf-ipngwg-icmp-name-lookups-04.txt.
There are certain bitfield change in 04 draft to 05 draft, which makes
04 "ping6 -a" and 05 "ping6 -a" not interoperable. sigh.
- remove net.inet6.ip6.nd6_proxyall. introduce proxy NDP code works
just like "arp -s".
- revise source address selection.
be more careful about use of yet-to-be-valid addresses as source.
- as router, transmit ICMP6_DST_UNREACH_BEYONDSCOPE against out-of-scope
packet forwarding attempt.
- path MTU discovery takes care of routing header properly.
- be more strict about mbuf chain parsing.