as config(8) will warn for value-less defparam options
- minor whitespace/formatting cleanup
- consolidate opt_tcp_recvspace.h and opt_tcp_sendspace.h into opt_tcp_space.h
as discussed in tech-net several weeks ago. It turned out that
KAME had already added this functionality to the IPv6 stack, so
I followed their example in adding the sysctl variables
net.inet.icmp.rediraccept and net.inet.icmp.redirtimeout.
Add capabilities bits that indicate an interface can only perform
in-bound TCPv4 or UDPv4 checksums. There is at least one Gig-E chip
for which this is true (Level One LXT-1001), and this is also the
case for the Intel i82559 10/100 Ethernet chips.
all open TCP connections in tcp_slowtimo() (which is called 2x
per second). It's fairly rare for TCP timers to actually fire,
so saving this list traversal is good, especially if you want
to scale to thousands of open connections.
and call it directly from tcp_slowtimo() (via a table) rather
than going through tcp_userreq().
This will allow us to call TCP timers directly from callouts,
in a future revision.
Instead of incrementing t_idle and t_rtt in tcp_slowtimo(), we now
take a timstamp (via tcp_now) and use subtraction to compute the
delta when we actually need it (using unsigned arithmetic so that
tcp_now wrapping is handled correctly).
Based on similar changes in FreeBSD.
provide a better support for multiple address with the same prefix better.
(like 10.0.0.1/8 and 10.0.0.2/8 on the same interface)
continuation of PR 13311.
remove irrelevant #if 0'ed segment for PR 10427.
- consider non-primary (2nd and beyond) IPv4 address as "local", and prevent
outgoing ARP.
- for routing entries generated by ARP, make sure to set rt->rt_ifa equal to
rt_key, to help IPv4 source address selection for traffic to myself.
PR 13311.
caveats/TODOs:
- interface routes ("connected routes" in cisco terminlogy) is tied with the
primary (1st) IPv4 address on the interface. should be fixed with updates
to rt_ifinit().
- source address selection for offlink locations. 1st address tend to be used
with the current code
(you can configure it right by setting rt->rt_ifa accordingly).
network interfaces. This works by pre-computing the pseudo-header
checksum and caching it, delaying the actual checksum to ip_output()
if the hardware cannot perform the sum for us. In-bound checksums
can either be fully-checked by hardware, or summed up for final
verification by software. This method was modeled after how this
is done in FreeBSD, although the code is significantly different in
most places.
We don't delay checksums for IPv6/TCP, but we do take advantage of the
cached pseudo-header checksum.
Note: hardware-assisted checksumming defaults to "off". It is
enabled with ifconfig(8). See the manual page for details.
Implement hardware-assisted checksumming on the DP83820 Gigabit Ethernet,
3c90xB/3c90xC 10/100 Ethernet, and Alteon Tigon/Tigon2 Gigabit Ethernet.
There is no place in the source where this bit could ever be set (or I'm
to blind to find it).
This fixes PR 12671.
If someone thinks this is the wrong solution, please make sure to (a) reopen
the PR and (b) explain to me how the tested bits would ever get set. I'll
be glad to then look further for the real cause (i.e. the flags not getting
set in the case described in the PR).
(reported by: grendel@heorot.stanford.edu (Ted U)
- Don't cast malloc/realloc/calloc return values because they hide LP64 bugs.
- Don't destroy the whole array when realloc fails
- Use calloc in all cases (malloc was used inconsistently).
- Avoid duplicating code.
Reviewed by: ross
"lots of fragmented packets" DoS attack.
the current default value is derived from ipv6 counterpart, which is
a magical value "200". it should be enough for normal systems, not sure
if it is enough when you take hundreds of thousands of tcp connections on
your system. if you have proposal for a better value with concrete reasons,
let me know.
ISS attacks (which we already fend off quite well).
1. First-cut implementation of RFC1948, Steve Bellovin's cryptographic
hash method of generating TCP ISS values. Note, this code is experimental
and disabled by default (experimental enough that I don't export the
variable via sysctl yet, either). There are a couple of issues I'd
like to discuss with Steve, so this code should only be used by people
who really know what they're doing.
2. Per a recent thread on Bugtraq, it's possible to determine a system's
uptime by snooping the RFC1323 TCP timestamp options sent by a host; in
4.4BSD, timestamps are created by incrementing the tcp_now variable
at 2 Hz; there's even a company out there that uses this to determine
web server uptime. According to Newsham's paper "The Problem With
Random Increments", while NetBSD's TCP ISS generation method is much
better than the "random increment" method used by FreeBSD and OpenBSD,
it is still theoretically possible to mount an attack against NetBSD's
method if the attacker knows how many times the tcp_iss_seq variable
has been incremented. By not leaking uptime information, we can make
that much harder to determine. So, we avoid the leak by giving each
TCP connection a timebase of 0.
normal operation (/var can get filled up by flodding bogus packets).
sysctl net.inet6.icmp6.nd6_debug will turn on diagnostic messages.
(#define ND6_DEBUG will turn it on by default)
improve stats in ND6 code.
lots of synchronziation with kame (including comments and cometic ones).
- let ipfilter look at wire-format packet only (not the decapsulated ones),
so that VPN setting can work with NAT/ipfilter settings.
sync with kame.
TODO: use header history for stricter inbound validation
since we do not have feedback mechanism from path MTU to tunnel MTU
(not sure if we should), and inner packet source will not get informed of
outer PMTUD (we shouldn't do this), 1.15 behavior can lead us to
blackhole behavior.
configurable behavior (as suggested in RFC2401 6.1) would be nice to have,
however, reusing net.inet.ipsec.dfbit would be hairy.
have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).
benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0
remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.
XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.
the link level name for the interface (ifp->if_sadl) is allocated
before ifp->if_addrlen is initialized, which could lead to allocating
too little space for the link level address.
Do this by splitting allocation of the link level name out of
if_attach() and into if_alloc_sadl(), which is normally called
by functions like ether_ifattach(). Network interfaces which
don't have a link-specific attach routine must call if_alloc_sadl()
themselves (example: gif).
Link level names are freed by if_free_sadl(), which can be called
from e.g. ether_ifdetach(). Drivers never need call if_free_sadl()
themselves as if_detach() will do it if it is not already done.
While here, add the ability to pass an AF_LINK address to
SIOCSIFADDR in ether_ioctl() (this is what caused me to notice
the problem that the above fixes).
by interface index. if the interface address specified is in 0.0.0.0/8
it will be considered as interface index in network byteorder.
getsockopt(IP_MULTICAST_IF) preserves old behavior if
setsockopt(IP_MULTICAST_IF) was done with interface address, and
returns interface index if setsockopt(IP_MULTICAST_IF) was done with
interface index (again using the form in 0.0.0.0/8).
Suggested by Dave Thaler, based on RIPv2 MIB spec (RFC1724 section 3.3).
http://mail-index.netbsd.org/tech-net/2001/01/13/0003.html
"key" and a "dlt", use a "type" (PFIL_TYPE_{AF,IFNET} for now) and
a val/ptr appropriate for that type. This allows for more future
flexibility with the pfil_hook mechanism.
mandatory for IPv6 (so we can't just validate by using connected pcb - we need
to allow traffic from unconnected pcb to do pmtud).
- if the traffic is validated by xx_ctlinput, allow up to "hiwat" pmtud
route entries.
- if the traffic was not validated by xx_ctlinput, allow up to "lowat" pmtud
route entries (there's upper limit, so bad guys cannot blow up our routing
table).
sync with kame
XXX need to think again about default hiwat/lowat value.
XXX victim selection to help starvation case
- All packets are passed to PFIL_HOOKS as they come off the wire, i.e.
fields in protocol headers in network order, etc.
- Allow for multiple hooks to be registered, using a "key" and a "dlt".
The "dlt" is a BPF data link type, indicating what type of header is
present.
- INET and INET6 register with key == AF_INET or AF_INET6, and
dlt == DLT_RAW.
- PFIL_HOOKS now take an argument for the filter hook, and mbuf **,
an ifnet *, and a direction (PFIL_IN or PFIL_OUT), thus making them
less IP (really, IP Filter) centric.
Maintain compatibility with IP Filter by adding wrapper functions for
IP Filter.
Current status:
Only OHCI chip is supported (fwohci).
ping (IPv4) works with Sony's implementation (SmartConnect) on Win98.
sometimes works but not stable.
Not implemented yet:
IRM (Isochronous Resource Manager) functionality.
Link layer fragmentation.
Topology map.
More to do:
clean ups
MCAP
charactor device part
dhcp
There is no entry in GENERIC config file yet.
Follow sys/dev/ieee1394/IMPLEMENTATION to enable if_fw.