Commit Graph

886 Commits

Author SHA1 Message Date
itojun
696bcad865 put attribute(packed) for ip6 option headers. they will appear at
strange alignment positions.  sync with kame
2001-01-23 07:21:07 +00:00
itojun
5a4703fbfe revert revision 1.15 (on ingress, DF bit copied from inner to outer).
since we do not have feedback mechanism from path MTU to tunnel MTU
(not sure if we should), and inner packet source will not get informed of
outer PMTUD (we shouldn't do this), 1.15 behavior can lead us to
blackhole behavior.

configurable behavior (as suggested in RFC2401 6.1) would be nice to have,
however, reusing net.inet.ipsec.dfbit would be hairy.
2001-01-22 07:57:34 +00:00
itojun
a836499e32 make it possible to turn off ingress filter on gif/stf tunnel egress,
by using IFF_LINK2.  (part of) PR 11163 from Ken Raeburn.
2001-01-22 07:51:01 +00:00
itojun
93deb6a97f fix RR result bit in little endian systems. sync with kame 2001-01-22 02:28:02 +00:00
itojun
69622e75ab sync with latest kame.
- make icmp6.h spec conformant to 2292bis-02, regarding to router reumbering
  flag bit.
- latest rtadvd.
2001-01-21 15:39:32 +00:00
kleink
4c96c6b51f Add IPPROTO_VRRP. 2001-01-19 09:01:48 +00:00
jdolecek
34c8ae80da constify 2001-01-18 20:28:15 +00:00
itojun
df9784d749 pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).
have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works.  previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *.  it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.
2001-01-17 04:05:41 +00:00
thorpej
fc5dafc79b Fix a rather annoying problem where the sockaddr_dl which holds
the link level name for the interface (ifp->if_sadl) is allocated
before ifp->if_addrlen is initialized, which could lead to allocating
too little space for the link level address.

Do this by splitting allocation of the link level name out of
if_attach() and into if_alloc_sadl(), which is normally called
by functions like ether_ifattach().  Network interfaces which
don't have a link-specific attach routine must call if_alloc_sadl()
themselves (example: gif).

Link level names are freed by if_free_sadl(), which can be called
from e.g. ether_ifdetach().  Drivers never need call if_free_sadl()
themselves as if_detach() will do it if it is not already done.

While here, add the ability to pass an AF_LINK address to
SIOCSIFADDR in ether_ioctl() (this is what caused me to notice
the problem that the above fixes).
2001-01-17 00:30:49 +00:00
itojun
42bede79da allow IP_MULTICAST_IF and IP_ADD/DROP_MEMBERSHIP to specify interface
by interface index.  if the interface address specified is in 0.0.0.0/8
it will be considered as interface index in network byteorder.

getsockopt(IP_MULTICAST_IF) preserves old behavior if
setsockopt(IP_MULTICAST_IF) was done with interface address, and
returns interface index if setsockopt(IP_MULTICAST_IF) was done with
interface index (again using the form in 0.0.0.0/8).

Suggested by Dave Thaler, based on RIPv2 MIB spec (RFC1724 section 3.3).

http://mail-index.netbsd.org/tech-net/2001/01/13/0003.html
2001-01-13 07:19:33 +00:00
itojun
4a14fb4fd8 on getsockopt(IP_IPSEC_POLICY), make sure to initialize len 2001-01-13 06:01:18 +00:00
thorpej
054b845cec Fix a small typo that would cause IP Filter to not hook in to
pfil_hooks properly on kernels that included IPv6 support.
2000-12-28 21:42:49 +00:00
thorpej
ad5b855ef0 Back out the sledgehammer damage applied by wiz while I was out for
the holiday.
2000-12-28 21:40:59 +00:00
wiz
32e20d8993 Back out previous change. It causes NAT to fail, and was CLEARLY
NOT TESTED before it was committed.
2000-12-25 02:00:46 +00:00
thorpej
d0357bdb4f Slight adjustment to how pfil_head's are registered. Instead of a
"key" and a "dlt", use a "type" (PFIL_TYPE_{AF,IFNET} for now) and
a val/ptr appropriate for that type.  This allows for more future
flexibility with the pfil_hook mechanism.
2000-12-22 20:01:17 +00:00
itojun
b2aef8afe2 fix call to in6_pcbnotify. s/EMSGSIZE/PRC_MSGSIZE/. 2000-12-21 00:45:17 +00:00
thorpej
ca9af6e52e Pull in BPF includes. 2000-12-18 20:58:13 +00:00
thorpej
ed7695a765 Fill in if_dlt. 2000-12-18 19:44:33 +00:00
thorpej
d9a9544a2f Add ALTQ glue. XXX Temporary until ALTQ is changed to use a pfil hook. 2000-12-14 17:36:44 +00:00
thorpej
c5293456da Adapt to bpfattach() changes, and further centralize the bpfattach()
and bpfdetach() calls into link-type subroutines where possible.
2000-12-12 18:00:22 +00:00
itojun
25296369e5 make sure t_family has the correct protocol family, after connect(2)
and/or bind(2).  sync with kame
2000-12-11 00:07:48 +00:00
itojun
9302f377a6 remove NRL code leftover. sync with kame 2000-12-10 23:39:36 +00:00
itojun
5eae50d991 update icmp6 too big validation. the change is necessary since pmtud is
mandatory for IPv6 (so we can't just validate by using connected pcb - we need
to allow traffic from unconnected pcb to do pmtud).
- if the traffic is validated by xx_ctlinput, allow up to "hiwat" pmtud
  route entries.
- if the traffic was not validated by xx_ctlinput, allow up to "lowat" pmtud
  route entries (there's upper limit, so bad guys cannot blow up our routing
  table).
sync with kame

XXX need to think again about default hiwat/lowat value.
XXX victim selection to help starvation case
2000-12-09 01:29:45 +00:00
itojun
7fee705236 more on previous (udp4 multicast fix) 2000-12-04 11:24:20 +00:00
itojun
c2ca545d60 fix multicast inbound packet processing.
NetBSD PR 11629 From: salvet@ics.muni.cz
2000-12-04 11:23:04 +00:00
itojun
f9ed4a5d70 IFA_STATS stability (not complete); don't touch ip if it is NULL. 2000-11-24 03:43:20 +00:00
thorpej
e37508421d Due to a quirk (err, bug?) in IP Filter (mbuf freed without setting *mp
to NULL), the NULL check is insufficient.  Also make sure fr_check()
returned 0.
2000-11-12 19:50:47 +00:00
thorpej
cbf6f69cb2 Oops, the mbuf may have been freed -- do a NULL check in the wrapper. 2000-11-12 19:29:31 +00:00
thorpej
8517807044 Actually, our local ip_off variable isn't needed. 2000-11-11 00:55:51 +00:00
thorpej
65fd25ea82 Restructure the PFIL_HOOKS mechanism a bit:
- All packets are passed to PFIL_HOOKS as they come off the wire, i.e.
  fields in protocol headers in network order, etc.
- Allow for multiple hooks to be registered, using a "key" and a "dlt".
  The "dlt" is a BPF data link type, indicating what type of header is
  present.
- INET and INET6 register with key == AF_INET or AF_INET6, and
  dlt == DLT_RAW.
- PFIL_HOOKS now take an argument for the filter hook, and mbuf **,
  an ifnet *, and a direction (PFIL_IN or PFIL_OUT), thus making them
  less IP (really, IP Filter) centric.

Maintain compatibility with IP Filter by adding wrapper functions for
IP Filter.
2000-11-11 00:52:36 +00:00
ad
642267bcc7 Update for hashinit() change. 2000-11-08 14:28:12 +00:00
itojun
ef8a34f5c3 fix IPv4 TTL selection with AF_INET6 API. sync with kame. From: jdc 2000-11-06 00:50:12 +00:00
onoe
e83458422f First Prototype implementation of network interface part for IEEE1394 (if_fw).
Current status:
	Only OHCI chip is supported (fwohci).
	ping (IPv4) works with Sony's implementation (SmartConnect) on Win98.
	sometimes works but not stable.
Not implemented yet:
	IRM (Isochronous Resource Manager) functionality.
	Link layer fragmentation.
	Topology map.
More to do:
	clean ups
	MCAP
	charactor device part
	dhcp

There is no entry in GENERIC config file yet.
Follow sys/dev/ieee1394/IMPLEMENTATION to enable if_fw.
2000-11-05 17:17:12 +00:00
itojun
be2983be9d cleanup tcp_drop 2000-10-29 06:33:59 +00:00
itojun
7813d4bf6e process IPv4 tcp RST packet right. reported by thorpej. 2000-10-29 06:30:51 +00:00
itojun
80db86454a fix IFA_STATS.
- use hashed in_ifaddr lookup.
- correct endianness.
2000-10-23 03:42:18 +00:00
mjl
986b950535 Mark packets from gre as coming from appropriate gre interface, not
transport interface.
2000-10-20 20:43:26 +00:00
itojun
9183e2dc4e remove #ifdef TCP6. it is not likely for us to bring in sys/netinet6/tcp6*.c
(separate TCP/IPv6 stack) into netbsd-current.
2000-10-19 20:22:59 +00:00
itojun
9288750911 memcpy -> bcopy, for sync with kame tree 2000-10-19 00:40:44 +00:00
itojun
23a03329ef verify ICMPv6 too big messages based on TCP pcbs, and/or IPsec SA.
TODO: udp6, and sendto consideration.  as pmtud is mandatory for IPv6,
it is rather important for us to support those cases.
TODO: more testing
TODO: kame sync
2000-10-18 21:14:12 +00:00
itojun
d7ca32a335 s/mtudisc_callback/icmp_&/ so that we don't feel conflict between IPv4 and
IPv6 counterpart. (or icmp4_&?)
2000-10-18 20:34:00 +00:00
itojun
9e8a83c2a4 count successful path MTU changes. good for debugging.
(there could be some discussion on when to increase the counter...)
2000-10-18 19:20:02 +00:00
thorpej
ea9b5a9106 Restructure the Path MTU Discovery code somewhat to avoid
entering rtentry's for hosts we're not actually communicating
with.

Do this by invoking the ctlinput for the protocol, which is
responsible for validating the ICMP message:
	* TCP -- Lookup the connection based on the address/port
	  pairs in the ICMP message.
	* AH/ESP -- Lookup the SA based on the SPI in the ICMP message.

If validation succeeds, ctlinput is responsible for calling
icmp_mtudisc().  icmp_mtudisc() then invokes callbacks registered
by protocols (such as TCP) which want to take some sort of special
action when a path's MTU changes.  For TCP, this is where we now
refresh cached routes and re-enter slow-start.

As a side-effect, this fixes the problem where TCP would not be
notified when a path's MTU changed if AH/ESP were being used.

XXX Note, this is only a fix for the IPv4 case.  For the IPv6
XXX case, we need to wait for the KAME folks.

Reviewed by sommerfeld@netbsd.org and itojun@netbsd.org.
2000-10-18 17:09:14 +00:00
itojun
06700c02aa move tcp syn cache parameters from in_proto.c to tcp_subr.c.
it makes more sense and helps INET6-only (INET-less) build.
2000-10-18 07:21:10 +00:00
itojun
1eb3565937 allow INET6-less build.
From: smd@ebone.net (Sean Doran)
2000-10-17 21:16:57 +00:00
itojun
a7e15e4935 be more friendly with INET-less build.
XXX we need to do more to do a working INET-less build
2000-10-17 03:06:42 +00:00
thorpej
d839a91f5f Add an IP_MTUDISC flag to the flags that can be passed to
ip_output().  This flag, if set, causes ip_output() to set
DF in the IP header if the MTU in the route is not locked.

This allows a bunch of redundant code, which I was never
really all that happy about adding in the first place, to
be eliminated.

Inspired by a similar change made by provos@openbsd.org when
he integrated NetBSD's Path MTU Discovery code into OpenBSD.
2000-10-17 02:57:01 +00:00
itojun
6e3a9bc311 validate mbuf chain length on *_ctlinput. remote node may be able to
transmit a truncated icmp6 packet and panic the system.  sync with kame.
2000-10-13 17:53:44 +00:00
itojun
6572421763 make sure we don't share external mbuf between m and mcopy, in ip_forward().
should solve PR 11201.
2000-10-13 01:50:04 +00:00
itojun
8fa0e6b9f7 sync with kame ($KAME$) 2000-10-10 16:26:43 +00:00
itojun
97c873b9b0 ipfilter currently supports IPv4 only. do not try to touch non-IPv4
packets.  PR 11082.

This is a short-term workaround.  whenever new ipfilter comes out with
proper non-IPv4 support, we should migrate to the new ipfilter.
2000-10-08 13:01:30 +00:00
enami
a2b260195e - Keep track of allhost multicast address record we joined into
each in_ifaddr and delete it when an address is purged.
- Don't simply try to delete a multicast address record listed in the
  ia_multiaddrs.  It results a dangling pointer.  Let who holds a
  reference to it to delete it.
2000-10-08 09:15:28 +00:00
itojun
48cc942e89 implement multicast kludge table for IPv4.
- when all the interface address is removed from an interface, and there's
  multicast groups still left joined, keep it in kludge table.
- when an interface address is added again, recover multicast groups from
  kludge table.
this will avoid problem with dangling in_ifaddr on pcmcia card removal,
due to the link from multicast group info (in_multi).

the code is basically from sys/netinet6/in6.c (jinmei@kame).

pointed out by: Shiva Shenoy <shiva_s@yahoo.com>
2000-10-08 02:05:47 +00:00
enami
d127401d7f Cosmetic changes to previous commit; indent break statement sanely. 2000-10-06 10:21:06 +00:00
enami
358aa75755 Just call matching purgeif/pcbpurgeif routine for the protocol family.
Without this, if a v6 address is placed before a v4 address in if_addrlist,
a PRU_PURGEIF request for v6 tcp protocol purges also v4 addresses and,
as a result, if_detach fails to request PRU_PURGEIF for v4 protocols
other than tcp.
2000-10-06 09:24:40 +00:00
itojun
654a1d9555 remove obsolete handling code for SIOCSIFPHY*. they are now in ifioctl().
sync with kame.
2000-10-06 05:07:41 +00:00
itojun
dcfe05e7c1 fix compilation without INET. fix confusion between ipsecstat and ipsec6stat.
sync with kame.
2000-10-02 03:55:41 +00:00
itojun
dde2adf8e4 for t_template, allocate mbuf cluster only if really necessary.
this avoids too aggressive memory usage on heavy load web server, for example.
From: Kevin Lahey <kml@dotrocket.com>

release and reallocate t_template, if t_template->m_len changes.
(this happens if we connect to IPv4 mapped destination and then IPv6
destination, on a single AF_INET6 socket)

KAME 1.26 -> 1.28
2000-09-19 18:21:41 +00:00
itojun
29a4fb39d9 minor typo. s/iPsec/IPsec/ 2000-08-30 15:04:45 +00:00
simonb
94d3076df3 #define<tab> cleanup. 2000-08-28 02:12:22 +00:00
itojun
26dc854c41 make sure anonport{min,max} is not negative number 2000-08-26 10:41:29 +00:00
tron
a97bfde931 Add new sysctl variables "net.inet.ip.lowportmin" and
"net.inet.ip.lowportmax" which can be used to the set minimum
and maximum port number assigned to sockets using
IP_PORTRANGE_LOW.
2000-08-25 13:35:05 +00:00
mjl
8358c07048 Add bpf tap to gre interface. 2000-08-25 00:51:20 +00:00
sommerfeld
867ca7767a Fill in next mtu field of NEEDFRAG ICMP error message.
From Marc Horowitz, pr10857
2000-08-22 16:02:16 +00:00
itojun
32e6a89b31 net.inet.tcp.rstratelimit is deprecated. make it invalid and return
ENOPROTOOPT.
2000-08-15 22:13:02 +00:00
jhawk
b70721109d Add kernel counters for arp events, displayable with netstat -s -f arp 2000-08-15 20:24:57 +00:00
enami
beb808e530 Put # endif directive after the right (i.e., matching) close brace
to prevent compilation error.
2000-08-12 14:17:13 +00:00
veego
fea1509f80 Apply fix from IWAMOTO Toshihiro in pr#10813:
rev 1.35 of ip_nat.c checks if packets are too short.
 For ICMP packets, this packet length checking double counts
 the length of an IP header contained in ICMP messages.
 So, unless ICMP packets are long enough (such as echo-reply),
 packets are mistakingly considered too short and are dropped.
2000-08-12 08:08:54 +00:00
veego
35df2482b0 Protect a IPLLOG with ifdef IPFILTER_LOG. Patch from Darren Reed. 2000-08-12 08:04:18 +00:00
veego
b3d0df91fb Resolve conflicts. 2000-08-09 21:00:39 +00:00
itojun
5e868d1e49 clearifications in icmp6 node query support.
XXX previous commit included "supported qtypes" icmp6 node query support.
sorry commit message was mistaken.
2000-08-03 16:30:37 +00:00
itojun
afa5315364 correct typo in #define. ICMP6_NI_SUCESS -> SUCCESS (notice missing C).
sync with kame.
2000-08-03 14:31:04 +00:00
itojun
6574aa66e8 inhibit error code from rtinit(). this happens when we try to assign
multiple addresses from same prefix, onto single interface.  PR 10427.


more info:
- 4.4BSD did not check return code from in_ifinit() at all.
  4.4BSD does not support multiple address from same prefix.
- past KAME change passed in{,6}_ifinit() to upwards, toward ifconfig(8).
  the behavior is filed as PR 10427.
- the commit inhibits EEXIST from rtinit(), hence partially recovers old
  4.4BSD behavior.
- the right thing to happen is to properly support multiple address assignment
  from the same prefix.  KAME tree has more extensive change, however, it needs
  much more time to get stabilized (rtentry refcnt change can cause serious
  issue, we really need to bake it before bring it to netbsd)
2000-08-02 15:03:02 +00:00
thorpej
bdb0f01b7c Slight adjustment to last, to allow the userland version to build. 2000-08-01 15:03:51 +00:00
thorpej
ead5ad8885 - ipl_enable(): -1 is not an error return. If initializing IP Filter
fails, return EIO instead.

- iplioctl(): If performing a NAT operation, and IP Filter is not
  yet initialized (e.g. by `ipf -E'), enable it implicitly before
  doing the NAT operation.
2000-08-01 03:46:09 +00:00
kleink
079b94ad72 Avoid recursion with traditional cpp. 2000-07-28 12:13:32 +00:00
itojun
63de4c2cb9 nuke the following sysctl variables. "ppsratelimit" should work better.
need to recompile sbin/sysctl after updating /usr/include.
	net.inet.tcp.rstratelimit
	net.inet.icmp.errratelimit
	net.inet6.icmp6.errratelimit
2000-07-28 04:06:52 +00:00
itojun
7abf4641c6 forgot to call tcp6_quench(). sync with kame. 2000-07-28 02:39:45 +00:00
itojun
928dfa5233 do not disable icmp error rate limitation for local address.
local address can be abused too.  pps rate limitation should work fine for
moderate amount of icmp errors.
2000-07-27 11:36:14 +00:00
itojun
dd9f2f7f1d implement net.inet.tcp.rstppslimit to limit TCP RSTs by packet-per-second
basis.  default: 100pps

set default value for net.inet.tcp.rstratelimit to 0 (disabled),
NOTE: it does not work right for smaller-than-1/hz interval.  maybe we should
nuke it, or make it impossible to set smaller-than-1/hz value.
2000-07-27 11:34:06 +00:00
itojun
a18c2d780f be proactive about unspecified IPv6 source address. pcb layer uses
unspecified address (::) to mean "unbounded" or "unconnected",
and can be confused by packets from outside.

use of :: as source is not documented well in IPv6 specification.

not sure if it presents a real threat.  the worst case scenario is a DoS
against TCP listening socket:
- outsider transmit TCP SYN with :: as IPv6 source
- receiving side creates TCP control block with:
	local address = my addres
	remote address = ::     (meaning "unconnected")
	state = SYN_RCVD
  note that SYN ACK will not be sent due to ip6_output() filter.
  this stays until it timeouts.
- the TCP control block prevents listening TCP control block from
  being contacted (DoS).

udp6/raw6 socket may have similar problem, but as they are connectionless,
it may too much to filter it out.
2000-07-27 06:18:13 +00:00
sommerfeld
73b6d9485c Drop packet, increment udps_badlen if the udp header length field
reports a size smaller than the udp header; defends against bogosity
detected by Assar Westerlund.

This patch and the previous ip_icmp.c change were the joint work of
assar, itojun, and myself.
2000-07-24 03:46:57 +00:00
sommerfeld
a0c29e06a3 Improve robustness of icmp_error():
- allow it to work when icmpreturndatabytes is sufficiently large that the
icmp error message doesn't fit in a header mbuf.
 - defend against mbuf chains shorter than their contained ip->ip_len.
2000-07-24 03:32:31 +00:00
itojun
ca777cb72c add an DIAGNOSTIC case for MCLBYTES assumption 2000-07-23 05:00:01 +00:00
itojun
f5211e847a remove m_pulldown statistics code. it is highly experimental and belong
to kame tree only (not for *bsd).
2000-07-13 05:34:21 +00:00
itojun
ab492849bc implement net.inet.icmp.errppslimit.
make default value for net.inet.icmp.erratelimit to 0, as < 10ms value
does not do the right thing.
2000-07-10 09:31:29 +00:00
itojun
8a661b9beb be more cautious about tcp option length field. drop bogus ones earlier.
not sure if there is a real threat or not, but it seems that there's
possibility for overrun/underrun (like non-NOP option with optlen > cnt).
2000-07-09 12:49:08 +00:00
itojun
ec67eee51f sync with kame.
introduce in6_{recover,embed}scope, for in-kernel scoped-address manipulation.
improve in6_pcbnotify.
2000-07-07 15:54:16 +00:00
itojun
210a3e2f80 remove unnecessary #include <netkey/key_debug.h>. from kame. 2000-07-06 12:51:39 +00:00
itojun
0a1e211454 - do not use bitfield for router renumbering header.
- add protection mechanism against ND cache corruption due to bad NUD hints.
- more stats
- icmp6 pps limitation.  TOOD: should implement ppsratecheck(9).
2000-07-06 12:36:18 +00:00
thorpej
70140a566d Some slight cleanup. 2000-07-06 04:34:26 +00:00
thorpej
9c86b65a92 Fix an omission in the gre cloning changes. 2000-07-05 22:45:25 +00:00
thorpej
6a900bc9ff Fix some zero-vs-NULL confusion. 2000-07-05 21:45:14 +00:00
thorpej
f77f419c50 Make that note that we really should be checking the viftable
in ip_mroute.c for duplicate tunnel entries, too.  Well, what
really needs to happen is that the mrouting code needs to be
changed to work w/ `gif' tunnels... but...
2000-07-05 21:32:51 +00:00
thorpej
4348603862 RFCs 1853, 2003, 2401 -- copy the DF bit. 2000-07-05 21:01:38 +00:00
thorpej
e5c397199f Use LIST_HEAD_INITIALIZER(), for correctness sake. 2000-07-05 18:45:26 +00:00
christos
f142d4254d added a linted comment about non-portable bitfields. Unfortunately it cannot
be fixed portably.
2000-07-05 02:45:03 +00:00
itojun
f0d7296dc1 typo in previous 2000-07-02 21:25:41 +00:00
itojun
e29fba4ba7 do not touch struct ip6stat on non-INET6 compilation.
From: Paul Goyette <paul@whooppee.com>
2000-07-02 21:05:41 +00:00
itojun
8ff902fca1 repair kernel faithd(8) support. there were two mistakes:
(1) tcp6_input dropped packets for translation
(2) in6_pcblookup_connect was too strict
2000-07-02 08:04:10 +00:00