Commit Graph

1117 Commits

Author SHA1 Message Date
itojun 9d27b7540e one too many whitespace 2002-09-25 07:37:12 +00:00
martti 15e6ca78da Fix ipmon problems on 64-bit platforms (PR#17403 and PR#17404). 2002-09-25 06:43:17 +00:00
sommerfeld 17aee57321 Relax overly-conservative TCP option parsing used by ipnat when
hunting for an MSS option to clamp.  The previous code assumed that at least
one more byte of options (such as a TCPOPT_EOL) would follow the MSS
option; now, we allow the MSS option to end on the last byte of the
TCP header.

Packets have been observed "in the wild" with a TCP header length of
'6' (24 bytes.. 20 bytes fixed header, 4 bytes options) with a 4-byte
MSS option exactly filling the 4 bytes of options payload and no
following TCPOPT_EOL.

RFC793 is quite explicit that the EOL byte:

	" .. need only be used if the end of the options would not
	otherwise coincide with the end of the TCP header."
2002-09-24 14:14:25 +00:00
itojun 38e6856368 revert mtudisc_timeout value to the old one if update falis 2002-09-23 13:43:27 +00:00
simonb 4e3613273b Remove breaks after returns, unreachable returns and returns after
returns(!).
2002-09-23 05:51:10 +00:00
martti b69124b84c Resync with official IPF 2002-09-19 08:12:43 +00:00
martti 87f18f024e Upgraded IPFilter to 3.4.29 2002-09-19 08:08:14 +00:00
darrenr 04978010b2 From FreeBSD (1.164) courtesy of Maxim Konovalov:
"In rare cases when there is no room for ip options ip_insertoptions()
can fail and corrupt a header length.  Initialize len and check what
ip_insertoptions() returns."
2002-09-17 13:10:59 +00:00
mycroft 129af72834 In the txsegsize bounding code, it is not necessary to adjust for the options
length.
2002-09-13 18:26:55 +00:00
itojun 9401012487 KNF - return is not a function. sync w/kame. 2002-09-11 02:46:42 +00:00
itojun 6dedde045a correct signedness mixup in pointer passing. sync w/kame 2002-09-11 02:41:19 +00:00
enami c2428db9db Make usr.sbin/ipf/ipftest compiles again. 2002-09-07 00:10:24 +00:00
gehenna 5747ad0039 The device switch ``ipl_cdevsw'' is defined after 1.6H. 2002-09-06 14:00:00 +00:00
gehenna 77a6b82b27 Merge the gehenna-devsw branch into the trunk.
This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

	device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
  by using this grammer.

- Added the new naming convention.
  The name of the device switch must be <prefix>_[bc]devsw for auto-generation
  of device switch tables.

- The backward compatibility of loading block/character device
  switch by LKM framework is broken. This is necessary to convert
  from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
  We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
  the LKM framework will refer it to assign device major number dynamically.
2002-09-06 13:18:43 +00:00
itojun 530771e5ef always consult SS_CANTRCVMORE. PR 18185 2002-09-05 23:02:18 +00:00
itojun 98ba20f9e4 backout 1.78, ioctl(SIOCSIFADDR) is needed to test if the interface
supports AF_INET or not
2002-09-04 03:45:01 +00:00
itojun 91d888cd38 avoid SIOCSIFADDR if there's an IPv4 address already.
the comment doesn't match the behavior, it seems that the code assumed that
there's only one IPv4 address on an interface.  sync w/kame
2002-09-04 00:03:58 +00:00
thorpej ec09d2df2a Fix a problem introduced in rev 1.103, where we recycle a TIME_WAIT
TCPCB .. the fields need to be converted back to net-order, because
the packet is checksummed after the TCPCB lookup happens.

From YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>.
2002-08-28 02:23:57 +00:00
thorpej c23fa5a752 Never send more than half a socket buffer of data. This insures that
we can always keep 2 packets on the wire, no matter what SO_SNDBUF is,
and therefore ACKs will never be delayed unless we run out of data to
transmit.  The problem is quite easy to tickle when the MTU of the
outgoing interface is larger than the socket buffer size (e.g. loopback).

Fix from Charles Hannum.
2002-08-20 16:29:42 +00:00
itojun 436f2a58ac better sync w/kame on deprecated address handling. check af == AF_INET6. 2002-08-19 02:17:54 +00:00
itojun f00291d88b pull in deprecated address handling from KAME sys/netinet6/tcp6_input.c. 2002-08-19 02:13:46 +00:00
itojun c00fa8dfd9 avoid swapping endian of ip_len and ip_off on mbuf, to meet with M_LEADINGSPACE
optimization made last year.  should solve PR 17867 and 10195.

IP_HDRINCL behavior of raw ip socket is kept unchanged.  we may want to
provide IP_HDRINCL variant that does not swap endian.
2002-08-14 00:23:27 +00:00
itojun 6446feb7a7 inject GRE packet to raw ip socket input, to support userland GRE decapsulator.
discussed on openbsd developers list.
2002-08-10 05:40:54 +00:00
itojun fc50f2e011 bring back old copyright notice lost in rev 1.15 (which is the authors' intent). 2002-07-31 04:07:20 +00:00
itojun d5e0a4aba9 remove packed attribute as it will cause data be unaligned 2002-07-31 03:18:04 +00:00
itojun f8e5e9c295 be friendly with gcc-3.1.1 -O2, which takes advantage of ANSI C
pointer aliasing rule (gcc optimization/7427).  from tsubai, sync w/kame
2002-07-29 09:14:36 +00:00
wrstuden 332b66d974 When a new connection arrives on a listening port, copy over the
value of the TCP_NODELAY socket option from the listener to the
newly connected connection. Agrees with how Linux & FreeBSD behave,
and goes more with the spirit of accept(2) creating a socket with
the same properties as the listener.

Analysis by Kevin Lahey. Closes PR 17616 by myself.
2002-07-18 03:23:01 +00:00
itojun 572c4c4a3f need to bzero() before rtalloc. KAME PR 432 2002-07-14 21:09:17 +00:00
thorpej 668640a43d Rename sbappend_stream() to sbappendstream(), per suggestion from
Jonathan Stone.
2002-07-03 21:36:57 +00:00
thorpej 0585ce1489 Make insertion of data into socket buffers O(C):
* Keep pointers to the first and last mbufs of the last record in the
  socket buffer.
* Use the sb_lastrecord pointer in the sbappend*() family of functions
  to avoid traversing the packet chain to find the last record.
* Add a new sbappend_stream() function for stream protocols which
  guarantee that there will never be more than one record in the
  socket buffer.  This function uses the sb_mbtail pointer to perform
  the data insertion.  Make TCP use sbappend_stream().

On a profiling run, this makes sbappend of a TCP transmission using
a 1M socket buffer go from 50% of the time to .02% of the time.

Thanks to Bill Sommerfeld and YAMAMOTO Takashi for their debugging
assistance!
2002-07-03 19:06:47 +00:00
itojun 390ee363bd check AF_INET6 socketes when IPv4 "too big" messages arrive.
PR 17448
2002-07-01 20:51:25 +00:00
christos dad84218d6 Fix iplog problem on sparc64 [from Tomi Nylund]
1. size_t is 64 bits, so use a u_32_t for iplused
	2. microtime() and friends expect a struct timeval,
	   passing the first of two unsigned longs will not cut it.
2002-07-01 13:55:35 +00:00
thorpej 10c252ba47 Changes to allow the IPv4 and IPv6 layers to align headers themseves,
as necessary:
* Implement a new mbuf utility routine, m_copyup(), is is like
  m_pullup(), except that it always prepends and copies, rather
  than only doing so if the desired length is larger than m->m_len.
  m_copyup() also allows an offset into the destination mbuf, which
  allows space for packet headers, in the forwarding case.
* Add *_HDR_ALIGNED_P() macros for IP, IPv6, ICMP, and IGMP.  These
  macros expand to 1 if __NO_STRICT_ALIGNMENT is defined, so that
  architectures which do not have strict alignment constraints don't
  pay for the test or visit the new align-if-needed path.
* Use the new macros to check if a header needs to be aligned, or to
  assert that it already is, as appropriate.

Note: This code is still somewhat experimental.  However, the new
code path won't be visited if individual device drivers continue
to guarantee that packets are delivered to layer 3 already properly
aligned (which are rules that are already in use).
2002-06-30 22:40:32 +00:00
yamt 58077442ae split logging code in order to reduce maximum stack usage. 2002-06-29 04:13:21 +00:00
enami 6aad1636a8 If we need to fix up ar_hrd field, we must do it before using ar_tpa/tha. 2002-06-25 04:16:31 +00:00
itojun a5b52729e6 in arprequest(), fill ar_hrd only for IEEE1394. for other cases,
ifp->if_output will fill it for us.
2002-06-25 04:04:53 +00:00
enami 96fe4d7666 No need to include same file twice. 2002-06-25 02:55:14 +00:00
enami 4b27343d39 Use if_addrlen macro rather than if_data.ifi_addrlen. 2002-06-25 02:53:27 +00:00
enami 37f335b28b The ieee1394 arp reply should be broadcast. 2002-06-24 21:25:34 +00:00
enami 36f1c19838 Don't use a pointer before it is initialized. 2002-06-24 10:52:15 +00:00
itojun 570a3e1f3d set ar_hrd for RFC-defined cases 2002-06-24 08:42:33 +00:00
itojun e03a874f74 set ia as well 2002-06-24 08:11:30 +00:00
itojun 0143dfc42f integrate IEEE1394 ARP into generic ARP logic.
XXX there's no check at all in ar_hrd, and we don't set ar_hrd on outgoing.
it seems like a bad thing.
2002-06-24 08:06:20 +00:00
itojun c474c560dd do not consult routing table under the following condition:
- the destination is IPv4 multicast or 255.255.255.255, and
- outgoing interface is specified via socket option

this simplifies operation of routed
(no longer reqiure 224.0.0.0/4 to be set up)
2002-06-24 08:01:35 +00:00
thorpej 8038dd2cbe Disable TCP Congestion Window Monitoring by default; there are
performance problems in the face of tinygrams.
2002-06-13 16:31:05 +00:00
itojun 9368c444df set IPv4 parameter to modern value.
- turn on path MTU discovery (previous: turned off)
- ICMPv4 redirect entry timeout = 600 sec (previous: never timeout)
2002-06-13 16:25:54 +00:00
itojun fa53d749ff share policy-on-pcb for listening socket. sync w/kame
todo: share even more, avoid frequent updates of spidx
2002-06-11 19:39:59 +00:00
itojun 2a8a7da29d style 2002-06-09 19:49:49 +00:00
itojun f192b66b94 whitespace 2002-06-09 16:33:36 +00:00
itojun 39af55e317 enforce IPv4 link MTU for FDDI and ARCNET even in RTF_GATEWAY case.
PR 17151.
2002-06-09 05:09:26 +00:00
itojun 6d8d0d63d8 sync with latest KAME in6_ifaddr/prefix/default router manipulation.
behavior changes:
- two iocts used by ndp(8) are now obsolete (backward compat provided).
  use sysctl path instead.
- lo0 does not get ::1 automatically.  it will get ::1 when lo0 comes up.
2002-06-08 21:22:29 +00:00
itojun 14df31ceb3 look at rmx_mtu on IPsec tunnel MTU computation.
From: David Waitzman <djw@bbn.com>
2002-06-07 13:43:47 +00:00
itojun f45a8e9eb0 typo/bound check fix from YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp> 2002-06-05 13:11:34 +00:00
itojun fb9b52398c in mss clamping code, do not go past TCPOPT_EOL. enforce stricter
boundary checking.  discussed on tech-net
2002-06-04 10:06:27 +00:00
yamt 0f40d327f4 make "keep state" work for SYN without win scale option. 2002-06-01 07:21:11 +00:00
itojun 02dd12d915 since if_mtu is u_long, use u_long for mtu. 2002-05-31 05:26:42 +00:00
itojun 5c1df51d53 attach nd_ifinfo structure into if_afdata.
split IPv6 link MTU (advertised by RA) from real link MTU.
sync with kame
2002-05-29 07:53:39 +00:00
itojun ede265fffd move per-interface ip6/icmp6 stat to ifnet->if_afdata. sync w/kame 2002-05-29 02:58:28 +00:00
itojun bbc84065b6 use arc4random 2002-05-29 01:33:45 +00:00
itojun 4121fa09fc correct in*_pcbrtentry. check cached value correctly. 2002-05-28 11:10:52 +00:00
itojun b9f810de55 use arc4random() on tcp iss generation 2002-05-28 10:17:27 +00:00
itojun d208a22daa use arc4random() where possible.
XXX is it necessary to do microtime() on tcp syn cache?
2002-05-28 10:11:49 +00:00
itojun 7410ea60ca in in*_pcbrtentry(), check if route is still valid (RTF_UP),
and address family is still valid.
2002-05-28 10:07:51 +00:00
itojun 3e7ae517e0 path MTU discovery blackhole detection.
PR 12790 (sorry for not committing it for a long time)
2002-05-26 16:05:43 +00:00
kleink 1b8d8d79a8 Define uint{8,32}_t locally, per XNS5.2/POSIX-2001, and use them in this
header where applicable; use private fixed-width integer types otherwise.
2002-05-13 13:34:32 +00:00
kleink 602066c0d6 Provide local definitions of in_{addr,port}_t in <netinet/in.h> and use
them where deemed appropriate by XNS5.2/POSIX-2001.
2002-05-12 23:04:15 +00:00
matt c03e11f081 Eliminate commons. 2002-05-12 20:33:50 +00:00
wiz d30d25dc1a Spelling fixes, from Sergey Svishchev in kern/16650. 2002-05-12 15:48:36 +00:00
itojun 31a6ad2757 backout 1.72. it is not correct for the kernel to remove routes by itself,
and the code was buggy (dereferenced null pointer when IFAFREE removes the
route).
2002-05-09 06:49:15 +00:00
matt e5555e5c26 Change struct ipqe to use TAILQ's instead of LIST's (primarily for TCP's
benefit currently).  Rework tcp_reass code to optimize the 4 most likely causes
of out-of-order packets: first OoO pkt, next OoO pkt in seq, OoO pkt is part
of new chuck of OoO packets, and the OoO pkt fills the first hole.  Add evcnts
to instrument tcp_reass (enabled by the options TCP_REASS_COUNTERS).  This is
part 1/2 of tcp_reass changes.
2002-05-07 02:59:38 +00:00
martti 6f5d858e4b Fix compilation problems 2002-05-02 17:13:27 +00:00
martti e74092de02 Upgraded IPFilter to 3.4.27 2002-05-02 17:11:37 +00:00
thorpej 9054daca3e * Instrument tcp_build_datapkt().
* Remove the code that allocates a cluster if the packet would
  fit in one; it totally defeats doing references to M_EXT mbufs
  in the socket buffer.  This drastically reduces the number of
  data copies in the tcp_output() path for applications which use
  large writes.  Kudos to Matt Thomas for pointing me in the right
  direction.
2002-04-27 01:47:58 +00:00
matt 79b1afa490 Change test for M_EXT to M_READONLY for MROUTING. We only need to to do
a pullup if we aren't allowed to modify the packet.
2002-04-18 22:33:21 +00:00
itojun 45451927ec correct variable initialization. reported by fujitsu folks 2002-04-10 09:18:57 +00:00
thorpej f0bde82437 Add missing #else 2002-04-09 02:20:10 +00:00
jdolecek b10eb8758b Disable the H.323 proxy again - it's too buggy to be supported option
for now. Suggested by Matthew Green and Bernd Ernesti.
2002-04-01 18:07:10 +00:00
jdolecek af2aedbe22 put back ip_h323_pxy.c - the QNX licence seems to be okay upon
further examination
2002-04-01 16:50:08 +00:00
jdolecek c56211c431 add __KERNEL_RCSID() 2002-04-01 16:47:46 +00:00
jdolecek 69b18217c3 add RCS IDs 2002-04-01 16:45:24 +00:00
jdolecek 905b8db7c7 add __KERNEL_RCSID() 2002-04-01 16:44:28 +00:00
jdolecek cedc0276dc Import H.323 proxy of IPFilter 3.4.25. Upon closer examination,
the QNX licence seems to be allow both non-commercial and commercial
use actually.

According to Darren, the H.323 proxy code is buggy ATM, but is imported
here for reference anyway.
2002-04-01 16:29:31 +00:00
itojun 2f227734df do not consider /32 address itself as broadcast.
with /32 address, in_addr == in_broadaddr.
2002-03-30 00:40:32 +00:00
christos 4f0742e306 Change the multicast/broadcast test to happen later, and when we are
in listen mode. Fixes panic with telnet ::1 port, where the port is an
ipv4 open port.
2002-03-24 17:09:01 +00:00
itojun bd5373f4e2 no need to check in_broadaddr/IN_MULTICAST in dropwithreset label.
suggested by enami
2002-03-22 04:31:01 +00:00
itojun 1f14081709 make sure we don't touch "ip" in IPv6 path 2002-03-22 03:21:13 +00:00
christos 9c8babbd46 Drop connections to the broadcast address. From BUGTRAQ. This is a security
issue because it can by-pass ipf rules unintentionally.
2002-03-19 14:35:20 +00:00
itojun 38f3d28842 have tcp6_drain 2002-03-15 09:25:41 +00:00
martin 58d564bc8c Add MSS clamping to the IP Filter NAT subsystem.
Configured by a new option "mssclamp" in NAT rules, like:

 map pppoe0 192.168.1.0/24 -> 0/32 mssclamp 1452

This is based on work by Xiaodan Tang <xtang@qnx.com>.
2002-03-14 21:46:54 +00:00
martti dd7a744e5a Added (char *) for pointer arithmetic 2002-03-14 12:34:29 +00:00
martti 3e033bc0f1 Removed unused proxy file 2002-03-14 12:34:25 +00:00
martti 83b3487b70 Upgraded IPFilter to 3.4.25 2002-03-14 12:32:36 +00:00
itojun 7f7fe98c2c support tcp_log_refused for IPv6. From: Andrew Brown <atatat@atatdot.net> 2002-03-12 04:36:47 +00:00
martin 0039b1300a KNFify my last change. 2002-03-11 10:06:12 +00:00
thorpej a180cee23b Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map).  Try to deal with this:

* Group all information about the backend allocator for a pool in a
  separate structure.  The pool references this structure, rather than
  the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
  to become available, but will still fail if it cannot callocate KVA
  space for the pages.  If this happens, carefully drain all pools using
  the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
  some pages, and use that information to make draining easier and more
  efficient.
* Get rid of PR_URGENT.  There was only one use of it, and it could be
  dealt with by the caller.

From art@openbsd.org.
2002-03-08 20:48:27 +00:00
itojun ac36f7cb2c bring in latest ALTQ from kjc. ALTQify some of the drivers. 2002-03-05 04:12:57 +00:00
sommerfeld 3406f0a3dd The "gif*" tunnelling interface does everything ipip does.
Move usage example from ipip.4 to gif.4
Excise ipip and stitch up the scars.
2002-03-04 13:24:06 +00:00
thorpej 1caa35aa0f In tcp_segsize(), move a label so that option length is considered
when using the default TCP MSS as well.  From Matt Thomas.
2002-03-01 22:54:09 +00:00
thorpej 10444ca48f In in_savemkludge() and in_restoremkludge(), don't insert into a new
list without removing from the old one first.

From Matt Thomas.
2002-03-01 22:51:28 +00:00
martin 75c5a16cfc Enforce a lower bound of 32 for tcp_mssdflt.
This avoids kernel crashes when we don't handle nonsensial values
like 0 gracefully. Better check here once beforehand than having to
check for non meaningful values in time critical paths (like tcp_output).

Fixes PR 15709.
2002-02-28 20:26:17 +00:00