Commit Graph

1099 Commits

Author SHA1 Message Date
thorpej c23fa5a752 Never send more than half a socket buffer of data. This insures that
we can always keep 2 packets on the wire, no matter what SO_SNDBUF is,
and therefore ACKs will never be delayed unless we run out of data to
transmit.  The problem is quite easy to tickle when the MTU of the
outgoing interface is larger than the socket buffer size (e.g. loopback).

Fix from Charles Hannum.
2002-08-20 16:29:42 +00:00
itojun 436f2a58ac better sync w/kame on deprecated address handling. check af == AF_INET6. 2002-08-19 02:17:54 +00:00
itojun f00291d88b pull in deprecated address handling from KAME sys/netinet6/tcp6_input.c. 2002-08-19 02:13:46 +00:00
itojun c00fa8dfd9 avoid swapping endian of ip_len and ip_off on mbuf, to meet with M_LEADINGSPACE
optimization made last year.  should solve PR 17867 and 10195.

IP_HDRINCL behavior of raw ip socket is kept unchanged.  we may want to
provide IP_HDRINCL variant that does not swap endian.
2002-08-14 00:23:27 +00:00
itojun 6446feb7a7 inject GRE packet to raw ip socket input, to support userland GRE decapsulator.
discussed on openbsd developers list.
2002-08-10 05:40:54 +00:00
itojun fc50f2e011 bring back old copyright notice lost in rev 1.15 (which is the authors' intent). 2002-07-31 04:07:20 +00:00
itojun d5e0a4aba9 remove packed attribute as it will cause data be unaligned 2002-07-31 03:18:04 +00:00
itojun f8e5e9c295 be friendly with gcc-3.1.1 -O2, which takes advantage of ANSI C
pointer aliasing rule (gcc optimization/7427).  from tsubai, sync w/kame
2002-07-29 09:14:36 +00:00
wrstuden 332b66d974 When a new connection arrives on a listening port, copy over the
value of the TCP_NODELAY socket option from the listener to the
newly connected connection. Agrees with how Linux & FreeBSD behave,
and goes more with the spirit of accept(2) creating a socket with
the same properties as the listener.

Analysis by Kevin Lahey. Closes PR 17616 by myself.
2002-07-18 03:23:01 +00:00
itojun 572c4c4a3f need to bzero() before rtalloc. KAME PR 432 2002-07-14 21:09:17 +00:00
thorpej 668640a43d Rename sbappend_stream() to sbappendstream(), per suggestion from
Jonathan Stone.
2002-07-03 21:36:57 +00:00
thorpej 0585ce1489 Make insertion of data into socket buffers O(C):
* Keep pointers to the first and last mbufs of the last record in the
  socket buffer.
* Use the sb_lastrecord pointer in the sbappend*() family of functions
  to avoid traversing the packet chain to find the last record.
* Add a new sbappend_stream() function for stream protocols which
  guarantee that there will never be more than one record in the
  socket buffer.  This function uses the sb_mbtail pointer to perform
  the data insertion.  Make TCP use sbappend_stream().

On a profiling run, this makes sbappend of a TCP transmission using
a 1M socket buffer go from 50% of the time to .02% of the time.

Thanks to Bill Sommerfeld and YAMAMOTO Takashi for their debugging
assistance!
2002-07-03 19:06:47 +00:00
itojun 390ee363bd check AF_INET6 socketes when IPv4 "too big" messages arrive.
PR 17448
2002-07-01 20:51:25 +00:00
christos dad84218d6 Fix iplog problem on sparc64 [from Tomi Nylund]
1. size_t is 64 bits, so use a u_32_t for iplused
	2. microtime() and friends expect a struct timeval,
	   passing the first of two unsigned longs will not cut it.
2002-07-01 13:55:35 +00:00
thorpej 10c252ba47 Changes to allow the IPv4 and IPv6 layers to align headers themseves,
as necessary:
* Implement a new mbuf utility routine, m_copyup(), is is like
  m_pullup(), except that it always prepends and copies, rather
  than only doing so if the desired length is larger than m->m_len.
  m_copyup() also allows an offset into the destination mbuf, which
  allows space for packet headers, in the forwarding case.
* Add *_HDR_ALIGNED_P() macros for IP, IPv6, ICMP, and IGMP.  These
  macros expand to 1 if __NO_STRICT_ALIGNMENT is defined, so that
  architectures which do not have strict alignment constraints don't
  pay for the test or visit the new align-if-needed path.
* Use the new macros to check if a header needs to be aligned, or to
  assert that it already is, as appropriate.

Note: This code is still somewhat experimental.  However, the new
code path won't be visited if individual device drivers continue
to guarantee that packets are delivered to layer 3 already properly
aligned (which are rules that are already in use).
2002-06-30 22:40:32 +00:00
yamt 58077442ae split logging code in order to reduce maximum stack usage. 2002-06-29 04:13:21 +00:00
enami 6aad1636a8 If we need to fix up ar_hrd field, we must do it before using ar_tpa/tha. 2002-06-25 04:16:31 +00:00
itojun a5b52729e6 in arprequest(), fill ar_hrd only for IEEE1394. for other cases,
ifp->if_output will fill it for us.
2002-06-25 04:04:53 +00:00
enami 96fe4d7666 No need to include same file twice. 2002-06-25 02:55:14 +00:00
enami 4b27343d39 Use if_addrlen macro rather than if_data.ifi_addrlen. 2002-06-25 02:53:27 +00:00
enami 37f335b28b The ieee1394 arp reply should be broadcast. 2002-06-24 21:25:34 +00:00
enami 36f1c19838 Don't use a pointer before it is initialized. 2002-06-24 10:52:15 +00:00
itojun 570a3e1f3d set ar_hrd for RFC-defined cases 2002-06-24 08:42:33 +00:00
itojun e03a874f74 set ia as well 2002-06-24 08:11:30 +00:00
itojun 0143dfc42f integrate IEEE1394 ARP into generic ARP logic.
XXX there's no check at all in ar_hrd, and we don't set ar_hrd on outgoing.
it seems like a bad thing.
2002-06-24 08:06:20 +00:00
itojun c474c560dd do not consult routing table under the following condition:
- the destination is IPv4 multicast or 255.255.255.255, and
- outgoing interface is specified via socket option

this simplifies operation of routed
(no longer reqiure 224.0.0.0/4 to be set up)
2002-06-24 08:01:35 +00:00
thorpej 8038dd2cbe Disable TCP Congestion Window Monitoring by default; there are
performance problems in the face of tinygrams.
2002-06-13 16:31:05 +00:00
itojun 9368c444df set IPv4 parameter to modern value.
- turn on path MTU discovery (previous: turned off)
- ICMPv4 redirect entry timeout = 600 sec (previous: never timeout)
2002-06-13 16:25:54 +00:00
itojun fa53d749ff share policy-on-pcb for listening socket. sync w/kame
todo: share even more, avoid frequent updates of spidx
2002-06-11 19:39:59 +00:00
itojun 2a8a7da29d style 2002-06-09 19:49:49 +00:00
itojun f192b66b94 whitespace 2002-06-09 16:33:36 +00:00
itojun 39af55e317 enforce IPv4 link MTU for FDDI and ARCNET even in RTF_GATEWAY case.
PR 17151.
2002-06-09 05:09:26 +00:00
itojun 6d8d0d63d8 sync with latest KAME in6_ifaddr/prefix/default router manipulation.
behavior changes:
- two iocts used by ndp(8) are now obsolete (backward compat provided).
  use sysctl path instead.
- lo0 does not get ::1 automatically.  it will get ::1 when lo0 comes up.
2002-06-08 21:22:29 +00:00
itojun 14df31ceb3 look at rmx_mtu on IPsec tunnel MTU computation.
From: David Waitzman <djw@bbn.com>
2002-06-07 13:43:47 +00:00
itojun f45a8e9eb0 typo/bound check fix from YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp> 2002-06-05 13:11:34 +00:00
itojun fb9b52398c in mss clamping code, do not go past TCPOPT_EOL. enforce stricter
boundary checking.  discussed on tech-net
2002-06-04 10:06:27 +00:00
yamt 0f40d327f4 make "keep state" work for SYN without win scale option. 2002-06-01 07:21:11 +00:00
itojun 02dd12d915 since if_mtu is u_long, use u_long for mtu. 2002-05-31 05:26:42 +00:00
itojun 5c1df51d53 attach nd_ifinfo structure into if_afdata.
split IPv6 link MTU (advertised by RA) from real link MTU.
sync with kame
2002-05-29 07:53:39 +00:00
itojun ede265fffd move per-interface ip6/icmp6 stat to ifnet->if_afdata. sync w/kame 2002-05-29 02:58:28 +00:00
itojun bbc84065b6 use arc4random 2002-05-29 01:33:45 +00:00
itojun 4121fa09fc correct in*_pcbrtentry. check cached value correctly. 2002-05-28 11:10:52 +00:00
itojun b9f810de55 use arc4random() on tcp iss generation 2002-05-28 10:17:27 +00:00
itojun d208a22daa use arc4random() where possible.
XXX is it necessary to do microtime() on tcp syn cache?
2002-05-28 10:11:49 +00:00
itojun 7410ea60ca in in*_pcbrtentry(), check if route is still valid (RTF_UP),
and address family is still valid.
2002-05-28 10:07:51 +00:00
itojun 3e7ae517e0 path MTU discovery blackhole detection.
PR 12790 (sorry for not committing it for a long time)
2002-05-26 16:05:43 +00:00
kleink 1b8d8d79a8 Define uint{8,32}_t locally, per XNS5.2/POSIX-2001, and use them in this
header where applicable; use private fixed-width integer types otherwise.
2002-05-13 13:34:32 +00:00
kleink 602066c0d6 Provide local definitions of in_{addr,port}_t in <netinet/in.h> and use
them where deemed appropriate by XNS5.2/POSIX-2001.
2002-05-12 23:04:15 +00:00
matt c03e11f081 Eliminate commons. 2002-05-12 20:33:50 +00:00
wiz d30d25dc1a Spelling fixes, from Sergey Svishchev in kern/16650. 2002-05-12 15:48:36 +00:00
itojun 31a6ad2757 backout 1.72. it is not correct for the kernel to remove routes by itself,
and the code was buggy (dereferenced null pointer when IFAFREE removes the
route).
2002-05-09 06:49:15 +00:00
matt e5555e5c26 Change struct ipqe to use TAILQ's instead of LIST's (primarily for TCP's
benefit currently).  Rework tcp_reass code to optimize the 4 most likely causes
of out-of-order packets: first OoO pkt, next OoO pkt in seq, OoO pkt is part
of new chuck of OoO packets, and the OoO pkt fills the first hole.  Add evcnts
to instrument tcp_reass (enabled by the options TCP_REASS_COUNTERS).  This is
part 1/2 of tcp_reass changes.
2002-05-07 02:59:38 +00:00
martti 6f5d858e4b Fix compilation problems 2002-05-02 17:13:27 +00:00
martti e74092de02 Upgraded IPFilter to 3.4.27 2002-05-02 17:11:37 +00:00
thorpej 9054daca3e * Instrument tcp_build_datapkt().
* Remove the code that allocates a cluster if the packet would
  fit in one; it totally defeats doing references to M_EXT mbufs
  in the socket buffer.  This drastically reduces the number of
  data copies in the tcp_output() path for applications which use
  large writes.  Kudos to Matt Thomas for pointing me in the right
  direction.
2002-04-27 01:47:58 +00:00
matt 79b1afa490 Change test for M_EXT to M_READONLY for MROUTING. We only need to to do
a pullup if we aren't allowed to modify the packet.
2002-04-18 22:33:21 +00:00
itojun 45451927ec correct variable initialization. reported by fujitsu folks 2002-04-10 09:18:57 +00:00
thorpej f0bde82437 Add missing #else 2002-04-09 02:20:10 +00:00
jdolecek b10eb8758b Disable the H.323 proxy again - it's too buggy to be supported option
for now. Suggested by Matthew Green and Bernd Ernesti.
2002-04-01 18:07:10 +00:00
jdolecek af2aedbe22 put back ip_h323_pxy.c - the QNX licence seems to be okay upon
further examination
2002-04-01 16:50:08 +00:00
jdolecek c56211c431 add __KERNEL_RCSID() 2002-04-01 16:47:46 +00:00
jdolecek 69b18217c3 add RCS IDs 2002-04-01 16:45:24 +00:00
jdolecek 905b8db7c7 add __KERNEL_RCSID() 2002-04-01 16:44:28 +00:00
jdolecek cedc0276dc Import H.323 proxy of IPFilter 3.4.25. Upon closer examination,
the QNX licence seems to be allow both non-commercial and commercial
use actually.

According to Darren, the H.323 proxy code is buggy ATM, but is imported
here for reference anyway.
2002-04-01 16:29:31 +00:00
itojun 2f227734df do not consider /32 address itself as broadcast.
with /32 address, in_addr == in_broadaddr.
2002-03-30 00:40:32 +00:00
christos 4f0742e306 Change the multicast/broadcast test to happen later, and when we are
in listen mode. Fixes panic with telnet ::1 port, where the port is an
ipv4 open port.
2002-03-24 17:09:01 +00:00
itojun bd5373f4e2 no need to check in_broadaddr/IN_MULTICAST in dropwithreset label.
suggested by enami
2002-03-22 04:31:01 +00:00
itojun 1f14081709 make sure we don't touch "ip" in IPv6 path 2002-03-22 03:21:13 +00:00
christos 9c8babbd46 Drop connections to the broadcast address. From BUGTRAQ. This is a security
issue because it can by-pass ipf rules unintentionally.
2002-03-19 14:35:20 +00:00
itojun 38f3d28842 have tcp6_drain 2002-03-15 09:25:41 +00:00
martin 58d564bc8c Add MSS clamping to the IP Filter NAT subsystem.
Configured by a new option "mssclamp" in NAT rules, like:

 map pppoe0 192.168.1.0/24 -> 0/32 mssclamp 1452

This is based on work by Xiaodan Tang <xtang@qnx.com>.
2002-03-14 21:46:54 +00:00
martti dd7a744e5a Added (char *) for pointer arithmetic 2002-03-14 12:34:29 +00:00
martti 3e033bc0f1 Removed unused proxy file 2002-03-14 12:34:25 +00:00
martti 83b3487b70 Upgraded IPFilter to 3.4.25 2002-03-14 12:32:36 +00:00
itojun 7f7fe98c2c support tcp_log_refused for IPv6. From: Andrew Brown <atatat@atatdot.net> 2002-03-12 04:36:47 +00:00
martin 0039b1300a KNFify my last change. 2002-03-11 10:06:12 +00:00
thorpej a180cee23b Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map).  Try to deal with this:

* Group all information about the backend allocator for a pool in a
  separate structure.  The pool references this structure, rather than
  the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
  to become available, but will still fail if it cannot callocate KVA
  space for the pages.  If this happens, carefully drain all pools using
  the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
  some pages, and use that information to make draining easier and more
  efficient.
* Get rid of PR_URGENT.  There was only one use of it, and it could be
  dealt with by the caller.

From art@openbsd.org.
2002-03-08 20:48:27 +00:00
itojun ac36f7cb2c bring in latest ALTQ from kjc. ALTQify some of the drivers. 2002-03-05 04:12:57 +00:00
sommerfeld 3406f0a3dd The "gif*" tunnelling interface does everything ipip does.
Move usage example from ipip.4 to gif.4
Excise ipip and stitch up the scars.
2002-03-04 13:24:06 +00:00
thorpej 1caa35aa0f In tcp_segsize(), move a label so that option length is considered
when using the default TCP MSS as well.  From Matt Thomas.
2002-03-01 22:54:09 +00:00
thorpej 10444ca48f In in_savemkludge() and in_restoremkludge(), don't insert into a new
list without removing from the old one first.

From Matt Thomas.
2002-03-01 22:51:28 +00:00
martin 75c5a16cfc Enforce a lower bound of 32 for tcp_mssdflt.
This avoids kernel crashes when we don't handle nonsensial values
like 0 gracefully. Better check here once beforehand than having to
check for non meaningful values in time critical paths (like tcp_output).

Fixes PR 15709.
2002-02-28 20:26:17 +00:00
itojun 8832af6e59 correctly enforce ipsec policy check on forwarding case.
From: Greg Troxel <gdt@ir.bbn.com>, Bill Chiarchiaro <wjc@work.cleartech.com>
2002-02-25 02:17:55 +00:00
martin a7d662b71c Clear M_BCAST and M_MCAST on outgoing mbufs.
Don't copy ttl from the inner packet to the encapsulating packet. Make
the outer ttl sysctl'able. This should close PR 14269 from Jasper Wallace
(change partly from there) and it makes traceroute work over gre tunnels.
2002-02-24 17:22:20 +00:00
christos 61e29fb60a Sean amended his patch not to include the IFAFREE() 2002-02-21 22:39:17 +00:00
christos 2446cd0b68 PR/15662: Sean Boudreau: make sure we clean all routes of an interface when
we change its ip address.
2002-02-21 21:59:16 +00:00
itojun 9c68db2bfc suppress source quence message, based on router-req RFC (also could be abused
as DoS traffic generator).  from kjc/kame
2002-02-21 08:39:33 +00:00
thorpej 35a343b018 IFF_POINTTOPOINT interfaces can also transmit packets to broadcast
destinations.
2002-02-07 21:47:45 +00:00
thorpej eb79ee01a8 ip_mloopback(): process the delayed checksum on the copy, not
the original mbuf.
2002-02-06 18:00:01 +00:00
itojun d303c80bfb correct bad ip checksum on multicast loopback packet. PR14597 2002-01-31 07:45:22 +00:00
martti b035470c38 Fixed initialization 2002-01-24 08:24:59 +00:00
martti 7a8f11612c Re-sync with IPFilter 2002-01-24 08:23:40 +00:00
martti b9920d0f43 Upgraded IPFilter to 3.4.23 2002-01-24 08:21:30 +00:00
martti b0499f9062 Import IPFilter 3.4.23 2002-01-24 08:18:28 +00:00
itojun a709c83618 place NRL copyright notice itself, not a reference to it. 2002-01-24 02:12:29 +00:00
itojun ae1b9c29e9 make sure to check address family on route cache. with IPv4 mapped
address we can see both AF_INET/INET6.
2002-01-22 03:53:55 +00:00
itojun 1cc58965b6 don't panic when there's no interface address exist for the specified multicast
outgoing interface (ia == NULL after IFP_TO_IA).

historic behavior (up to revision 1.43) was to use 0.0.0.0 as source address,
but it seems like a mistake according to RFC1112/1122.
2002-01-08 10:05:13 +00:00
itojun 28922b9973 use radix table for inbound tunnel lookup (would increase performance
for machines with a lot of tunnels).
update route cache for IPvX-over-IPv6 tunnel on path MTU discovery.
snyc with kame
2001-12-21 06:30:43 +00:00
itojun af7e7f7b93 whitespace. protect from multiple inclusion. sync with kame 2001-12-21 04:11:24 +00:00
itojun 9fe96e61e6 call rip_ctlinput on icmp4 inputs 2001-12-21 04:07:25 +00:00