Commit Graph

186 Commits

Author SHA1 Message Date
ragge da20a11a23 Fix the bug in the tcp transmit prediction code.
During testing the prediction counters show a hit-rate on about 85% for
packets sent on a local LAN, and better than 99% for intercontinental
high-speed bulk traffic (!).
2003-10-24 10:25:40 +00:00
mycroft 5a8b331f54 Remove all the code to maintain ia_inpcbs. This information was only used to
close sockets on address changes, which was deemed to be a bad idea and was
summarily removed, so there is no point in wasting effort on maintaining it
any more.
2003-10-23 20:55:08 +00:00
itojun 644a4857fb cut-and-paste error. Valeriy E. Ushakov 2003-09-10 01:46:27 +00:00
itojun 99bc41d6fd if IPsec inbound policy mismatches, respond to SYN with RST (instead of
just dropping it), allow client to react quickly.
2003-09-10 00:58:29 +00:00
itojun 175c9afa3f clarify flowlabel handling 2003-09-06 03:12:51 +00:00
itojun 495906ca8e revamp inpcb/in6pcb so that they are more aligned with each other.
in6pcb lookup now uses hash(9).
2003-09-04 09:16:57 +00:00
itojun a3bad645a4 make sure so is properly initialized 2003-08-22 22:49:34 +00:00
itojun 11ede1ed88 remove ipsec_set/getsocket. now we explicitly pass socket * to ip{,6}_output. 2003-08-22 22:00:36 +00:00
itojun 82eb4ce914 change the additional arg to be passed to ip{,6}_output to struct socket *.
this fixes KAME policy lookup which was broken by the previous commit.
2003-08-22 21:53:01 +00:00
jonathan 902669955f Replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec.  Reviewed by itojun.
2003-08-22 20:20:09 +00:00
jonathan 6196bbe72d Honour the M_CSUM_NO_PSEUDOHDR, if set on inbound TCP and UDP packets.
Tested against  bcm5700 with patched if_bge.c.
2003-08-21 14:49:49 +00:00
jonathan 28b5f5dfab (fast-ipsec): Add hooks to pass IPv4 IPsec traffic into fast-ipsec, if
configured with ``options FAST_IPSEC''.  Kernels with KAME IPsec or
with no IPsec should work as before.

All calls to ip_output() now always pass an additional compulsory
argument: the inpcb associated with the packet being sent,
or 0 if no inpcb is available.

Fast-ipsec tested with ICMP or UDP over ESP. TCP doesn't work, yet.
2003-08-15 03:42:00 +00:00
agc aad01611e7 Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22364, verified by myself.
2003-08-07 16:26:28 +00:00
he 80ccb5520c As a temporary workaround, apply the fix from PR#20390, thereby
cooperating with the callout code in working around the race
condition caused by the TCP code's use of the callout facility.

Instead of unconditionally releasing memory in tcp_close() and
SYN_CACHE_PUT(), check whether any of the related callout handlers
are about to be invoked (but have not yet done callout_ack()), and
if so, just mark the associated data structure (tcpcb or syn cache
entry) as "dead", and test for this (and release storage) in the
callout handler functions.
2003-07-20 16:35:07 +00:00
ragge c6308a0598 Fix previous bug. Thanks to Enami for spotting the (obvious) error, and
to other people with much help with bug reports etc.
While fixing, change some of the code I added last time to make it
cleaner and simpler.
2003-07-02 19:33:20 +00:00
fvdl d5aece61d6 Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
2003-06-29 22:28:00 +00:00
ragge 679db94879 Add code to remember where in the send queue of mbufs the last packet was
sent from. This change avoid a linear search through all mbufs when using
large TCP windows, and therefore permit high-speed connections on long
distances.

Tested on a 1 Gigabit connection between Luleå and San Francisco, a distance
of about 15000km.  With TCP windows of just over 20 Mbytes it could keep up
with 950Mbit/s.

After discussions with Matt Thomas and Jason Thorpe.
2003-06-29 18:58:26 +00:00
matt 27e1742142 Change the way multicasts are kept. They now use a hash table in the same
manner as the ifaddr hash table.  By doing this, the mkludge code can go
away.  At the same time, keep track of what pcbs are using what ifaddr and
when an address is deleted from an interface, notify/abort all sockets
that have that address as a source.  Switch IGMP and multicasts to use pools
for allocation.  Fix a number of potential problems in the igmp code where
allocation failures could cause a trap/panic.
2003-06-15 02:49:32 +00:00
itojun 7cc3e999f7 inherit IPV6_V6ONLY from listening socket. PR 21713 2003-05-30 01:15:04 +00:00
itojun 6ca34aa391 no need for ip_v recovery in output path too
(tcp_template includes ip_v setting)
2003-05-17 17:16:20 +00:00
itojun b29a40989d ip checksum logic no longer damage ip_v 2003-05-17 17:08:15 +00:00
itojun 4008ec1218 use strlcpy 2003-05-16 03:56:49 +00:00
itojun 346e0198f0 always use PULLDOWN_TEST codepath. 2003-05-14 06:47:33 +00:00
thorpej cdf1b0026c Allow TCP connections to hosts on a local network to use a larger
slow start initial window.  Default this larger initial window to
4 packets, allowing it to be adjusted with net.inet.tcp.init_win_local.
2003-03-01 04:40:27 +00:00
matt 65e5548a17 Add MBUFTRACE kernel option.
Do a little mbuf rework while here.  Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *).  These are not performance critical and making them
call m_get saves considerable space.  Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
2003-02-26 06:31:08 +00:00
he 27bc436921 Swap neighboring lines of callout_init() and bzero() of container
struct in syn_cache_add(); the bzero() invalidates whatever
callout_init() has done (which might matter, but presently doesn't).
2003-02-25 22:12:24 +00:00
wiz 617b132aac Spell output with two ts. 2003-01-04 23:43:02 +00:00
perry 6858187df6 /*CONTCOND*/ while (0)'ed macros 2002-11-02 07:20:42 +00:00
thorpej 163bdfc19e Make sure TF_REQ_TSTMP and TF_REQ_SCALE get set correctly in the new
TCPCB in the passive-open case.

Fixes PR 18677.
2002-10-22 04:24:50 +00:00
simonb ce9de06a5d In tcp_input():
Remove the set-but-not-used "proto" variable.
 Guard the "ostate" variable in #ifdef TCP_DEBUG.
Remove the set-but-not-used "parentinpcb" variable in syn_cache_get().
2002-10-22 03:07:06 +00:00
itojun 2fffb9beb4 correct log_refused check (TH_SYN, !TH_RST and !TH_ACK). PR 18669 2002-10-16 15:15:28 +00:00
itojun 6dedde045a correct signedness mixup in pointer passing. sync w/kame 2002-09-11 02:41:19 +00:00
itojun 530771e5ef always consult SS_CANTRCVMORE. PR 18185 2002-09-05 23:02:18 +00:00
thorpej ec09d2df2a Fix a problem introduced in rev 1.103, where we recycle a TIME_WAIT
TCPCB .. the fields need to be converted back to net-order, because
the packet is checksummed after the TCPCB lookup happens.

From YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>.
2002-08-28 02:23:57 +00:00
itojun 436f2a58ac better sync w/kame on deprecated address handling. check af == AF_INET6. 2002-08-19 02:17:54 +00:00
itojun f00291d88b pull in deprecated address handling from KAME sys/netinet6/tcp6_input.c. 2002-08-19 02:13:46 +00:00
itojun c00fa8dfd9 avoid swapping endian of ip_len and ip_off on mbuf, to meet with M_LEADINGSPACE
optimization made last year.  should solve PR 17867 and 10195.

IP_HDRINCL behavior of raw ip socket is kept unchanged.  we may want to
provide IP_HDRINCL variant that does not swap endian.
2002-08-14 00:23:27 +00:00
wrstuden 332b66d974 When a new connection arrives on a listening port, copy over the
value of the TCP_NODELAY socket option from the listener to the
newly connected connection. Agrees with how Linux & FreeBSD behave,
and goes more with the spirit of accept(2) creating a socket with
the same properties as the listener.

Analysis by Kevin Lahey. Closes PR 17616 by myself.
2002-07-18 03:23:01 +00:00
thorpej 668640a43d Rename sbappend_stream() to sbappendstream(), per suggestion from
Jonathan Stone.
2002-07-03 21:36:57 +00:00
thorpej 0585ce1489 Make insertion of data into socket buffers O(C):
* Keep pointers to the first and last mbufs of the last record in the
  socket buffer.
* Use the sb_lastrecord pointer in the sbappend*() family of functions
  to avoid traversing the packet chain to find the last record.
* Add a new sbappend_stream() function for stream protocols which
  guarantee that there will never be more than one record in the
  socket buffer.  This function uses the sb_mbtail pointer to perform
  the data insertion.  Make TCP use sbappend_stream().

On a profiling run, this makes sbappend of a TCP transmission using
a 1M socket buffer go from 50% of the time to .02% of the time.

Thanks to Bill Sommerfeld and YAMAMOTO Takashi for their debugging
assistance!
2002-07-03 19:06:47 +00:00
thorpej 10c252ba47 Changes to allow the IPv4 and IPv6 layers to align headers themseves,
as necessary:
* Implement a new mbuf utility routine, m_copyup(), is is like
  m_pullup(), except that it always prepends and copies, rather
  than only doing so if the desired length is larger than m->m_len.
  m_copyup() also allows an offset into the destination mbuf, which
  allows space for packet headers, in the forwarding case.
* Add *_HDR_ALIGNED_P() macros for IP, IPv6, ICMP, and IGMP.  These
  macros expand to 1 if __NO_STRICT_ALIGNMENT is defined, so that
  architectures which do not have strict alignment constraints don't
  pay for the test or visit the new align-if-needed path.
* Use the new macros to check if a header needs to be aligned, or to
  assert that it already is, as appropriate.

Note: This code is still somewhat experimental.  However, the new
code path won't be visited if individual device drivers continue
to guarantee that packets are delivered to layer 3 already properly
aligned (which are rules that are already in use).
2002-06-30 22:40:32 +00:00
yamt 58077442ae split logging code in order to reduce maximum stack usage. 2002-06-29 04:13:21 +00:00
itojun fa53d749ff share policy-on-pcb for listening socket. sync w/kame
todo: share even more, avoid frequent updates of spidx
2002-06-11 19:39:59 +00:00
itojun f192b66b94 whitespace 2002-06-09 16:33:36 +00:00
itojun d208a22daa use arc4random() where possible.
XXX is it necessary to do microtime() on tcp syn cache?
2002-05-28 10:11:49 +00:00
matt e5555e5c26 Change struct ipqe to use TAILQ's instead of LIST's (primarily for TCP's
benefit currently).  Rework tcp_reass code to optimize the 4 most likely causes
of out-of-order packets: first OoO pkt, next OoO pkt in seq, OoO pkt is part
of new chuck of OoO packets, and the OoO pkt fills the first hole.  Add evcnts
to instrument tcp_reass (enabled by the options TCP_REASS_COUNTERS).  This is
part 1/2 of tcp_reass changes.
2002-05-07 02:59:38 +00:00
christos 4f0742e306 Change the multicast/broadcast test to happen later, and when we are
in listen mode. Fixes panic with telnet ::1 port, where the port is an
ipv4 open port.
2002-03-24 17:09:01 +00:00
itojun bd5373f4e2 no need to check in_broadaddr/IN_MULTICAST in dropwithreset label.
suggested by enami
2002-03-22 04:31:01 +00:00
itojun 1f14081709 make sure we don't touch "ip" in IPv6 path 2002-03-22 03:21:13 +00:00
christos 9c8babbd46 Drop connections to the broadcast address. From BUGTRAQ. This is a security
issue because it can by-pass ipf rules unintentionally.
2002-03-19 14:35:20 +00:00