Commit Graph

173 Commits

Author SHA1 Message Date
yamt
0ea22c32fa fix ipqent pool corruption problems. make tcp reass code use
its own pool of ipqent rather than sharing it with ip reass code.
PR/24782.
2004-09-15 09:21:22 +00:00
itojun
4ebcfcf29a fix MD5 signature support to actually validate inbound signature, and
drop packet if fails.
2004-05-18 14:44:14 +00:00
matt
da67d85073 Use EVCNT_ATTACH_STATIC{,2} 2004-05-01 02:20:42 +00:00
itojun
362e07a3c9 zero-clear ip6?pseudo before use 2004-04-26 05:18:13 +00:00
itojun
f103f9aee9 declare ip6_hdr_pseudo (for kernel only) and use it for TCP MD5 signature 2004-04-26 05:15:47 +00:00
itojun
67372cc454 sync comment with reality 2004-04-26 05:05:49 +00:00
itojun
e0395ac8f0 make TCP MD5 signature work with KAME IPSEC (#define IPSEC).
support IPv6 if KAME IPSEC (RFC is not explicit about how we make data stream
for checksum with IPv6, but i'm pretty sure using normal pseudo-header is the
right thing).

XXX
current TCP MD5 signature code has giant flaw:
it does not validate signature on input (can't believe it! what is the point?)
2004-04-26 03:54:28 +00:00
jonathan
887b782b0b Initial commit of a port of the FreeBSD implementation of RFC 2385
(MD5 signatures for TCP, as used with BGP).  Credit for original
FreeBSD code goes to Bruce M. Simpson, with FreeBSD sponsorship
credited to sentex.net.  Shortening of the setsockopt() name
attributed to Vincent Jardin.

This commit is a minimal, working version of the FreeBSD code, as
MFC'ed to FreeBSD-4. It has received minimal testing with a ttcp
modified to set the TCP-MD5 option; BMS's additions to tcpdump-current
(tcpdump -M) confirm that the MD5 signatures are correct.  Committed
as-is for further testing between a NetBSD BGP speaker (e.g., quagga)
and industry-standard BGP speakers (e.g., Cisco, Juniper).


NOTE: This version has two potential flaws. First, I do see any code
that verifies recieved TCP-MD5 signatures.  Second, the TCP-MD5
options are internally padded and assumed to be 32-bit aligned. A more
space-efficient scheme is to pack all TCP options densely (and
possibly unaligned) into the TCP header ; then do one final padding to
a 4-byte boundary.  Pre-existing comments note that accounting for
TCP-option space when we add SACK is yet to be done. For now, I'm
punting on that; we can solve it properly, in a way that will handle
SACK blocks, as a separate exercise.

In case a pullup to NetBSD-2 is requested, this adds sys/netipsec/xform_tcp.c
,and modifies:

sys/net/pfkeyv2.h,v 1.15
sys/netinet/files.netinet,v 1.5
sys/netinet/ip.h,v 1.25
sys/netinet/tcp.h,v 1.15
sys/netinet/tcp_input.c,v 1.200
sys/netinet/tcp_output.c,v 1.109
sys/netinet/tcp_subr.c,v 1.165
sys/netinet/tcp_usrreq.c,v 1.89
sys/netinet/tcp_var.h,v 1.109
sys/netipsec/files.netipsec,v 1.3
sys/netipsec/ipsec.c,v 1.11
sys/netipsec/ipsec.h,v 1.7
sys/netipsec/key.c,v 1.11
share/man/man4/tcp.4,v 1.16
lib/libipsec/pfkey.c,v 1.20
lib/libipsec/pfkey_dump.c,v 1.17
lib/libipsec/policy_token.l,v 1.8
sbin/setkey/parse.y,v 1.14
sbin/setkey/setkey.8,v 1.27
sbin/setkey/token.l,v 1.15

Note that the preceding two revisions to tcp.4 will be
required to cleanly apply this diff.
2004-04-25 22:25:03 +00:00
simonb
b5d0e6bf06 Initialise (most) pools from a link set instead of explicit calls
to pool_init.  Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

 Convert struct session, ucred and lockf to pools.
2004-04-25 16:42:40 +00:00
tls
7eb2f214d5 Change the default state of two tunables; bring our TCP a little bit
closer to normal behaviour for the current century.

New Reno is now on by default (which is really the only reasonable
choice, since we don't do SACK); instead of an initial window of 1
for non-local nets, we now use Sally Floyd's magic 4K rule.
2004-04-22 02:19:39 +00:00
itojun
f2e796b13f - respond to RST by ACK, as suggested in NISCC recommendation
- rate-limit ACKs against RSTs and SYNs
2004-04-20 16:52:12 +00:00
christos
90e1f431ca adjust to the sbreserve prototype change. 2004-04-17 15:18:53 +00:00
christos
99d2bc9467 PR/22551: Invoking tcpcb's get erroneously free'd resulting in to_ticks <= 0
assertion. Approved by he.
2004-04-05 21:49:21 +00:00
matt
9196bdd1f8 When accepting a peer's MSS, never let it drop below 256 (SLIP + TCP will
be the lowest MSS we should ever enounter).
2004-01-07 19:15:43 +00:00
thorpej
db71356cd1 - Change callout_setfunc() to require that the callout handle is already
initialized.  Update the txp(4) to compensate.
- Statically initialize the TCP timer callout handles in the tcpcb
  template.  We still use callout_setfunc(), but that call is now much
  less expensive.  Add a comment that the compiler is likely to unroll
  the loop (so don't sweat that it's there).
2003-10-27 16:52:01 +00:00
christos
649137925e initialize off 2003-10-25 08:13:28 +00:00
thorpej
9e4220c00a Oops, a little to aggressive in the previous patch; TCP_TIMER_INIT()
still needs to be in tcp_newtcpcb(), for now.  Pointed out by enami.
2003-10-22 05:55:54 +00:00
thorpej
31923baa46 Rather than zeroing a tcpcb structure and filling in all the fields
individually, create a tcpcb template pre-initialized (and pre-zero'd)
with the static and mostly-static tcpcb parameters.  The template is
now copied into the new tcpcb, which zeros and initializes most of the
tcpcb in one pass.  The template is kept up-to-date as TCP sysctl
variables are changed.

Combined with the previous sb_max change, TCP socket creation is now
25% faster.
2003-10-22 02:45:57 +00:00
thorpej
861856caa0 Add event counters that measure FAST_MBSEARCH. 2003-10-21 21:17:20 +00:00
mycroft
3114965161 Fix glaring errors in recent changes. 2003-09-25 00:59:31 +00:00
itojun
495bd5ff91 initialize ip_hl for ipsec policy lookup. PR kern/22715 2003-09-08 02:06:34 +00:00
itojun
32e3deae21 randomize IPv4/v6 fragment ID and IPv6 flowlabel. avoids predictability
of these fields.  ip_id.c is from openbsd.  ip6_id.c is adapted by kame.
2003-09-06 03:36:30 +00:00
itojun
495906ca8e revamp inpcb/in6pcb so that they are more aligned with each other.
in6pcb lookup now uses hash(9).
2003-09-04 09:16:57 +00:00
itojun
58f57a60fd tp could be null in tcp_respond() 2003-08-22 22:27:07 +00:00
itojun
11ede1ed88 remove ipsec_set/getsocket. now we explicitly pass socket * to ip{,6}_output. 2003-08-22 22:00:36 +00:00
itojun
82eb4ce914 change the additional arg to be passed to ip{,6}_output to struct socket *.
this fixes KAME policy lookup which was broken by the previous commit.
2003-08-22 21:53:01 +00:00
jonathan
902669955f Replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec.  Reviewed by itojun.
2003-08-22 20:20:09 +00:00
jonathan
28b5f5dfab (fast-ipsec): Add hooks to pass IPv4 IPsec traffic into fast-ipsec, if
configured with ``options FAST_IPSEC''.  Kernels with KAME IPsec or
with no IPsec should work as before.

All calls to ip_output() now always pass an additional compulsory
argument: the inpcb associated with the packet being sent,
or 0 if no inpcb is available.

Fast-ipsec tested with ICMP or UDP over ESP. TCP doesn't work, yet.
2003-08-15 03:42:00 +00:00
agc
aad01611e7 Move UCB-licensed code from 4-clause to 3-clause licence.
Patches provided by Joel Baker in PR 22364, verified by myself.
2003-08-07 16:26:28 +00:00
he
80ccb5520c As a temporary workaround, apply the fix from PR#20390, thereby
cooperating with the callout code in working around the race
condition caused by the TCP code's use of the callout facility.

Instead of unconditionally releasing memory in tcp_close() and
SYN_CACHE_PUT(), check whether any of the related callout handlers
are about to be invoked (but have not yet done callout_ack()), and
if so, just mark the associated data structure (tcpcb or syn cache
entry) as "dead", and test for this (and release storage) in the
callout handler functions.
2003-07-20 16:35:07 +00:00
ragge
9e2d68cb61 Make it possible to set TCP_INIT_WIN and TCP_INIT_WIN_LOCAL in the config
file as options.
2003-07-03 08:28:16 +00:00
fvdl
d5aece61d6 Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.
2003-06-29 22:28:00 +00:00
ragge
679db94879 Add code to remember where in the send queue of mbufs the last packet was
sent from. This change avoid a linear search through all mbufs when using
large TCP windows, and therefore permit high-speed connections on long
distances.

Tested on a 1 Gigabit connection between Luleå and San Francisco, a distance
of about 15000km.  With TCP windows of just over 20 Mbytes it could keep up
with 950Mbit/s.

After discussions with Matt Thomas and Jason Thorpe.
2003-06-29 18:58:26 +00:00
martin
d505b18964 Make sure to include opt_foo.h if a defflag option FOO is used. 2003-06-23 11:00:59 +00:00
thorpej
cdf1b0026c Allow TCP connections to hosts on a local network to use a larger
slow start initial window.  Default this larger initial window to
4 packets, allowing it to be adjusted with net.inet.tcp.init_win_local.
2003-03-01 04:40:27 +00:00
matt
65e5548a17 Add MBUFTRACE kernel option.
Do a little mbuf rework while here.  Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *).  These are not performance critical and making them
call m_get saves considerable space.  Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
2003-02-26 06:31:08 +00:00
scw
5521093d4b Quell an uninitialised variable warning. 2002-11-24 10:52:47 +00:00
lukem
3b5f6123fa fix typo in previous: s/tip/top/ 2002-10-22 07:22:19 +00:00
simonb
8b9702b758 Micro-optimisation: don't check if the high bit is set and then mask it
off - just mask it off anyways.  Saves a branch 50% of the time.
2002-10-22 02:53:59 +00:00
itojun
167b0b8ebd minor KNF 2002-09-25 11:19:23 +00:00
itojun
c00fa8dfd9 avoid swapping endian of ip_len and ip_off on mbuf, to meet with M_LEADINGSPACE
optimization made last year.  should solve PR 17867 and 10195.

IP_HDRINCL behavior of raw ip socket is kept unchanged.  we may want to
provide IP_HDRINCL variant that does not swap endian.
2002-08-14 00:23:27 +00:00
itojun
390ee363bd check AF_INET6 socketes when IPv4 "too big" messages arrive.
PR 17448
2002-07-01 20:51:25 +00:00
itojun
f192b66b94 whitespace 2002-06-09 16:33:36 +00:00
itojun
5c1df51d53 attach nd_ifinfo structure into if_afdata.
split IPv6 link MTU (advertised by RA) from real link MTU.
sync with kame
2002-05-29 07:53:39 +00:00
itojun
b9f810de55 use arc4random() on tcp iss generation 2002-05-28 10:17:27 +00:00
itojun
3e7ae517e0 path MTU discovery blackhole detection.
PR 12790 (sorry for not committing it for a long time)
2002-05-26 16:05:43 +00:00
matt
c03e11f081 Eliminate commons. 2002-05-12 20:33:50 +00:00
matt
e5555e5c26 Change struct ipqe to use TAILQ's instead of LIST's (primarily for TCP's
benefit currently).  Rework tcp_reass code to optimize the 4 most likely causes
of out-of-order packets: first OoO pkt, next OoO pkt in seq, OoO pkt is part
of new chuck of OoO packets, and the OoO pkt fills the first hole.  Add evcnts
to instrument tcp_reass (enabled by the options TCP_REASS_COUNTERS).  This is
part 1/2 of tcp_reass changes.
2002-05-07 02:59:38 +00:00
thorpej
9054daca3e * Instrument tcp_build_datapkt().
* Remove the code that allocates a cluster if the packet would
  fit in one; it totally defeats doing references to M_EXT mbufs
  in the socket buffer.  This drastically reduces the number of
  data copies in the tcp_output() path for applications which use
  large writes.  Kudos to Matt Thomas for pointing me in the right
  direction.
2002-04-27 01:47:58 +00:00
itojun
38f3d28842 have tcp6_drain 2002-03-15 09:25:41 +00:00