Commit Graph

1426 Commits

Author SHA1 Message Date
ragge 79edf5fba0 Send an arp request before the arp entry times out if the entry is active,
to avoid deleting active entries.
Add sysctl support to tune the default arp timeout values.
2004-04-28 14:09:36 +00:00
matt 5a0de7507d When a packet is received that overlaps the left side of the window,
check for RST *before* trimming data and adjust its sequence number.
2004-04-27 14:46:07 +00:00
itojun 362e07a3c9 zero-clear ip6?pseudo before use 2004-04-26 05:18:13 +00:00
itojun f103f9aee9 declare ip6_hdr_pseudo (for kernel only) and use it for TCP MD5 signature 2004-04-26 05:15:47 +00:00
itojun 67372cc454 sync comment with reality 2004-04-26 05:05:49 +00:00
itojun e0395ac8f0 make TCP MD5 signature work with KAME IPSEC (#define IPSEC).
support IPv6 if KAME IPSEC (RFC is not explicit about how we make data stream
for checksum with IPv6, but i'm pretty sure using normal pseudo-header is the
right thing).

XXX
current TCP MD5 signature code has giant flaw:
it does not validate signature on input (can't believe it! what is the point?)
2004-04-26 03:54:28 +00:00
matt 5413745100 Remove #else clause of __STDC__ 2004-04-26 01:31:56 +00:00
jonathan 887b782b0b Initial commit of a port of the FreeBSD implementation of RFC 2385
(MD5 signatures for TCP, as used with BGP).  Credit for original
FreeBSD code goes to Bruce M. Simpson, with FreeBSD sponsorship
credited to sentex.net.  Shortening of the setsockopt() name
attributed to Vincent Jardin.

This commit is a minimal, working version of the FreeBSD code, as
MFC'ed to FreeBSD-4. It has received minimal testing with a ttcp
modified to set the TCP-MD5 option; BMS's additions to tcpdump-current
(tcpdump -M) confirm that the MD5 signatures are correct.  Committed
as-is for further testing between a NetBSD BGP speaker (e.g., quagga)
and industry-standard BGP speakers (e.g., Cisco, Juniper).


NOTE: This version has two potential flaws. First, I do see any code
that verifies recieved TCP-MD5 signatures.  Second, the TCP-MD5
options are internally padded and assumed to be 32-bit aligned. A more
space-efficient scheme is to pack all TCP options densely (and
possibly unaligned) into the TCP header ; then do one final padding to
a 4-byte boundary.  Pre-existing comments note that accounting for
TCP-option space when we add SACK is yet to be done. For now, I'm
punting on that; we can solve it properly, in a way that will handle
SACK blocks, as a separate exercise.

In case a pullup to NetBSD-2 is requested, this adds sys/netipsec/xform_tcp.c
,and modifies:

sys/net/pfkeyv2.h,v 1.15
sys/netinet/files.netinet,v 1.5
sys/netinet/ip.h,v 1.25
sys/netinet/tcp.h,v 1.15
sys/netinet/tcp_input.c,v 1.200
sys/netinet/tcp_output.c,v 1.109
sys/netinet/tcp_subr.c,v 1.165
sys/netinet/tcp_usrreq.c,v 1.89
sys/netinet/tcp_var.h,v 1.109
sys/netipsec/files.netipsec,v 1.3
sys/netipsec/ipsec.c,v 1.11
sys/netipsec/ipsec.h,v 1.7
sys/netipsec/key.c,v 1.11
share/man/man4/tcp.4,v 1.16
lib/libipsec/pfkey.c,v 1.20
lib/libipsec/pfkey_dump.c,v 1.17
lib/libipsec/policy_token.l,v 1.8
sbin/setkey/parse.y,v 1.14
sbin/setkey/setkey.8,v 1.27
sbin/setkey/token.l,v 1.15

Note that the preceding two revisions to tcp.4 will be
required to cleanly apply this diff.
2004-04-25 22:25:03 +00:00
simonb b5d0e6bf06 Initialise (most) pools from a link set instead of explicit calls
to pool_init.  Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

 Convert struct session, ucred and lockf to pools.
2004-04-25 16:42:40 +00:00
itojun 22bdfd729d fix how we send RST against ACK. markus@openbsd 2004-04-25 03:29:11 +00:00
itojun 8a0aba4304 indent for little bit better readability 2004-04-25 00:08:54 +00:00
itojun 3b87628cfb fix comment; we no longer move ip+tcp into the same mbuf 2004-04-24 23:59:13 +00:00
matt 41478e7f33 Always include <sys/param.h> first! 2004-04-24 19:59:19 +00:00
ragge febf637b17 Avoid performance problem in tcp_reass() when appending mbufs to a chain
by keeping a pointer to the last mbuf in the chain.
2004-04-22 15:05:33 +00:00
tls 7eb2f214d5 Change the default state of two tunables; bring our TCP a little bit
closer to normal behaviour for the current century.

New Reno is now on by default (which is really the only reasonable
choice, since we don't do SACK); instead of an initial window of 1
for non-local nets, we now use Sally Floyd's magic 4K rule.
2004-04-22 02:19:39 +00:00
matt e50668c7fa Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.
2004-04-22 01:01:40 +00:00
itojun d2f1c029b9 kill sprintf, use snprintf 2004-04-21 18:40:37 +00:00
itojun e133d13e80 kill some strcpy 2004-04-21 18:16:14 +00:00
itojun 0f06e31eb6 no space between function name and paren: foo (blah) -> foo(blah) 2004-04-21 17:49:46 +00:00
matt e3b919c754 Constify if.c radix.c and route.c (and fix related fallout). 2004-04-21 04:17:28 +00:00
matt 30e63c6236 export tcpstates for _KERNEL and remove tcp_usrreq.c's incorrect
declartion.
2004-04-20 22:54:31 +00:00
itojun 6a16706746 follow draft-ietf-tcpm-tcpsecure-00.txt 3.2 (B):
if SYN is coming and RCV.NXT == SEG.SEQ, then ACK with value - 1.
2004-04-20 19:49:15 +00:00
itojun f2e796b13f - respond to RST by ACK, as suggested in NISCC recommendation
- rate-limit ACKs against RSTs and SYNs
2004-04-20 16:52:12 +00:00
matt 5060b3b780 ANSI'fy and de __P 2004-04-18 23:35:56 +00:00
matt db6a0b431a De __P() 2004-04-18 21:00:35 +00:00
matt 35b9f3ec72 If a segment is received with RST set and the segment is completely to the
left of the receive window, ignore it.  Add some additional comments to
the code that deals with received segemnts that are completely to the right
of the receive window.  If an invalid SYN is received, force an ACK and
drop it; if the other side really sent the SYN; it'll respond with a reset.
2004-04-17 23:35:37 +00:00
christos 90e1f431ca adjust to the sbreserve prototype change. 2004-04-17 15:18:53 +00:00
ragge 0a7fe37708 Add back one line which was accidentially removed (by me) a while ago.
Spotted by Markus Friedl (markus at openbsd.org).
2004-04-14 18:07:52 +00:00
christos 99d2bc9467 PR/22551: Invoking tcpcb's get erroneously free'd resulting in to_ticks <= 0
assertion. Approved by he.
2004-04-05 21:49:21 +00:00
matt efc47093e2 In ip_reass_ttl_descr, make i signed since it's compared to >= 0 2004-04-01 22:47:55 +00:00
martin 8afe56f1c5 A few more ioctl vs. copyin changes, spotted by Bill Studenmund. 2004-04-01 21:54:41 +00:00
martin 9d16150a8e Untangle ioctl copyin/copyout confusion. IP-Filter now actually works
on sparc64 (and probably everywhere else).
2004-04-01 09:24:58 +00:00
dyoung 957f9ce691 Only #define COPYIN copyin, et cetera, in the kernel. That is, only
when when _KERNEL is defined.
2004-03-31 20:58:15 +00:00
darrenr 077337039d COPYIN/COPYOUT macros need to call copyin/out on NetBSD rather than just use
bcopy.
2004-03-31 11:41:45 +00:00
itojun 7cd01f1c20 clean previous commit (uh_sum != 0 check in IPv6) 2004-03-31 07:57:06 +00:00
itojun 8d81738de0 drop packet if IPv6 udp packet does not have checksum (checksum is mandatory
in IPv6).
2004-03-31 07:54:00 +00:00
christos dc9378460c Make sure we disarm the persist timer before we arm the rexmit
timer, otherwise there is a tiny window where both timers are
active, and this is not correct according to the comments in the
code. I believe that this is the cause of the to_ticks <= 0 assertion
failure in callout_schedule() that I've been getting.
2004-03-30 19:58:14 +00:00
atatat 83b193a052 Make these compile without INET. tcp_input probably needs a lot more
work...
2004-03-29 04:59:02 +00:00
martin 665588c20c Cast 64 bit pointers only with (intptr_t) care. 2004-03-28 12:12:28 +00:00
martti 621e9bac7f Sync with official IPFilter 2004-03-28 09:01:26 +00:00
martti 24d567d60d Upgraded IPFilter to 4.1.1 2004-03-28 09:00:53 +00:00
martti ad9b29ed97 Import IPFilter 4.1.1 2004-03-28 08:55:20 +00:00
atatat 19af35fd0d Tango on sysctl_createv() and flags. The flags have all been renamed,
and sysctl_createv() now uses more arguments.
2004-03-24 15:34:46 +00:00
itojun 3811eef49d typo 2004-03-23 05:31:54 +00:00
drochner 6a4fbf616c fix tcp/udp checksum test in the M_CSUM_NO_PSEUDOHDR case
(this can never have worked)
now I can use a "bge" gigabit interface with hw checksumming
ttcp-t: 2147483648 bytes in 18.31 real seconds = 114527.11 KB/sec +++
woow!
2004-03-10 18:50:45 +00:00
wiz e8f4f5ba76 No need to include netinet/ip_mroute.h twice.
Closes PR 24652 by Kailash Sethuraman.
2004-03-04 15:15:06 +00:00
thorpej 8387ab32c5 Use IPSEC_PCB_SKIP_IPSEC() to short-circuit calls to ipsec{4,6}_hdrsiz_tcp(). 2004-03-03 05:59:38 +00:00
thorpej 2803ff0955 Use the new IPSEC_PCB_SKIP_IPSEC() to bypass a socket policy lookup
when possible.  This shaves several cycles from the output path for
non-IPsec connections, even if the policy is cached in the PCB.
2004-03-02 02:28:28 +00:00
thorpej 00f100daae Call ipsec_pcbconn() and ipsec_pcbdisconn() for FAST_IPSEC, too. 2004-03-02 02:26:28 +00:00
thorpej 979f197a86 Define a sotoinpcb_hdr() macro (a'la sotoinpcb()). 2004-03-02 02:11:14 +00:00
itojun 8ef33296ff KNF 2004-02-26 02:34:59 +00:00
wiz 73e1501b98 parameter with two es. From Peter Postma. 2004-02-24 15:22:01 +00:00
wiz f05e6f1a3a occured -> occurred. From Peter Postma. 2004-02-24 15:12:51 +00:00
itojun d334411bcd deal with IPv6 path MTU < 1280 (RFC2460 section 5 last paragraph).
check if there really is room for TCP data.
2004-02-04 05:36:03 +00:00
abs c02c2d8844 Allow DEF_NAT_AGE to be set in kernel config. 2004-01-16 09:01:22 +00:00
itojun 0146a277ba correct typo in 1.94 -> 1.95. pointed out by Shiva Shenoy 2004-01-15 05:13:17 +00:00
itojun 3ffdb9507a avoid deref-after-free.
http://sources.zabbadoz.net/freebsd/patchset/106-ipsec-pcb-discon.diff
2004-01-13 06:17:14 +00:00
matt 9196bdd1f8 When accepting a peer's MSS, never let it drop below 256 (SLIP + TCP will
be the lowest MSS we should ever enounter).
2004-01-07 19:15:43 +00:00
tron 784a553ad1 Remove extra tokens at end of #undef directive. 2004-01-03 22:34:38 +00:00
itojun 4fc59b19d5 no need for tmp = arc4randomid here 2004-01-02 20:51:51 +00:00
itojun 7cddb2827b whitespace 2004-01-02 15:51:45 +00:00
itojun 344b08b44b some corrections from markus@openbsd;
- callout_ack() was called with wrong argument
2004-01-02 15:51:04 +00:00
itojun 5377ace199 some corrections from markus@openbsd;
- callout_ack() was called with wrong argument
- no need for xor with timestamp as we are using arc4random()
- minor typo/cleanup
2004-01-02 12:01:39 +00:00
wiz d46bc94200 Niels Provos kindly agreed to drop clauses 3 and 4 from the
license -- thanks.
Based on OpenBSD commit and hints by itojun.
2003-12-26 19:04:55 +00:00
abs 8724ebf7f9 Comment out #undef LARGE_NAT so LARGE_NAT can be set in a kernel config file
without having to edit this file as well.
2003-12-16 12:15:04 +00:00
thorpej 0c4c58a70b Fix syntax errors in CHECK_NMBCLUSTER_PARAMS(). 2003-12-14 01:14:24 +00:00
jonathan 9c1a5c5570 Second part of hashed IP_reassembly changes:
When under pressure for mbufs or we have too many fragments in the IP
reassembly queue, drop half of all fragments. This multiplicative-drop
strategy ensures we return to a healthy state, even under borderline
denial-of-service from extremely lossy NFS-over-UDP peers.
The multiplicative-drop phase currently drops 50% of fragments, but
has pre-placed support for implementing drop-fractions other than 50%

The threshhold for the `drop-half' phase is the new variable,
ip_maxfrags which is calculated as nmbclusters/4.

ip_input.c now keeps ip_nmbclusters, a cached copy of nmbclusters.
Before using limits derived from nmbclusters, we check if nmbclusters
and ip_nmclusters are equal. If not, we recompute Ip parameters
derived from nmbclusters.  Based on a suggestion by Jason Thorpe.
ip_maxfrags is currently auto-recalcuated.

The counters ip_nfrags and ip_nfragpacketsr are now declared static
and uninitialized (bss), to discourage tampering with them.
2003-12-14 00:09:24 +00:00
scw 6aec1d6812 Make fast-ipsec and ipflow (Fast Forwarding) interoperate.
The idea is that we only clear M_CANFASTFWD if an SPD exists
for the packet. Otherwise, it's safe to add a fast-forward
cache entry for the route.

To make this work properly, we invalidate the entire ipflow
cache if a fast-ipsec key is added or changed.
2003-12-12 21:17:59 +00:00
itojun aa8a6718f0 use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index has different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
since when we have introduced dynamically-created interfaces.  from kame
2003-12-10 11:46:33 +00:00
itojun c81f32fe6c comment from niels provos;
- seed2 is necessary, but use it as "seed2 + x" not "seed2 ^ x".
- skipping number is not needed, so disable it for 16bit generator (makes
  the repetition period to 30000)
2003-12-10 05:22:18 +00:00
jonathan 626b230d59 Add new field ipq_nfrags to struct ipq. Maintain count of fragments
(fragments, not fragmented packets) in each queue entry.
Use ipq_nfrags to maintain a count of total fragments in reassembly queue.
2003-12-08 02:23:27 +00:00
jonathan 27171efb6d KNF: s/unsigned/u_int/, in a couple of places I missed. 2003-12-07 01:18:26 +00:00
jonathan c56097abb8 Replace the single global IP reassembly list/listhead, with a
hashtable of list-heads. Independently re-invented, then reworked to
match similar code in FreeBSD.
2003-12-06 23:56:10 +00:00
atatat 13f8d2ce5f Dynamic sysctl.
Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al.  Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.

Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded.  Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.

All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.

PS - I'm sorry, but there's a distinct lack of documentation at the
moment.  I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
2003-12-04 19:38:21 +00:00
christos 0aac876eae fix unused variable warnings when LARGE_NAT is defined. 2003-12-04 15:32:01 +00:00
scw 7ef39665ff ipflow (IP fast forwarding) is not compatible with FAST_IPSEC either.
XXX: The decision whether or not to fast forward should be made
XXX: dynamically. Using the current approach seriously reduces
XXX: routing performance on gateways with IPsec enabled.
2003-12-04 10:02:35 +00:00
itojun a748550c99 always compile ip_id.c 2003-11-26 21:26:56 +00:00
itojun 326cfe57d2 define RANDOM_IP_ID by default (unifdef -DRANDOM_IP_ID).
one use remains in sys/netipsec, which is kept for freebsd source code compat.
2003-11-26 21:15:47 +00:00
itojun 0864b4939d "seed2" was ruining non-repeating property, so remove it. discussed on tech-net 2003-11-25 18:13:55 +00:00
itojun f51095cf7c knf 2003-11-25 14:44:13 +00:00
scw fd11abcb03 For FAST_IPSEC, ipfilter gets to see wire-format IPsec-encapsulated packets
only. Decapsulated packets bypass ipfilter. This mimics current behaviour
for Kame IPsec.
2003-11-24 20:54:59 +00:00
yamt bba8d5af45 comments on tcp_outflags. 2003-11-20 16:21:48 +00:00
fvdl f2fdecfc92 Correct number of arguments to sysctl_rdint. 2003-11-19 22:40:55 +00:00
jonathan b6e73d53fb Footwork for fast-ipsec and IPv6: when compiling sys/netinet/tcp_input.c
for both FAST_IPSEC and INET6, include <netipsec/ipsec6.h>.
2003-11-19 20:47:00 +00:00
jonathan 130f3bfc26 Patch back support for (badly) randomized IP ids, by request:
* Include "opt_inet.h" everywhere IP-ids are generated with ip_newid(),
  so the RANDOM_IP_ID option is visible. Also in ip_id(), to ensure
  the prototype for ip_randomid() is made visible.

* Add new sysctl to enable randomized IP-ids, provided the kernel was
  configured with RANDOM_IP_ID. (The sysctl defaults to zero, and is
  a read-only zero if RANDOM_IP_ID is not configured).

Note that the implementation of randomized IP ids is still defective,
and should not be enabled at all (even if configured) without
very careful deliberation. Caveat emptor.
2003-11-19 18:39:34 +00:00
jonathan de80d1419e Diff to netinet/ip_input.c (restore ip_id, initialize) for ip_id fix:
Revert the (default) ip_id algorithm to the pre-randomid algorithm,
due to demonstrated low-period repeated IDs from the randomized IP_id
code.  Consensus is that the low-period repetition (much less than
2^15) is not suitable for general-purpose use.

Allocators of new IPv4 IDs should now call the function ip_newid().
Randomized IP_ids is now a config-time option, "options RANDOM_IP_ID".
ip_newid() can use ip_random-id()_IP_ID if and only if configured
with RANDOM_IP_ID. A sysctl knob should be  provided.

This API may be reworked in the near future to support linear ip_id
counters per (src,dst) IP-address pair.
2003-11-17 22:34:16 +00:00
jonathan 995c532c33 Revert the (default) ip_id algorithm to the pre-randomid algorithm,
due to demonstrated low-period repeated IDs from the randomized IP_id
code.  Consensus is that the low-period repetition (much less than
2^15) is not suitable for general-purpose use.

Allocators of new IPv4 IDs should now call the function ip_newid().
Randomized IP_ids is now a config-time option, "options RANDOM_IP_ID".
ip_newid() can use ip_random-id()_IP_ID if and only if configured
with RANDOM_IP_ID. A sysctl knob should be  provided.

This API may be reworked in the near future to support linear ip_id
counters per (src,dst) IP-address pair.
2003-11-17 21:34:27 +00:00
jonathan fa24e6f3f8 Add m_tag_delete_nonpesrsistent(), for deleting all packet tags on
mbuf chains which are recycled (e.g., ICMP reflection, loopback
interface).  A consensus was reached that such recycled packets should
behave (more-or-less) the same way if a new chain had been allocated
and the contents copied to that chain.

Some packet tags may in future be marked as "persistent" (e.g., for
mandatory access controls) and should persist across such deletion.
NetBSD as yet hos no persistent tags, so m_tag_delete_nonpersistent()
just deletes all tags. This should not be relied upon.
2003-11-13 01:48:12 +00:00
itojun d46ad3421a KNF 2003-11-12 15:00:05 +00:00
ragge 4a9b211e76 Remove the FAST_MBSEARCH ifdef, send packet prediction is now default. 2003-11-12 10:48:04 +00:00
jonathan 79bf8521a5 Change global head-of-local-IP-address list from in_ifaddr to
in_ifaddrhead. Recent changes in struct names caused a namespace
collision in fast-ipsec, which are most cleanly fixed by using
"in_ifaddrhead" as the listhead name.
2003-11-11 20:25:26 +00:00
jonathan b86d07f435 Allocate sysctl oid for ipv4 sysctl node "ifq", define symbolic name, and
bump IPCTL_MAXID. (Should have been committed with other ifq sysctl changes).
2003-11-10 20:50:29 +00:00
jonathan 88ba77e705 Make per-protocol network input queue stats visible to userland via
sysctl. Add a protocol-independent sysctl handler to show the per-protocol
"struct ifq' statistics. Add IP(v4) specific call to the handler.
Other protocols can show their per-protocol input statistics by
allocating a sysclt node and calling sysctl_ifq() with their own struct ifq *.

As posted to tech-kern plus improvements/cleanup suggested by Andrew Brown.
2003-11-10 20:03:29 +00:00
simonb a2facef339 Remove some assigned-to but otherwise unused variables. 2003-10-30 01:43:08 +00:00
mycroft d7f0f6de8f Do the previous differently. 2003-10-28 20:27:22 +00:00
provos 57755c156a use a hash table to bind to local ports; suggested by markus friedl
approved: fvdl@
2003-10-28 17:18:37 +00:00
thorpej db71356cd1 - Change callout_setfunc() to require that the callout handle is already
initialized.  Update the txp(4) to compensate.
- Statically initialize the TCP timer callout handles in the tcpcb
  template.  We still use callout_setfunc(), but that call is now much
  less expensive.  Add a comment that the compiler is likely to unroll
  the loop (so don't sweat that it's there).
2003-10-27 16:52:01 +00:00
itojun 3fef2ba893 make it compilable with TCP_DEBUG defined 2003-10-27 07:43:01 +00:00
christos 2017bf9a94 Fix uninitialized variable warning 2003-10-25 18:31:59 +00:00
christos 649137925e initialize off 2003-10-25 08:13:28 +00:00
ragge da20a11a23 Fix the bug in the tcp transmit prediction code.
During testing the prediction counters show a hit-rate on about 85% for
packets sent on a local LAN, and better than 99% for intercontinental
high-speed bulk traffic (!).
2003-10-24 10:25:40 +00:00
enami 935b3c7ad5 Make this file compile again when TCP_OUTPUT_COUNTERS defined. 2003-10-24 03:12:53 +00:00
mycroft 5a8b331f54 Remove all the code to maintain ia_inpcbs. This information was only used to
close sockets on address changes, which was deemed to be a bad idea and was
summarily removed, so there is no point in wasting effort on maintaining it
any more.
2003-10-23 20:55:08 +00:00
thorpej e8a98ee63e Oops, FAST_MBSEARCH counters were swapped; fix it. Pointed out by yamt@. 2003-10-23 17:02:23 +00:00
thorpej 9e4220c00a Oops, a little to aggressive in the previous patch; TCP_TIMER_INIT()
still needs to be in tcp_newtcpcb(), for now.  Pointed out by enami.
2003-10-22 05:55:54 +00:00
thorpej 31923baa46 Rather than zeroing a tcpcb structure and filling in all the fields
individually, create a tcpcb template pre-initialized (and pre-zero'd)
with the static and mostly-static tcpcb parameters.  The template is
now copied into the new tcpcb, which zeros and initializes most of the
tcpcb in one pass.  The template is kept up-to-date as TCP sysctl
variables are changed.

Combined with the previous sb_max change, TCP socket creation is now
25% faster.
2003-10-22 02:45:57 +00:00
thorpej 861856caa0 Add event counters that measure FAST_MBSEARCH. 2003-10-21 21:17:20 +00:00
enami e51f5c64e5 Fix indent. 2003-10-18 13:05:45 +00:00
enami bae9643b84 Increment stats when packet is dropped since there is no room
to put all fragments in the interfaces's send queue.  Some large
UDP packets are dropped here and administrator may want to bump ifqmaxlen.
2003-10-17 20:31:12 +00:00
itojun 5e7b0c710b more correction to ip_fragment; free mbuf correctly if ENOBUFS is raised
during fragmenting.
2003-10-14 06:36:48 +00:00
itojun 00af50df1b avoid mbuf leak on ip_fragment(); obey 4.4bsd mbuf passing rule (mbuf passed
to a function must be freed by the called function on error).
pointed out by enami
2003-10-14 03:38:49 +00:00
mycroft f2fc15d4b5 There is also no reason to use arc4random() here. 2003-10-07 21:24:56 +00:00
itojun 98d5598feb when dropping M_PKTHDR, need to free m_tag associated with it. 2003-10-03 20:56:11 +00:00
itojun 899b67c09a correct ip_fragment() wrt ip->ip_off handling.
do not send out incomplete fragment due to ENOBUFS (behavior change from 4.4BSD)
2003-10-01 23:54:40 +00:00
tls b911732f2a Increase default socket-buffer sizes from 16K to 32K. This increases
throughput significantly in a wide variety of test cases, including
local gigabit ethernet with both jumbo and standard frames,
transcontinental (U.S.) connections with e2e bandwidths ranging from
10Mbit/sec to 155Mbit/sec, and on a variety of test connections
between the NetBSD Project public servers and machines in Australia.

The impact of this change is less dramatic for high-delay connections
when Path MTU is in use but still measurable.

For optimal performance on local gigabit networks, a higher socket
buffer size (at least 64K) will still yield a substantial improvement
in performance, but 32K gets us most of the way there in my test
cases, with only a cost of _doubling_ memory use per socket rather
than _quadrupling_ it.

N.B. Windows NT, at least since Win2k SP2, uses a default socket buffer
     size (or their analogue thereof) of 64K, which is a useful data
     point.
2003-09-29 21:39:35 +00:00
mycroft ca96c7c4ec Remove some code that breaks AH tunnels completely. The comment describing
the purpose of this code appears to be on crack -- it's talking about
end-to-end authentication, but the purpose of an AH tunnel is NOT end-to-end
authentication; it's authentication of the tunnel endpoints.

NB: This does not fix the fact that IPsec leaks "packet tags."
2003-09-28 04:45:14 +00:00
mycroft 3114965161 Fix glaring errors in recent changes. 2003-09-25 00:59:31 +00:00
itojun 8d9a724638 on arplookup() failure, nuke cloned route - otherwise outsider could use massive
number of bogus ARPs for DoS attack.  FreeBSD-SA-03:14.arp
2003-09-24 06:52:47 +00:00
jonathan 5923dedaeb Fast-ipsec can call ip_output() with a null 'struct socket *so'
argument.  So check so is non-NULL before doing the pointer-chasing
dance to find the PCB. (Unless and until we rework fast-ipsec and
KAME, to pass a struct in_pcbhdr * instead of the struct socket *).
2003-09-19 00:27:56 +00:00
itojun a3931fc5ab exp is reserved name under posix 2003-09-16 00:31:55 +00:00
itojun 6b33d95e22 send icmp admin prohibit if socket policy mismatches. 2003-09-12 09:55:22 +00:00
itojun 644a4857fb cut-and-paste error. Valeriy E. Ushakov 2003-09-10 01:46:27 +00:00
itojun 99bc41d6fd if IPsec inbound policy mismatches, respond to SYN with RST (instead of
just dropping it), allow client to react quickly.
2003-09-10 00:58:29 +00:00
itojun 495bd5ff91 initialize ip_hl for ipsec policy lookup. PR kern/22715 2003-09-08 02:06:34 +00:00
itojun 32e3deae21 randomize IPv4/v6 fragment ID and IPv6 flowlabel. avoids predictability
of these fields.  ip_id.c is from openbsd.  ip6_id.c is adapted by kame.
2003-09-06 03:36:30 +00:00
itojun 175c9afa3f clarify flowlabel handling 2003-09-06 03:12:51 +00:00
itojun dd45bfac41 backout previous, we don't know if arc4random() corrides on reboot. 2003-09-06 00:24:54 +00:00
itojun 9636351c96 u_short -> u_int16_t 2003-09-05 23:02:40 +00:00
itojun 186bd1ad6a initialize fragment ID with arc4random, not by time.tv_sec 2003-09-05 22:09:38 +00:00
itojun 495906ca8e revamp inpcb/in6pcb so that they are more aligned with each other.
in6pcb lookup now uses hash(9).
2003-09-04 09:16:57 +00:00
itojun 5c39f4aaa7 don't intiialize m by m0, m0 is not initialized (by introduction of ip_fragment) 2003-08-27 02:09:59 +00:00
itojun 3e76200c67 need sys/domain.h for FAST_IPSEC case; jonathan 2003-08-23 01:41:10 +00:00
itojun a3bad645a4 make sure so is properly initialized 2003-08-22 22:49:34 +00:00
itojun 58f57a60fd tp could be null in tcp_respond() 2003-08-22 22:27:07 +00:00
itojun 4e6aca94c2 correct missing inclusion of opt_ipsec.h 2003-08-22 22:11:44 +00:00
itojun 11ede1ed88 remove ipsec_set/getsocket. now we explicitly pass socket * to ip{,6}_output. 2003-08-22 22:00:36 +00:00
itojun 82eb4ce914 change the additional arg to be passed to ip{,6}_output to struct socket *.
this fixes KAME policy lookup which was broken by the previous commit.
2003-08-22 21:53:01 +00:00
jonathan 9339ef0381 Change KAME code for ip_output()/ip6_output() to obtain struct socket*
from the explicit inpcb*/in6pcb* argument.  set_socket() becomes redundant.
2003-08-22 20:29:00 +00:00
jonathan 902669955f Replace the set_socket() method of passing an extra struct socket*
argument to ip6_output() with a new explicit struct in6pcb* argument.
(The underlying socket can be obtained via in6pcb->inp6_socket.)

In preparation for fast-ipsec.  Reviewed by itojun.
2003-08-22 20:20:09 +00:00
jonathan 6196bbe72d Honour the M_CSUM_NO_PSEUDOHDR, if set on inbound TCP and UDP packets.
Tested against  bcm5700 with patched if_bge.c.
2003-08-21 14:49:49 +00:00
itojun b83dd2f98b remove unneeded #ifdef __NetBSD__ 2003-08-19 08:00:54 +00:00
itojun ade8129bdc make ip_fragment public (it is for coming PF integration) 2003-08-19 01:20:03 +00:00
christos ae572737ba make ip_fragment static and add prototype. 2003-08-19 00:54:41 +00:00
itojun 4f8ba921cd correct ip_multicast_if fix to always set ifp (tnx Shiva) 2003-08-19 00:17:38 +00:00
itojun 449b5c43d4 since we cope with packets with addess on !IFF_UP interface in ip_input()
properly, IFF_UP check in INADDR_TO_IA is obsolete (or too much).
2003-08-18 22:28:51 +00:00
itojun 122edbc337 fix problem we can't drop membership on !IFF_UP interface.
reported by Shiva Shenoy

while we're here, fix another problem when the same interface address is
assigned to !IFF_MULTICAST and IFF_MULTICAST interface.  if ip_multicast_if()
returns the first one, join/leave will fail, which is not an desired effect.
2003-08-18 22:23:22 +00:00
itojun 3bcba4f62b do not disconnect L4 connections on IP address removal. the behavior
is too extreme (consider DHCP/PPP-based fixed address allocation).
see tech-net for more info.
2003-08-16 11:30:35 +00:00
martti 03506a6ef3 Fix return-rst for IPv6 (PR#22157 by Peter Postma). 2003-08-15 08:11:09 +00:00
jonathan 28b5f5dfab (fast-ipsec): Add hooks to pass IPv4 IPsec traffic into fast-ipsec, if
configured with ``options FAST_IPSEC''.  Kernels with KAME IPsec or
with no IPsec should work as before.

All calls to ip_output() now always pass an additional compulsory
argument: the inpcb associated with the packet being sent,
or 0 if no inpcb is available.

Fast-ipsec tested with ICMP or UDP over ESP. TCP doesn't work, yet.
2003-08-15 03:42:00 +00:00
itojun fd3f06dabb enforce ipsec policy on raw wildcard. 2003-08-14 07:57:40 +00:00