Commit Graph

241 Commits

Author SHA1 Message Date
ad dd85fd121f ipintr(): check if the queue is empty before looping. Hardly a giant
win, but removed 30% of splnet() calls in one local test.
2006-12-22 05:34:02 +00:00
joerg eb04733c4e Introduce new helper functions to abstract the route caching.
rtcache_init and rtcache_init_noclone lookup ro_dst and store
the result in ro_rt, taking care of the reference counting and
calling the domain specific route cache.
rtcache_free checks if a route was cashed and frees the reference.
rtcache_copy copies ro_dst of the given struct route, checking that
enough space is available and incrementing the reference count of the
cached rtentry if necessary.
rtcache_check validates that the cached route is still up. If it isn't,
it tries to look it up again. Afterwards ro_rt is either a valid again
or NULL.
rtcache_copy is used internally.

Adjust to callers of rtalloc/rtflush in the tree to check the sanity of
ro_dst first (if necessary). If it doesn't fit the expectations, free
the cache, otherwise check if the cached route is still valid. After
that combination, a single check for ro_rt == NULL is enough to decide
whether a new lookup needs to be done with a different ro_dst.
Make the route checking in gre stricter by repeating the loop check
after revalidation.
Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly
changed here to first validate the route and check RTF_GATEWAY
afterwards. This is sementically equivalent though.
etherip doesn't need sc_route_expire similiar to the gif changes from
dyoung@ earlier.

Based on the earlier patch from dyoung@, reviewed and discussed with
him.
2006-12-15 21:18:52 +00:00
dyoung c308b1c661 Here are various changes designed to protect against bad IPv4
routing caused by stale route caches (struct route).  Route caches
are sprinkled throughout PCBs, the IP fast-forwarding table, and
IP tunnel interfaces (gre, gif, stf).

Stale IPv6 and ISO route caches will be treated by separate patches.

Thank you to Christoph Badura for suggesting the general approach
to invalidating route caches that I take here.

Here are the details:

Add hooks to struct domain for tracking and for invalidating each
domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall.

Introduce helper subroutines, rtflush(ro) for invalidating a route
cache, rtflushall(family) for invalidating all route caches in a
routing domain, and rtcache(ro) for notifying the domain of a new
cached route.

Chain together all IPv4 route caches where ro_rt != NULL.  Provide
in_rtcache() for adding a route to the chain.  Provide in_rtflush()
and in_rtflushall() for invalidating IPv4 route caches.  In
in_rtflush(), set ro_rt to NULL, and remove the route from the
chain.  In in_rtflushall(), walk the chain and remove every route
cache.

In rtrequest1(), call rtflushall() to invalidate route caches when
a route is added.

In gif(4), discard the workaround for stale caches that involves
expiring them every so often.

Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a
call to rtflush(ro).

Update ipflow_fastforward() and all other users of route caches so
that they expect a cached route, ro->ro_rt, to turn to NULL.

Take care when moving a 'struct route' to rtflush() the source and
to rtcache() the destination.

In domain initializers, use .dom_xxx tags.

KNF here and there.
2006-12-09 05:33:04 +00:00
dyoung d7a8741d84 KNF. 2006-12-06 00:39:56 +00:00
dyoung 0394fe1e42 KNF. 2006-12-06 00:38:16 +00:00
christos 168cd830d2 __unused removal on arguments; approved by core. 2006-11-16 01:32:37 +00:00
christos 4d595fd7b1 - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
2006-10-12 01:30:41 +00:00
dogcow 55ddfc9aae change the MOWNER_INIT define to take two args; fix extant struct mowner
decls to use it. Makes options MBUFTRACE compile again and not whinge about
missing structure declarations. (Also makes initialization consistent.)
2006-10-10 21:49:14 +00:00
tls 8cc016b4bc Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

	s=splfoo();
	/* frob data structure */
	splx(s);
	pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next.  "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them.  One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html
2006-10-05 17:35:19 +00:00
elad 83a5239b28 Remove ugly (void *) casts from network scope authorization wrapper and
calls to it.

While here, adapt code for system scope listeners to avoid some more
casts (forgotten in previous run).

Update documentation.
2006-09-19 21:42:29 +00:00
elad bada0c776a Don't use KAUTH_RESULT_* where it's not applicable.
Prompted by yamt@.
2006-09-13 10:07:42 +00:00
elad 5f7169ccb1 First take at security model abstraction.
- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
  opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
  security model, called "bsd44". This is the default (and only) model we
  have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

  * There's a sample overlay model, sitting on-top of "bsd44", for
    fast experimenting with tweaking just a subset of an existing model.

    This is pretty cool because it's *really* straightforward to do stuff
    you had to use ugly hacks for until now...

  * And of course, documentation describing how to do the above for quick
    reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

	http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

  - Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
  - Checks 'securelevel' directly,
  - Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)
2006-09-08 20:58:56 +00:00
christos da6e78aa67 fix initializer 2006-08-30 18:55:09 +00:00
elad 5446ee0ef6 ugh.. more stuff that's overdue and should not be in 4.0: remove the
sysctl(9) flags CTLFLAG_READONLY[12]. luckily they're not documented
so it's only half regression.

only two knobs used them; proc.curproc.corename (check added in the
existing handler; its CTLFLAG_ANYWRITE, yay) and net.inet.ip.forwsrcrt,
that got its own handler now too.
2006-07-30 17:38:19 +00:00
kardel de4337ab21 merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
  time.tv_sec -> time_second
- struct timeval mono_time is gone
  mono_time.tv_sec -> time_uptime
- access to time via
	{get,}{micro,nano,bin}time()
	get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
  Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
  NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
2006-06-07 22:33:33 +00:00
liamjfoy 64c2ef1711 #if -> #ifdef
ok christos
2006-05-08 18:50:12 +00:00
christos f190fa88ca Coverity CID 1134: Protect against NULL deref. 2006-04-15 02:24:12 +00:00
joerg 34096c9b32 Print the source and destination IP in ip_forward's DIAGNOSTIC code
with inet_ntoa, making it more human friendly.

From Liam J. Foy in private mail.
2006-02-18 17:47:07 +00:00
perry 0f0296d88a Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete. 2005-12-24 20:45:08 +00:00
christos 95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
christos 8481673c7a Don't decrement the ttl, until we are sure that we can forward this packet.
Before if there was no route, we would call icmp_error with a datagram
packet that has an incorrect checksum. (From Liam Foy)
2005-11-01 21:21:09 +00:00
christos ff7f1eddad No need to pass an interface when only the mtu is needed. From OpenBSD via
Liam Foy.
2005-10-23 18:38:53 +00:00
elad 6439f2618f Add sysctls for IP, ICMP, TCP, and UDP statistics. 2005-08-05 09:21:25 +00:00
seanb d7185c5796 - Return ICMP_UNREACH_NET when no route found as per
section 4.3.3.1 of rfc1812.
2005-06-28 19:38:58 +00:00
atatat 420d91208b Properly fix the constipated lossage wrt -Wcast-qual and the sysctl
code.  I know it's not the prettiest code, but it seems to work rather
well in spite of itself.
2005-06-09 02:19:59 +00:00
blymn e703150707 Unconstify rnode to prevent compile error when GATEWAY option set. 2005-06-01 09:45:15 +00:00
yamt 34c3fec469 move decl of inetsw to its own header to avoid array of incomplete type.
found by gcc4.  reported by Adam Ciarcinski.
2005-04-29 10:39:09 +00:00
yamt e5a2b5a4a4 fix problems related to loopback interface checksum omission. PR/29971.
- for ipv4, defer decision to ip layer as h/w checksum offloading does
  so that it can check the actual interface the packet is going to.
- for ipv6, disable it.
  (maybe will be revisited when it implements h/w checksum offloading.)

ok'ed by Jason Thorpe.
2005-04-18 21:50:25 +00:00
yamt 2c742b20e6 ip_reass: clear stale csum_flags. 2005-03-29 09:37:08 +00:00
perry f07677dd81 nuke trailing whitespace 2005-02-26 22:45:09 +00:00
perry 402f8626b1 ANSIfy function declarations 2005-02-03 22:51:50 +00:00
perry 3494482345 de-__P -- will ANSIfy .c files later. 2005-02-02 21:41:55 +00:00
matt 027c11539b Add IFNET_FOREACH and IFADDR_FOREACH macros and start using them. 2005-01-24 21:25:09 +00:00
christos 77e7bdb8aa yamt's changes seem to fix all the checksumming issues. Turn the loopback
checksums back off so we can make sure that everything works.
2004-12-19 06:42:24 +00:00
christos 60fb5c0ece Turn checksumming on loopback back on until we fix the bugs in it.
Connect over tcp on the loopback is broken:

  4729 amq      0.000007 CALL  connect(4,0x804f2a0,0x1c)
  4729 amq      75.007420 RET   connect -1 errno 60 Connection timed out
2004-12-17 22:54:52 +00:00
thorpej 7994b6f95e Don't perform checksums on loopback interfaces. They can be reenabled with
the net.inet.*.do_loopback_cksum sysctl.

Approved by: groo
2004-12-15 04:25:19 +00:00
darrenr 0543239818 Add a comment to document what setting "srcrt" is really on about in ipintr() 2004-10-06 01:34:11 +00:00
christos d790aa42d0 PR/27081: Sean Boudreau: ip_input() bad csum count not incremented on sw csum 2004-09-29 21:28:34 +00:00
atatat 4de3747b89 Sysctl descriptions under net subtree (net.key not done) 2004-05-25 04:33:59 +00:00
darrenr 39ee9f396a at line 543, we do a pullup here of hlen bytes into the mbuf,
so these later ones are superfluous.
2004-05-02 05:02:53 +00:00
matt da67d85073 Use EVCNT_ATTACH_STATIC{,2} 2004-05-01 02:20:42 +00:00
simonb b5d0e6bf06 Initialise (most) pools from a link set instead of explicit calls
to pool_init.  Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

 Convert struct session, ucred and lockf to pools.
2004-04-25 16:42:40 +00:00
matt e50668c7fa Constify protosw arrays. This can reduce the kernel .data section by
over 4K (if all the network protocols) are loaded.
2004-04-22 01:01:40 +00:00
matt efc47093e2 In ip_reass_ttl_descr, make i signed since it's compared to >= 0 2004-04-01 22:47:55 +00:00
atatat 19af35fd0d Tango on sysctl_createv() and flags. The flags have all been renamed,
and sysctl_createv() now uses more arguments.
2004-03-24 15:34:46 +00:00
itojun 0146a277ba correct typo in 1.94 -> 1.95. pointed out by Shiva Shenoy 2004-01-15 05:13:17 +00:00
thorpej 0c4c58a70b Fix syntax errors in CHECK_NMBCLUSTER_PARAMS(). 2003-12-14 01:14:24 +00:00
jonathan 9c1a5c5570 Second part of hashed IP_reassembly changes:
When under pressure for mbufs or we have too many fragments in the IP
reassembly queue, drop half of all fragments. This multiplicative-drop
strategy ensures we return to a healthy state, even under borderline
denial-of-service from extremely lossy NFS-over-UDP peers.
The multiplicative-drop phase currently drops 50% of fragments, but
has pre-placed support for implementing drop-fractions other than 50%

The threshhold for the `drop-half' phase is the new variable,
ip_maxfrags which is calculated as nmbclusters/4.

ip_input.c now keeps ip_nmbclusters, a cached copy of nmbclusters.
Before using limits derived from nmbclusters, we check if nmbclusters
and ip_nmclusters are equal. If not, we recompute Ip parameters
derived from nmbclusters.  Based on a suggestion by Jason Thorpe.
ip_maxfrags is currently auto-recalcuated.

The counters ip_nfrags and ip_nfragpacketsr are now declared static
and uninitialized (bss), to discourage tampering with them.
2003-12-14 00:09:24 +00:00
scw 6aec1d6812 Make fast-ipsec and ipflow (Fast Forwarding) interoperate.
The idea is that we only clear M_CANFASTFWD if an SPD exists
for the packet. Otherwise, it's safe to add a fast-forward
cache entry for the route.

To make this work properly, we invalidate the entire ipflow
cache if a fast-ipsec key is added or changed.
2003-12-12 21:17:59 +00:00
jonathan 626b230d59 Add new field ipq_nfrags to struct ipq. Maintain count of fragments
(fragments, not fragmented packets) in each queue entry.
Use ipq_nfrags to maintain a count of total fragments in reassembly queue.
2003-12-08 02:23:27 +00:00