NetBSD

Commit Graph

Author	SHA1	Message	Date
dyoung	72f0a6dfb0	Eliminate address family-specific route caches (struct route, struct route_in6, struct route_iso), replacing all caches with a struct route. The principle benefit of this change is that all of the protocol families can benefit from route cache-invalidation, which is necessary for correct routing. Route-cache invalidation fixes an ancient PR, kern/3508, at long last; it fixes various other PRs, also. Discussions with and ideas from Joerg Sonnenberger influenced this work tremendously. Of course, all design oversights and bugs are mine. DETAILS 1 I added to each address family a pool of sockaddrs. I have introduced routines for allocating, copying, and duplicating, and freeing sockaddrs: struct sockaddr sockaddr_alloc(sa_family_t af, int flags); struct sockaddr sockaddr_copy(struct sockaddr dst, const struct sockaddr src); struct sockaddr sockaddr_dup(const struct sockaddr src, int flags); void sockaddr_free(struct sockaddr sa); sockaddr_alloc() returns either a sockaddr from the pool belonging to the specified family, or NULL if the pool is exhausted. The returned sockaddr has the right size for that family; sa_family and sa_len fields are initialized to the family and sockaddr length---e.g., sa_family = AF_INET and sa_len = sizeof(struct sockaddr_in). sockaddr_free() puts the given sockaddr back into its family's pool. sockaddr_dup() and sockaddr_copy() work analogously to strdup() and strcpy(), respectively. sockaddr_copy() KASSERTs that the family of the destination and source sockaddrs are alike. The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is passed directly to pool_get(9). 2 I added routines for initializing sockaddrs in each address family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(), etc. They are fairly self-explanatory. 3 structs route_in6 and route_iso are no more. All protocol families use struct route. I have changed the route cache, 'struct route', so that it does not contain storage space for a sockaddr. Instead, struct route points to a sockaddr coming from the pool the sockaddr belongs to. I added a new method to struct route, rtcache_setdst(), for setting the cache destination: int rtcache_setdst(struct route , const struct sockaddr *); rtcache_setdst() returns 0 on success, or ENOMEM if no memory is available to create the sockaddr storage. It is now possible for rtcache_getdst() to return NULL if, say, rtcache_setdst() failed. I check the return value for NULL everywhere in the kernel. 4 Each routing domain (struct domain) has a list of live route caches, dom_rtcache. rtflushall(sa_family_t af) looks up the domain indicated by 'af', walks the domain's list of route caches and invalidates each one.	2007-05-02 20:40:22 +00:00
ad	59d979c5f1	Pass an ipl argument to pool_init/POOL_INIT to be used when initializing the pool's lock.	2007-03-12 18:18:22 +00:00
christos	53524e44ef	Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.	2007-03-04 05:59:00 +00:00
dyoung	5493f188c7	KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous parentheses in return statements. Cosmetic: don't open-code TAILQ_FOREACH(). Cosmetic: change types of variables to avoid oodles of casts: in in6_src.c, avoid casts by changing several route_in6 pointers to struct route pointers. Remove unnecessary casts to caddr_t elsewhere. Pave the way for eliminating address family-specific route caches: soon, struct route will not embed a sockaddr, but it will hold a reference to an external sockaddr, instead. We will set the destination sockaddr using rtcache_setdst(). (I created a stub for it, but it isn't used anywhere, yet.) rtcache_free() will free the sockaddr. I have extracted from rtcache_free() a helper subroutine, rtcache_clear(). rtcache_clear() will "forget" a cached route, but it will not forget the destination by releasing the sockaddr. I use rtcache_clear() instead of rtcache_free() in rtcache_update(), because rtcache_update() is not supposed to forget the destination. Constify: 1 Introduce const accessor for route->ro_dst, rtcache_getdst(). 2 Constify the 'dst' argument to ifnet->if_output(). This led me to constify a lot of code called by output routines. 3 Constify the sockaddr argument to protosw->pr_ctlinput. This led me to constify a lot of code called by ctlinput routines. 4 Introduce const macros for converting from a generic sockaddr to family-specific sockaddrs, e.g., sockaddr_in: satocsin6, satocsin, et cetera.	2007-02-17 22:34:07 +00:00
dyoung	d8316ce94e	KNF: bzero -> memset, change (struct in_ifaddr *)0 to NULL.	2007-01-26 19:15:26 +00:00
joerg	eb04733c4e	Introduce new helper functions to abstract the route caching. rtcache_init and rtcache_init_noclone lookup ro_dst and store the result in ro_rt, taking care of the reference counting and calling the domain specific route cache. rtcache_free checks if a route was cashed and frees the reference. rtcache_copy copies ro_dst of the given struct route, checking that enough space is available and incrementing the reference count of the cached rtentry if necessary. rtcache_check validates that the cached route is still up. If it isn't, it tries to look it up again. Afterwards ro_rt is either a valid again or NULL. rtcache_copy is used internally. Adjust to callers of rtalloc/rtflush in the tree to check the sanity of ro_dst first (if necessary). If it doesn't fit the expectations, free the cache, otherwise check if the cached route is still valid. After that combination, a single check for ro_rt == NULL is enough to decide whether a new lookup needs to be done with a different ro_dst. Make the route checking in gre stricter by repeating the loop check after revalidation. Remove some unused RADIX_MPATH code in in6_src.c. The logic is slightly changed here to first validate the route and check RTF_GATEWAY afterwards. This is sementically equivalent though. etherip doesn't need sc_route_expire similiar to the gif changes from dyoung@ earlier. Based on the earlier patch from dyoung@, reviewed and discussed with him.	2006-12-15 21:18:52 +00:00
dyoung	c308b1c661	Here are various changes designed to protect against bad IPv4 routing caused by stale route caches (struct route). Route caches are sprinkled throughout PCBs, the IP fast-forwarding table, and IP tunnel interfaces (gre, gif, stf). Stale IPv6 and ISO route caches will be treated by separate patches. Thank you to Christoph Badura for suggesting the general approach to invalidating route caches that I take here. Here are the details: Add hooks to struct domain for tracking and for invalidating each domain's route caches: dom_rtcache, dom_rtflush, and dom_rtflushall. Introduce helper subroutines, rtflush(ro) for invalidating a route cache, rtflushall(family) for invalidating all route caches in a routing domain, and rtcache(ro) for notifying the domain of a new cached route. Chain together all IPv4 route caches where ro_rt != NULL. Provide in_rtcache() for adding a route to the chain. Provide in_rtflush() and in_rtflushall() for invalidating IPv4 route caches. In in_rtflush(), set ro_rt to NULL, and remove the route from the chain. In in_rtflushall(), walk the chain and remove every route cache. In rtrequest1(), call rtflushall() to invalidate route caches when a route is added. In gif(4), discard the workaround for stale caches that involves expiring them every so often. Replace the pattern 'RTFREE(ro->ro_rt); ro->ro_rt = NULL;' with a call to rtflush(ro). Update ipflow_fastforward() and all other users of route caches so that they expect a cached route, ro->ro_rt, to turn to NULL. Take care when moving a 'struct route' to rtflush() the source and to rtcache() the destination. In domain initializers, use .dom_xxx tags. KNF here and there.	2006-12-09 05:33:04 +00:00
joerg	c882b2cbc1	When a dynamic route is deleted in in_losing and in6_losing, rtrequest is called, but the current reference via the PCB is not removed. This is effectively a leaked reference. Call rtfree unconditional.	2006-12-08 16:06:22 +00:00
christos	168cd830d2	__unused removal on arguments; approved by core.	2006-11-16 01:32:37 +00:00
dyoung	a25eaede91	Add a source-address selection policy mechanism to the kernel. Also, add ioctls SIOCGIFADDRPREF/SIOCSIFADDRPREF to get/set preference numbers for addresses. Make ifconfig(8) set/display preference numbers. To activate source-address selection policies in your kernel, add 'options IPSELSRC' to your kernel configuration. Miscellaneous changes in support of source-address selection: 1 Factor out some common code, producing rt_replace_ifa(). 2 Abbreviate a for-loop with TAILQ_FOREACH(). 3 Add the predicates on IPv4 addresses IN_LINKLOCAL() and IN_PRIVATE(), that are true for link-local unicast (169.254/16) and RFC1918 private addresses, respectively. Add the predicate IN_ANY_LOCAL() that is true for link-local unicast and multicast. 4 Add IPv4-specific interface attach/detach routines, in_domifattach and in_domifdetach, which build #ifdef IPSELSRC. See in_getifa(9) for a more thorough description of source-address selection policy.	2006-11-13 05:13:38 +00:00
christos	4d595fd7b1	- sprinkle __unused on function decls. - fix a couple of unused bugs - no more -Wno-unused for i386	2006-10-12 01:30:41 +00:00
tls	8cc016b4bc	Protect calls to pool_put/pool_get that may occur in interrupt context with spl used to protect other allocations and frees, or datastructure element insertion and removal, in adjacent code. It is almost unquestionably the case that some of the spl()/splx() calls added here are superfluous, but it really seems wrong to see: s=splfoo(); /* frob data structure */ splx(s); pool_put(x); and if we think we need to protect the first operation, then it is hard to see why we should not think we need to protect the next. "Better safe than sorry". It is also almost unquestionably the case that I missed some pool gets/puts from interrupt context with my strategy for finding these calls; use of PR_NOWAIT is a strong hint that a pool may be used from interrupt context but many callers in the kernel pass a "can wait/can't wait" flag down such that my searches might not have found them. One notable area that needs to be looked at is pf. See also: http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html	2006-10-05 17:35:19 +00:00
elad	83a5239b28	Remove ugly (void *) casts from network scope authorization wrapper and calls to it. While here, adapt code for system scope listeners to avoid some more casts (forgotten in previous run). Update documentation.	2006-09-19 21:42:29 +00:00
elad	5f7169ccb1	First take at security model abstraction. - Add a few scopes to the kernel: system, network, and machdep. - Add a few more actions/sub-actions (requests), and start using them as opposed to the KAUTH_GENERIC_ISSUSER place-holders. - Introduce a basic set of listeners that implement our "traditional" security model, called "bsd44". This is the default (and only) model we have at the moment. - Update all relevant documentation. - Add some code and docs to help folks who want to actually use this stuff: * There's a sample overlay model, sitting on-top of "bsd44", for fast experimenting with tweaking just a subset of an existing model. This is pretty cool because it's really straightforward to do stuff you had to use ugly hacks for until now... * And of course, documentation describing how to do the above for quick reference, including code samples. All of these changes were tested for regressions using a Python-based testsuite that will be (I hope) available soon via pkgsrc. Information about the tests, and how to write new ones, can be found on: http://kauth.linbsd.org/kauthwiki NOTE FOR DEVELOPERS: PLEASE don't add any code that does any of the following: - Uses a KAUTH_GENERIC_ISSUSER kauth(9) request, - Checks 'securelevel' directly, - Checks a uid/gid directly. (or if you feel you have to, contact me first) This is still work in progress; It's far from being done, but now it'll be a lot easier. Relevant mailing list threads: http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help stablizing kauth(9). Full credit for the regression tests, making sure these changes didn't break anything, goes to Matt Fleming and Jaime Fournier. Happy birthday Randi! :)	2006-09-08 20:58:56 +00:00
ad	f474dceb13	Use the LWP cached credentials where sane.	2006-07-23 22:06:03 +00:00
elad	874fef3711	integrate kauth.	2006-05-14 21:19:33 +00:00
dsl	c24781af04	Pass the current process structure to in_pcbconnect() so that it can pass it to in_pcbbind() so that can allocate a low numbered port if setsockopt() has been used to set IP_PORTRANGE to IP_PORTRANGE_LOW. While there, fail in_pcbconnect() if the in_pcbbind() fails - rather than sending the request out from a port of zero. This has been largely broken since the socket option was added in 1998.	2005-11-15 18:39:46 +00:00
christos	ea2d4204b6	- add const - remove bogus casts - avoid nested variables	2005-05-29 21:41:23 +00:00
christos	761bd09636	PR/30154: YAMAMOTO Takashi: tcp_close locking botch chgsbsize() as mentioned in the PR can be called from an interrupt context via tcp_close(). Avoid calling uid_find() in chgsbsize(). - Instead of storing so_uid in struct socketvar, store *so_uidinfo - Add a simple lock to struct uidinfo.	2005-05-07 17:42:09 +00:00
perry	51ad03a950	ANSIfy function prototypes. (Still have about 3/5ths of the C files in netinet to go...)	2005-02-03 03:49:01 +00:00
perry	3494482345	de-__P -- will ANSIfy .c files later.	2005-02-02 21:41:55 +00:00
christos	1b492809a0	PR/27082: Sean Boudreau: redundant assignment or NULL dereference in in_pcbconnect()	2004-09-29 21:30:00 +00:00
simonb	b5d0e6bf06	Initialise (most) pools from a link set instead of explicit calls to pool_init. Untouched pools are ones that either in arch-specific code, or aren't initialiased during initial system startup. Convert struct session, ucred and lockf to pools.	2004-04-25 16:42:40 +00:00
thorpej	00f100daae	Call ipsec_pcbconn() and ipsec_pcbdisconn() for FAST_IPSEC, too.	2004-03-02 02:26:28 +00:00
itojun	3ffdb9507a	avoid deref-after-free. http://sources.zabbadoz.net/freebsd/patchset/106-ipsec-pcb-discon.diff	2004-01-13 06:17:14 +00:00
itojun	7cddb2827b	whitespace	2004-01-02 15:51:45 +00:00
jonathan	79bf8521a5	Change global head-of-local-IP-address list from in_ifaddr to in_ifaddrhead. Recent changes in struct names caused a namespace collision in fast-ipsec, which are most cleanly fixed by using "in_ifaddrhead" as the listhead name.	2003-11-11 20:25:26 +00:00
provos	57755c156a	use a hash table to bind to local ports; suggested by markus friedl approved: fvdl@	2003-10-28 17:18:37 +00:00
mycroft	5a8b331f54	Remove all the code to maintain ia_inpcbs. This information was only used to close sockets on address changes, which was deemed to be a bad idea and was summarily removed, so there is no point in wasting effort on maintaining it any more.	2003-10-23 20:55:08 +00:00
itojun	495906ca8e	revamp inpcb/in6pcb so that they are more aligned with each other. in6pcb lookup now uses hash(9).	2003-09-04 09:16:57 +00:00
jonathan	28b5f5dfab	(fast-ipsec): Add hooks to pass IPv4 IPsec traffic into fast-ipsec, if configured with ``options FAST_IPSEC''. Kernels with KAME IPsec or with no IPsec should work as before. All calls to ip_output() now always pass an additional compulsory argument: the inpcb associated with the packet being sent, or 0 if no inpcb is available. Fast-ipsec tested with ICMP or UDP over ESP. TCP doesn't work, yet.	2003-08-15 03:42:00 +00:00
agc	aad01611e7	Move UCB-licensed code from 4-clause to 3-clause licence. Patches provided by Joel Baker in PR 22364, verified by myself.	2003-08-07 16:26:28 +00:00
itojun	e4feda72ab	avoid code dup when check broadcast addr in bind(2)	2003-07-22 02:09:30 +00:00
itojun	074166110c	permit bind(2) to broadcast address, as it was permitted before. (for instance, "ntpd -b" was broken since revision 1.82) found report on http://pc.2ch.net/unix	2003-07-21 07:02:35 +00:00
itojun	ab5963ee1f	check if INADDR_TO_IA gets us valid in_ifaddr or not. hopefully fix PR21964	2003-06-26 00:19:13 +00:00
matt	27e1742142	Change the way multicasts are kept. They now use a hash table in the same manner as the ifaddr hash table. By doing this, the mkludge code can go away. At the same time, keep track of what pcbs are using what ifaddr and when an address is deleted from an interface, notify/abort all sockets that have that address as a source. Switch IGMP and multicasts to use pools for allocation. Fix a number of potential problems in the igmp code where allocation failures could cause a trap/panic.	2003-06-15 02:49:32 +00:00
lukem	b1026bf11c	Enable check in in_pcbbind() to enforce sin_family == AF_INET. If there are any "old programs which incorrectly set this" left, they will now fail with EAFNOSUPPORT. This make in_pcbbind() consistent with in_pcbconnect() and the other protocol families. As per my PR [kern/4441], which has the comment: Steven's "TCP/IP Illustrated, Volume 2", page 730, notes that in_pcbbind() has the check which determines if sin_family == AF_INET commented out, but the same check in in_pcbconnect() is still active.	2003-03-16 03:33:28 +00:00
simonb	e6a79d25e7	"error" in in_pcbbind() was only ever set but not used, remove it.	2002-10-22 02:31:16 +00:00
itojun	fa53d749ff	share policy-on-pcb for listening socket. sync w/kame todo: share even more, avoid frequent updates of spidx	2002-06-11 19:39:59 +00:00
itojun	f192b66b94	whitespace	2002-06-09 16:33:36 +00:00
itojun	4121fa09fc	correct in*_pcbrtentry. check cached value correctly.	2002-05-28 11:10:52 +00:00
itojun	7410ea60ca	in in*_pcbrtentry(), check if route is still valid (RTF_UP), and address family is still valid.	2002-05-28 10:07:51 +00:00
thorpej	a180cee23b	Pool deals fairly well with physical memory shortage, but it doesn't deal with shortages of the VM maps where the backing pages are mapped (usually kmem_map). Try to deal with this: * Group all information about the backend allocator for a pool in a separate structure. The pool references this structure, rather than the individual fields. * Change the pool_init() API accordingly, and adjust all callers. * Link all pools using the same backend allocator on a list. * The backend allocator is responsible for waiting for physical memory to become available, but will still fail if it cannot callocate KVA space for the pages. If this happens, carefully drain all pools using the same backend allocator, so that some KVA space can be freed. * Change pool_reclaim() to indicate if it actually succeeded in freeing some pages, and use that information to make draining easier and more efficient. * Get rid of PR_URGENT. There was only one use of it, and it could be dealt with by the caller. From art@openbsd.org.	2002-03-08 20:48:27 +00:00
itojun	ae1b9c29e9	make sure to check address family on route cache. with IPv4 mapped address we can see both AF_INET/INET6.	2002-01-22 03:53:55 +00:00
lukem	ea1cd7eb08	add RCSIDs	2001-11-13 00:32:34 +00:00
matt	da5a70805c	Convert netinet to not use the internal <sys/queue.h> field names but instead the access macros. Use the FOREACH macros where appropriate.	2001-11-04 20:55:25 +00:00
itojun	57030e2f12	cache IPsec policy on in6?pcb. most of the lookup operations can be bypassed, especially when it is a connected SOCK_STREAM in6?pcb. sync with kame.	2001-08-06 10:25:00 +00:00
itojun	fd5e7077a3	allocate ipsec policy buffer attached to pcb in in*_pcballoc, before giving anyone accesses to pcb (do not reveal an inconsistent ones). sync with kame	2001-07-25 23:28:02 +00:00
itojun	1ff38f4d03	on interface removal, remove multicast groups joined from pcb, before removing interface addresses. without the change, we may deref NULL pointer in in_pcbpurgeif(). from jinmei@kame, sync with kame	2001-07-02 15:25:34 +00:00
ad	642267bcc7	Update for hashinit() change.	2000-11-08 14:28:12 +00:00

1 2 3

117 Commits