NetBSD

Author	SHA1	Message	Date
maxv	8d81257030	Style, localify, remove XXX when there's no issue, and switch 'extra' to int.	2018-01-23 10:55:38 +00:00
maxv	64e3e4a7da	Fix the check on 'maxlen', we are not creating struct icmp6_hdr but struct nd_redirect (which is bigger). Also, make sure we can add a struct nd_opt_rd_hdr. Normally this doesn't change anything, since the mbuf has IPV6_MMTU bytes, and it's always way bigger than what we need.	2018-01-23 10:46:59 +00:00
maxv	c0011bd096	Fix info leak. We are allocating a slot of size: roundup(sizeof(*nd_opt) + ifp->if_addrlen, 8) But we are not filling in the padding caused by the roundup, and therefore several bytes are leaked, in the mbuf we're about to send to the network.	2018-01-23 10:32:50 +00:00
maxv	300aacb2b0	Fix twice the same mistake: 'last' can't be null, so there's no point in having this misleading branch.	2018-01-23 09:21:59 +00:00
maxv	cde3086ed1	Style, and four fixes: * Remove the (disabled) IPPROTO_ESP check. If the packet was decrypted it will have M_DECRYPTED, and this is already checked. * Memory leaks in icmp6_error2. They seem hardly triggerable. * Fix miscomputation in _icmp6_input, the ICMP6 header is not guaranteed to be located right after the IP6 header. ok mlelstv@ * Memory leak in _icmp6_input. This one seems to be impossible to trigger.	2018-01-23 07:02:57 +00:00
ozaki-r	606d8da495	Suppress noisy debugging outputs Even if DEBUG they are too noisy under load.	2018-01-19 08:01:05 +00:00
ozaki-r	1b37ab7b8c	Make DAD destructions (MP-)safe with callout_stop arp_dad_stoptimer and nd6_dad_stoptimer can be called with or without softnet_lock held and unfortunately we have no easy way to statically know which. So it is hard to use callout_halt there. To address the situation, we use callout_stop to make the code safe. The new approach copes with the issue by delegating the destruction of a callout to callout itself, which allows us to not wait the callout to finish. This can be done thanks to that DAD objects are separated from other data such as ifa. The approach is suggested by riastradh@ Proposed on tech-kern@ and tech-net@	2018-01-16 08:13:47 +00:00
ozaki-r	d8462ff5f2	Revert "Work around softnet_lock handling" as per pgoyette@'s request We should avoid if (mutex_owned(softnet_lock)).	2018-01-16 07:56:55 +00:00
ozaki-r	4090a5f5a5	Remove extra pserialize_perform from in_purgeaddr It's already performed in ifa_remove. Note so there (in in6_unlink_ifa too).	2018-01-15 08:17:34 +00:00
knakahara	fe5d98860a	apply in{,6}_tunnel_validate() to gif(4).	2018-01-10 11:13:26 +00:00
knakahara	4ab3af3e3e	add ipsec(4) interface, which is used for route-based VPN. man and ATF are added later, please see man for details. reviewed by christos@n.o, joerg@n.o and ozaki-r@n.o, thanks. https://mail-index.netbsd.org/tech-net/2017/12/18/msg006557.html	2018-01-10 10:56:30 +00:00
ozaki-r	62fd3b8c8a	Get rid of unnecessary ifdef for IFT_IEEE80211	2018-01-10 07:34:31 +00:00
ozaki-r	013cd23759	Fix a deadlock on callout_halt of nd6_dad_timer We must not call callout_halt of nd6_dad_timer with holding nd6_dad_lock because the lock is taken in nd6_dad_timer. Once softnet_lock goes away, we can pass the lock to callout_halt, but for now we cannot.	2018-01-10 07:11:38 +00:00
ozaki-r	8c09e9f90b	Fix use-after-free of mbuf by ip6flow_create (one more) XXX need pullup-[678]	2018-01-09 04:41:19 +00:00
ozaki-r	a29d76a139	Fix use-after-free of mbuf by ip6flow_create This fixes recent failures of some ATF tests such as t_ipsec_tunnel_odd. XXX need pullup-[678]	2018-01-09 04:21:26 +00:00
knakahara	d88bfb301b	Committed debugging logs by mistake, sorry. Revert cryoto.c:r.1.103 and ip6_flow.c:r.1.37.	2018-01-08 23:33:40 +00:00
knakahara	f2516b4ae6	Fix PR kern/52910. Reported and implemented a patch by Sevan Janiyan, thanks.	2018-01-08 23:23:25 +00:00
ozaki-r	3283da7f0f	Work around softnet_lock handling nd6_dad_stoptimer can be called with or without softnet_lock held. callout_halt has to take softnet_lock depending on the situation.	2017-12-26 02:26:45 +00:00
ozaki-r	d6af4f8075	Fix wrong usage of psref_held We can't use it for checking if a caller does NOT hold a given target. If you want to do it you should have psref_not_held or something.	2017-12-25 04:41:48 +00:00
ozaki-r	3bc7c9e607	Add missing curlwp_bindx	2017-12-22 09:53:06 +00:00
knakahara	decfa6a54a	fix mbuf leaks. pointed out and suggested by kre@n.o, thanks.	2017-12-18 03:21:44 +00:00
knakahara	5a6bcc75c1	backout wrong fix again, sorry.	2017-12-18 03:20:12 +00:00
knakahara	38abdefc72	Fix pullup'ed mbuf leaks. The match function just requires enough mbuf length. XXX need pullup-8	2017-12-15 05:01:16 +00:00
knakahara	ca9d7f2f69	backout wrong fix as it causes atf net/ipsec/t_ipsec_l2tp failures.	2017-12-15 04:58:31 +00:00
ozaki-r	bde7231efb	Ensure to call if_mcast_op with holding IFNET_LOCK Note that CARP doesn't deal with IFNET_LOCK yet.	2017-12-15 04:03:46 +00:00
knakahara	e56a093de8	fix pullup'ed mbuf leaks. pointed out by maxv@n.o, thanks. XXX need pullup-8	2017-12-11 02:17:35 +00:00
maxv	df4b74a91d	Fix use-after-free: if m_pullup fails the (freed) mbuf is pushed on the ip6_pktq queue and re-processed later. Return 1 to say "processed and freed".	2017-12-10 09:06:46 +00:00
roy	195d1af85f	Treat unvalidated addresses as deprecated in rule 3.	2017-12-06 14:17:42 +00:00
knakahara	753ed3406c	IFF_RUNNING checking in Rx and Tx processing is unnecessary now. Because the configs of gif (members of gif_var) are protected by psref(9).	2017-11-27 05:05:50 +00:00
knakahara	493e35e35d	preserve gif(4) configs by psref(9) like vlan(4) and l2tp(4). After Tx side does not use softint, gif(4) can use psref(9) for config preservation like vlan(4) and l2tp(4). update locking notes later.	2017-11-27 05:02:22 +00:00
kre	814f99dcb6	Attempt to restore v6 networking. Not 100% certain that these changes are all that is needed, but they're certainly a big part of it (especially the ip6_input.c change.)	2017-11-25 13:18:02 +00:00
roy	fbb6c0f8fa	Allow local communication over DETACHED addresses. Allow binding to DETACHED or TENTATIVE addresses as we deny sending upstream from them anyway. Prefer non DETACHED or TENTATIVE addresses.	2017-11-24 14:03:25 +00:00
ozaki-r	f9ba03bb1c	Fix a race condition of in6_ifinit in6_ifinit checks the number of IPv6 addresses on a given interface and if it's zero (i.e., an IPv6 address being assigned to the interface is the first one), call if_addr_init. However, the actual assignment of the address (ifa_insert) is out of in6_ifinit. The check and the assignment must be done atomically. Fix it by holding in6_ifaddr_lock during in6_ifinit and ifa_insert. And also add missing pserialize to IFADDR_READER_FOREACH.	2017-11-23 07:09:20 +00:00
ozaki-r	97d0399779	Tweak a condition; we don't need to care ifacount to be negative	2017-11-23 07:06:14 +00:00
ozaki-r	70c2bde852	Remove unnecessary goto because there is no cleanup code to share (NFC)	2017-11-23 07:05:02 +00:00
ozaki-r	9faa031948	Mention IPv6 address selection policy isn't MP-safe yet Though it's not a problem until a policy is set.	2017-11-20 09:01:20 +00:00
ozaki-r	cead3b8854	Provide macros for softnet_lock and KERNEL_LOCK hiding NET_MPSAFE switch It reduces C&P codes such as "#ifndef NET_MPSAFE KERNEL_LOCK(1, NULL); ..." scattered all over the source code and makes it easy to identify remaining KERNEL_LOCK and/or softnet_lock that are held even if NET_MPSAFE. No functional change	2017-11-17 07:37:12 +00:00
knakahara	fb23bb2cff	Add argument to encapsw->pr_input() instead of m_tag.	2017-11-15 10:42:41 +00:00
ozaki-r	b9e3a5a1e9	Use psref instead of pserialize because that code is sleepable	2017-11-10 07:25:39 +00:00
ozaki-r	1c27f64d6f	Fix a deadlock between a route update and lltable It happens because rtalloc1 is called from lltable with holding IF_AFDATA_WLOCK. If a route update is in action, rtalloc1 would wait for its completion with holding IF_AFDATA_WLOCK. At the same moment, a softint (e.g., arpintr) may try to take IF_AFDATA_WLOCK and get stuck on it. Unfortunately the stuck softint prevents the route update from progressing because the route update calls psref_target_destroy that needs the softint to complete. A resource allocation graph of the senario looks like this: route update =(psref_target_destroy)=> softint => IF_AFDATA_WLOCK =(rt_update_wait)=> route update Fix the deadlock by pulling rtalloc1 out of the lltable codes inside IF_AFDATA_WLOCK. Note that the deadlock happens only if NET_MPSAFE is enabled.	2017-11-10 07:24:28 +00:00
ozaki-r	faf7d24046	Remove redundant KASSERTMSG The function is static, has just one caller and the caller does the same check.	2017-11-10 07:15:32 +00:00
ozaki-r	048933fa32	Fix usages of ipsec_used If IPsec isn't used, we must go back to the normal path. PR kern/52659	2017-11-05 07:03:37 +00:00
rjs	01b82b52c9	Make SCTP work when IPSEC is also defined.	2017-10-17 19:23:42 +00:00
ozaki-r	6bf0e671a0	Add missing NULL check PR kern/52554	2017-10-05 03:42:14 +00:00
ozaki-r	bbda3ec76e	Take softnet_lock on pr_input properly if NET_MPSAFE Currently softnet_lock is taken unnecessarily in some cases, e.g., icmp_input and encap4_input from ip_input, or not taken even if needed, e.g., udp_input and tcp_input from ipsec4_common_input_cb. Fix them. NFC if NET_MPSAFE is disabled (default).	2017-09-27 10:05:04 +00:00
knakahara	56188c2a5a	add lock for percpu route like l2tp(4).	2017-09-21 09:42:03 +00:00
ozaki-r	0092eb7df6	Invalidate rtcache based on a global generation counter The change introduces a global generation counter that is incremented when any routes have been added or deleted. When a rtcache caches a rtentry into itself, it also stores a snapshot of the generation counter. If the snapshot equals to the global counter, the cache is still valid, otherwise invalidated. One drawback of the change is that all rtcaches of all protocol families are invalidated when any routes of any protocol families are added or deleted. If that matters, we should have separate generation counters based on protocol families. This change removes LIST_ENTRY from struct route, which fixes a part of PR kern/52515.	2017-09-21 07:15:34 +00:00
christos	da66cce8cd	explain why in6_setscope fails...	2017-09-17 17:36:06 +00:00
christos	548324cb09	Skip the scope test for loopback addresses in non-loopback interfaces. While this test is also done in in6_setscope, testing here allows us to log an error for other callers.	2017-09-17 17:35:10 +00:00
christos	b9baaeeb55	PR/52382: BERTRAND Joel: Fix mapped IPv4 source selection; this got broken in the last code refactoring. in6_selectif failing is not fatal. XXX: pullup-8	2017-08-27 12:34:21 +00:00
christos	a2c4fad4b4	PR/52472: Edgar Fuss: Document handling of scoped IPv6 addresses by embedding ASCII art from: IPv6 Core Protocols Implementation By Qing Li, Tatuya Jinmei, Keiichi Shima Page 56, Figure 2.12	2017-08-09 17:20:44 +00:00
ozaki-r	ae6fc59569	Add missing IPsec policy checks to icmp6_rip6_input icmp6_rip6_input is quite similar to rip6_input and the same checks exist in rip6_input.	2017-08-02 02:18:17 +00:00
ozaki-r	0c084e85e9	Make IPsec SPD MP-safe We use localcount(9), not psref(9), to make the sptree and secpolicy (SP) entries MP-safe because SPs need to be referenced over opencrypto processing that executes a callback in a different context. SPs on sockets aren't managed by the sptree and can be destroyed in softint. localcount_drain cannot be used in softint so we delay the destruction of such SPs to a thread context. To do so, a list to manage such SPs is added (key_socksplist) and key_timehandler_spd deletes dead SPs in the list. For more details please read the locking notes in key.c. Proposed on tech-kern@ and tech-net@	2017-08-02 01:28:02 +00:00
ozaki-r	e1c9808fed	Don't acquire global locks for IPsec if NET_MPSAFE Note that the change is just to make testing easy and IPsec isn't MP-safe yet.	2017-07-27 06:59:28 +00:00
knakahara	a431af6bf4	l2tp(4): fix mbuf leak when tunnel nested over the limit XXX need pullup -8 branch	2017-07-11 05:03:45 +00:00
knakahara	25917aa5e0	fix PR kern/52353. implemented by ozaki-r@n.o. I just commit by proxy. XXX need to pullup to -8.	2017-07-07 00:55:15 +00:00
christos	1922edaa5b	remove unnecessary casts; use sizeof(var) instead of sizeof(type).	2017-07-06 17:14:35 +00:00
christos	2b50acc97b	Merge the two copies SO_TIMESTAMP/SO_OTIMESTAMP processing to a single function, and add a SOOPT_TIMESTAMP define reducing compat pollution from 5 places to 1.	2017-07-06 17:08:57 +00:00
ozaki-r	50558ab0df	Fix usage of ip6_get_membership It may set nothing to ifp even if returning 0. So we need to NULL-clear ifp before calling it. Fix PR kern/52324	2017-06-26 08:01:53 +00:00
ozaki-r	d59e7b9e71	Purge ARP/NDP entries on an interface when the interface is down Fix PR kern/51179	2017-06-22 09:53:24 +00:00
ozaki-r	a4910d9c60	Allow in6_lltable_free_entry to be called without holding the afdata lock of ifp as well as in_lltable_free_entry This behavior is a bit odd and should be fixed in the future...	2017-06-22 09:29:23 +00:00
ozaki-r	e765209802	Remove unused function (nd6_rem_ifa_lle)	2017-06-22 09:24:02 +00:00
ozaki-r	dc9233b94b	Don't create a permanent L2 cache entry on adding an address to an interface It was created to copy FreeBSD, however actually the cache isn't necessary. Remove it to simplify the code and reduce the cost to maintain it (e.g., keep a consistency with a corresponding local route).	2017-06-21 09:05:31 +00:00
ozaki-r	5ecc1e1d8c	Sending a routing message (RTM_ADD) on adding an llentry A message used to be sent on adding a cloned route. Restore the behavior for backward compatibility. Requested by ryo@	2017-06-16 02:24:54 +00:00
chs	fd34ea77eb	remove checks for failure after memory allocation calls that cannot fail: kmem_alloc() with KM_SLEEP kmem_zalloc() with KM_SLEEP percpu_alloc() pserialize_create() psref_class_create() all of these paths include an assertion that the allocation has not failed, so callers should not assert that again.	2017-06-01 02:45:05 +00:00
kardel	acf7aadada	avoid a double ifa_release() and thus a panic when e. g. running ifmcstat	2017-05-13 20:13:26 +00:00
ozaki-r	808b116a48	Add missing KEY_FREESP to ip6_forward	2017-05-09 04:24:10 +00:00
ozaki-r	c33d80e3e4	Don't output debugging logs just if DIAGNOSTIC Also make log messages informative.	2017-04-28 05:56:33 +00:00
ozaki-r	5cfcce1f60	Check if solock of PCB is held when SP caches in the PCB are accessed To this end, a back pointer from inpcbpolicy to inpcb_hdr is added.	2017-04-25 05:44:11 +00:00
ozaki-r	c5b713b4e3	Fix build of kernel with SCTP	2017-04-20 09:19:19 +00:00
ozaki-r	ed8b1986a9	Remove unnecessary NULL checks for inp_socket and in6p_socket They cannot be NULL except for programming errors.	2017-04-20 08:46:07 +00:00
ozaki-r	c4cc9034cb	Simplify logic of udp4_sendup and udp6_sendup They are always passed a socket with the same protocol faimiliy as its own: AF_INET for udp4_sendup and AF_INET6 for udp6_sendup.	2017-04-20 08:45:09 +00:00
ozaki-r	469c0f099a	Rumpify netipsec Note that we should modularize netipsec and reduce reverse symbol references (referencing symbols of netipsec from net, netinet and netinet6) though, the task needs lots of code changes. Prior to doing so, rumpifying it and having ATF tests should be useful.	2017-04-14 02:43:27 +00:00
knakahara	685eeb51f1	fix module build	2017-04-04 23:49:17 +00:00
sevan	cb2085f041	Revert change to allow builds to continue until the missing vlan.h file is committed. https://mail-index.netbsd.org/source-changes/2017/04/04/msg083283.html	2017-04-04 16:49:15 +00:00
knakahara	6f4f1b05e1	remove unnecessary if_vlanvar.h. add missing include "vlan.h". pointed out by s-yamaguchi@IIJ, thanks.	2017-04-04 10:25:38 +00:00
knakahara	d35df4a96d	remove duplicated validation. That is already done in l2tp_lookup_session_ref(). pointed out by s-yamaguchi@IIJ, thanks.	2017-03-30 23:13:54 +00:00
ozaki-r	07a4b673ca	Replace DIAGNOSTIC + panic with KASSERT	2017-03-14 04:25:10 +00:00
ozaki-r	4ea7185a98	Replace DIAGNOSTIC + panic with CTASSERT	2017-03-14 04:24:04 +00:00
ozaki-r	752d3b8752	Remove unnecessary NULL check	2017-03-14 04:21:38 +00:00
ozaki-r	2495e7a0c7	Pass inpcb/in6pcb instead of socket to ip_output/ip6_output - Passing a socket to Layer 3 is layer violation and even unnecessary - The change makes codes of callers and IPsec a bit simple	2017-03-03 07:13:06 +00:00
msaitoh	f71865e18b	Add missing opt_net_mpsafe.h.	2017-03-03 06:27:20 +00:00
ozaki-r	f27f4e283c	Plug a race condition on accessing i6mm_maddr	2017-03-02 09:48:20 +00:00
ozaki-r	362a23cbc0	Fix racy in6m_sol Relook up the entry instead of reusing it, which makes locking simple.	2017-03-02 09:16:46 +00:00
ozaki-r	549f799fbf	Protect ia6_memberships by in6_ifaddr_lock	2017-03-02 05:27:39 +00:00
ozaki-r	3e6e186e8a	Make sure im6o_memberships is protected by in6p's lock (solock)	2017-03-02 05:26:24 +00:00
ozaki-r	36ae5d22b0	Make usages of ifp MP-safe in some functions of IP multicast	2017-03-02 05:24:23 +00:00
ozaki-r	0b2f4040ea	Use LIST_* macros No functional change.	2017-03-02 01:05:02 +00:00
ozaki-r	73c95a6a4c	Make IPv6 multicast MP-safe partially To complete the task, we need to make users of IPv6 multicast MP-safe, for example socket/PCB and CARP.	2017-03-01 09:09:37 +00:00
ozaki-r	ef30413ffd	Provide in6_multi_group Use it when checking if we belong to the group, instead of in6_lookup_multi. No functional change.	2017-03-01 08:54:12 +00:00
ozaki-r	2496195667	Restore/add some softnet_lock for nd6_rt_flush and defrouter_addreq May help PR kern/52015	2017-03-01 03:02:35 +00:00
ozaki-r	2d60fd0074	Separate the code of joining multicast groups No functional change.	2017-02-28 04:07:11 +00:00
ozaki-r	b44b24fe31	Prevent ia6 from being freed in in6_ifinit It fixes a panic (diagnostic assertion "entry->ple_prevp != NULL" failed) on: ifconfig lo1 create ifconfig lo1 127.0.0.2 reported by ryo@	2017-02-28 02:56:49 +00:00
ozaki-r	00a9cf741d	Remove mkludge stuffs For unknown reasons, IPv6 multicast addresses are linked to a first IPv6 address assigned to an interface. Due to the design, when removing a first address having multicast addresses, we need to save them to somewhere and later restore them once a new IPv6 address is activated. mkludge stuffs support the operations. This change links multicast addresses to an interface directly and throws the kludge away. Note that as usual some obsolete member variables remain for kvm(3) users. And also sysctl net.inet6.multicast_kludge remains to avoid breaking old ifmcstat. TODO: currently ifnet has a list of in6_multi but obviously the list should be protocol independent. Provide a common structure (if_multi or something) to handle in6_multi and in_multi together as well as ifaddr does for in_ifaddr and in6_ifaddr.	2017-02-23 07:57:09 +00:00
ozaki-r	40914f019e	Stop using useless IN6_*_MULTI macros	2017-02-22 07:46:00 +00:00
ozaki-r	fcf7d70e3a	Get rid of unnecessary splsoftnet	2017-02-22 07:05:47 +00:00
ozaki-r	559b831490	Add assertions and comments for lock states of socket and pcb	2017-02-22 07:05:04 +00:00
ozaki-r	66d96cc093	Use kmem istead of malloc	2017-02-22 03:41:54 +00:00
ozaki-r	93f6b1d8be	Fix prefix invalidation via nd6_timer We cannot remove a prefix there. Instead just invalidate it; the prefix will be removed when purging an associated address. This is the same as the original behavior.	2017-02-22 03:02:55 +00:00
ozaki-r	1f29833ec6	Sweep unnecessary malloc.h inclusions	2017-02-21 03:59:31 +00:00
ozaki-r	67412bb47f	Replace malloc for DAD with kmem and move them out of the lock for DAD	2017-02-21 03:58:23 +00:00
ozaki-r	c5696d3c25	Rename if_acquire_NOMPSAFE to if_acquire It can be used in MP-safe ways. So let's remove the confusing postfix. If it's used in a unsafe way, warn NOMPSAFE in a comment.	2017-02-17 03:57:17 +00:00
knakahara	706b73f634	add missing files.	2017-02-16 08:23:35 +00:00
knakahara	939a415a7d	add l2tp(4) L2TPv3 interface. originally implemented by IIJ SEIL team.	2017-02-16 08:12:43 +00:00
ozaki-r	3f909d1769	Do ND in L2_output in the same manner as arpresolve The benefits of this change are: - The flow is consistent with IPv4 (and FreeBSD and OpenBSD) - old: ip6_output => nd6_output (do ND if needed) => L2_output (lookup a stored cache) - new: ip6_output => L2_output (lookup a cache. Do ND if cache not found) - We can remove some workarounds in nd6_output - We can move L2 specific operations to their own place - The performance slightly improves because one cache lookup is reduced	2017-02-14 03:05:06 +00:00
ozaki-r	19c4d830db	Protect mtudisc and redirect stuffs of icmp/icmp6 with mutex We have to run pr_init of icmp and icmp6 prior to tcp and tcp6 ones for mutex initialization.	2017-02-13 07:18:20 +00:00
ozaki-r	b070ee09f7	Replace splnet with splsoftnet	2017-02-13 04:05:58 +00:00
ozaki-r	57c38b2894	Add missing NULL checks for m_get_rcvif	2017-02-07 02:38:08 +00:00
ozaki-r	589739056f	Defer some pr_input to workqueue pr_input is currently called in softint. Some pr_input such as ICMP, ICMPv6 and CARP can add/delete/update IP addresses and routing table entries. For example, icmp6_redirect_input updates an a routing table entry and nd6_ra_input may delete an IP address. Basically such operations shouldn't be done in softint. That aside, we have a reason to avoid the situation; psz/psref waits cannot be used in softint, however they are required to work in such pr_input in the MP-safe world. The change implements the workqueue pr_input framework called wqinput which provides a means to defer pr_input of a protocol to workqueue easily. Currently icmp_input, icmp6_input, carp_proto_input and carp6_proto_input are deferred to workqueue by the framework. Proposed and discussed on tech-kern and tech-net	2017-02-02 02:52:10 +00:00
ozaki-r	9e8d969cf0	Tweak softnet_lock and NET_MPSAFE - Don't hold softnet_lock in some functions if NET_MPSAFE - Add softnet_lock to sysctl_net_inet_icmp_redirtimeout - Add softnet_lock to expire_upcalls of ip_mroute.c - Restore softnet_lock for in{,6}_pcbpurgeif{,0} if NET_MPSAFE - Mark some softnet_lock for future work	2017-01-24 07:09:24 +00:00
ozaki-r	c26964ba3f	Replace some splnet with splsoftnet	2017-01-23 10:19:03 +00:00
ozaki-r	14cc93cb28	Get rid of splnet for pool(9) We don't need it anymore.	2017-01-23 09:14:24 +00:00
christos	35561f6b22	ip6_sprintf -> IN6_PRINT so that we pass the size.	2017-01-16 15:44:46 +00:00
ozaki-r	e4b1e1923a	Remove KASSERT (revert in6.c,v 1.232) We don't need it (it's harmless though).	2017-01-16 08:26:30 +00:00
ryo	28f4c24cc2	Make ip6_sprintf(), in_fmtaddr(), lla_snprintf() and icmp6_redirect_diag() mpsafe. Reviewed by ozaki-r@	2017-01-16 07:33:36 +00:00
ozaki-r	e4f13796f3	Tweak icmp6_input; always use off, not *offp	2017-01-13 10:38:37 +00:00
ozaki-r	046e2eafb0	Prevent in6_ifaddr from being freed with holding its psref This is a possible fix for PR kern/51828.	2017-01-12 04:43:59 +00:00
christos	88f657f139	Add KASSERT.	2017-01-11 18:25:46 +00:00
ozaki-r	2b82ef9b8f	Get rid of unnecessary header inclusions	2017-01-11 13:08:29 +00:00
ozaki-r	a94a205118	Enable some sysctl knobs on rump kernels for ifmcstat	2017-01-10 05:42:34 +00:00
knakahara	cc189cdb90	remove unnecessary conversion. gif_softc->gif_pdst is already valid sockaddr.	2017-01-06 03:25:13 +00:00
christos	92bd3d8559	- kill NULL argument from in6_update_ifa - amend in6_update_ifa1 to return the ia, so that we can use it in pfil hooks to avoid NULL pointer crash.	2017-01-04 19:37:14 +00:00
christos	042834e792	simplify, and call the hooks after the address has been deleted like we did for the ipv4 case.	2017-01-03 15:14:31 +00:00
ryo	30456e82a3	In the case of SIOCDIFADDR, call pfil_run_addrhooks before release ia.	2016-12-31 09:41:05 +00:00
ozaki-r	12da772ecc	Fix panic in pfil_run_hooks on bootup XXX a kernel with pf still fails to boot up. Please someone fix it.	2016-12-27 10:53:11 +00:00
ozaki-r	ec260ed075	Remove assertion that the lock isn't held It's useless in this case, because without it we can know that the lock is held or not on a next lock acquisition and even more if LOCKDEBUG is enabled a failure on the acquisition will provide useful information for debugging while an assertion failure will provide just the fact that the assertion failed.	2016-12-22 03:46:51 +00:00
ozaki-r	6261537b3d	Fix deadlock between llentry timers and destruction of llentry llentry timer (of nd6) holds both llentry's lock and softnet_lock. A caller also holds them and calls callout_halt to wait for the timer to quit. However we can pass only one lock to callout_halt, so passing either of them can cause a deadlock. Fix it by avoid calling callout_halt without holding llentry's lock. BTW in the first place we cannot pass llentry's lock to callout_halt because it's a rwlock...	2016-12-21 08:47:02 +00:00
ozaki-r	3f765c212e	Hold the big locks only where they are needed	2016-12-21 04:08:47 +00:00
ozaki-r	3adf4b3b3e	Protect IPv6 default router and prefix lists with coarse-grained rwlock in6_purgeaddr (in6_unlink_ifa) itself unrefernces a prefix entry and calls nd6_prelist_remove if the counter becomes 0, so callers doesn't need to handle the reference counting. Performance-sensitive paths (sending/forwarding packets) call just one reader lock. This is a trade-off between performance impact vs. the amount of efforts; if we want to remove the reader lock, we need huge amount of works including destroying objects with psz/psref in softint, for example.	2016-12-19 07:51:34 +00:00
ozaki-r	834995ee17	Kill pr->ndpr_refcnt = 0 The reference counter represents the numuber of references from IPv6 addresses to a prefix entry. If all IPv6 addresses assigned to an interface are purged, all references to a prefix for the interface are also released. For now nd6_purge is always called after purging all IPv6 addresses, so we can get rid of clearing pr->ndpr_refcnt from nd6_purge and instead we can assert it's 0 there. Note that nd6_ifdetach is only called via dom_ifdetach when processing if_detach where dom_ifdetach is called after pr_purgeif that eventually calls in6_ifdetach. So in the call path nd6_purge in nd6_ifdetach does nothing. That said, we should explicitly make it sure to purge all IPv6 addresses before nd6_purge for future changes (or the case I missed something). So if_purgeaddrs is added to nd6_ifdetach.	2016-12-19 04:52:17 +00:00
ozaki-r	1ce4a727f8	Get rid of extra nd6_purge from in6_ifdetach There were two nd6_purge in in6_ifdetach for some reason, but at least now We don't need extra nd6_purge. Remove it and instead add assertions that check if surely purged.	2016-12-19 03:32:54 +00:00
ozaki-r	dd8638eea5	Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input The benefits of the change are: - We can reduce codes - We can provide the same behavior between drivers - Where/When if_ipackets is counted up - Note that some drivers still update packet statistics in their own way (periodical update) - Moved bpf_mtap run in softint - This makes it easy to MP-ify bpf Proposed on tech-kern and tech-net	2016-12-15 09:28:02 +00:00
knakahara	237f476937	fix race of gif_softc->gif_ro when we send multiple flows over gif on NET_MPSAFE enabled kernel. make gif_softc->gif_ro percpu as well as ipforward_rt to resolve this race. and add future TODO comment for etherip(4).	2016-12-14 11:19:15 +00:00
ozaki-r	8e7c1da780	Reduce return points No functional change intended.	2016-12-14 06:33:01 +00:00
ozaki-r	0195cee9ea	Use macro to iterate on the nd_prefix list	2016-12-14 04:13:50 +00:00
ozaki-r	9d79cf8c86	Make functions static	2016-12-14 04:05:11 +00:00
ozaki-r	44375ea93d	Remove unnecessary inclusions of nd6.h	2016-12-13 08:29:03 +00:00
ozaki-r	6fb8880601	Make the routing table and rtcaches MP-safe See the following descriptions for details. Proposed on tech-kern and tech-net Overview -------- We protect the routing table with a rwock and protect rtcaches with another rwlock. Each rtentry is protected from being freed or updated via reference counting and psref. Global rwlocks -------------- There are two rwlocks; one for the routing table (rt_lock) and the other for rtcaches (rtcache_lock). rtcache_lock covers all existing rtcaches; there may have room for optimizations (future work). The locking order is rtcache_lock first and rt_lock is next. rtentry references ------------------ References to an rtentry is managed with reference counting and psref. Either of the two mechanisms is used depending on where a rtentry is obtained. Reference counting is used when we obtain a rtentry from the routing table directly via rtalloc1 and rtrequest{,1} while psref is used when we obtain a rtentry from a rtcache via rtcache_* APIs. In both cases, a caller can sleep/block with holding an obtained rtentry. The reasons why we use two different mechanisms are (i) only using reference counting hurts the performance due to atomic instructions (rtcache case) (ii) ease of implementation; applying psref to APIs such rtaloc1 and rtrequest{,1} requires additional works (adding a local variable and an argument). We will finally migrate to use only psref but we can do it when we have a lockless routing table alternative. Reference counting for rtentry ------------------------------ rt_refcnt now doesn't count permanent references such as for rt_timers and rtcaches, instead it is used only for temporal references when obtaining a rtentry via rtalloc1 and rtrequest{,1}. We can do so because destroying a rtentry always involves removing references of rt_timers and rtcaches to the rtentry and we don't need to track such references. This also makes it easy to wait for readers to release references on deleting or updating a rtentry, i.e., we can simply wait until the reference counter is 0 or 1. (If there are permanent references the counter can be arbitrary.) rt_ref increments a reference counter of a rtentry and rt_unref decrements it. rt_ref is called inside APIs (rtalloc1 and rtrequest{,1} so users don't need to care about it while users must call rt_unref to an obtained rtentry after using it. rtfree is removed and we use rt_unref and rt_free instead. rt_unref now just decrements the counter of a given rtentry and rt_free just tries to destroy a given rtentry. See the next section for destructions of rtentries by rt_free. Destructions of rtentries ------------------------- We destroy a rtentry only when we call rtrequst{,1}(RTM_DELETE); the original implementation can destroy in any rtfree where it's the last reference. If we use reference counting or psref, it's easy to understand if the place that a rtentry is destroyed is fixed. rt_free waits for references to a given rtentry to be released before actually destroying the rtentry. rt_free uses a condition variable (cv_wait) (and psref_target_destroy for psref) to wait. Unfortunately rtrequst{,1}(RTM_DELETE) can be called in softint that we cannot use cv_wait. In that case, we have to defer the destruction to a workqueue. rtentry#rt_cv, rtentry#rt_psref and global variables (see rt_free_global) are added to conduct the procedure. Updates of rtentries -------------------- One difficulty to use refcnt/psref instead of rwlock for rtentry is updates of rtentries. We need an additional mechanism to prevent readers from seeing inconsistency of a rtentry being updated. We introduce RTF_UPDATING flag to rtentries that are updating. While the flag is set to a rtentry, users cannot acquire the rtentry. By doing so, we avoid users to see inconsistent rtentries. There are two options when a user tries to acquire a rtentry with the RTF_UPDATING flag; if a user runs in softint context the user fails to acquire a rtentry (NULL is returned). Otherwise a user waits until the update completes by waiting on cv. The procedure of a updater is simpler to destruction of a rtentry. Wait on cv (and psref) and after all readers left, proceed with the update. Global variables (see rt_update_global) are added to conduct the procedure. Currently we apply the mechanism to only RTM_CHANGE in rtsock.c. We would have to apply other codes. See "Known issues" section. psref for rtentry ----------------- When we obtain a rtentry from a rtcache via rtcache_* APIs, psref is used to reference to the rtentry. rtcache_ref acquires a reference to a rtentry with psref and rtcache_unref releases the reference after using it. rtcache_ref is called inside rtcache_* APIs and users don't need to take care of it while users must call rtcache_unref to release the reference. struct psref and int bound that is needed for psref is embedded into struct route. By doing so we don't need to add local variables and additional argument to APIs. However this adds another constraint to psref other than reference counting one's; holding a reference of an rtentry via a rtcache is allowed by just one caller at the same time. So we must not acquire a rtentry via a rtcache twice and avoid a recursive use of a rtcache. And also a rtcache must be arranged to be used by a LWP/softint at the same time somehow. For IP forwarding case, we have per-CPU rtcaches used in softint so the constraint is guaranteed. For a h rtcache of a PCB case, the constraint is guaranteed by the solock of each PCB. Any other cases (pf, ipf, stf and ipsec) are currently guaranteed by only the existence of the global locks (softnet_lock and/or KERNEL_LOCK). If we've found the cases that we cannot guarantee the constraint, we would need to introduce other rtcache APIs that use simple reference counting. psref of rtcache is created with IPL_SOFTNET and so rtcache shouldn't used at an IPL higher than IPL_SOFTNET. Note that rtcache_free is used to invalidate a given rtcache. We don't need another care by my change; just keep them as they are. Performance impact ------------------ When NET_MPSAFE is disabled the performance drop is 3% while when it's enabled the drop is increased to 11%. The difference comes from that currently we don't take any global locks and don't use psref if NET_MPSAFE is disabled. We can optimize the performance of the case of NET_MPSAFE on by reducing lookups of rtcache that uses psref; currently we do two lookups but we should be able to trim one of two. This is a future work. Known issues ------------ There are two known issues to be solved; one is that a caller of rtrequest(RTM_ADD) may change rtentry (see rtinit). We need to prevent new references during the update. Or we may be able to remove the code (perhaps, need more investigations). The other is rtredirect that updates a rtentry. We need to apply our update mechanism, however it's not easy because rtredirect is called in softint and we cannot apply our mechanism simply. One solution is to defer rtredirect to a workqueue but it requires some code restructuring.	2016-12-12 03:55:57 +00:00
ozaki-r	f2613b7f35	Introduce macros for the prefix list No functional change.	2016-12-12 03:14:01 +00:00
ozaki-r	f5a82c952d	Introduce macros for the default router list No functional change.	2016-12-12 03:13:14 +00:00
ozaki-r	95eeb954a9	Add nd6_ prefix to exported functions	2016-12-11 07:38:50 +00:00
ozaki-r	091c448c26	Move default interface things from nd6_rtr.c to nd6.c	2016-12-11 07:37:53 +00:00
ozaki-r	100c447a96	Make some functions static	2016-12-11 07:36:55 +00:00
ozaki-r	c43c0bf31c	Remove function declarations that have no actual definition	2016-12-11 07:36:20 +00:00
ozaki-r	eecbd48e68	Correct sanity checks of icmp6_redirect_output - rt->rt_ifp is always non-NULL - Checking RTF_UP here is just racy and meaningless - The arguments should be non-NULL (at least for now)	2016-12-11 07:35:42 +00:00
ozaki-r	4c25fb2f83	Add rtcache_unref to release points of rtentry stemming from rtcache In the MP-safe world, a rtentry stemming from a rtcache can be freed at any points. So we need to protect rtentries somehow say by reference couting or passive references. Regardless of the method, we need to call some release function of a rtentry after using it. The change adds a new function rtcache_unref to release a rtentry. At this point, this function does nothing because for now we don't add a reference to a rtentry when we get one from a rtcache. We will add something useful in a further commit. This change is a part of changes for MP-safe routing table. It is separated to avoid one big change that makes difficult to debug by bisecting.	2016-12-08 05:16:33 +00:00
knakahara	2126b5df9a	remove unnecessary extern declaration. inetsw has been declared since r1.1, however sctp6_usrreq.c can be built without the declaration. It must be removed.	2016-12-06 08:58:16 +00:00
ozaki-r	3de81a8881	CID 1396598, CID 1396634: Fix null pointer dereferences	2016-12-02 00:19:54 +00:00
ozaki-r	0645ba174f	Fix panic on destroying an interface with IPv6 addresses obtained with RA nd6_purge depends on that IPv6 addresses are purged. If addresses remain, pfxlist_onlink_check called from nd6_purge dereferences a dangling pointer (ia->ia6_ndpr) that is freed before calling pfxlist_onlink_check. Fix it by removing addresses before calling nd6_purge, which is the original behavior that was changed by in6.c,v 1.203 and in6_ifattach.c,v 1.99. Note that it seems the issue occurs because of a hack that forcibly destroys prefix list entries of a given interface in nd6_purge. We should tackle the hack in the future. Fix PR kern/51467	2016-11-30 02:08:57 +00:00
knakahara	2526d8f639	fix: "ifconfig destory" can stalls when "ifconfig" is done parallel. This problem occurs only if NET_MPSAFE on. ifconfig destroy side: kernel entry point is ifioctl => if_clone_destroy. pr_purgeif() acquires softnet_lock, and then ifa_remove() calls pserialize_perform() holding softnet_lock. ifconfig side: kernel entry point is socreate. pr_attach()(udp_attach_wrapper()) calls sosetlock(). In this call path, sosetlock() try to acquire softnet_lock. These can cause dead lock.	2016-11-18 06:50:04 +00:00
mlelstv	700032562a	nd6_dad_duplicated takes the lock itself. Move it out of the critical section.	2016-11-15 21:17:07 +00:00
mlelstv	845a599209	Enforce alignment requirements that are violated in some cases. For machines that don't need strict alignment (i386,amd64,vax,m68k) this is a no-op. Fixes PR kern/50766 but should be improved.	2016-11-15 20:50:28 +00:00
ozaki-r	5879478f65	Don't use rt_walktree to delete routes Some functions use rt_walktree to scan the routing table and delete matched routes. However, we shouldn't use rt_walktree to delete routes because rt_walktree is recursive to the routing table (radix tree) and isn't friendly to MP-ification. rt_walktree allows a caller to pass a callback function to delete an matched entry. The callback function is called from an API of the radix tree (rn_walktree) but also calls an API of the radix tree to delete an entry. This change adds a new API of the radix tree, rn_search_matched, which returns a matched entry that is selected by a callback function passed by a caller and the caller itself deletes the entry. By using the API, we can avoid the recursive form.	2016-11-15 01:50:06 +00:00
ozaki-r	c2f4bfe351	Add missing rtfree	2016-11-14 02:34:19 +00:00
ozaki-r	d0432711b6	Tidy up in6_select* This change tidies up in6_select* functions, especially selectroute. selectroute is annoying because: - It returns both/either of a rtentry and/or an ifp - Yes, it may return only an ifp! - It is valid but selectroute shouldn't handle the case - Such conditional behavior makes it difficult to apply locking/psref thingy - It may return a rtentry even if error - It may use opt->ip6po_nextroute rtcache implicitly - The caller can know if it is used by rtcache_validate(&opt->ip6po_nextroute) but it's racy in MP-safe world - Even if it uses opt->ip6po_nextroute, it may return a rtentry that isn't derived from the rtcache The change includes: - Rename selectroute to in6_selectroute - Let a remaining caller of selectroute, in6_selectif, use in6_selectroute instead - Let in6_selectroute return only an rtentry - If error, it doesn't return an rtentry - A caller gets an ifp from a returned rtentry - Allow in6_selectroute to modify a passed rtcache and a caller can know if opt->ip6po_nextroute is used via the rtcache - Let callers (ip6_output and in6_selectif) handle the case that only an ifp is required Inspired by OpenBSD Proposed on tech-kern and tech-net LGTM by roy@	2016-11-10 04:13:53 +00:00
ozaki-r	fbb7e30d1e	Reduce the number of return points of frag6_input No functional change.	2016-11-09 03:49:38 +00:00
ozaki-r	29a46075c0	Pull routing header handling out of ip6_output No functional change.	2016-11-07 01:55:17 +00:00
ozaki-r	2ad0a3feca	Tidy up ip6_getpmtu Pull rtcache thing out of ip6_getpmtu; that isn't an essential of the function. Add comments inspired by FreeBSD. No functional change.	2016-11-07 01:05:39 +00:00
ozaki-r	ede76fae4b	Add missing pserialize_read_exit	2016-11-02 03:43:27 +00:00
ozaki-r	d5dc0960bb	Reduce the number of return points No functional change.	2016-11-01 10:32:57 +00:00
christos	e3992d8536	restore previous logic.	2016-10-31 14:34:32 +00:00
ozaki-r	c5224ffd07	Pull best address selection code out of in6_selectsrc No functional change.	2016-10-31 04:57:10 +00:00
ozaki-r	0f3a44863e	Fix race condition of in6_selectsrc in6_selectsrc returned a pointer to in6_addr that wan't guaranteed to be safe by pserialize (or psref), which was racy. Let callers pass a pointer to in6_addr and in6_selectsrc copy a result to it inside pserialize critical sections.	2016-10-31 04:16:25 +00:00
ozaki-r	6e6136eaff	Remove unnecessary NULL checks	2016-10-31 02:50:31 +00:00
ozaki-r	cf96c34d79	Remove unnecessary argument No functional change.	2016-10-25 02:45:09 +00:00
ozaki-r	3be3142886	Don't hold global locks if NET_MPSAFE is enabled If NET_MPSAFE is enabled, don't hold KERNEL_LOCK and softnet_lock in part of the network stack such as IP forwarding paths. The aim of the change is to make it easy to test the network stack without the locks and reduce our local diffs. By default (i.e., if NET_MPSAFE isn't enabled), the locks are held as they used to be. Reviewed by knakahara@	2016-10-18 07:30:30 +00:00
ozaki-r	6cabf64625	Fix indentation	2016-10-18 02:46:50 +00:00
ozaki-r	5faeb64f4b	Remove unnecessary pserialize_read_enter	2016-10-18 02:46:21 +00:00
ozaki-r	48ec99bd49	Add missing pserialize_read_exit	2016-10-18 02:45:41 +00:00
roy	0dbee937df	Now that we disallow sending or receiving from invalid addresses, allow binding to tentative addresses.	2016-09-29 12:19:47 +00:00
roy	8066689d53	Drop UDP packets as well as TCP without error when sending from detached or tentative addresses.	2016-09-20 14:30:13 +00:00
roy	8c6871896f	Ensure that packets are sent from a valid address. If the packet is TCP and the address is detached or tentative then it's just dropped, otherwise an error is returned. This is needed because you can bind to a valid address and it can then become invalid. This satisfies RFC 4862 section 5.5.4.	2016-09-15 18:25:45 +00:00
christos	0d94f00ba4	fix typo	2016-09-14 16:17:17 +00:00
christos	959c247a60	revert previous, roy says it breaks DaD.	2016-09-13 15:57:50 +00:00
christos	acab31252a	When initializing addresses, reset the interface flags to 0. This fixes an issue where point to point addresses that started down, and then came up, were left with stale flags on one side of the point to point link.	2016-09-13 15:41:33 +00:00
christos	647765d084	remove trailing spaces. userland does not catch this?	2016-09-13 00:45:15 +00:00
christos	47afd135ed	add bits for address flags	2016-09-13 00:19:28 +00:00
roy	3e6930820d	Disallow input to detached addresses because they are not yet valid.	2016-09-07 15:41:44 +00:00
roy	c85195ff64	This comment no longer applies.	2016-09-02 15:57:54 +00:00
ozaki-r	ab06ed1240	Don't GC an NDP cache that is added just before GC This fixes unstable test results of ndp_neighborgcthresh.	2016-09-02 07:15:14 +00:00
ozaki-r	543e39c0d3	Make ipforward_rt and ip6_forward_rt percpu Sharing one rtcache between CPUs is just a bad idea. Reviewed by knakahara@	2016-08-31 09:14:47 +00:00
dholland	2df5f31439	PR 51434 David Binderman: remove redundant test.	2016-08-26 21:48:31 +00:00
roy	c63a839724	Simplify.	2016-08-26 20:29:31 +00:00
roy	333b0c4c48	Allow explicit binding to detached addresss. Fixes PR kern/51435.	2016-08-26 19:45:55 +00:00
roy	1893d82b49	White space police.	2016-08-23 19:39:57 +00:00
roy	da7a376e71	Sync denied flags.	2016-08-23 19:39:04 +00:00
knakahara	74c24413b3	improve fast-forward performance when the number of flows exceeds ip6_maxflows. This is porting of ip_flow.c:r1.76 In ip6flow case, the before degradation is about 45%, the after degradation is bout 55%.	2016-08-23 09:59:20 +00:00
roy	dfadc24d64	Revert r1.148 IP6_EXTHDR_GET ensures that a icmp6 header can be fetched from the mbuf so m_pullup does not need to be called. While here, we can safely increament interface error stats even with an invalidated mbuf because we have a saved reference to the interface.	2016-08-19 12:26:01 +00:00
roy	e52094cac4	Revert part of the prior patch so loopback lladdr gets a working prefix route.	2016-08-18 09:34:43 +00:00
roy	fe4671807c	Separate ioctl address prefix management from RA prefix management as we have no API for controlling the latter. This fixes a long standing problem where addresses added with non /128 prefixes and non infinte address lifetimes would register a prefix route which would expire. Subsequent calls set new lifetimes for the same address would not affect the prefix route management, so once expired, the prefix route would be impossible to add back as the kernel would remove it.	2016-08-16 10:31:57 +00:00
christos	fa02ef2c34	In rump (ifp)->if_afdata[AF_INET6] == NULL if we did not register netinet6 yet. Treat this like we don't have a scope, and make the sid tests consistent.	2016-08-12 11:44:24 +00:00
roy	e9c7e74884	Set RTF_CONNECTED instead of setting only RTF_CONNECTED.	2016-08-06 20:00:14 +00:00
ozaki-r	e8f81e31c2	CID 1364757: remove unnecessary branching	2016-08-05 00:51:14 +00:00
knakahara	48235e8230	ip6flow refactor like ipflow. - move ip6flow sysctls into ip6_flow.c like ip_flow.c:r1.64 - build ip6_flow.c only if GATEWAY kernel option is enabled	2016-08-02 04:50:16 +00:00
ozaki-r	466f21f0b9	Fix kernel builds (gcc 4.8)	2016-08-01 04:37:53 +00:00
ozaki-r	a403cbd4f5	Apply pserialize and psref to struct ifaddr and its variants This change makes struct ifaddr and its variants (in_ifaddr and in6_ifaddr) MP-safe by using pserialize and psref. At this moment, pserialize_perform and psref_target_destroy are disabled because (1) we don't need them because of softnet_lock (2) they cause a deadlock because of softnet_lock. So we'll enable them when we remove softnet_lock in the future.	2016-08-01 03:15:30 +00:00
ozaki-r	efee6976a2	Avoid memset and rtcache_free if unnecessary It's the same as ip_output.	2016-07-29 06:02:03 +00:00
ozaki-r	c68a77bc1d	Fix panic on adding/deleting IP addresses under network load Adding and deleting IP addresses aren't serialized with other network opeartions, e.g., forwarding packets. So if we add or delete an IP address under network load, a kernel panic may happen on manipulating network-related shared objects such as rtentry and rtcache. To avoid such panicks, we still need to hold softnet_lock in in_control and in6_control that are called via ioctl and do network-related operations including IP address additions/deletions. Fix PR kern/51356	2016-07-28 09:03:50 +00:00
ozaki-r	e449cc85bc	Simplify by using atomic_swap instead of mutex Suggested by kefren@	2016-07-26 05:53:30 +00:00
ozaki-r	a3625f4d7b	Make DAD of ARP/NDP MP-safe with coarse-grained locks The change also prevents arp_dad_timer/nd6_dad_timer from running if arp_dad_stop/nd6_dad_stop is called, which makes sure that callout_reset won't be called during callout_halt.	2016-07-25 04:21:19 +00:00
ozaki-r	6b3e3b4814	Use KASSERT for checking non-NULL of ifa->ifa_ifp ifa->ifa_ifp should be always non-NULL, so doing the check only if DIAGNOSTIC is ok.	2016-07-25 01:52:21 +00:00
ozaki-r	1f39eeaeb9	Get rid of extra ifafree It was wrongly imported from FreeBSD.	2016-07-20 07:56:10 +00:00
ozaki-r	4f21a42704	Apply pserialize to some iterations of IP address lists	2016-07-20 07:37:51 +00:00
ozaki-r	8759207c83	Use sin6tosa and sin6tocsa macros No functional change.	2016-07-15 07:40:09 +00:00
ozaki-r	328b3c6b85	Use ifatoia6 macro No functional change.	2016-07-15 07:33:41 +00:00
ozaki-r	dca032f9f4	Run timers in workqueue Timers (such as nd6_timer) typically free/destroy some data in callout (softint). If we apply psz/psref for such data, we cannot do free/destroy process in there because synchronization of psz/psref cannot be used in softint. So run timer callbacks in workqueue works (normal LWP context). Doing workqueue_enqueue a work twice (i.e., call workqueue_enqueue before a previous task is scheduled) isn't allowed. For nd6_timer and rt_timer_timer, this doesn't happen because callout_reset is called only from workqueue's work. OTOH, ip{,6}flow_slowtimo's callout can be called before its work starts and completes because the callout is periodically called regardless of completion of the work. To avoid such a situation, add a flag for each protocol; the flag is set true when a work is enqueued and set false after the work finished. workqueue_enqueue is called only if the flag is false. Proposed on tech-net and tech-kern.	2016-07-11 07:37:00 +00:00
ozaki-r	94dba1b837	CID 1363345: remove unreachable code and cleanup returns	2016-07-08 06:18:29 +00:00
ozaki-r	4133a8eca8	Replace macros to get an IP address with proper inline functions The inline functions are more friendly for applying psz/psref; they consist of only simple interations.	2016-07-08 04:33:30 +00:00
ozaki-r	75a23513d7	Kill remaining use of the old lists of IP addresses	2016-07-08 03:40:34 +00:00
ozaki-r	9e4c2bda8a	Switch the address list of intefaces to pslist(9) As usual, we leave the old list to avoid breaking kvm(3) users.	2016-07-07 09:32:01 +00:00
ozaki-r	6106c473fc	Move in6_ifaddr_list to a more proper place (from ip6_input.c to in6.c) It's a similar place as the IPv4 address list, i.e., in.c. More varibles will join together.	2016-07-06 10:49:49 +00:00
ozaki-r	806a31cb3c	Add missing IN6_ADDRLIST_ENTRY_DESTROY	2016-07-06 07:52:53 +00:00
ozaki-r	d04ff44ad6	Apply m_get_rcvif_psref (kill m_get_rcvif_NOMPSAFE)	2016-07-06 00:30:55 +00:00
ozaki-r	ff67da833b	Constify an argument of regen_tmpaddr	2016-07-05 06:32:18 +00:00
ozaki-r	2a3c249748	KNF	2016-07-05 04:25:23 +00:00
ozaki-r	c30ba26977	Use ia6 or ia instead of ifa as a variable name of struct in6_ifaddr We conventionally use ifa for struct ifaddr and use ia6 or ia for struct in6_ifaddr. No functional change.	2016-07-05 03:40:52 +00:00
ozaki-r	d961591ee9	Fix userland compilations of those including in6_var.h	2016-07-04 07:32:18 +00:00
ozaki-r	6cf9fce745	Use pslist(9) for the global in6_ifaddr list psz and psref will be applied in another commit. No functional change intended.	2016-07-04 06:48:14 +00:00
knakahara	a6d7586724	fix: gif(4) receive side race A panic cause in rn_match() called by encap[46]_lookup(). The reason is that gif(4) does not suspend receive packet processing in spite of suspending transmit packet processing while anyone is doing gif(4) ioctl.	2016-07-04 04:22:47 +00:00
knakahara	d81cd78ed7	let gif(4) promise softint(9) contract (1/2) : gif(4) side To prevent calling softint_schedule() after called softint_disestablish(), the following modifications are added + ioctl (writing configuration) side - off IFF_RUNNING flag before changing configuration - wait softint handler completion before changing configuration + packet processing (reading configuraiotn) side - if IFF_RUNNING flag is on, do nothing + in whole - add gif_list_lock_{enter,exit} to prevent the same configuration is set to other gif(4) interfaces	2016-07-04 04:14:47 +00:00
ozaki-r	feeae45125	Remove redundant codes purging IPv6 addresses Proposed on tech-net and tech-kern.	2016-07-04 02:41:18 +00:00
ozaki-r	17b4eb5edd	Make sure to free all interface addresses in if_detach Addresses of an interface (struct ifaddr) have a (reverse) pointer of an interface object (ifa->ifa_ifp). If the addresses are surely freed when their interface is destroyed, the pointer is always valid and we don't need a tweak of replacing the pointer to if_index like mbuf. In order to make sure the assumption, the following changes are required: - Deactivate the interface at the firstish of if_detach. This prevents in6_unlink_ifa from saving multicast addresses (wrongly) - Invalidate rtcache(s) and clear a rtentry referencing an address on RTM_DELETE. rtcache(s) may delay freeing an address - Replace callout_stop with callout_halt of DAD timers to ensure stopping such timers in if_detach	2016-07-01 05:22:33 +00:00
ozaki-r	d4c71b34a8	Make sure that ifaddr is published after its initialization finished Basically we should insert an item to a collection (say a list) after item's initialization has been completed to avoid accessing an item that is initialized halfway. ifaddr (in{,6}_ifaddr) isn't processed like so and needs to be fixed. In order to do so, we need to tweak {arp,nd6}_rtrequest that depend on that an ifaddr is inserted during its initialization; they explore interface's address list to determine that rt_getkey(rt) of a given rtentry is in the list to know whether the route's interface should be a loopback, which doesn't work after the change. To make it work, first check RTF_LOCAL flag that is set in rt_ifa_addlocal that calls {arp,nd6}_rtrequest eventually. Note that we still need the original code for the case to remove and re-add a local interface route.	2016-06-30 01:34:53 +00:00
ozaki-r	a577cf2aa0	Introduce if_is_deactivated Checking ifp->if_output == if_nulloutput is too implicit. No functional change.	2016-06-28 02:36:54 +00:00
ozaki-r	ca4ea29d93	Add missing NULL checks for m_get_rcvif_psref	2016-06-28 02:02:56 +00:00
christos	9471dccf97	CID 1362905: Initialize ifp early, so that we don't if_put garbage in the IPSEC case.	2016-06-27 18:35:54 +00:00
ozaki-r	4b54d200aa	Remove unnecessary NULL checks of ifa->ifa_addr If it's NULL, it should be a bug. There many IFADDR_FOREACH that don't do NULL check. If it can be NULL, they should fire already.	2016-06-22 07:48:17 +00:00
ozaki-r	4badfc204a	Make sure returning ifp from in6_select* functions psref-ed To this end, callers need to pass struct psref to the functions and the fuctions acquire a reference of ifp with it. In some cases, we can simply use if_get_byindex, however, in other cases (say rt->rt_ifp and ia->ifa_ifp), we have no MP-safe way for now. In order to take a reference anyway we use non MP-safe function if_acquire_NOMPSAFE for the latter cases. They should be fixed in the future somehow.	2016-06-21 10:25:27 +00:00
ozaki-r	f7107c248e	Protect if_byindex with pserialize	2016-06-21 10:21:04 +00:00
ozaki-r	43c5ab376f	Replace ifp of ip_moptions and ip6_moptions with if_index The motivation is the same as the mbuf's rcvif case; avoid having a pointer of an ifnet object in ip_moptions and ip6_moptions, which is not MP-safe. ip_moptions and ip6_moptions can be stored in a PCB for inet or inet6 that's life time is different from ifnet one and so an ifnet object can be disappeared anytime we get it via them. Thus we need to look up an ifnet object by if_index every time for safe.	2016-06-21 03:28:27 +00:00
ozaki-r	51db7a24e2	Fix nd6_output (if_output_lock conversion mistake)	2016-06-21 02:14:11 +00:00
knakahara	95fc145695	apply if_output_lock() to L3 callers which call ifp->if_output() of L2(or L3 tunneling).	2016-06-20 06:46:37 +00:00
ozaki-r	f0423d34e6	Use if_get_byindex instead of if_byindex for MP-safe	2016-06-16 03:03:33 +00:00
ozaki-r	e1135cd9b9	Use curlwp_bind and curlwp_bindx instead of open-coding LP_BOUND	2016-06-16 02:38:40 +00:00
ozaki-r	c7e18ccbde	Protect if_byindex by pserialize	2016-06-15 06:01:21 +00:00
knakahara	a6f4292e65	eliminate unnecessary splnet	2016-06-13 08:37:15 +00:00
knakahara	e4ff09f05d	MP-ify fastforward to support GATEWAY kernel option. I add "ipflow_lock" mutex in ip_flow.c and "ip6flow_lock" mutex in ip6_flow.c to protect all data in each file. Of course, this is not MP-scalable. However, it is sufficient as tentative workaround. We should make it scalable somehow in the future. ok by ozaki-r@n.o.	2016-06-13 08:34:23 +00:00
ozaki-r	fe6d427551	Avoid storing a pointer of an interface in a mbuf Having a pointer of an interface in a mbuf isn't safe if we remove big kernel locks; an interface object (ifnet) can be destroyed anytime in any packet processing and accessing such object via a pointer is racy. Instead we have to get an object from the interface collection (ifindex2ifnet) via an interface index (if_index) that is stored to a mbuf instead of an pointer. The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9) for sleep-able critical sections and m_{get,put}_rcvif that use pserialize(9) for other critical sections. The change also adds another API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition moratorium, i.e., it is intended to be used for places where are not planned to be MP-ified soon. The change adds some overhead due to psref to performance sensitive paths, however the overhead is not serious, 2% down at worst. Proposed on tech-kern and tech-net.	2016-06-10 13:31:43 +00:00
ozaki-r	d938d837b3	Introduce m_set_rcvif and m_reset_rcvif The API is used to set (or reset) a received interface of a mbuf. They are counterpart of m_get_rcvif, which will come in another commit, hide internal of rcvif operation, and reduce the diff of the upcoming change. No functional change.	2016-06-10 13:27:10 +00:00
ozaki-r	9f10dc7910	Get rcvif once and reuse it No functional change.	2016-05-19 08:53:25 +00:00
ozaki-r	348f728f8e	Replace DIAGNOSTIC & panic with KASSERT	2016-05-19 03:11:42 +00:00
ozaki-r	894d037bc1	Get rid of unnecessary assignment	2016-05-18 11:28:44 +00:00
ozaki-r	9f595a90fa	Get rid of unnecessary NULL check It's already checked just some lines above.	2016-05-18 09:32:05 +00:00
ozaki-r	27df9b11fc	Don't try to get outif unnecessarily from in6_selectsrc The got outif is unused.	2016-05-18 08:40:51 +00:00
ozaki-r	842c4ed6c1	Get rcvif once and reuse it No functional change.	2016-05-17 03:27:02 +00:00
ozaki-r	31da384114	Make sure icmp6_redirect_input frees mbuf before return	2016-05-17 03:24:46 +00:00
ozaki-r	040205ae93	Protect ifnet list with psz and psref The change ensures that ifnet objects in the ifnet list aren't freed during list iterations by using pserialize(9) and psref(9). Note that the change adds a pslist(9) for ifnet but doesn't remove the original ifnet list (ifnet_list) to avoid breaking kvm(3) users. We shouldn't use the original list in the kernel anymore.	2016-05-12 02:24:16 +00:00
is	142ff9d692	Let non-neighbor NS/NA debug error message include useful information.	2016-04-29 11:46:17 +00:00
ozaki-r	ad0fbab4d2	Get rid of unused argument from get_rand_ifid	2016-04-27 07:51:14 +00:00
ozaki-r	9e0f6c5e36	Stop using rt_gwroute on packet sending paths rt_gwroute of rtentry is a reference to a rtentry of the gateway for a rtentry with RTF_GATEWAY. That was used by L2 (arp and ndp) to look up L2 addresses. By separating L2 nexthop caches, we don't need a route for the purpose and we can stop using rt_gwroute. By doing so, we can reduce referencing and modifying rtentries, which makes it easy to apply a lock (and/or psref) to the routing table and rtentries. One issue to do this is to keep RTF_REJECT behavior. It seems it was broken when we moved rtalloc1 things from L2 output routines (e.g., ether_output) to ip_hresolv_output, but (fortunately?) it works unexpectedly. What we mistook are: - RTF_REJECT was checked for any routes in L2 output routines, but in ip_hresolv_output it is checked only when the route is RTF_GATEWAY - The RTF_REJECT check wasn't copied to IPv6 (nd6_output) It seems that rt_gwroute checks hid the mistakes and it looked work (unexpectedly) and removing rt_gwroute checks unveil the issue. So we need to fix RTF_REJECT checks in ip_hresolv_output and also add them to nd6_output. One more point we have to care is returning an errno; we need to mimic looutput behavior. Originally RTF_REJECT check was done either in L2 output routines or in looutput. The latter is applied when a reject route directs to a loopback interface. However, now RTF_REJECT check is done before looutput so to keep the original behavior we need to return an errno which looutput chooses. Added rt_check_reject_route does such tweaks.	2016-04-26 09:30:01 +00:00

... 3 4 5 6 7 ...

1913 Commits