NetBSD

Commit Graph

Author	SHA1	Message	Date
ozaki-r	8392d61dfb	Fill rmx_locks too Otherwise userland sees garbage in it. This should fix t_mtudisc6 failing on babylon5.	2017-02-17 02:56:53 +00:00
knakahara	706b73f634	add missing files.	2017-02-16 08:23:35 +00:00
knakahara	9795bc8884	support interface name which includes digit.	2017-02-16 08:13:43 +00:00
knakahara	939a415a7d	add l2tp(4) L2TPv3 interface. originally implemented by IIJ SEIL team.	2017-02-16 08:12:43 +00:00
ozaki-r	d2f7edf8ff	Avoid if_dl and if_sadl to be NULL Calling if_deactivate_sadl and then if_sadl_setrefs exposes NULL-ed if_dl and if_sadl to users for a moment. It's harmful because users expect that they're always non-NULL. Fix it. Note that a race condition still remains; if_dl and if_sald aren't updated atomically so a user can see different data from if_dl and if_sadl. Fortunately none uses both if_dl and if_sadl at the same time, so the race condition doesn't hurt nobody for now. (In the first place exposing one data with two ways is problematic?)	2017-02-15 01:48:44 +00:00
ozaki-r	3f909d1769	Do ND in L2_output in the same manner as arpresolve The benefits of this change are: - The flow is consistent with IPv4 (and FreeBSD and OpenBSD) - old: ip6_output => nd6_output (do ND if needed) => L2_output (lookup a stored cache) - new: ip6_output => L2_output (lookup a cache. Do ND if cache not found) - We can remove some workarounds in nd6_output - We can move L2 specific operations to their own place - The performance slightly improves because one cache lookup is reduced	2017-02-14 03:05:06 +00:00
ozaki-r	fc8fa9f575	Remove unnecessary splnet ok @knakahara	2017-02-13 04:05:19 +00:00
ozaki-r	e7ec750b6d	Update comments to reflect bpf MP-ification	2017-02-13 03:44:45 +00:00
skrll	4c9468e5b1	Whitespace	2017-02-12 09:47:31 +00:00
skrll	fdfc876dd7	Remove redundant splnet/splx calls - ec_lock is IPL_NET.	2017-02-12 09:36:05 +00:00
skrll	18a70850e5	Convert to kmem(9)	2017-02-12 08:51:45 +00:00
skrll	3cc4428ad4	Typo in comment	2017-02-12 08:47:12 +00:00
skrll	a7ecd2081f	KNF (sort #include <sys/...>) and remove a duplicate	2017-02-12 08:40:19 +00:00
christos	e344807275	make attach and detach locking symmetric (detaching cloners failed)	2017-02-10 20:56:21 +00:00
ozaki-r	39a63b6b81	Ensure that nobody references a rtentry that is passed to rt_setgate	2017-02-10 13:48:06 +00:00
ozaki-r	f1b2f01b14	Fix locking against myself in ifa_ifwithroute_psref It happened on the path: rtrequest1 => rt_getifa => ifa_ifwithroute_psref. Reported by ryo@	2017-02-10 13:44:47 +00:00
kre	4fbae14d9b	PR kern/51280 This allows srt devices to work for IPv6. srt still needs work (particularly #ifdef INET6 but also general effeciency and similar.)	2017-02-09 11:43:32 +00:00
ozaki-r	f66c9ca3fd	Make bpf MP-safe By the change, bpf_mtap can run without any locks as long as its bpf filter doesn't match a target packet. Pushing data to a bpf buffer still needs a lock. Removing the lock requires big changes and it's a future work. Another known issue is that we need to remain some obsolete variables to avoid breaking kvm(3) users such as netstat and fstat. One problem for MP-ification is that in order to keep statistic counters of bpf_d we need to use atomic operations for them. Once we retire the kvm(3) users, we should make the counters per-CPU and remove the atomic operations.	2017-02-09 09:30:26 +00:00
skrll	3a6718d674	KNF and trailing whitespace. No functional change.	2017-02-07 11:17:50 +00:00
ozaki-r	ecf8174acc	Use m_get_rcvif_psref instead of m_get_rcvif Because the critical sections are now sleepable. Reviewed by knakahara@	2017-02-07 02:33:54 +00:00
ozaki-r	589739056f	Defer some pr_input to workqueue pr_input is currently called in softint. Some pr_input such as ICMP, ICMPv6 and CARP can add/delete/update IP addresses and routing table entries. For example, icmp6_redirect_input updates an a routing table entry and nd6_ra_input may delete an IP address. Basically such operations shouldn't be done in softint. That aside, we have a reason to avoid the situation; psz/psref waits cannot be used in softint, however they are required to work in such pr_input in the MP-safe world. The change implements the workqueue pr_input framework called wqinput which provides a means to defer pr_input of a protocol to workqueue easily. Currently icmp_input, icmp6_input, carp_proto_input and carp6_proto_input are deferred to workqueue by the framework. Proposed and discussed on tech-kern and tech-net	2017-02-02 02:52:10 +00:00
maxv	1fca26e9a0	Not sure what we are trying to achieve here, but there are two issues; error can be printed while it is not initialized, and if m_pulldown fails m is freed and reused. Quickly reviewed by christos and martin	2017-02-01 17:58:47 +00:00
ozaki-r	c66c595b80	Reduce return points	2017-02-01 08:18:33 +00:00
ozaki-r	23e72bfbc4	Kill tsleep/wakeup and use cv	2017-02-01 08:16:42 +00:00
ozaki-r	bbe8ead203	Make bpf_gstats percpu	2017-02-01 08:15:15 +00:00
ozaki-r	2fec859db2	Use pslist(9) instead of queue(9) for psz/psref As usual some member variables of struct bpf_d and bpf_if remain to avoid breaking kvm(3) users (netstat and fstat).	2017-02-01 08:13:45 +00:00
ozaki-r	b76d85bbe1	Use kmem(9) instead of malloc/free	2017-02-01 08:07:27 +00:00
ozaki-r	ddd60175a6	Make global variables static	2017-02-01 08:06:01 +00:00
maxv	1880bea337	Correctly handle the return value of arpresolve, otherwise we either leak memory or use some we already freed. Sent on tech-net, ok christos	2017-01-31 17:13:36 +00:00
maya	8c70f41783	Most error paths that goto out; don't hold tun_lock. so don't mutex_exit(tun_lock) in them, but only in the one that needs it. ok skrll	2017-01-29 18:30:33 +00:00
christos	923e6ee286	- Increase copyin buffer size to 4M - Change log output format to be like the OpenBSD's pf including in the header the matching rule etc, and fill in the matching info.	2017-01-29 00:15:54 +00:00
maya	491605d47f	Switch agr(4) to use a workqueue. This is necessary because during a callout, it allocates memory with M_WAITOK, which triggers a DEBUG assert. XXX we should drain the workqueue. ok riastradh	2017-01-28 22:56:09 +00:00
ryo	672772a1a5	Don't hold softnet_lock if NET_MPSAFE. Some functions lock softnet_lock while waiting in pserialize_perform() in pfil_add_hook(). (e.g. key_timehandler(), etc)	2017-01-27 17:25:34 +00:00
skrll	5366c9f585	Fix logic inversion spotted by paulg	2017-01-26 21:38:11 +00:00
skrll	cf1b64a79e	Make MP-safe and use kmem(9) Mostly from rmind-smpnet	2017-01-26 21:13:19 +00:00
msaitoh	11b02bab64	ifmedia_removeall(): Clear ifm_cur and ifm_media after removing all ifmedia entries.	2017-01-25 07:19:24 +00:00
msaitoh	072c3839f8	ifmedia_init(): Clear ifm_media with IFM_NONE instead of 0.	2017-01-25 07:17:19 +00:00
christos	98c1dc9935	fix locking against myself in module autoload; module autoload calls if_clone_attach which takes the lock again.	2017-01-25 03:04:21 +00:00
ozaki-r	87e988a7d8	Use bpf_ops for bpf_mtap_softint By doing so we don't need to care whether a kernel enables bpfilter or not.	2017-01-25 01:04:23 +00:00
christos	0e249c9092	Sync with libpcap-1.8.1	2017-01-24 22:12:42 +00:00
maxv	a2073096ee	Don't forget to free the mbuf when we decide not to reply to an ARP request. This obviously is a terrible bug, since it allows a remote sender to DoS the system with specially-crafted requests sent in a loop.	2017-01-24 18:37:20 +00:00
ozaki-r	9674e2224b	Defer bpf_mtap in Rx interrupt context to softint bpf_mtap of some drivers is still called in hardware interrupt context. We want to run them in softint as well as bpf_mtap of most drivers (see if_percpuq_softint and if_input). To this end, bpf_mtap_softint mechanism is implemented; it defers bpf_mtap processing to a dedicated softint for a target driver. By using the machanism, we can move bpf_mtap processing to softint without changing target drivers much while it adds some overhead on CPU and memory. Once target drivers are changed to softint-based, we should return to normal bpf_mtap. Proposed on tech-kern and tech-net	2017-01-24 09:05:27 +00:00
ozaki-r	6e6b9ceef2	Restore splnet for if_slowtimo if_slowtimo (== if_watchdog) still requires splnet for most drivers. Pointed out by nonaka@	2017-01-24 07:58:58 +00:00
skrll	021afb57b6	KNF. Same code before and after.	2017-01-23 15:32:04 +00:00
ozaki-r	c26964ba3f	Replace some splnet with splsoftnet	2017-01-23 10:19:03 +00:00
ozaki-r	e9b008839d	Make bpf_setf static	2017-01-23 10:17:36 +00:00
ozaki-r	1360192bca	Fix typo in a comment	2017-01-23 06:47:54 +00:00
ozaki-r	cf9d550252	Call pserialize_perform and psref_target_destroy only if NET_MPSAFE They shouldn't be used with holding softnet_lock.	2017-01-23 02:32:54 +00:00
ozaki-r	ab9446efae	Add curlwp_bind It is necessary for example when we use tun(4). Without it the following panic occurs: panic: kernel diagnostic assertion "(kpreempt_disabled() \|\| cpu_softintr_p() \|\| ISSET(curlwp->l_pflag, LP_BOUND))" failed: file "/usr/src/sys/kern/subr_psref.c", line 291 passive references are CPU-local, but preemption is enabled and the caller is not in a softint or CPU-bound LWP Backtrace: vpanic() ch_voltag_convert_in() psref_release() pfil_run_arg.isra.0() if_initialize() if_attach() tun_clone_create() tunopen() cdev_open() spec_open() VOP_OPEN() vn_open() do_open() do_sys_openat() sys_open() syscall()	2017-01-23 02:30:47 +00:00
ozaki-r	ac86ae25b9	Protect if_clone data with if_clone_mtx To this end, carpattach needs to be delayed from RUMP_COMPONENT_NET to RUMP_COMPONENT_NET_IF on rump_server. Otherwise mutex_enter via carpattach for if_clone_mtx is called before mutex_init for it in ifinit1.	2017-01-20 08:35:33 +00:00
ozaki-r	8d757e7177	Disable rt_update mechanism by default This is a workaround for PR kern/51877. Enable again once the issue is fixed.	2017-01-19 06:58:55 +00:00
ozaki-r	6e752b2732	Fix typo in comments	2017-01-17 07:53:06 +00:00
christos	35561f6b22	ip6_sprintf -> IN6_PRINT so that we pass the size.	2017-01-16 15:44:46 +00:00
ryo	28df50d7bb	Make pfil(9) MP-safe (applying psref(9))	2017-01-16 09:28:40 +00:00
ryo	28f4c24cc2	Make ip6_sprintf(), in_fmtaddr(), lla_snprintf() and icmp6_redirect_diag() mpsafe. Reviewed by ozaki-r@	2017-01-16 07:33:36 +00:00
maya	fe2925feed	appease coverity by using strlcpy instead of strncpy ok riastradh	2017-01-14 16:34:44 +00:00
msaitoh	98ff4b5f9f	Fix a bug that the parent interface's callback wasn't called when the vlan interface is configured. A callback function uses VLAN_ATTACHED() function which check ec->ec_nvlans, the value should be incremented before calling the callback. This bug was added in if_vlan.c rev. 1.83 (2015/11/19).	2017-01-13 06:11:56 +00:00
ryo	de1b0d7b6e	* pfil_add_hook() no longer treats PFIL_IFADDR and PFIL_IFNET. delete them from pfil_flag_cases[]. * add/fix KASSERT * fix comment	2017-01-12 17:19:17 +00:00
ozaki-r	2b82ef9b8f	Get rid of unnecessary header inclusions	2017-01-11 13:08:29 +00:00
ozaki-r	93472e9c29	Don't call ifa_remove with holding psref	2017-01-11 07:03:59 +00:00
ozaki-r	7f1388a5d2	Add softnet_lock to if_link_state_change_si Fix panic: lock error: Mutex: mutex_vector_exit: assertion failed: MUTEX_OWNER(mtx->mtx_owner) == curthread at callout_halt <= arp_dad_stop <= in_if_link_down.	2017-01-10 08:45:45 +00:00
ozaki-r	a94a205118	Enable some sysctl knobs on rump kernels for ifmcstat	2017-01-10 05:42:34 +00:00
ozaki-r	9ac3cb6dea	Replace adaptive mutex for ethercom with spin one Unfortunately even wm(4) doesn't allow adaptive mutex because wm(4) tries to hold it with holding its own spin mutex.	2017-01-10 05:40:59 +00:00
ryo	7c72d62b0f	Not to use ph_inout[2]. dir (= PFIL_IN or PFIL_OUT) is 1 or 2, not 0 or 1.	2017-01-04 13:03:41 +00:00
rmind	c65c0a1d00	NPF: fix the interface table initialisation on load.	2017-01-03 00:58:05 +00:00
christos	2cc136e9e2	make this compile as a module.	2017-01-02 23:02:04 +00:00
rmind	0b635e7c1d	NPF: implement dynamic handling of interface addresses (the kernel part).	2017-01-02 21:49:51 +00:00
ozaki-r	0e2548254a	Use kmem_intr_alloc instead of kmem_alloc ether_addmulti still can be called in softint. Fix PR kern/51755	2016-12-31 15:07:02 +00:00
christos	1913804a7a	export rprocs too so we don't lose them.	2016-12-28 21:55:04 +00:00
ozaki-r	bf5ce79b5b	Protect ec_multi* with mutex The data can be accessed from sysctl, ioctl, interface watchdog (if_slowtimo) and interrupt handlers. We need to protect the data against parallel accesses from them. Currently the mutex is applied to some drivers, we need to apply it to all drivers in the future. Note that the mutex is adaptive one for ease of implementation but some drivers access the data in interrupt context so we cannot apply the mutex to every drivers as is. We have two options: one is to replace the mutex with a spin one, which requires some additional works (see ether_multicast_sysctl), and the other is to modify the drivers to access the data not in interrupt context somehow.	2016-12-28 07:32:16 +00:00
ozaki-r	b79bd95d27	Use ether_ifattach in carp_clone_create instead of C&P code carp_clone_destroy calls ether_ifdetach so not calling ether_ifattach is inconsistent. If we add something pair of initialization and destruction to ether_ifattach and ether_ifdetach (e.g., mutex_init/mutex_destroy), ether_ifdetach of carp_clone_destroy won't work. So use ether_ifattach. In order to do so, make ether_ifattach accept the 2nd argument (lla) as NULL to allow carp to initialize its link level address by itself.	2016-12-28 07:26:24 +00:00
christos	e3864be530	Another missed patch	2016-12-27 13:49:58 +00:00
christos	d687232cdf	fix merge conflict.	2016-12-27 01:31:06 +00:00
rmind	f3628a9718	Convert NPF to the latest pfil(9) changes.	2016-12-26 23:59:47 +00:00
rmind	ce5a9d0b35	Bump NPF_VERSION to 19.	2016-12-26 23:39:18 +00:00
christos	8dd9914047	pfil(9) improvements to handle address changes: Add: PFIL_IFADDR call on interface reconfig (mbuf is ioctl #) PFIL_IFNET call on interface attach/detach (mbuf is PFIL_IFNET_*) from rmind@	2016-12-26 23:21:49 +00:00
rmind	40a029b661	npf_tcp_fsm: fix for the NPF_TCPS_SYN_RECEIVED state. SYN re-transmission after SYN-ACK was seen by NPF should not terminate the connection. Thanks to: Alexander Kiselev <kiselev99 at gmail com>	2016-12-26 23:10:46 +00:00
christos	f75d79eb69	Sync NPF with the version on github: backport standalone NPF changes, which allow us to create and run separate NPF instances. Minor fixes. (from rmind@)	2016-12-26 23:05:05 +00:00
rmind	e8a8032d56	Fix kmem_free() in hashmap_remove().	2016-12-26 21:16:06 +00:00
rmind	50c1611937	Fix kmem_free() sizes in hashmap_rehash() and lpm_clear().	2016-12-26 12:44:10 +00:00
ozaki-r	78a5509e4b	Use psz/psref to hold ifa	2016-12-26 07:25:00 +00:00
ozaki-r	ec260ed075	Remove assertion that the lock isn't held It's useless in this case, because without it we can know that the lock is held or not on a next lock acquisition and even more if LOCKDEBUG is enabled a failure on the acquisition will provide useful information for debugging while an assertion failure will provide just the fact that the assertion failed.	2016-12-22 03:46:51 +00:00
ozaki-r	6261537b3d	Fix deadlock between llentry timers and destruction of llentry llentry timer (of nd6) holds both llentry's lock and softnet_lock. A caller also holds them and calls callout_halt to wait for the timer to quit. However we can pass only one lock to callout_halt, so passing either of them can cause a deadlock. Fix it by avoid calling callout_halt without holding llentry's lock. BTW in the first place we cannot pass llentry's lock to callout_halt because it's a rwlock...	2016-12-21 08:47:02 +00:00
ozaki-r	8be8a178cb	Don't call psref_target_destroy unless NET_MPSAFE We don't need it if NET_MPSAFE off and also it causes lockup sometimes because of calling it with holding softnet_lock.	2016-12-21 04:01:57 +00:00
ozaki-r	eceb88a68b	Fix kernel build with RT_DEBUG and !NET_MPSAFE	2016-12-21 00:33:49 +00:00
roy	996a7c47cf	Fix gcc complaining about int to unsigned long conversion issues by explictly marking as unsigned in RT_ROUNDUP2.	2016-12-19 11:17:00 +00:00
christos	fb7054ae79	Can't hide stuff from userland, because struct route is embedded in other structures (like inpcb) and things like fstat stop working.	2016-12-16 20:11:52 +00:00
knakahara	7ae52b382e	fix unlock and splx inversion. Currently, this doesn't cause problem because either one is used.	2016-12-16 08:47:36 +00:00
ozaki-r	dd8638eea5	Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input The benefits of the change are: - We can reduce codes - We can provide the same behavior between drivers - Where/When if_ipackets is counted up - Note that some drivers still update packet statistics in their own way (periodical update) - Moved bpf_mtap run in softint - This makes it easy to MP-ify bpf Proposed on tech-kern and tech-net	2016-12-15 09:28:02 +00:00
knakahara	237f476937	fix race of gif_softc->gif_ro when we send multiple flows over gif on NET_MPSAFE enabled kernel. make gif_softc->gif_ro percpu as well as ipforward_rt to resolve this race. and add future TODO comment for etherip(4).	2016-12-14 11:19:15 +00:00
ozaki-r	81f2534aaa	Constify ifp of if_is_deactivated	2016-12-13 02:05:48 +00:00
knakahara	22bec9c1a0	MP-safe pppoe(4). Nearly all parts is implemented by Shoichi YAMAGUCHI<s-yamaguchi@IIJ>, thanks.	2016-12-13 00:35:11 +00:00
maya	21cc7f1b6b	acknowleg -> acknowledg, proceedure -> procedure. only comments were changed. from miod	2016-12-12 15:58:44 +00:00
ozaki-r	6fb8880601	Make the routing table and rtcaches MP-safe See the following descriptions for details. Proposed on tech-kern and tech-net Overview -------- We protect the routing table with a rwock and protect rtcaches with another rwlock. Each rtentry is protected from being freed or updated via reference counting and psref. Global rwlocks -------------- There are two rwlocks; one for the routing table (rt_lock) and the other for rtcaches (rtcache_lock). rtcache_lock covers all existing rtcaches; there may have room for optimizations (future work). The locking order is rtcache_lock first and rt_lock is next. rtentry references ------------------ References to an rtentry is managed with reference counting and psref. Either of the two mechanisms is used depending on where a rtentry is obtained. Reference counting is used when we obtain a rtentry from the routing table directly via rtalloc1 and rtrequest{,1} while psref is used when we obtain a rtentry from a rtcache via rtcache_* APIs. In both cases, a caller can sleep/block with holding an obtained rtentry. The reasons why we use two different mechanisms are (i) only using reference counting hurts the performance due to atomic instructions (rtcache case) (ii) ease of implementation; applying psref to APIs such rtaloc1 and rtrequest{,1} requires additional works (adding a local variable and an argument). We will finally migrate to use only psref but we can do it when we have a lockless routing table alternative. Reference counting for rtentry ------------------------------ rt_refcnt now doesn't count permanent references such as for rt_timers and rtcaches, instead it is used only for temporal references when obtaining a rtentry via rtalloc1 and rtrequest{,1}. We can do so because destroying a rtentry always involves removing references of rt_timers and rtcaches to the rtentry and we don't need to track such references. This also makes it easy to wait for readers to release references on deleting or updating a rtentry, i.e., we can simply wait until the reference counter is 0 or 1. (If there are permanent references the counter can be arbitrary.) rt_ref increments a reference counter of a rtentry and rt_unref decrements it. rt_ref is called inside APIs (rtalloc1 and rtrequest{,1} so users don't need to care about it while users must call rt_unref to an obtained rtentry after using it. rtfree is removed and we use rt_unref and rt_free instead. rt_unref now just decrements the counter of a given rtentry and rt_free just tries to destroy a given rtentry. See the next section for destructions of rtentries by rt_free. Destructions of rtentries ------------------------- We destroy a rtentry only when we call rtrequst{,1}(RTM_DELETE); the original implementation can destroy in any rtfree where it's the last reference. If we use reference counting or psref, it's easy to understand if the place that a rtentry is destroyed is fixed. rt_free waits for references to a given rtentry to be released before actually destroying the rtentry. rt_free uses a condition variable (cv_wait) (and psref_target_destroy for psref) to wait. Unfortunately rtrequst{,1}(RTM_DELETE) can be called in softint that we cannot use cv_wait. In that case, we have to defer the destruction to a workqueue. rtentry#rt_cv, rtentry#rt_psref and global variables (see rt_free_global) are added to conduct the procedure. Updates of rtentries -------------------- One difficulty to use refcnt/psref instead of rwlock for rtentry is updates of rtentries. We need an additional mechanism to prevent readers from seeing inconsistency of a rtentry being updated. We introduce RTF_UPDATING flag to rtentries that are updating. While the flag is set to a rtentry, users cannot acquire the rtentry. By doing so, we avoid users to see inconsistent rtentries. There are two options when a user tries to acquire a rtentry with the RTF_UPDATING flag; if a user runs in softint context the user fails to acquire a rtentry (NULL is returned). Otherwise a user waits until the update completes by waiting on cv. The procedure of a updater is simpler to destruction of a rtentry. Wait on cv (and psref) and after all readers left, proceed with the update. Global variables (see rt_update_global) are added to conduct the procedure. Currently we apply the mechanism to only RTM_CHANGE in rtsock.c. We would have to apply other codes. See "Known issues" section. psref for rtentry ----------------- When we obtain a rtentry from a rtcache via rtcache_* APIs, psref is used to reference to the rtentry. rtcache_ref acquires a reference to a rtentry with psref and rtcache_unref releases the reference after using it. rtcache_ref is called inside rtcache_* APIs and users don't need to take care of it while users must call rtcache_unref to release the reference. struct psref and int bound that is needed for psref is embedded into struct route. By doing so we don't need to add local variables and additional argument to APIs. However this adds another constraint to psref other than reference counting one's; holding a reference of an rtentry via a rtcache is allowed by just one caller at the same time. So we must not acquire a rtentry via a rtcache twice and avoid a recursive use of a rtcache. And also a rtcache must be arranged to be used by a LWP/softint at the same time somehow. For IP forwarding case, we have per-CPU rtcaches used in softint so the constraint is guaranteed. For a h rtcache of a PCB case, the constraint is guaranteed by the solock of each PCB. Any other cases (pf, ipf, stf and ipsec) are currently guaranteed by only the existence of the global locks (softnet_lock and/or KERNEL_LOCK). If we've found the cases that we cannot guarantee the constraint, we would need to introduce other rtcache APIs that use simple reference counting. psref of rtcache is created with IPL_SOFTNET and so rtcache shouldn't used at an IPL higher than IPL_SOFTNET. Note that rtcache_free is used to invalidate a given rtcache. We don't need another care by my change; just keep them as they are. Performance impact ------------------ When NET_MPSAFE is disabled the performance drop is 3% while when it's enabled the drop is increased to 11%. The difference comes from that currently we don't take any global locks and don't use psref if NET_MPSAFE is disabled. We can optimize the performance of the case of NET_MPSAFE on by reducing lookups of rtcache that uses psref; currently we do two lookups but we should be able to trim one of two. This is a future work. Known issues ------------ There are two known issues to be solved; one is that a caller of rtrequest(RTM_ADD) may change rtentry (see rtinit). We need to prevent new references during the update. Or we may be able to remove the code (perhaps, need more investigations). The other is rtredirect that updates a rtentry. We need to apply our update mechanism, however it's not easy because rtredirect is called in softint and we cannot apply our mechanism simply. One solution is to defer rtredirect to a workqueue but it requires some code restructuring.	2016-12-12 03:55:57 +00:00
christos	8a52904684	revert dir hack.	2016-12-10 22:09:49 +00:00
christos	9543d3ecd6	Welcome to version 18: - Connection state keys are not stored and loaded using the logical key contents. - connection finder key is stored in a map that contains the key and the direction.	2016-12-10 19:05:45 +00:00
christos	8feed355cb	Add missing extcalls array. This is currently a no-op, but this is what userland does too. Allows npfctl save; npfctl load to work again.	2016-12-10 19:02:18 +00:00
kre	27d85bae08	Remove what looks like remnant (partly removed already) debug code, which could not possibly compile as it was.	2016-12-10 09:26:16 +00:00
christos	0ce32297f1	add functionality to lookup a nat entry from the connection list.	2016-12-10 05:41:10 +00:00
christos	8f6d079f97	This patches ditches the ptree(3) library, because it is broken (you can get missing entries!). Instead, as a temporary solution, we switch to a simple linear scan of the hash tables for the longest-prefix-match (lpm.c lpm.h) algorithm. In fact, with few unique prefixes in the set, on modern hardware this simple algorithm is pretty fast anyway!	2016-12-09 02:40:38 +00:00

1 2 3 4 5 ...

3105 Commits