NetBSD

Author	SHA1	Message	Date
knakahara	b71542e5bc	let gif(4) promise softint(9) contract (2/2) : ip_encap side The last commit does not care encaptab. This commit fixes encaptab race which is used not only gif(4).	2016-07-04 04:17:25 +00:00
knakahara	d81cd78ed7	let gif(4) promise softint(9) contract (1/2) : gif(4) side To prevent calling softint_schedule() after called softint_disestablish(), the following modifications are added + ioctl (writing configuration) side - off IFF_RUNNING flag before changing configuration - wait softint handler completion before changing configuration + packet processing (reading configuraiotn) side - if IFF_RUNNING flag is on, do nothing + in whole - add gif_list_lock_{enter,exit} to prevent the same configuration is set to other gif(4) interfaces	2016-07-04 04:14:47 +00:00
ozaki-r	17b4eb5edd	Make sure to free all interface addresses in if_detach Addresses of an interface (struct ifaddr) have a (reverse) pointer of an interface object (ifa->ifa_ifp). If the addresses are surely freed when their interface is destroyed, the pointer is always valid and we don't need a tweak of replacing the pointer to if_index like mbuf. In order to make sure the assumption, the following changes are required: - Deactivate the interface at the firstish of if_detach. This prevents in6_unlink_ifa from saving multicast addresses (wrongly) - Invalidate rtcache(s) and clear a rtentry referencing an address on RTM_DELETE. rtcache(s) may delay freeing an address - Replace callout_stop with callout_halt of DAD timers to ensure stopping such timers in if_detach	2016-07-01 05:22:33 +00:00
ozaki-r	65634f4cd9	Tidy up goto lables No functional change.	2016-06-30 06:56:27 +00:00
ozaki-r	3fe6488e55	Fix error paths Some error paths did m_put_rcvif_psref twice.	2016-06-30 06:48:58 +00:00
ozaki-r	d4c71b34a8	Make sure that ifaddr is published after its initialization finished Basically we should insert an item to a collection (say a list) after item's initialization has been completed to avoid accessing an item that is initialized halfway. ifaddr (in{,6}_ifaddr) isn't processed like so and needs to be fixed. In order to do so, we need to tweak {arp,nd6}_rtrequest that depend on that an ifaddr is inserted during its initialization; they explore interface's address list to determine that rt_getkey(rt) of a given rtentry is in the list to know whether the route's interface should be a loopback, which doesn't work after the change. To make it work, first check RTF_LOCAL flag that is set in rt_ifa_addlocal that calls {arp,nd6}_rtrequest eventually. Note that we still need the original code for the case to remove and re-add a local interface route.	2016-06-30 01:34:53 +00:00
ozaki-r	ca4ea29d93	Add missing NULL checks for m_get_rcvif_psref	2016-06-28 02:02:56 +00:00
ozaki-r	e1b6735f05	Fix typo in a comment	2016-06-23 06:40:48 +00:00
ozaki-r	43c5ab376f	Replace ifp of ip_moptions and ip6_moptions with if_index The motivation is the same as the mbuf's rcvif case; avoid having a pointer of an ifnet object in ip_moptions and ip6_moptions, which is not MP-safe. ip_moptions and ip6_moptions can be stored in a PCB for inet or inet6 that's life time is different from ifnet one and so an ifnet object can be disappeared anytime we get it via them. Thus we need to look up an ifnet object by if_index every time for safe.	2016-06-21 03:28:27 +00:00
knakahara	bf1a57d3d3	fix: i386/ALL build failure	2016-06-20 08:08:13 +00:00
knakahara	95fc145695	apply if_output_lock() to L3 callers which call ifp->if_output() of L2(or L3 tunneling).	2016-06-20 06:46:37 +00:00
ozaki-r	e1135cd9b9	Use curlwp_bind and curlwp_bindx instead of open-coding LP_BOUND	2016-06-16 02:38:40 +00:00
knakahara	a6f4292e65	eliminate unnecessary splnet	2016-06-13 08:37:15 +00:00
knakahara	e4ff09f05d	MP-ify fastforward to support GATEWAY kernel option. I add "ipflow_lock" mutex in ip_flow.c and "ip6flow_lock" mutex in ip6_flow.c to protect all data in each file. Of course, this is not MP-scalable. However, it is sufficient as tentative workaround. We should make it scalable somehow in the future. ok by ozaki-r@n.o.	2016-06-13 08:34:23 +00:00
knakahara	14ea9af5f7	make ipflow_reap() static function.	2016-06-13 08:29:55 +00:00
knakahara	f2808ade1a	remove unnecessary splnet before pool_{get,put}	2016-06-13 08:04:44 +00:00
ozaki-r	fe6d427551	Avoid storing a pointer of an interface in a mbuf Having a pointer of an interface in a mbuf isn't safe if we remove big kernel locks; an interface object (ifnet) can be destroyed anytime in any packet processing and accessing such object via a pointer is racy. Instead we have to get an object from the interface collection (ifindex2ifnet) via an interface index (if_index) that is stored to a mbuf instead of an pointer. The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9) for sleep-able critical sections and m_{get,put}_rcvif that use pserialize(9) for other critical sections. The change also adds another API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition moratorium, i.e., it is intended to be used for places where are not planned to be MP-ified soon. The change adds some overhead due to psref to performance sensitive paths, however the overhead is not serious, 2% down at worst. Proposed on tech-kern and tech-net.	2016-06-10 13:31:43 +00:00
ozaki-r	d938d837b3	Introduce m_set_rcvif and m_reset_rcvif The API is used to set (or reset) a received interface of a mbuf. They are counterpart of m_get_rcvif, which will come in another commit, hide internal of rcvif operation, and reduce the diff of the upcoming change. No functional change.	2016-06-10 13:27:10 +00:00
christos	fdea3219c6	make hostzerobroadcast default to "no".	2016-05-27 16:44:15 +00:00
rjs	afd529313e	Use const for arguments to sctp_is_same_scope().	2016-05-22 23:04:27 +00:00
rjs	b65559a564	Remove rtcache reference to route before freeing the containing struct.	2016-05-22 22:18:41 +00:00
ozaki-r	1acd48af54	Get rid of unnecessary assignment	2016-05-17 09:00:24 +00:00
ozaki-r	040205ae93	Protect ifnet list with psz and psref The change ensures that ifnet objects in the ifnet list aren't freed during list iterations by using pserialize(9) and psref(9). Note that the change adds a pslist(9) for ifnet but doesn't remove the original ifnet list (ifnet_list) to avoid breaking kvm(3) users. We shouldn't use the original list in the kernel anymore.	2016-05-12 02:24:16 +00:00
ozaki-r	8e8364ddca	Fix compilation for ppc	2016-05-09 07:02:10 +00:00
christos	902487a7f3	fix compilation for ppc.	2016-05-04 15:42:32 +00:00
ozaki-r	2cf7873b92	Constify rtentry of if_output We no longer need to change rtentry below if_output. The change makes it clear where rtentries are changed (or not) and helps forthcoming locking (os psrefing) rtentries.	2016-04-28 00:16:56 +00:00
rjs	991b8746b6	Fix build when IPSEC enabled.	2016-04-26 11:02:57 +00:00
ozaki-r	9e0f6c5e36	Stop using rt_gwroute on packet sending paths rt_gwroute of rtentry is a reference to a rtentry of the gateway for a rtentry with RTF_GATEWAY. That was used by L2 (arp and ndp) to look up L2 addresses. By separating L2 nexthop caches, we don't need a route for the purpose and we can stop using rt_gwroute. By doing so, we can reduce referencing and modifying rtentries, which makes it easy to apply a lock (and/or psref) to the routing table and rtentries. One issue to do this is to keep RTF_REJECT behavior. It seems it was broken when we moved rtalloc1 things from L2 output routines (e.g., ether_output) to ip_hresolv_output, but (fortunately?) it works unexpectedly. What we mistook are: - RTF_REJECT was checked for any routes in L2 output routines, but in ip_hresolv_output it is checked only when the route is RTF_GATEWAY - The RTF_REJECT check wasn't copied to IPv6 (nd6_output) It seems that rt_gwroute checks hid the mistakes and it looked work (unexpectedly) and removing rt_gwroute checks unveil the issue. So we need to fix RTF_REJECT checks in ip_hresolv_output and also add them to nd6_output. One more point we have to care is returning an errno; we need to mimic looutput behavior. Originally RTF_REJECT check was done either in L2 output routines or in looutput. The latter is applied when a reject route directs to a loopback interface. However, now RTF_REJECT check is done before looutput so to keep the original behavior we need to return an errno which looutput chooses. Added rt_check_reject_route does such tweaks.	2016-04-26 09:30:01 +00:00
ozaki-r	a79dfa5db0	Sweep unnecessary route.h inclusions	2016-04-26 08:44:44 +00:00
rjs	505ea9765f	Fix build when IPSEC enabled.	2016-04-25 21:21:02 +00:00
ozaki-r	0c74cec625	Check error of rt_setgate and rt_settag	2016-04-25 14:38:08 +00:00
ozaki-r	5fd142cec8	Fix error path	2016-04-19 09:36:35 +00:00
ozaki-r	54748dcad2	Separate MPLS-related routines from ip_hresolv_output No functional changes.	2016-04-19 09:29:54 +00:00
ozaki-r	07d863c903	Constify rtentry of arpresolve We don't need to (rather shouldn't) modify rtentry in there.	2016-04-19 04:13:56 +00:00
ozaki-r	805fe96546	Fix panic on receiving an ARP request The panic happened if an ARP request has a spa (i.e., IP address) whose ARP entry already exists in the table as a static ARP entry.	2016-04-18 02:24:42 +00:00
ozaki-r	4ace575dc7	Get rid of meaningless RTF_UP check from ip_hresolv_output The check is meaningless because - An obtained rtentry is ensured that it's always RTF_UP by rtcache, rtalloc1 and rtlookup. If the rtentry isn't changed (i.e., RTF_UP gets dropped) during processing, the check should be unnecessary - Even if not, i.e., an obtained rtentry can be changed during processing, checking only at the point doesn't help; the rtentry can be changed after the check Instead we have to ensure that RTF_UP isn't dropped if someone is using it somehow. Note that we already ensure that a rtentry being used isn't freed by rt_refcnt. Proposed on tech-kern and tech-net.	2016-04-18 01:28:06 +00:00
rjs	b4a446b522	Remove stray debug printf().	2016-04-14 18:36:56 +00:00
ozaki-r	4f0eb37aac	ddb: rename show arptab to show routes show arptab command of ddb is now inappropriate because it actually dumps routes but arp entries aren't routes anymore. So rename it to show routes and move the code from if_arp.c to route.c. ok christos@	2016-04-13 00:47:01 +00:00
ozaki-r	322b6a238d	Sweep unncessary radix.h inclusions	2016-04-11 08:56:16 +00:00
christos	b988d754df	- tidy up error messages - add a length argument to arpresolve() - add KASSERT for overflow	2016-04-07 03:22:15 +00:00
ozaki-r	09973b35ac	Separate nexthop caches from the routing table By this change, nexthop caches (IP-MAC address pair) are not stored in the routing table anymore. Instead nexthop caches are stored in each network interface; we already have lltable/llentry data structure for this purpose. This change also obsoletes the concept of cloning/cloned routes. Cloned routes no longer exist while cloning routes still exist with renamed to connected routes. Noticeable changes are: - Nexthop caches aren't listed in route show/netstat -r - sysctl(NET_RT_DUMP) doesn't return them - If RTF_LLDATA is specified, it returns nexthop caches - Several definitions of routing flags and messages are removed - RTF_CLONING, RTF_XRESOLVE, RTF_LLINFO, RTF_CLONED and RTM_RESOLVE - RTF_CONNECTED is added - It has the same value of RTF_CLONING for backward compatibility - route's -xresolve, -[no]cloned and -llinfo options are removed - -[no]cloning remains because it seems there are users - -[no]connected is introduced and recommended to be used instead of -[no]cloning - route show/netstat -r drops some flags - 'L' and 'c' are not seen anymore - 'C' now indicates a connected route - Gateway value of a route of an interface address is now not a L2 address but "link#N" like a connected (cloning) route - Proxy ARP: "arp -s ... pub" doesn't create a route You can know details of behavior changes by seeing diffs under tests/. Proposed on tech-net and tech-kern: http://mail-index.netbsd.org/tech-net/2016/03/11/msg005701.html	2016-04-04 07:37:07 +00:00
mlelstv	78f913b0b2	Replace generic queue macros with IFNET/IFADDR macros.	2016-04-03 09:57:40 +00:00
ozaki-r	35b18fbb1d	Remove unnecessary casts and do s/0/NULL/ for rtrequest	2016-04-01 09:16:02 +00:00
christos	6228dc517a	PR/50899: David Binderman: optimize memset	2016-03-06 19:46:05 +00:00
knakahara	9b7918b3ee	remove unnecessary declarations and fix KNF Thanks to riastradh@	2016-02-29 01:29:15 +00:00
knakahara	e80f101289	To eliminate gif_softc_list linear search, add extra argument to encapsw.pr_ctlinput().	2016-02-26 07:35:17 +00:00
ozaki-r	a143583fe0	Use callout_halt instead of callout_stop	2016-02-25 06:00:01 +00:00
rtr	0a0528fd0a	Fix building of IPv4-Mapped IPv6 addresses. As discussed on tech-net@ use in6_sin_2_v4mapsin6() to build mapped addresses.	2016-02-15 19:00:42 +00:00
rtr	e2a3307b85	Reduce code duplication. Split creation of IPv4-Mapped IPv6 addresses into its own function and use it. No functional change intended. As posted to tech-net@	2016-02-15 14:59:03 +00:00
rtr	f5c6d9772a	remove duplicated #include of <netinet/in.h>	2016-02-14 23:47:57 +00:00
ozaki-r	9c4cd06355	Introduce softint-based if_input This change intends to run the whole network stack in softint context (or normal LWP), not hardware interrupt context. Note that the work is still incomplete by this change; to that end, we also have to softint-ify if_link_state_change (and bpf) which can still run in hardware interrupt. This change softint-ifies at ifp->if_input that is called from each device driver (and ieee80211_input) to ensure Layer 2 runs in softint (e.g., ether_input and bridge_input). To this end, we provide a framework (called percpuq) that utlizes softint(9) and percpu ifqueues. With this patch, rxintr of most drivers just queues received packets and schedules a softint, and the softint dequeues packets and does rest packet processing. To minimize changes to each driver, percpuq is allocated in struct ifnet for now and that is initialized by default (in if_attach). We probably have to move percpuq to softc of each driver, but it's future work. At this point, only wm(4) has percpuq in its softc as a reference implementation. Additional information including performance numbers can be found in the thread at tech-kern@ and tech-net@: http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html Acknowledgment: riastradh@ greatly helped this work. Thank you very much!	2016-02-09 08:32:07 +00:00
knakahara	51f4870974	eliminate variable argument in encapsw	2016-01-26 06:00:10 +00:00
knakahara	b546d5277b	implement encapsw instead of protosw and uniform prototype. suggested and advised by riastradh@n.o, thanks. BTW, It seems in_stf_input() had bugs...	2016-01-26 05:58:05 +00:00
ozaki-r	07e20941bc	Remove unnecessary LLE_REMREF The code around it was copied from arptimer, but LLE_REMREF is unnecessary because it is needed only for arptimer that is called after LLE_ADDREF. This is a possible fix for PR#50548, PR#50702 and PR#50704.	2016-01-25 10:15:38 +00:00
riastradh	fa50b451d4	Those were local changes not meant to be part of the revert. SORRY!	2016-01-23 14:48:55 +00:00
christos	e1c6072fc4	fix compilation	2016-01-23 02:58:13 +00:00
riastradh	e588d95c25	Back out previous change to introduce struct encapsw. This change was intended, but Nakahara-san had already made a better one locally! So I'll let him commit that one, and I'll try not to step on anyone's toes again.	2016-01-22 23:27:12 +00:00
riastradh	87bc652e3d	Don't abuse struct protosw for ip_encap -- introduce struct encapsw. Mostly mechanical change to replace it, culling some now-needless boilerplate around all the users. This does not substantively change the ip_encap API or eliminate abuse of sketchy pointer casts -- that will come later, and will be easier now that it is not tangled up with struct protosw.	2016-01-22 05:15:10 +00:00
riastradh	7c7b1739c8	Revert previous: ran cvs commit when I meant cvs diff. Sorry! Hit up-arrow one too few times.	2016-01-21 15:41:29 +00:00
riastradh	b41d562bd0	Give proper prototype to ip_output.	2016-01-21 15:27:48 +00:00
riastradh	f8b0ac1cb4	Give proper prototype to ip_output.	2016-01-20 22:12:22 +00:00
riastradh	6439c4109a	Give proper prototype to rip_output.	2016-01-20 22:02:54 +00:00
riastradh	2880d69957	Give proper prototype to udp_output.	2016-01-20 22:01:18 +00:00
riastradh	65a8f527af	Eliminate struct protosw::pr_output. You can't use this unless you know what it is a priori: the formal prototype is variadic, and the different instances (e.g., ip_output, route_output) have different real prototypes. Convert the only user of it, raw_send in net/raw_cb.c, to take an explicit callback argument. Convert the only instances of it, route_output and key_output, to such explicit callbacks for raw_send. Use assertions to make sure the conversion to explicit callbacks is warranted. Discussed on tech-net with no objections: https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html	2016-01-20 21:43:59 +00:00
knakahara	2692d86ef7	remove unused variable.	2016-01-20 05:58:49 +00:00
knakahara	d7b9bb29c0	Refactor protosw codes in gif(4). No functional change. - remove unnecessary include - reduce scopes	2016-01-18 06:08:26 +00:00
christos	bd37d539ab	PR/50670: David Binderman: Tidy up debugging printfs to avoid if else confusion.	2016-01-17 15:08:10 +00:00
knakahara	1c5d304e9c	eliminate ip_input.c and ip6_input.c dependency on gif(4)	2016-01-08 03:55:39 +00:00
ozaki-r	d52244fae3	Make revarprequest static	2016-01-05 05:37:06 +00:00
knakahara	6d50f36d54	use satosin{,6} macros instead of casts.	2015-12-25 06:47:56 +00:00
ozaki-r	66d9895f20	Fix memory leak of llentry#la_opaque llentry#la_opaque which is for token ring is allocated in arp.c and freed in arp.c when freeing llentry. However, llentry can be freed from other places, e.g., lltable_free. In such cases, la_opaque is never freed. To fix that, add a new callback (lle_ll_free) to llentry and register a destruction function of la_opque to it. On freeing a llentry, we can surely free la_opque via the callback.	2015-12-17 02:38:33 +00:00
ozaki-r	213b8d3cc6	Fix token_rif extractions from llentry	2015-12-16 05:44:59 +00:00
christos	7273e27bf7	PR/50529: David Binderman: Remove double sizeof	2015-12-13 18:58:13 +00:00
christos	f2d1d0f2f7	PR/50528: David Binderman: remove sizeof(sizeof(x))	2015-12-13 18:53:57 +00:00
knakahara	a00e94f4ff	PR kern/50522: gif(4) ioctl causes panic while someone is using the gif(4) interface. It is required to wait other CPU's softint completion before disestablishing the softint handler.	2015-12-11 07:59:14 +00:00
ozaki-r	871888c540	Introduce arp_settimer No functional change.	2015-12-11 01:15:00 +00:00
knakahara	0072297ac8	ip_encap uses kmem_alloc APIs instead of malloc.	2015-12-09 06:00:51 +00:00
ozaki-r	cefec21119	Get rid of a big block in in_arpinput No functional change.	2015-11-30 06:45:38 +00:00
ozaki-r	f373fa78e6	Fix build dependency of if_llatbl.c if_llatbl.c is required if inet or inet6 is enabled. Depending on ether doesn't suit for NDP case.	2015-11-26 01:41:20 +00:00
ozaki-r	53e3e4714d	Restore softnet_lock and KERNEL_LOCK for rtrequest and rtfree We still need them for rt operations.	2015-11-19 03:03:04 +00:00
ozaki-r	17001ea619	Add missing rtfree	2015-11-16 05:39:39 +00:00
ozaki-r	e72fec577e	Fix db_print_llinfo rt_llinfo is now struct llentry.	2015-11-06 08:55:49 +00:00
ozaki-r	60defe31a6	Fix inappropriate rt_flags check It depended on either RTF_CLONED or RTF_CLONING must be set, however, the assumption didn't meet for userland problems that create a route via RTM_ADD. This fixes an issue that running rarpd causes the following kernel panic reported by nonaka@: panic: kernel diagnostic assertion "(la->la_flags & LLE_STATIC) == 0" failed: file "/usr/src/sys/netinet/if_arp.c", line 1339	2015-11-06 08:38:43 +00:00
ozaki-r	847c251da6	Stop callout in arp_rtrequest(RTM_DELETE) This change fixes arptimer panic after removing an interface (say by drvctl -d), which is reported by Takahiro Hayashi. This change also fixes llentry's reference counting; we have to take into account rtentry#rt_llinfo as well as arptimer.	2015-10-20 07:46:59 +00:00
ozaki-r	e4a5751875	Stop using softnet_lock (fix possible deadlock) Using softnet_lock for mutual exclusion between lltable_free and arptimer was wrong and had an issue causing a deadlock between them; lltable_free waits arptimer completion by calling callout_halt with softnet_lock that is held in arptimer, however lltable_free also holds llentry's lock that is also held in arptimer so arptimer never obtain the lock and both never go forward eventually. We have to pass llentry's lock to callout_halt instead.	2015-10-20 07:35:15 +00:00
roy	a2d314543b	In the event of an error within arpresolve(), delete the cloned route otherwise it would never be deleted.	2015-10-14 11:22:55 +00:00
roy	b0f4622d81	Save and clear the la route while we have a write lock	2015-10-14 11:17:57 +00:00
rjs	8c2654abca	Add core networking support for SCTP.	2015-10-13 21:28:34 +00:00
roy	222d6fab6a	arpresolve() now returns 0 on success otherwise an error code. Callers of arpresolve() now pass the error code back to their caller, masking out EWOULDBLOCK. This allows applications such as ping(8) to display a suitable error condition.	2015-10-13 12:33:07 +00:00
roy	9ba2bef003	Move the NOARP check up a bit so that it works when an la is created but hasn't been resolved yet. Fixes PR kern/17611.	2015-10-13 11:13:37 +00:00
roy	c47c3c3042	Include arp.h to restore the sysctl net.inet.ip.dad_count. Fixes PR kern/49883 thanks to HITOSHI Osada.	2015-10-13 09:46:42 +00:00
roy	b61ebcc9c7	Simplify la handling in arpresolve() by asking arplookup() not to create a la. If a la is needed arpresolve() will then create it or mark the current la as writable.	2015-10-13 09:33:35 +00:00
roy	620387577c	Create a temporary define involving IFF_STATICARP if we have it instead of just testing for __FreeBSD__. No functional change. ok: ozaki-r@	2015-10-08 08:17:37 +00:00
ozaki-r	98c468dd77	Create an llentry after fixing an interface to store In case of RTF_LOCAL routes, we change an output interface of a route from original one to lo0ifp. An llentry also has to be stored to lo0ifp in such cases. Problem reported by roy@	2015-10-07 00:33:27 +00:00
ozaki-r	a7ed97a295	Fix arplookup logic It should first lookup and then create an entry if not found (and if creation is requested).	2015-10-05 08:17:31 +00:00
skrll	e22ecac88e	Make this compile again	2015-09-21 13:32:26 +00:00
roy	f3b0c038a1	If, for whatever reason, a local interface route is removed and then re-added, mark it as a local route. While here, if changing the route to go via the loopback interface remove any inherited MTU value.	2015-09-11 10:33:32 +00:00
ozaki-r	4c03bf20c9	Remove wrong KASSERT in arptfree la_rt can be NULL because arptimer that calls arptfree doesn't always free llentry so llentry can remain with la_rt == NULL. So we instead check whether la_rt is NULL or not and do arptfree if not. This fixes PR kern/50184 (confirmed by martin@) and PR kern/50186 (maybe).	2015-09-09 01:24:01 +00:00
ozaki-r	c2ed920b63	Revert v1.176 for further proper fix	2015-09-09 01:22:28 +00:00
ozaki-r	cac0e9c370	Refactor tcp_mtudisc No functional change.	2015-09-07 01:56:50 +00:00
ozaki-r	6c5982b876	CID 1322880: remove unnecessary m != NULL checks	2015-09-07 01:18:27 +00:00
ozaki-r	d5ad433b2c	CID 1322878: simplify log output flow	2015-09-07 01:17:37 +00:00
ozaki-r	54c4f3b688	Do rt_refcnt++ when set a rtentry to another rtentry's rt_gwroute And also do rtfree when deref a rtentry from rt_gwroute.	2015-09-02 11:35:11 +00:00
christos	00158cce6d	XXX: Disable KASSERT for now since locking is broken for interface removals.	2015-09-02 09:28:13 +00:00
ozaki-r	13b8e486ae	Fix building kernels w/o ether	2015-08-31 16:46:14 +00:00
ozaki-r	ac75483513	Fix building kernels w/o DIAGNOSTIC	2015-08-31 09:21:55 +00:00
ozaki-r	7dc37e542b	Remove obsolete global variables and sysctl MIBs	2015-08-31 08:06:30 +00:00
ozaki-r	8997ac8f09	Replace ARP cache (llinfo) with lltable/llentry Highlights of the change are: - Use llentry instead of llinfo to manage ARP caches - ARP specific data are stored in the hashed list of an interface instead of the global list (llinfo_arp) - Fine-grain locking on llentry - arptimer (callout) per ARP cache - the global timer callout with the big locks can be removed (though softnet_lock is still required for now) - net.inet.arp.prune is now obsoleted - it was the interval of the global timer callout - net.inet.arp.refresh is now obsoleted - it was a parameter that prevents expiration of active caches - Removed to simplify the timer logic, but we may be able to restore the feature if really needed Proposed on tech-kern and tech-net.	2015-08-31 08:05:20 +00:00
ozaki-r	879526da38	Hook up lltable/llentry with the kernel (and rumpkernel) It is built and initialized on bootup, but there is no user for now. Most codes in in.c are imported from FreeBSD as well as lltable/llentry.	2015-08-31 08:02:44 +00:00
ozaki-r	3aedc74443	Make rt_refcnt take into account rt_timer	2015-08-31 06:25:15 +00:00
pooka	1c4a50f192	sprinkle _KERNEL_OPT	2015-08-24 22:21:26 +00:00
christos	e7ae23fd9e	include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.	2015-08-20 14:40:16 +00:00
ozaki-r	f818671bf4	Move insane goto label	2015-08-12 07:13:14 +00:00
ozaki-r	55140c1926	Use time_uptime instead of time_second to avoid time leaps Some codes in sys/net* use time_second to manage time periods such as cache expirations. However, time_second doesn't increase monotonically and can leap by say settimeofday(2) according to time_second(9). We should use time_uptime instead of it to avoid such time leaps. This change replaces time_second with time_uptime. Additionally it converts a time based on time_uptime to a time based on time_second when the kernel passes the time to userland programs that expect the latter, and vice versa. Note that we shouldn't leak time_uptime to other hosts over the netowrk. My investigation shows there is no such leak: http://mail-index.netbsd.org/tech-net/2015/08/06/msg005332.html Discussed on tech-kern and tech-net.	2015-08-07 08:11:33 +00:00
matt	49cb8763aa	If we are sending a window probe and there's unacked data in the socket, make sure at least the persist timer is running.	2015-07-24 04:33:50 +00:00
matt	6de0fc0ff8	Make sure that snd_win doesn't go negative.	2015-07-24 04:31:20 +00:00
ozaki-r	9eae87d0c8	Reform use of rt_refcnt rt_refcnt of rtentry was used in bad manners, for example, direct rt_refcnt++ and rt_refcnt-- outside route.c, "rt->rt_refcnt++; rtfree(rt);" idiom, and touching rt after rt->rt_refcnt--. These abuses seem to be needed because rt_refcnt manages only references between rtentry and doesn't take care of references during packet processing (IOW references from local variables). In order to reduce the above abuses, the latter cases should be counted by rt_refcnt as well as the former cases. This change improves consistency of use of rt_refcnt: - rtentry is always accessed with rt_refcnt incremented - rtentry's rt_refcnt is decremented after use (rtfree is always used instead of rt_refcnt--) - functions returning rtentry increment its rt_refcnt (and caller rtfree it) Note that rt_refcnt prevents rtentry from being freed but doesn't prevent rtentry from being updated. Toward MP-safe, we need to provide another protection for rtentry, e.g., locks. (Or introduce a better data structure allowing concurrent readers during updates.)	2015-07-17 02:21:08 +00:00
ozaki-r	fcda92b6be	Remove unused arguments and the associated code from nd6_nud_hint() from OpenBSD	2015-07-15 09:20:18 +00:00
ozaki-r	bd4fe18031	Make global variables static	2015-07-15 08:49:15 +00:00
ozaki-r	f2abd6a2e3	Move rt_gwroute operation out of stripoutput We should do it in ip_hresolv_needed.	2015-07-14 08:44:59 +00:00
ozaki-r	f81368b844	Use ip_hresolv_output for if_token as well I thought we cannot apply ip_hresolv_output to if_token because rt0 looked being needed by arpresolve in token_output. However, rt0 is actually not used by arpresolve in NetBSD (see obsolete ARPRESOLVE macro).	2015-07-01 03:39:36 +00:00
roy	938235dc97	errno -> error, spotted by the hawk skrll	2015-06-08 08:19:20 +00:00
roy	17aa5fe87d	It's possible we could not have any ready addresses.	2015-06-08 08:02:43 +00:00
roy	be5b0a3a89	Don't set errno. Thanks to skrll@	2015-06-08 07:59:54 +00:00
ozaki-r	6ea8c2e666	Pull out route lookups from L2 output routines Route lookups for routes of RTF_GATEWAY were done in L2 output routines such as ether_output, but they should be done in L3 i.e., before L2 output routines. This change places the lookups between L3 output routines (say ip_output) and the L2 output routines. The change is based on dyoung's patch submitted in the thread: https://mail-index.netbsd.org/tech-net/2013/02/01/msg003847.html You can find out detailed investigations by dyoung about the issue in there. Note that the change introduces a workaround for MPLS. ether_output knew that it needs to fill the ethertype of a frame as MPLS, based on a tag of an original route (rtentry), but now we don't pass it to ehter_output. So we have to tell that in another way. We use mtag to do so for now, which introduces some overhead. We should fix it somehow in the future. Discussed on tech-kern and tech-net.	2015-06-04 09:19:59 +00:00
rtr	e7083d7a4b	remove transitional functions in{,6}_pcbconnect_m() that were used in converting protocol user requests to accept sockaddr instead of mbufs. remove tcp_input copy in to mbuf from sockaddr and just copy to sockaddr to make it possible for the transitional functions to go away. no version bump since these functions only existed for a short time and were commented as adapters (they appeared in 7.99.15).	2015-05-24 15:43:45 +00:00
ozaki-r	423491c235	Replace NARC with NARCNET to follow renaming at 2007 Hmm, is anyone using this?	2015-05-22 07:44:46 +00:00
ozaki-r	b41c75c271	Use LIST_FOREACH{,_SAFE} The first loop doesn't remove any items in it, so we can use LIST_FOREACH instead of LIST_FOREACH_SAFE.	2015-05-21 09:29:51 +00:00
ozaki-r	442b227d9f	Use NULL instead of 0 for pointers	2015-05-21 09:27:10 +00:00
ozaki-r	2c0e34375a	Make arp_init, in_revarpinput and revarprequest static	2015-05-21 09:26:18 +00:00
kefren	f3bd20e96c	Use RUN_ONCE to initialize iss secret. Suggested by riastradh@	2015-05-19 17:33:43 +00:00
roy	f45d868787	Separate ARP handling DAD from inet. This is done by signalling the intent to try tentative addresses and then clearing the intent once the address is setup. When the ARP handler is installed (arp_ifinit) then it adds dad start and stop functions to the address which are used instead of calling ARP directly.	2015-05-16 12:12:46 +00:00
kefren	56d130b58b	Don't overexpose tcp_iss_secret and don't bother compute it unless RFC1948 compliance is activated	2015-05-16 10:09:20 +00:00
kefren	a6fab82126	Don't put segment on the wire if security request can't be fulfilled	2015-05-16 01:15:34 +00:00
kefren	110f4b05db	Don't try to do PCB lookup for bad checksummed segments Fixes PR/43510 and PR/48452	2015-05-15 18:03:45 +00:00
christos	37fd390ec4	if no address was found, don't check if it is tentative (hi Roy)	2015-05-09 18:47:26 +00:00
christos	28383371f1	assign sin only when it is needed	2015-05-09 18:46:25 +00:00
roy	bdb2ef03d5	If we don't have ARP, don't set IN_IFF_TENTATIVE.	2015-05-05 08:52:51 +00:00
justin	f5df4fc799	Rename delay variable as it shadows a global on arm.	2015-05-03 10:44:04 +00:00
joerg	5cad40c933	Fix !ARP build.	2015-05-02 20:22:12 +00:00
rtr	fd12cf39ee	make connect syscall use sockaddr_big and modify pr_{send,connect} nam parameter type from buf * to sockaddr . final commit for parameter type changes to protocol user requests bump kernel version to 7.99.15 for parameter type changes to pr_{send,connect}	2015-05-02 17:18:03 +00:00
roy	866e96fa79	Appease gcc.	2015-05-02 15:22:03 +00:00
roy	505639d2f3	Add IPv4 address flags IN_IFF_TENTATIVE, IN_IFF_DUPLICATED and IN_IFF_DETATCHED to mimic the IPv6 address behaviour. Add SIOCGIFAFLAG_IN ioctl to retrieve the address flag via the ifreq structure. Add IPv4 DAD detection via the ARP methods described in RFC 5227. Add sysctls net.inet.ip.dad_count and net.inet.arp.debug. Discussed on tech-net@	2015-05-02 14:41:32 +00:00
christos	ffe2b84e28	Apply Revision 220794 from FreeBSD to avoid dup ACKs: When checking to see if a window update should be sent to the remote peer, don't force a window update if the window would not actually grow due to window scaling. Specifically, if the window scaling factor is larger than 2 * MSS, then after the local reader has drained 2 * MSS bytes from the socket, a window update can end up advertising the same window. If this happens, the supposed window update actually ends up being a duplicate ACK. This can result in an excessive number of duplicate ACKs when using a higher maximum socket buffer size. Pointed out by Ricky Charlet, in tech-net.	2015-04-27 16:50:17 +00:00
ozaki-r	5f21075b8f	Add missing error checks on rtcache_setdst It can fail with ENOMEM.	2015-04-27 10:14:44 +00:00
ozaki-r	2373b55abc	Introduce in6_selecthlim_rt to consolidate an idiom for rt->rt_ifp It consolidates a scattered routine: (rt = rtcache_validate(&in6p->in6p_route)) != NULL ? rt->rt_ifp : NULL	2015-04-27 02:59:44 +00:00
rtr	d2aa9dd71f	remove pr_generic from struct pr_usrreqs and all implementations of pr_generic in protocols. bump to 7.99.13 approved by rmind@	2015-04-26 21:40:48 +00:00
rtr	89539c0d5f	return EINVAL if sin{,6}_len != sizeof(sockaddr_in{,6}) respectively in in{,6}_pcbconnect(). checking just m->m_len isn't enough because there are various places that assume sa_len has been properly populated.	2015-04-26 16:45:50 +00:00
rtr	69b4af1034	make rip_connect_pcb take sockaddr_in * instead of mbuf * make rip_connect_pcb static since it appears to be used only in raw_ip.c moves m_len check to callers which is a small duplication of code that will go away when the callers are converted to receive sockaddr *.	2015-04-25 15:19:54 +00:00
rtr	eddf3af3c6	make accept, getsockname and getpeername syscalls use sockaddr_big and modify pr_{accept,sockname,peername} nam parameter type from mbuf * to sockaddr . retained use of mbuftypes[MT_SONAME] for now. * bump to netbsd version 7.99.12 for parameter type change. patch posted to tech-net@ 2015/04/19	2015-04-24 22:32:37 +00:00
ozaki-r	06f4ab5ebf	Use KASSERT instead of if & panic rt can be NULL only when programming error (and we sure it cannot for now), so we can use KASSERT here (i.e., check only if DIAGNOSTIC).	2015-04-24 03:20:41 +00:00
ozaki-r	840cc553d7	Replace 0 with NULL for pointer variables	2015-04-24 02:56:51 +00:00
ozaki-r	2af3302ac0	KNF	2015-04-24 00:48:47 +00:00
ozaki-r	d18817a22a	Remove non-USE_RADIX case and USE_RADIX switch It seems that we have been using ip_encap only with USE_RADIX for long years. Let's remove unused non-USE_RADIX case. No objection on tech-kern and tech-net. Double-checked by knakahara@	2015-04-20 07:34:48 +00:00
ozaki-r	cefb9995f4	Remove garbage undef	2015-04-16 06:50:16 +00:00
riastradh	691129c8c5	KASSERT x then y, not x && y, to give more specific errors.	2015-04-15 13:02:16 +00:00
ozaki-r	e952648134	Use LIST_FOREACH_SAFE We have to use LIST_FOREACH_SAFE because LIST_REMOVE is used inside the loop through encap_remove.	2015-04-15 08:47:28 +00:00
ozaki-r	73c17c4a13	Replace DIAGNOSTIC & panic with KASSERT/KASSERTMSG	2015-04-15 03:38:50 +00:00
ozaki-r	dcfb08075f	Add $NetBSD$ at the top of the file	2015-04-15 03:32:23 +00:00
riastradh	556fc62b15	cprng_strong(kern_cprng, ...) never blocks, pass 0 for flags. FASYNC was wrong anyway! It's FNONBLOCK.	2015-04-13 15:51:00 +00:00
rtr	80ea8ccc7c	* update dccp_bind for struct mbuf * to struct sockaddr * parameter change * pass NULL instead of casting 0 to a pointer when calling in_pcbbind()	2015-04-04 04:33:38 +00:00
rtr	a2ba5e69ab	* change pr_bind to accept struct sockaddr * instead of struct mbuf * * update protocol bind implementations to use/expect sockaddr * instead of mbuf * * introduce sockaddr_big struct for storage of addr data passed via sys_bind; sockaddr_big is of sufficient size and alignment to accommodate all addr data sizes received. * modify sys_bind to allocate sockaddr_big instead of using an mbuf. * bump kernel version to 7.99.9 for change to pr_bind() parameter type. Patch posted to tech-net@ http://mail-index.netbsd.org/tech-net/2015/03/15/msg005004.html The choice to use a new structure sockaddr_big has been retained since changing sockaddr_storage size would lead to unnecessary ABI change. The use of the new structure does not preclude future work that increases the size of sockaddr_storage and at that time sockaddr_big may be trivially replaced. Tested by mrg@ and myself, discussed with rmind@, posted to tech-net@	2015-04-03 20:01:07 +00:00
ozaki-r	9817ed1a76	Don't grab KERNEL_LOCK during if_output when NET_MPSAFE The change makes L3 MP-safe work easy. At this point we deal with only IP forwarding. No functional change when NET_MPSAFE isn't enabled.	2015-04-03 07:55:18 +00:00
ozaki-r	71b1eb47ca	Remove unnecessary opt_ipsec.h inclusions	2015-03-31 08:47:01 +00:00
ozaki-r	7f0bd664ae	Add missing ifdef IPSEC	2015-03-31 08:44:43 +00:00
ozaki-r	50468f9be7	Tidy up the regular path of ip_forward No functional change is intended.	2015-03-26 04:05:58 +00:00
roy	a37502b2b6	Add RTF_BROADCAST to mark routes used for the broadcast address when they are created on the fly. This makes it clear what the route is for and allows an optimisation in ip_output() by avoiding a call to in_broadcast() because most of the time we do talk to a host. It also avoids a needless allocation for the storage of llinfo_arp and thus vanishes from arp(8) - it showed as incomplete anyway so this is a nice side effect. Guard against this and routes marked with RTF_BLACKHOLE in ip_fastforward(). While here, guard against routes marked with RTF_BLACKHOLE in ip6_fastforward(). RTF_BROADCAST is IPv4 only, so don't bother checking that here.	2015-03-23 18:33:17 +00:00
rtr	8699b912c3	Move code that is conditional on options INET6 into #ifdef INET6. * Re-organize some variable declarations to limit #ifdef's. * Move INET and INET6 code into respective switch cases to simplify #ifdef INET6. No intended functional change.	2015-03-14 02:08:16 +00:00
roy	5170946304	Don't add local routes for the any address or p2p addresses where the address matches the destination.	2015-02-26 12:58:36 +00:00
roy	42900924fd	Introduce the routing flag RTF_LOCAL to track local address routes. Add functions rt_ifa_addlocal() and rt_ifa_remlocal() to add and remove local routes for the address and announce the new address and route to the routing socket. Add in_ifaddlocal() and in_ifremlocal() to use these functions. Rename in6_if{add,rem}loop() to in6_if{add,rem}local() and use these functions. rtinit() no longer announces the address, just the network route for the address. As such, calls to rt_newaddrmsg() have been removed from in_addprefix() and in_scrubprefix(). This solves the problem of potentially more than one announcement, or no announcement at all for the address in certain situations.	2015-02-26 09:54:46 +00:00
christos	31fb02278a	PR/49676: Ryo Shimizu: ICMP_STATINC() buffer overflows XXX: pullup-7	2015-02-18 17:00:15 +00:00
he	de7f57fda9	Change the new counter variables in struct tcpcb to uint32_t, as per christos' comments.	2015-02-14 22:09:53 +00:00
he	1d14d02249	Port over the TCP_INFO socket option from FreeBSD, originally from the Linux 2.6 TCP API. This permits the caller to query certain information about a TCP connection, and is used by pkgsrc's net/iperf3 test program if available. This extends struct tcbcb with three fields to count retransmits, out-of-sequence receives and zero window announcements, and will therefore warrant a kernel revision bump (done separately).	2015-02-14 12:57:52 +00:00
rjs	652788239c	Add DCCP protocol support from KAME.	2015-02-10 19:11:52 +00:00
christos	f89df58b37	use the new printing code.	2014-12-02 20:25:47 +00:00
christos	dfbbb8d8b5	add routines to print in_addr and sockaddr_in (in_print and sin_print)	2014-12-02 19:35:27 +00:00
christos	da48f144c9	Don't pass junk in sin_family and sin_len for SIOCGIFNETMASK, and explain why. XXX: pullup 7?	2014-12-01 17:07:43 +00:00
christos	40d7a68275	Only check that the offset < sizeof(struct ip) if nxt != 0, i.e. in the tcp and udp cases. From kre. XXX: pullup 7	2014-11-30 18:15:41 +00:00
ozaki-r	fb797ebb35	Call looutput with holding KERNEL_LOCK This fixes diagnostic assertion "KERNEL_LOCKED_P()" in if_loop.c. PR kern/49410	2014-11-26 10:18:37 +00:00
seanb	ae36e3e5b1	Really make SO_REUSEPORT and SO_REUSEADDR equivalent for multicast sockets. From FreeBSD.	2014-11-25 19:09:13 +00:00
seanb	56c6664a5c	Clean up any dangling ifp references in (struct in6pcb *)->in6p_v4moptions (v4 multicast options off v4 mapped v6 socket) on interface destruction. The code to clean this up in a true v4 socket was moved to its own function which is now also called in the corresponding place for v6 sockets on interface destruction.	2014-11-25 15:04:37 +00:00
christos	cb8dda3c0e	Add sysctl to selectively log arp packets from unknown network. (Adrien URBAN).	2014-11-13 16:11:18 +00:00
maxv	fcc99ce60e	Do not uselessly include <sys/malloc.h>.	2014-11-10 18:46:33 +00:00
christos	828d274251	Avoid stack overflow when SACK and TCP_SIGNATURE are both present. Thanks to Jonathan Looney for pointing this out.	2014-10-25 15:07:13 +00:00
hikaru	62fa1e32f7	Fix wrong condition checking TSO capability. ipsec_used is not necessary condition. IPsec outbound policy will not be checked when ipsec_used is false.	2014-10-21 13:44:47 +00:00
snj	f0a7346d21	src is too big these days to tolerate superfluous apostrophes. It's "its", people!	2014-10-18 08:33:23 +00:00
christos	01fcb35dc2	document that we depend on the option numbers matching.	2014-10-12 19:02:18 +00:00
christos	4f85a755f8	Refactor the multicast membership code so that we can handle v4 mapped addresses using the v6 membership ioctls.	2014-10-12 19:00:21 +00:00
christos	07d9441357	exposet multicast option functions which are used by the v6 code now.	2014-10-11 21:12:51 +00:00
rmind	436f757159	Eliminate IFAREF() and IFAFREE() macros in favour of functions.	2014-09-09 20:16:12 +00:00
joerg	8403248f23	Always use cprng_fast32, even during initialisation. No point in using random(9).	2014-09-08 17:40:02 +00:00
rmind	2082db2d3c	in_pcbdetach: move ip_freemoptions() under softnet_lock for now (this will be changed back once other IP paths become MP-safe). Same for IPv6 routine. This partially reverts 1.150 of in_pcb.c and 1.127 of in6_pcb.c changes.	2014-09-07 00:50:56 +00:00
matt	a63dc570e9	Don't use C++ keyword (template) as variable.	2014-09-05 06:04:43 +00:00
matt	6c3d985231	Don't use C++ keywords (class, template) as variables	2014-09-05 06:03:51 +00:00
matt	8f413cecf4	Deanonymize structure for llinfo_arp.	2014-09-05 06:02:11 +00:00
rtr	8cf67cc6d5	split PRU_CONNECT2 & PRU_PURGEIF function out of pr_generic() usrreq switches and put into separate functions - always KASSERT(solocked(so)) even if not implemented (for PRU_CONNECT2 only) - replace calls to pr_generic() with req = PRU_CONNECT2 with calls to pr_connect2() - replace calls to pr_generic() with req = PRU_PURGEIF with calls to pr_purgeif() put common code from unp_connect2() (used by unp_connect() into unp_connect1() and call out to it when needed patch only briefly reviewed by rmind@	2014-08-09 05:33:00 +00:00
rtr	822872eada	split PRU_RCVD function out of pr_generic() usrreq switches and put into separate functions - always KASSERT(solocked(so)) even if not implemented - replace calls to pr_generic() with req = PRU_RCVD with calls to pr_rcvd()	2014-08-08 03:05:44 +00:00
rtr	651e5bd3f8	split PRU_SEND function out of pr_generic() usrreq switches and put into separate functions xxx_send(struct socket , struct mbuf , struct mbuf , struct mbuf , struct lwp *) - always KASSERT(solocked(so)) even if not implemented - replace calls to pr_generic() with req = PRU_SEND with calls to pr_send() rename existing functions that operate on PCB for consistency (and to free up their names for xxx_send() PRUs - l2cap_send() -> l2cap_send_pcb() - sco_send() -> sco_send_pcb() - rfcomm_send() -> rfcomm_send_pcb() patch reviewed by rmind	2014-08-05 07:55:31 +00:00
rtr	8e80ae3c97	get_tcppcb() is nearly always called upon entry to usrreqs so KASSERT(solocked(so)) inside it and remove the redundant KASSERT everywhere we are using tcp_getpcb()	2014-08-05 07:10:41 +00:00
rtr	ce6a5ff64f	revert the removal of struct lwp * parameter from bind, listen and connect user requests. this should resolve the issue relating to nfs client hangs presented recently by wiz on current-users@	2014-08-05 05:24:26 +00:00

... 2 3 4 5 6 ...

2555 Commits