Commit Graph

1467 Commits

Author SHA1 Message Date
ozaki-r
040205ae93 Protect ifnet list with psz and psref
The change ensures that ifnet objects in the ifnet list aren't freed during
list iterations by using pserialize(9) and psref(9).

Note that the change adds a pslist(9) for ifnet but doesn't remove the
original ifnet list (ifnet_list) to avoid breaking kvm(3) users. We
shouldn't use the original list in the kernel anymore.
2016-05-12 02:24:16 +00:00
is
142ff9d692 Let non-neighbor NS/NA debug error message include useful information. 2016-04-29 11:46:17 +00:00
ozaki-r
ad0fbab4d2 Get rid of unused argument from get_rand_ifid 2016-04-27 07:51:14 +00:00
ozaki-r
9e0f6c5e36 Stop using rt_gwroute on packet sending paths
rt_gwroute of rtentry is a reference to a rtentry of the gateway
for a rtentry with RTF_GATEWAY. That was used by L2 (arp and ndp)
to look up L2 addresses. By separating L2 nexthop caches, we don't
need a route for the purpose and we can stop using rt_gwroute.
By doing so, we can reduce referencing and modifying rtentries,
which makes it easy to apply a lock (and/or psref) to the
routing table and rtentries.

One issue to do this is to keep RTF_REJECT behavior. It seems it
was broken when we moved rtalloc1 things from L2 output routines
(e.g., ether_output) to ip_hresolv_output, but (fortunately?)
it works unexpectedly. What we mistook are:
- RTF_REJECT was checked for any routes in L2 output routines,
  but in ip_hresolv_output it is checked only when the route
  is RTF_GATEWAY
- The RTF_REJECT check wasn't copied to IPv6 (nd6_output)

It seems that rt_gwroute checks hid the mistakes and it looked
work (unexpectedly) and removing rt_gwroute checks unveil the
issue. So we need to fix RTF_REJECT checks in ip_hresolv_output
and also add them to nd6_output.

One more point we have to care is returning an errno; we need
to mimic looutput behavior. Originally RTF_REJECT check was
done either in L2 output routines or in looutput. The latter is
applied when a reject route directs to a loopback interface.
However, now RTF_REJECT check is done before looutput so to keep
the original behavior we need to return an errno which looutput
chooses. Added rt_check_reject_route does such tweaks.
2016-04-26 09:30:01 +00:00
ozaki-r
a79dfa5db0 Sweep unnecessary route.h inclusions 2016-04-26 08:44:44 +00:00
rjs
505ea9765f Fix build when IPSEC enabled. 2016-04-25 21:21:02 +00:00
ozaki-r
0c74cec625 Check error of rt_setgate and rt_settag 2016-04-25 14:38:08 +00:00
ozaki-r
c325d0ca4f Fix RTF_{REJECT,BLACKHOLE} behavior for IPv6 routes
We still need a nexthop route to reflect RTF_{REJECT,BLACKHOLE}.
In the future, we would do it w/o looking up a route.
2016-04-21 05:07:50 +00:00
ozaki-r
322b6a238d Sweep unncessary radix.h inclusions 2016-04-11 08:56:16 +00:00
ozaki-r
dd3c4fc3e5 Don't call pfxlist_onlink_check with holding llentry lock
From FreeBSD (as of 2016-04-11).

Should fix PR kern/51060.
2016-04-11 01:16:20 +00:00
ozaki-r
f0071d85a1 Don't call pfxlist_onlink_check with holding llentry lock
Sync nd6_free with FreeBSD (as of 2016-04-10).

Should fix PR kern/51056.
2016-04-10 08:15:52 +00:00
roy
60a5a4a8a7 all1_sa is no longer used. 2016-04-04 12:05:40 +00:00
ozaki-r
09973b35ac Separate nexthop caches from the routing table
By this change, nexthop caches (IP-MAC address pair) are not stored
in the routing table anymore. Instead nexthop caches are stored in
each network interface; we already have lltable/llentry data structure
for this purpose. This change also obsoletes the concept of cloning/cloned
routes. Cloned routes no longer exist while cloning routes still exist
with renamed to connected routes.

Noticeable changes are:
- Nexthop caches aren't listed in route show/netstat -r
  - sysctl(NET_RT_DUMP) doesn't return them
  - If RTF_LLDATA is specified, it returns nexthop caches
- Several definitions of routing flags and messages are removed
  - RTF_CLONING, RTF_XRESOLVE, RTF_LLINFO, RTF_CLONED and RTM_RESOLVE
- RTF_CONNECTED is added
  - It has the same value of RTF_CLONING for backward compatibility
- route's -xresolve, -[no]cloned and -llinfo options are removed
  - -[no]cloning remains because it seems there are users
  - -[no]connected is introduced and recommended
    to be used instead of -[no]cloning
- route show/netstat -r drops some flags
  - 'L' and 'c' are not seen anymore
  - 'C' now indicates a connected route
- Gateway value of a route of an interface address is now not
  a L2 address but "link#N" like a connected (cloning) route
- Proxy ARP: "arp -s ... pub" doesn't create a route

You can know details of behavior changes by seeing diffs under tests/.

Proposed on tech-net and tech-kern:
  http://mail-index.netbsd.org/tech-net/2016/03/11/msg005701.html
2016-04-04 07:37:07 +00:00
ozaki-r
35b18fbb1d Remove unnecessary casts and do s/0/NULL/ for rtrequest 2016-04-01 09:16:02 +00:00
ozaki-r
103bd8df24 Refine nd6log
Add __func__ to nd6log itself instead of adding it to callers.
2016-04-01 08:12:00 +00:00
ozaki-r
19fb0179dc Use __func__ in log messages 2016-04-01 06:25:51 +00:00
ozaki-r
acdecad069 Tidy up nd6_timer initialization 2016-04-01 05:11:38 +00:00
knakahara
9b7918b3ee remove unnecessary declarations and fix KNF
Thanks to riastradh@
2016-02-29 01:29:15 +00:00
knakahara
e80f101289 To eliminate gif_softc_list linear search, add extra argument to encapsw.pr_ctlinput(). 2016-02-26 07:35:17 +00:00
rtr
e2a3307b85 Reduce code duplication.
Split creation of IPv4-Mapped IPv6 addresses into its own function
and use it.

No functional change intended.  As posted to tech-net@
2016-02-15 14:59:03 +00:00
ozaki-r
9c4cd06355 Introduce softint-based if_input
This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!
2016-02-09 08:32:07 +00:00
riastradh
3bc04b00b8 Declare in6_tmpaddrtimer_ch in in6_var.h.
Do not declare extern variables in .c files!
2016-02-04 02:48:37 +00:00
knakahara
b546d5277b implement encapsw instead of protosw and uniform prototype.
suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...
2016-01-26 05:58:05 +00:00
riastradh
fa50b451d4 Those were local changes not meant to be part of the revert. SORRY! 2016-01-23 14:48:55 +00:00
christos
e28df70b14 make this compile again 2016-01-23 14:03:04 +00:00
riastradh
e588d95c25 Back out previous change to introduce struct encapsw.
This change was intended, but Nakahara-san had already made a better
one locally!  So I'll let him commit that one, and I'll try not to
step on anyone's toes again.
2016-01-22 23:27:12 +00:00
riastradh
87bc652e3d Don't abuse struct protosw for ip_encap -- introduce struct encapsw.
Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.
2016-01-22 05:15:10 +00:00
riastradh
7c7b1739c8 Revert previous: ran cvs commit when I meant cvs diff. Sorry!
Hit up-arrow one too few times.
2016-01-21 15:41:29 +00:00
riastradh
b41d562bd0 Give proper prototype to ip_output. 2016-01-21 15:27:48 +00:00
riastradh
65a8f527af Eliminate struct protosw::pr_output.
You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.

Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument.  Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.

Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html
2016-01-20 21:43:59 +00:00
knakahara
d7b9bb29c0 Refactor protosw codes in gif(4). No functional change.
- remove unnecessary include
    - reduce scopes
2016-01-18 06:08:26 +00:00
ozaki-r
5c49460e3c Add missing RTF_LOCAL; sync with arp_setgate 2016-01-08 08:50:07 +00:00
knakahara
1c5d304e9c eliminate ip_input.c and ip6_input.c dependency on gif(4) 2016-01-08 03:55:39 +00:00
knakahara
6d50f36d54 use satosin{,6} macros instead of casts. 2015-12-25 06:47:56 +00:00
ozaki-r
9c1d124220 Add missing LLE_WUNLOCK to nd6_free 2015-12-18 09:04:33 +00:00
christos
5b5956f338 Hook up the addrctl stuff that's already there. 2015-12-12 23:34:25 +00:00
knakahara
a00e94f4ff PR kern/50522: gif(4) ioctl causes panic while someone is using the gif(4) interface.
It is required to wait other CPU's softint completion before disestablishing
the softint handler.
2015-12-11 07:59:14 +00:00
ozaki-r
c6e461ee0d CID 1341546: Fix integer handling issue (CONSTANT_EXPRESSION_RESULT)
n > INT_MAX where n is a long integer variable never be true on 32bit
architectures. Use time_t(int64_t) instead of long for the variable.
2015-12-07 06:19:13 +00:00
ozaki-r
2c1e216cf8 Replace __debugused with __diagused
Declaring __debugused was just a mistake. This fixes builds of kernels with
DEBUG but without DIAGNOSTIC.
2015-11-27 02:54:22 +00:00
ozaki-r
ff97010dea Declare __debugused for no DIAGNOSTIC kernels
This unbreaks hpcsh GENERIC kernel build.
2015-11-25 07:06:19 +00:00
ozaki-r
ecd5b23eef Use lltable/llentry for NDP
lltable and llentry were introduced to replace ARP cache data structure
for further restructuring of the routing table: L2 nexthop cache
separation. This change replaces the NDP cache data structure
(llinfo_nd6) with them as well as ARP.

One noticeable change is for neighbor cache GC mechanism that was
introduced to prevent IPv6 DoS attacks. net.inet6.ip6.neighborgcthresh
was the max number of caches that we store in the system. After
introducing lltable/llentry, the value is changed to be per-interface
basis because lltable/llentry stores neighbor caches in each interface
separately. And the change brings one degradation; the old GC mechanism
dropped exceeded packets based on LRU while the new implementation drops
packets in order from the beginning of lltable (a hash table + linked
lists). It would be improved in the future.

Added functions in in6.c come from FreeBSD (as of r286629) and are
tweaked for NetBSD.

Proposed on tech-kern and tech-net.
2015-11-25 06:21:26 +00:00
ozaki-r
0edb16352e Call icmp6_error2 after releasing ln
This is a restructuring for coming changes.

From FreeBSD
2015-11-19 03:02:10 +00:00
ozaki-r
5d81659a46 Stop passing llinfo_nd6 to nd6_ns_output
This is a restructuring for coming changes to nd6 (replacing
llinfo_nd6 with llentry). Once we have a lock of llinfo_nd6,
we need to pass it to nd6_ns_output with holding the lock.
However, in a function subsequent to nd6_ns_output, the llinfo_nd6
may be looked up, i.e., its lock would be acquired again.
To avoid such a situation, pass only required data (in6_addr) to
nd6_ns_output instead of passing whole llinfo_nd6.

Inspired by FreeBSD
2015-11-18 05:16:22 +00:00
ozaki-r
7cdf5bbe65 Unify nd6_ns_output calls in nd6_llinfo_timer
Inspired by FreeBSD
2015-11-18 02:51:11 +00:00
joerg
a3e166507d Ensure that the callout of the multicast address is valid before
hooking it up.
2015-11-12 15:01:06 +00:00
rjs
8c2654abca Add core networking support for SCTP. 2015-10-13 21:28:34 +00:00
ozaki-r
91afbd53fe Use satosin6 instead of its own macro 2015-10-05 04:15:42 +00:00
ozaki-r
4f92eb6d47 Update icmp6_redirect_timeout_q when changing net.inet6.icmp6.redirtimeout
We have to update icmp6_redirect_timeout_q as well as icmp6_redirtimeout
when changing net.inet6.icmp6.redirtimeout via sysctl. The updating logic
is copied from sysctl_net_inet_icmp_redirtimeout.

This change is from s-yamaguchi@IIJ (with KNF by ozaki-r) and fixes
PR kern/50240.
2015-09-14 05:34:28 +00:00
roy
f3b0c038a1 If, for whatever reason, a local interface route is removed and then
re-added, mark it as a local route.

While here, if changing the route to go via the loopback interface
remove any inherited MTU value.
2015-09-11 10:33:32 +00:00
dholland
1fbab01a93 More on PR 41200: headers that declare ioctls should include sys/ioccom.h.
This covers (I think) all the MI headers outside of external/ (and dist/).
2015-09-06 06:00:59 +00:00