Commit Graph

196 Commits

Author SHA1 Message Date
ozaki-r
51db7a24e2 Fix nd6_output (if_output_lock conversion mistake) 2016-06-21 02:14:11 +00:00
knakahara
95fc145695 apply if_output_lock() to L3 callers which call ifp->if_output() of L2(or L3 tunneling). 2016-06-20 06:46:37 +00:00
ozaki-r
894d037bc1 Get rid of unnecessary assignment 2016-05-18 11:28:44 +00:00
ozaki-r
040205ae93 Protect ifnet list with psz and psref
The change ensures that ifnet objects in the ifnet list aren't freed during
list iterations by using pserialize(9) and psref(9).

Note that the change adds a pslist(9) for ifnet but doesn't remove the
original ifnet list (ifnet_list) to avoid breaking kvm(3) users. We
shouldn't use the original list in the kernel anymore.
2016-05-12 02:24:16 +00:00
ozaki-r
9e0f6c5e36 Stop using rt_gwroute on packet sending paths
rt_gwroute of rtentry is a reference to a rtentry of the gateway
for a rtentry with RTF_GATEWAY. That was used by L2 (arp and ndp)
to look up L2 addresses. By separating L2 nexthop caches, we don't
need a route for the purpose and we can stop using rt_gwroute.
By doing so, we can reduce referencing and modifying rtentries,
which makes it easy to apply a lock (and/or psref) to the
routing table and rtentries.

One issue to do this is to keep RTF_REJECT behavior. It seems it
was broken when we moved rtalloc1 things from L2 output routines
(e.g., ether_output) to ip_hresolv_output, but (fortunately?)
it works unexpectedly. What we mistook are:
- RTF_REJECT was checked for any routes in L2 output routines,
  but in ip_hresolv_output it is checked only when the route
  is RTF_GATEWAY
- The RTF_REJECT check wasn't copied to IPv6 (nd6_output)

It seems that rt_gwroute checks hid the mistakes and it looked
work (unexpectedly) and removing rt_gwroute checks unveil the
issue. So we need to fix RTF_REJECT checks in ip_hresolv_output
and also add them to nd6_output.

One more point we have to care is returning an errno; we need
to mimic looutput behavior. Originally RTF_REJECT check was
done either in L2 output routines or in looutput. The latter is
applied when a reject route directs to a loopback interface.
However, now RTF_REJECT check is done before looutput so to keep
the original behavior we need to return an errno which looutput
chooses. Added rt_check_reject_route does such tweaks.
2016-04-26 09:30:01 +00:00
ozaki-r
0c74cec625 Check error of rt_setgate and rt_settag 2016-04-25 14:38:08 +00:00
ozaki-r
c325d0ca4f Fix RTF_{REJECT,BLACKHOLE} behavior for IPv6 routes
We still need a nexthop route to reflect RTF_{REJECT,BLACKHOLE}.
In the future, we would do it w/o looking up a route.
2016-04-21 05:07:50 +00:00
ozaki-r
f0071d85a1 Don't call pfxlist_onlink_check with holding llentry lock
Sync nd6_free with FreeBSD (as of 2016-04-10).

Should fix PR kern/51056.
2016-04-10 08:15:52 +00:00
roy
60a5a4a8a7 all1_sa is no longer used. 2016-04-04 12:05:40 +00:00
ozaki-r
09973b35ac Separate nexthop caches from the routing table
By this change, nexthop caches (IP-MAC address pair) are not stored
in the routing table anymore. Instead nexthop caches are stored in
each network interface; we already have lltable/llentry data structure
for this purpose. This change also obsoletes the concept of cloning/cloned
routes. Cloned routes no longer exist while cloning routes still exist
with renamed to connected routes.

Noticeable changes are:
- Nexthop caches aren't listed in route show/netstat -r
  - sysctl(NET_RT_DUMP) doesn't return them
  - If RTF_LLDATA is specified, it returns nexthop caches
- Several definitions of routing flags and messages are removed
  - RTF_CLONING, RTF_XRESOLVE, RTF_LLINFO, RTF_CLONED and RTM_RESOLVE
- RTF_CONNECTED is added
  - It has the same value of RTF_CLONING for backward compatibility
- route's -xresolve, -[no]cloned and -llinfo options are removed
  - -[no]cloning remains because it seems there are users
  - -[no]connected is introduced and recommended
    to be used instead of -[no]cloning
- route show/netstat -r drops some flags
  - 'L' and 'c' are not seen anymore
  - 'C' now indicates a connected route
- Gateway value of a route of an interface address is now not
  a L2 address but "link#N" like a connected (cloning) route
- Proxy ARP: "arp -s ... pub" doesn't create a route

You can know details of behavior changes by seeing diffs under tests/.

Proposed on tech-net and tech-kern:
  http://mail-index.netbsd.org/tech-net/2016/03/11/msg005701.html
2016-04-04 07:37:07 +00:00
ozaki-r
103bd8df24 Refine nd6log
Add __func__ to nd6log itself instead of adding it to callers.
2016-04-01 08:12:00 +00:00
ozaki-r
acdecad069 Tidy up nd6_timer initialization 2016-04-01 05:11:38 +00:00
riastradh
3bc04b00b8 Declare in6_tmpaddrtimer_ch in in6_var.h.
Do not declare extern variables in .c files!
2016-02-04 02:48:37 +00:00
ozaki-r
5c49460e3c Add missing RTF_LOCAL; sync with arp_setgate 2016-01-08 08:50:07 +00:00
ozaki-r
9c1d124220 Add missing LLE_WUNLOCK to nd6_free 2015-12-18 09:04:33 +00:00
ozaki-r
c6e461ee0d CID 1341546: Fix integer handling issue (CONSTANT_EXPRESSION_RESULT)
n > INT_MAX where n is a long integer variable never be true on 32bit
architectures. Use time_t(int64_t) instead of long for the variable.
2015-12-07 06:19:13 +00:00
ozaki-r
ecd5b23eef Use lltable/llentry for NDP
lltable and llentry were introduced to replace ARP cache data structure
for further restructuring of the routing table: L2 nexthop cache
separation. This change replaces the NDP cache data structure
(llinfo_nd6) with them as well as ARP.

One noticeable change is for neighbor cache GC mechanism that was
introduced to prevent IPv6 DoS attacks. net.inet6.ip6.neighborgcthresh
was the max number of caches that we store in the system. After
introducing lltable/llentry, the value is changed to be per-interface
basis because lltable/llentry stores neighbor caches in each interface
separately. And the change brings one degradation; the old GC mechanism
dropped exceeded packets based on LRU while the new implementation drops
packets in order from the beginning of lltable (a hash table + linked
lists). It would be improved in the future.

Added functions in in6.c come from FreeBSD (as of r286629) and are
tweaked for NetBSD.

Proposed on tech-kern and tech-net.
2015-11-25 06:21:26 +00:00
ozaki-r
0edb16352e Call icmp6_error2 after releasing ln
This is a restructuring for coming changes.

From FreeBSD
2015-11-19 03:02:10 +00:00
ozaki-r
5d81659a46 Stop passing llinfo_nd6 to nd6_ns_output
This is a restructuring for coming changes to nd6 (replacing
llinfo_nd6 with llentry). Once we have a lock of llinfo_nd6,
we need to pass it to nd6_ns_output with holding the lock.
However, in a function subsequent to nd6_ns_output, the llinfo_nd6
may be looked up, i.e., its lock would be acquired again.
To avoid such a situation, pass only required data (in6_addr) to
nd6_ns_output instead of passing whole llinfo_nd6.

Inspired by FreeBSD
2015-11-18 05:16:22 +00:00
ozaki-r
7cdf5bbe65 Unify nd6_ns_output calls in nd6_llinfo_timer
Inspired by FreeBSD
2015-11-18 02:51:11 +00:00
roy
f3b0c038a1 If, for whatever reason, a local interface route is removed and then
re-added, mark it as a local route.

While here, if changing the route to go via the loopback interface
remove any inherited MTU value.
2015-09-11 10:33:32 +00:00
ozaki-r
30a9349144 Pull nexthop determination routine from nd6_output
It simplifies nd6_output and the nexthop determination routine slightly.
2015-09-04 05:33:23 +00:00
ozaki-r
6af5fcf207 Fix rtfree in nd6_output
We have to check and avoid to rtfree the original rtentry passed to
nd6_output even when manipulating gateway routes.

This fixes panic on assertion "ro->_ro_rt ==NULL || ro->_ro_rt->rt_refcnt > 0"
failure and probably PR kern/50161.
2015-09-03 00:54:39 +00:00
ozaki-r
54c4f3b688 Do rt_refcnt++ when set a rtentry to another rtentry's rt_gwroute
And also do rtfree when deref a rtentry from rt_gwroute.
2015-09-02 11:35:11 +00:00
ozaki-r
1231d10774 Use KASSERT to check programming errors 2015-09-02 08:03:10 +00:00
ozaki-r
04bf400967 Move a rtentry definition to reduce its scope
No functional change.
2015-09-01 08:52:02 +00:00
ozaki-r
31cbc4a715 Cleanup nd6_nud_hint
The deleted rtfree was never called.
2015-09-01 08:46:27 +00:00
ozaki-r
31874cd257 Remove leading whitespaces 2015-08-31 03:26:53 +00:00
pooka
1c4a50f192 sprinkle _KERNEL_OPT 2015-08-24 22:21:26 +00:00
ozaki-r
aade6ffbb3 Fix double rtfree 2015-08-11 09:30:32 +00:00
ozaki-r
aa2414a0f0 Free rtentry when we successfully obtain it but return NULL 2015-08-11 08:27:08 +00:00
ozaki-r
55140c1926 Use time_uptime instead of time_second to avoid time leaps
Some codes in sys/net* use time_second to manage time periods such as
cache expirations. However, time_second doesn't increase monotonically
and can leap by say settimeofday(2) according to time_second(9). We
should use time_uptime instead of it to avoid such time leaps.

This change replaces time_second with time_uptime. Additionally it
converts a time based on time_uptime to a time based on time_second
when the kernel passes the time to userland programs that expect
the latter, and vice versa.

Note that we shouldn't leak time_uptime to other hosts over the
netowrk. My investigation shows there is no such leak:
http://mail-index.netbsd.org/tech-net/2015/08/06/msg005332.html

Discussed on tech-kern and tech-net.
2015-08-07 08:11:33 +00:00
ozaki-r
9eae87d0c8 Reform use of rt_refcnt
rt_refcnt of rtentry was used in bad manners, for example, direct rt_refcnt++
and rt_refcnt-- outside route.c, "rt->rt_refcnt++; rtfree(rt);" idiom, and
touching rt after rt->rt_refcnt--.

These abuses seem to be needed because rt_refcnt manages only references
between rtentry and doesn't take care of references during packet processing
(IOW references from local variables). In order to reduce the above abuses,
the latter cases should be counted by rt_refcnt as well as the former cases.

This change improves consistency of use of rt_refcnt:
- rtentry is always accessed with rt_refcnt incremented
- rtentry's rt_refcnt is decremented after use (rtfree is always used instead
  of rt_refcnt--)
- functions returning rtentry increment its rt_refcnt (and caller rtfree it)

Note that rt_refcnt prevents rtentry from being freed but doesn't prevent
rtentry from being updated. Toward MP-safe, we need to provide another
protection for rtentry, e.g., locks. (Or introduce a better data structure
allowing concurrent readers during updates.)
2015-07-17 02:21:08 +00:00
ozaki-r
fcda92b6be Remove unused arguments and the associated code from nd6_nud_hint()
from OpenBSD
2015-07-15 09:20:18 +00:00
ozaki-r
452d01ddfd Use KASSERT for argument NULL checks 2015-06-30 08:31:42 +00:00
ozaki-r
36d424c9ec Don't take KERNEL_LOCK for if_output when NET_MPSAFE 2015-04-30 10:00:04 +00:00
ozaki-r
f35c2148c2 Tidy up opt_ipsec.h inclusions 2015-03-30 04:25:26 +00:00
roy
1d0df6e404 Rename nd6_rtmsg() to rt_newmsg() and move into the generic routing code
as it's not IPv6 specific and will be used elsewhere.
2015-02-25 12:45:34 +00:00
roy
1777c2ee4b Retire nd6_newaddrmsg and use rt_newaddrmsg directly instead so that
we don't spam route changes when the route hasn't changed.
2015-02-25 00:26:58 +00:00
martin
94a27aa4e3 Rearange interface detachement slightly: before we free the INET6 specific
per-interface data, make sure to call nd6_purge() with it to remove
routing entries pointing to the going interface.
When we should happen to call this function again later, with the data
already gone, just return.
Fixes PR kern/49682, ok: christos.
2015-02-23 19:15:59 +00:00
christos
c4bbd62988 "something odd happens" is not a useful error message. 2015-02-17 15:14:28 +00:00
roy
24c1397228 Report route additions/changes/deletions for cached neighbours to userland. 2014-12-16 11:42:27 +00:00
christos
99c363a8a2 more debugging info... 2014-12-03 01:32:11 +00:00
snj
f0a7346d21 src is too big these days to tolerate superfluous apostrophes. It's
"its", people!
2014-10-18 08:33:23 +00:00
roy
15d73271e1 Tests for neighbour now work correctly on bridge(4) and carp(4) interfaces. 2014-10-14 15:29:43 +00:00
rmind
32293d340f - Eliminate RTFREE() macro in favour of rtfree() function.
- Make rtcache() function static.
2014-06-06 01:02:47 +00:00
roy
0398025216 Add IPV6CTL_AUTO_LINKLOCAL and ND6_IFF_AUTO_LINKLOCAL toggles which
control the automatic creation of IPv6 link-local addresses when an
interface is brought up.

Taken from FreeBSD.
2014-06-05 16:06:49 +00:00
bouyer
8ec9289dda Sync with the ipv4 code and call ifp->if_output() with KERNEL_LOCK
held.
Problem reported and fix tested by njoly@ on current-users@
2014-05-20 20:23:56 +00:00
rmind
f7741dab17 - Move IFNET_*() macros under #ifdef _KERNEL.
- Replace TAILQ_FOREACH on ifnet with IFNET_FOREACH().
2014-05-17 20:44:24 +00:00
roy
263486c97b If IPv6 is disabled for an interface, mark all addresses as tentative.
If enabled, check for a duplicated link-local address and abort enabling
as per RFC 4862, section 5.4.5. If allowed to enable, perform DAD
on the tentative addresses.

Taken from FreeBSD.
2014-03-20 13:34:35 +00:00