Commit Graph

1603 Commits

Author SHA1 Message Date
christos
e3992d8536 restore previous logic. 2016-10-31 14:34:32 +00:00
ozaki-r
c5224ffd07 Pull best address selection code out of in6_selectsrc
No functional change.
2016-10-31 04:57:10 +00:00
ozaki-r
0f3a44863e Fix race condition of in6_selectsrc
in6_selectsrc returned a pointer to in6_addr that wan't guaranteed to be
safe by pserialize (or psref), which was racy. Let callers pass a pointer
to in6_addr and in6_selectsrc copy a result to it inside pserialize
critical sections.
2016-10-31 04:16:25 +00:00
ozaki-r
6e6136eaff Remove unnecessary NULL checks 2016-10-31 02:50:31 +00:00
ozaki-r
cf96c34d79 Remove unnecessary argument
No functional change.
2016-10-25 02:45:09 +00:00
ozaki-r
3be3142886 Don't hold global locks if NET_MPSAFE is enabled
If NET_MPSAFE is enabled, don't hold KERNEL_LOCK and softnet_lock in
part of the network stack such as IP forwarding paths. The aim of the
change is to make it easy to test the network stack without the locks
and reduce our local diffs.

By default (i.e., if NET_MPSAFE isn't enabled), the locks are held
as they used to be.

Reviewed by knakahara@
2016-10-18 07:30:30 +00:00
ozaki-r
6cabf64625 Fix indentation 2016-10-18 02:46:50 +00:00
ozaki-r
5faeb64f4b Remove unnecessary pserialize_read_enter 2016-10-18 02:46:21 +00:00
ozaki-r
48ec99bd49 Add missing pserialize_read_exit 2016-10-18 02:45:41 +00:00
roy
0dbee937df Now that we disallow sending or receiving from invalid addresses,
allow binding to tentative addresses.
2016-09-29 12:19:47 +00:00
roy
8066689d53 Drop UDP packets as well as TCP without error when sending from detached or
tentative addresses.
2016-09-20 14:30:13 +00:00
roy
8c6871896f Ensure that packets are sent from a valid address.
If the packet is TCP and the address is detached or tentative then
it's just dropped, otherwise an error is returned.

This is needed because you can bind to a valid address and it can then
become invalid.

This satisfies RFC 4862 section 5.5.4.
2016-09-15 18:25:45 +00:00
christos
0d94f00ba4 fix typo 2016-09-14 16:17:17 +00:00
christos
959c247a60 revert previous, roy says it breaks DaD. 2016-09-13 15:57:50 +00:00
christos
acab31252a When initializing addresses, reset the interface flags to 0. This fixes
an issue where point to point addresses that started down, and then came
up, were left with stale flags on one side of the point to point link.
2016-09-13 15:41:33 +00:00
christos
647765d084 remove trailing spaces. userland does not catch this? 2016-09-13 00:45:15 +00:00
christos
47afd135ed add bits for address flags 2016-09-13 00:19:28 +00:00
roy
3e6930820d Disallow input to detached addresses because they are not yet valid. 2016-09-07 15:41:44 +00:00
roy
c85195ff64 This comment no longer applies. 2016-09-02 15:57:54 +00:00
ozaki-r
ab06ed1240 Don't GC an NDP cache that is added just before GC
This fixes unstable test results of ndp_neighborgcthresh.
2016-09-02 07:15:14 +00:00
ozaki-r
543e39c0d3 Make ipforward_rt and ip6_forward_rt percpu
Sharing one rtcache between CPUs is just a bad idea.

Reviewed by knakahara@
2016-08-31 09:14:47 +00:00
dholland
2df5f31439 PR 51434 David Binderman: remove redundant test. 2016-08-26 21:48:31 +00:00
roy
c63a839724 Simplify. 2016-08-26 20:29:31 +00:00
roy
333b0c4c48 Allow explicit binding to detached addresss.
Fixes PR kern/51435.
2016-08-26 19:45:55 +00:00
roy
1893d82b49 White space police. 2016-08-23 19:39:57 +00:00
roy
da7a376e71 Sync denied flags. 2016-08-23 19:39:04 +00:00
knakahara
74c24413b3 improve fast-forward performance when the number of flows exceeds ip6_maxflows.
This is porting of ip_flow.c:r1.76

In ip6flow case, the before degradation is about 45%, the after degradation is
bout 55%.
2016-08-23 09:59:20 +00:00
roy
dfadc24d64 Revert r1.148
IP6_EXTHDR_GET ensures that a icmp6 header can be fetched from the mbuf
so m_pullup does not need to be called.

While here, we can safely increament interface error stats even with an
invalidated mbuf because we have a saved reference to the interface.
2016-08-19 12:26:01 +00:00
roy
e52094cac4 Revert part of the prior patch so loopback lladdr gets a working prefix route. 2016-08-18 09:34:43 +00:00
roy
fe4671807c Separate ioctl address prefix management from RA prefix management
as we have no API for controlling the latter.

This fixes a long standing problem where addresses added with non /128
prefixes and non infinte address lifetimes would register a prefix route
which would expire. Subsequent calls set new lifetimes for the same address
would not affect the prefix route management, so once expired, the
prefix route would be impossible to add back as the kernel would remove it.
2016-08-16 10:31:57 +00:00
christos
fa02ef2c34 In rump (ifp)->if_afdata[AF_INET6] == NULL if we did not register netinet6
yet. Treat this like we don't have a scope, and make the sid tests consistent.
2016-08-12 11:44:24 +00:00
roy
e9c7e74884 Set RTF_CONNECTED instead of setting only RTF_CONNECTED. 2016-08-06 20:00:14 +00:00
ozaki-r
e8f81e31c2 CID 1364757: remove unnecessary branching 2016-08-05 00:51:14 +00:00
knakahara
48235e8230 ip6flow refactor like ipflow.
- move ip6flow sysctls into ip6_flow.c like ip_flow.c:r1.64
    - build ip6_flow.c only if GATEWAY kernel option is enabled
2016-08-02 04:50:16 +00:00
ozaki-r
466f21f0b9 Fix kernel builds (gcc 4.8) 2016-08-01 04:37:53 +00:00
ozaki-r
a403cbd4f5 Apply pserialize and psref to struct ifaddr and its variants
This change makes struct ifaddr and its variants (in_ifaddr and in6_ifaddr)
MP-safe by using pserialize and psref. At this moment, pserialize_perform
and psref_target_destroy are disabled because (1) we don't need them
because of softnet_lock (2) they cause a deadlock because of softnet_lock.
So we'll enable them when we remove softnet_lock in the future.
2016-08-01 03:15:30 +00:00
ozaki-r
efee6976a2 Avoid memset and rtcache_free if unnecessary
It's the same as ip_output.
2016-07-29 06:02:03 +00:00
ozaki-r
c68a77bc1d Fix panic on adding/deleting IP addresses under network load
Adding and deleting IP addresses aren't serialized with other network
opeartions, e.g., forwarding packets. So if we add or delete an IP
address under network load, a kernel panic may happen on manipulating
network-related shared objects such as rtentry and rtcache.

To avoid such panicks, we still need to hold softnet_lock in in_control
and in6_control that are called via ioctl and do network-related operations
including IP address additions/deletions.

Fix PR kern/51356
2016-07-28 09:03:50 +00:00
ozaki-r
e449cc85bc Simplify by using atomic_swap instead of mutex
Suggested by kefren@
2016-07-26 05:53:30 +00:00
ozaki-r
a3625f4d7b Make DAD of ARP/NDP MP-safe with coarse-grained locks
The change also prevents arp_dad_timer/nd6_dad_timer from running if
arp_dad_stop/nd6_dad_stop is called, which makes sure that callout_reset
won't be called during callout_halt.
2016-07-25 04:21:19 +00:00
ozaki-r
6b3e3b4814 Use KASSERT for checking non-NULL of ifa->ifa_ifp
ifa->ifa_ifp should be always non-NULL, so doing the check only if
DIAGNOSTIC is ok.
2016-07-25 01:52:21 +00:00
ozaki-r
1f39eeaeb9 Get rid of extra ifafree
It was wrongly imported from FreeBSD.
2016-07-20 07:56:10 +00:00
ozaki-r
4f21a42704 Apply pserialize to some iterations of IP address lists 2016-07-20 07:37:51 +00:00
ozaki-r
8759207c83 Use sin6tosa and sin6tocsa macros
No functional change.
2016-07-15 07:40:09 +00:00
ozaki-r
328b3c6b85 Use ifatoia6 macro
No functional change.
2016-07-15 07:33:41 +00:00
ozaki-r
dca032f9f4 Run timers in workqueue
Timers (such as nd6_timer) typically free/destroy some data in callout
(softint). If we apply psz/psref for such data, we cannot do free/destroy
process in there because synchronization of psz/psref cannot be used in
softint. So run timer callbacks in workqueue works (normal LWP context).

Doing workqueue_enqueue a work twice (i.e., call workqueue_enqueue before
a previous task is scheduled) isn't allowed. For nd6_timer and
rt_timer_timer, this doesn't happen because callout_reset is called only
from workqueue's work. OTOH, ip{,6}flow_slowtimo's callout can be called
before its work starts and completes because the callout is periodically
called regardless of completion of the work. To avoid such a situation,
add a flag for each protocol; the flag is set true when a work is
enqueued and set false after the work finished. workqueue_enqueue is
called only if the flag is false.

Proposed on tech-net and tech-kern.
2016-07-11 07:37:00 +00:00
ozaki-r
94dba1b837 CID 1363345: remove unreachable code and cleanup returns 2016-07-08 06:18:29 +00:00
ozaki-r
4133a8eca8 Replace macros to get an IP address with proper inline functions
The inline functions are more friendly for applying psz/psref;
they consist of only simple interations.
2016-07-08 04:33:30 +00:00
ozaki-r
75a23513d7 Kill remaining use of the old lists of IP addresses 2016-07-08 03:40:34 +00:00
ozaki-r
9e4c2bda8a Switch the address list of intefaces to pslist(9)
As usual, we leave the old list to avoid breaking kvm(3) users.
2016-07-07 09:32:01 +00:00
ozaki-r
6106c473fc Move in6_ifaddr_list to a more proper place (from ip6_input.c to in6.c)
It's a similar place as the IPv4 address list, i.e., in.c.

More varibles will join together.
2016-07-06 10:49:49 +00:00
ozaki-r
806a31cb3c Add missing IN6_ADDRLIST_ENTRY_DESTROY 2016-07-06 07:52:53 +00:00
ozaki-r
d04ff44ad6 Apply m_get_rcvif_psref (kill m_get_rcvif_NOMPSAFE) 2016-07-06 00:30:55 +00:00
ozaki-r
ff67da833b Constify an argument of regen_tmpaddr 2016-07-05 06:32:18 +00:00
ozaki-r
2a3c249748 KNF 2016-07-05 04:25:23 +00:00
ozaki-r
c30ba26977 Use ia6 or ia instead of ifa as a variable name of struct in6_ifaddr
We conventionally use ifa for struct ifaddr and use ia6 or ia for
struct in6_ifaddr.

No functional change.
2016-07-05 03:40:52 +00:00
ozaki-r
d961591ee9 Fix userland compilations of those including in6_var.h 2016-07-04 07:32:18 +00:00
ozaki-r
6cf9fce745 Use pslist(9) for the global in6_ifaddr list
psz and psref will be applied in another commit.

No functional change intended.
2016-07-04 06:48:14 +00:00
knakahara
a6d7586724 fix: gif(4) receive side race
A panic cause in rn_match() called by encap[46]_lookup(). The reason is that
gif(4) does not suspend receive packet processing in spite of suspending
transmit packet processing while anyone is doing gif(4) ioctl.
2016-07-04 04:22:47 +00:00
knakahara
d81cd78ed7 let gif(4) promise softint(9) contract (1/2) : gif(4) side
To prevent calling softint_schedule() after called softint_disestablish(),
the following modifications are added
    + ioctl (writing configuration) side
      - off IFF_RUNNING flag before changing configuration
      - wait softint handler completion before changing configuration
    + packet processing (reading configuraiotn) side
      - if IFF_RUNNING flag is on, do nothing
    + in whole
      - add gif_list_lock_{enter,exit} to prevent the same configuration is
        set to other gif(4) interfaces
2016-07-04 04:14:47 +00:00
ozaki-r
feeae45125 Remove redundant codes purging IPv6 addresses
Proposed on tech-net and tech-kern.
2016-07-04 02:41:18 +00:00
ozaki-r
17b4eb5edd Make sure to free all interface addresses in if_detach
Addresses of an interface (struct ifaddr) have a (reverse) pointer of an
interface object (ifa->ifa_ifp). If the addresses are surely freed when
their interface is destroyed, the pointer is always valid and we don't
need a tweak of replacing the pointer to if_index like mbuf.

In order to make sure the assumption, the following changes are required:
- Deactivate the interface at the firstish of if_detach. This prevents
  in6_unlink_ifa from saving multicast addresses (wrongly)
- Invalidate rtcache(s) and clear a rtentry referencing an address on
  RTM_DELETE. rtcache(s) may delay freeing an address
- Replace callout_stop with callout_halt of DAD timers to ensure stopping
  such timers in if_detach
2016-07-01 05:22:33 +00:00
ozaki-r
d4c71b34a8 Make sure that ifaddr is published after its initialization finished
Basically we should insert an item to a collection (say a list) after
item's initialization has been completed to avoid accessing an item
that is initialized halfway. ifaddr (in{,6}_ifaddr) isn't processed
like so and needs to be fixed.

In order to do so, we need to tweak {arp,nd6}_rtrequest that depend
on that an ifaddr is inserted during its initialization; they explore
interface's address list to determine that rt_getkey(rt) of a given
rtentry is in the list to know whether the route's interface should
be a loopback, which doesn't work after the change. To make it work,
first check RTF_LOCAL flag that is set in rt_ifa_addlocal that calls
{arp,nd6}_rtrequest eventually. Note that we still need the original
code for the case to remove and re-add a local interface route.
2016-06-30 01:34:53 +00:00
ozaki-r
a577cf2aa0 Introduce if_is_deactivated
Checking ifp->if_output == if_nulloutput is too implicit.

No functional change.
2016-06-28 02:36:54 +00:00
ozaki-r
ca4ea29d93 Add missing NULL checks for m_get_rcvif_psref 2016-06-28 02:02:56 +00:00
christos
9471dccf97 CID 1362905: Initialize ifp early, so that we don't if_put garbage in the
IPSEC case.
2016-06-27 18:35:54 +00:00
ozaki-r
4b54d200aa Remove unnecessary NULL checks of ifa->ifa_addr
If it's NULL, it should be a bug. There many IFADDR_FOREACH that don't do
NULL check. If it can be NULL, they should fire already.
2016-06-22 07:48:17 +00:00
ozaki-r
4badfc204a Make sure returning ifp from in6_select* functions psref-ed
To this end, callers need to pass struct psref to the functions
and the fuctions acquire a reference of ifp with it. In some cases,
we can simply use if_get_byindex, however, in other cases
(say rt->rt_ifp and ia->ifa_ifp), we have no MP-safe way for now.
In order to take a reference anyway we use non MP-safe function
if_acquire_NOMPSAFE for the latter cases. They should be fixed in
the future somehow.
2016-06-21 10:25:27 +00:00
ozaki-r
f7107c248e Protect if_byindex with pserialize 2016-06-21 10:21:04 +00:00
ozaki-r
43c5ab376f Replace ifp of ip_moptions and ip6_moptions with if_index
The motivation is the same as the mbuf's rcvif case; avoid having a pointer
of an ifnet object in ip_moptions and ip6_moptions, which is not MP-safe.

ip_moptions and ip6_moptions can be stored in a PCB for inet or inet6
that's life time is different from ifnet one and so an ifnet object can be
disappeared anytime we get it via them. Thus we need to look up an ifnet
object by if_index every time for safe.
2016-06-21 03:28:27 +00:00
ozaki-r
51db7a24e2 Fix nd6_output (if_output_lock conversion mistake) 2016-06-21 02:14:11 +00:00
knakahara
95fc145695 apply if_output_lock() to L3 callers which call ifp->if_output() of L2(or L3 tunneling). 2016-06-20 06:46:37 +00:00
ozaki-r
f0423d34e6 Use if_get_byindex instead of if_byindex for MP-safe 2016-06-16 03:03:33 +00:00
ozaki-r
e1135cd9b9 Use curlwp_bind and curlwp_bindx instead of open-coding LP_BOUND 2016-06-16 02:38:40 +00:00
ozaki-r
c7e18ccbde Protect if_byindex by pserialize 2016-06-15 06:01:21 +00:00
knakahara
a6f4292e65 eliminate unnecessary splnet 2016-06-13 08:37:15 +00:00
knakahara
e4ff09f05d MP-ify fastforward to support GATEWAY kernel option.
I add "ipflow_lock" mutex in ip_flow.c and "ip6flow_lock" mutex in ip6_flow.c
to protect all data in each file. Of course, this is not MP-scalable. However,
it is sufficient as tentative workaround. We should make it scalable somehow
in the future.

ok by ozaki-r@n.o.
2016-06-13 08:34:23 +00:00
ozaki-r
fe6d427551 Avoid storing a pointer of an interface in a mbuf
Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.
2016-06-10 13:31:43 +00:00
ozaki-r
d938d837b3 Introduce m_set_rcvif and m_reset_rcvif
The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.
2016-06-10 13:27:10 +00:00
ozaki-r
9f10dc7910 Get rcvif once and reuse it
No functional change.
2016-05-19 08:53:25 +00:00
ozaki-r
348f728f8e Replace DIAGNOSTIC & panic with KASSERT 2016-05-19 03:11:42 +00:00
ozaki-r
894d037bc1 Get rid of unnecessary assignment 2016-05-18 11:28:44 +00:00
ozaki-r
9f595a90fa Get rid of unnecessary NULL check
It's already checked just some lines above.
2016-05-18 09:32:05 +00:00
ozaki-r
27df9b11fc Don't try to get outif unnecessarily from in6_selectsrc
The got outif is unused.
2016-05-18 08:40:51 +00:00
ozaki-r
842c4ed6c1 Get rcvif once and reuse it
No functional change.
2016-05-17 03:27:02 +00:00
ozaki-r
31da384114 Make sure icmp6_redirect_input frees mbuf before return 2016-05-17 03:24:46 +00:00
ozaki-r
040205ae93 Protect ifnet list with psz and psref
The change ensures that ifnet objects in the ifnet list aren't freed during
list iterations by using pserialize(9) and psref(9).

Note that the change adds a pslist(9) for ifnet but doesn't remove the
original ifnet list (ifnet_list) to avoid breaking kvm(3) users. We
shouldn't use the original list in the kernel anymore.
2016-05-12 02:24:16 +00:00
is
142ff9d692 Let non-neighbor NS/NA debug error message include useful information. 2016-04-29 11:46:17 +00:00
ozaki-r
ad0fbab4d2 Get rid of unused argument from get_rand_ifid 2016-04-27 07:51:14 +00:00
ozaki-r
9e0f6c5e36 Stop using rt_gwroute on packet sending paths
rt_gwroute of rtentry is a reference to a rtentry of the gateway
for a rtentry with RTF_GATEWAY. That was used by L2 (arp and ndp)
to look up L2 addresses. By separating L2 nexthop caches, we don't
need a route for the purpose and we can stop using rt_gwroute.
By doing so, we can reduce referencing and modifying rtentries,
which makes it easy to apply a lock (and/or psref) to the
routing table and rtentries.

One issue to do this is to keep RTF_REJECT behavior. It seems it
was broken when we moved rtalloc1 things from L2 output routines
(e.g., ether_output) to ip_hresolv_output, but (fortunately?)
it works unexpectedly. What we mistook are:
- RTF_REJECT was checked for any routes in L2 output routines,
  but in ip_hresolv_output it is checked only when the route
  is RTF_GATEWAY
- The RTF_REJECT check wasn't copied to IPv6 (nd6_output)

It seems that rt_gwroute checks hid the mistakes and it looked
work (unexpectedly) and removing rt_gwroute checks unveil the
issue. So we need to fix RTF_REJECT checks in ip_hresolv_output
and also add them to nd6_output.

One more point we have to care is returning an errno; we need
to mimic looutput behavior. Originally RTF_REJECT check was
done either in L2 output routines or in looutput. The latter is
applied when a reject route directs to a loopback interface.
However, now RTF_REJECT check is done before looutput so to keep
the original behavior we need to return an errno which looutput
chooses. Added rt_check_reject_route does such tweaks.
2016-04-26 09:30:01 +00:00
ozaki-r
a79dfa5db0 Sweep unnecessary route.h inclusions 2016-04-26 08:44:44 +00:00
rjs
505ea9765f Fix build when IPSEC enabled. 2016-04-25 21:21:02 +00:00
ozaki-r
0c74cec625 Check error of rt_setgate and rt_settag 2016-04-25 14:38:08 +00:00
ozaki-r
c325d0ca4f Fix RTF_{REJECT,BLACKHOLE} behavior for IPv6 routes
We still need a nexthop route to reflect RTF_{REJECT,BLACKHOLE}.
In the future, we would do it w/o looking up a route.
2016-04-21 05:07:50 +00:00
ozaki-r
322b6a238d Sweep unncessary radix.h inclusions 2016-04-11 08:56:16 +00:00
ozaki-r
dd3c4fc3e5 Don't call pfxlist_onlink_check with holding llentry lock
From FreeBSD (as of 2016-04-11).

Should fix PR kern/51060.
2016-04-11 01:16:20 +00:00
ozaki-r
f0071d85a1 Don't call pfxlist_onlink_check with holding llentry lock
Sync nd6_free with FreeBSD (as of 2016-04-10).

Should fix PR kern/51056.
2016-04-10 08:15:52 +00:00
roy
60a5a4a8a7 all1_sa is no longer used. 2016-04-04 12:05:40 +00:00
ozaki-r
09973b35ac Separate nexthop caches from the routing table
By this change, nexthop caches (IP-MAC address pair) are not stored
in the routing table anymore. Instead nexthop caches are stored in
each network interface; we already have lltable/llentry data structure
for this purpose. This change also obsoletes the concept of cloning/cloned
routes. Cloned routes no longer exist while cloning routes still exist
with renamed to connected routes.

Noticeable changes are:
- Nexthop caches aren't listed in route show/netstat -r
  - sysctl(NET_RT_DUMP) doesn't return them
  - If RTF_LLDATA is specified, it returns nexthop caches
- Several definitions of routing flags and messages are removed
  - RTF_CLONING, RTF_XRESOLVE, RTF_LLINFO, RTF_CLONED and RTM_RESOLVE
- RTF_CONNECTED is added
  - It has the same value of RTF_CLONING for backward compatibility
- route's -xresolve, -[no]cloned and -llinfo options are removed
  - -[no]cloning remains because it seems there are users
  - -[no]connected is introduced and recommended
    to be used instead of -[no]cloning
- route show/netstat -r drops some flags
  - 'L' and 'c' are not seen anymore
  - 'C' now indicates a connected route
- Gateway value of a route of an interface address is now not
  a L2 address but "link#N" like a connected (cloning) route
- Proxy ARP: "arp -s ... pub" doesn't create a route

You can know details of behavior changes by seeing diffs under tests/.

Proposed on tech-net and tech-kern:
  http://mail-index.netbsd.org/tech-net/2016/03/11/msg005701.html
2016-04-04 07:37:07 +00:00
ozaki-r
35b18fbb1d Remove unnecessary casts and do s/0/NULL/ for rtrequest 2016-04-01 09:16:02 +00:00