Commit Graph

2394 Commits

Author SHA1 Message Date
ozaki-r
e1135cd9b9 Use curlwp_bind and curlwp_bindx instead of open-coding LP_BOUND 2016-06-16 02:38:40 +00:00
knakahara
a6f4292e65 eliminate unnecessary splnet 2016-06-13 08:37:15 +00:00
knakahara
e4ff09f05d MP-ify fastforward to support GATEWAY kernel option.
I add "ipflow_lock" mutex in ip_flow.c and "ip6flow_lock" mutex in ip6_flow.c
to protect all data in each file. Of course, this is not MP-scalable. However,
it is sufficient as tentative workaround. We should make it scalable somehow
in the future.

ok by ozaki-r@n.o.
2016-06-13 08:34:23 +00:00
knakahara
14ea9af5f7 make ipflow_reap() static function. 2016-06-13 08:29:55 +00:00
knakahara
f2808ade1a remove unnecessary splnet before pool_{get,put} 2016-06-13 08:04:44 +00:00
ozaki-r
fe6d427551 Avoid storing a pointer of an interface in a mbuf
Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.
2016-06-10 13:31:43 +00:00
ozaki-r
d938d837b3 Introduce m_set_rcvif and m_reset_rcvif
The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.
2016-06-10 13:27:10 +00:00
christos
fdea3219c6 make hostzerobroadcast default to "no". 2016-05-27 16:44:15 +00:00
rjs
afd529313e Use const for arguments to sctp_is_same_scope(). 2016-05-22 23:04:27 +00:00
rjs
b65559a564 Remove rtcache reference to route before freeing the containing struct. 2016-05-22 22:18:41 +00:00
ozaki-r
1acd48af54 Get rid of unnecessary assignment 2016-05-17 09:00:24 +00:00
ozaki-r
040205ae93 Protect ifnet list with psz and psref
The change ensures that ifnet objects in the ifnet list aren't freed during
list iterations by using pserialize(9) and psref(9).

Note that the change adds a pslist(9) for ifnet but doesn't remove the
original ifnet list (ifnet_list) to avoid breaking kvm(3) users. We
shouldn't use the original list in the kernel anymore.
2016-05-12 02:24:16 +00:00
ozaki-r
8e8364ddca Fix compilation for ppc 2016-05-09 07:02:10 +00:00
christos
902487a7f3 fix compilation for ppc. 2016-05-04 15:42:32 +00:00
ozaki-r
2cf7873b92 Constify rtentry of if_output
We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.
2016-04-28 00:16:56 +00:00
rjs
991b8746b6 Fix build when IPSEC enabled. 2016-04-26 11:02:57 +00:00
ozaki-r
9e0f6c5e36 Stop using rt_gwroute on packet sending paths
rt_gwroute of rtentry is a reference to a rtentry of the gateway
for a rtentry with RTF_GATEWAY. That was used by L2 (arp and ndp)
to look up L2 addresses. By separating L2 nexthop caches, we don't
need a route for the purpose and we can stop using rt_gwroute.
By doing so, we can reduce referencing and modifying rtentries,
which makes it easy to apply a lock (and/or psref) to the
routing table and rtentries.

One issue to do this is to keep RTF_REJECT behavior. It seems it
was broken when we moved rtalloc1 things from L2 output routines
(e.g., ether_output) to ip_hresolv_output, but (fortunately?)
it works unexpectedly. What we mistook are:
- RTF_REJECT was checked for any routes in L2 output routines,
  but in ip_hresolv_output it is checked only when the route
  is RTF_GATEWAY
- The RTF_REJECT check wasn't copied to IPv6 (nd6_output)

It seems that rt_gwroute checks hid the mistakes and it looked
work (unexpectedly) and removing rt_gwroute checks unveil the
issue. So we need to fix RTF_REJECT checks in ip_hresolv_output
and also add them to nd6_output.

One more point we have to care is returning an errno; we need
to mimic looutput behavior. Originally RTF_REJECT check was
done either in L2 output routines or in looutput. The latter is
applied when a reject route directs to a loopback interface.
However, now RTF_REJECT check is done before looutput so to keep
the original behavior we need to return an errno which looutput
chooses. Added rt_check_reject_route does such tweaks.
2016-04-26 09:30:01 +00:00
ozaki-r
a79dfa5db0 Sweep unnecessary route.h inclusions 2016-04-26 08:44:44 +00:00
rjs
505ea9765f Fix build when IPSEC enabled. 2016-04-25 21:21:02 +00:00
ozaki-r
0c74cec625 Check error of rt_setgate and rt_settag 2016-04-25 14:38:08 +00:00
ozaki-r
5fd142cec8 Fix error path 2016-04-19 09:36:35 +00:00
ozaki-r
54748dcad2 Separate MPLS-related routines from ip_hresolv_output
No functional changes.
2016-04-19 09:29:54 +00:00
ozaki-r
07d863c903 Constify rtentry of arpresolve
We don't need to (rather shouldn't) modify rtentry in there.
2016-04-19 04:13:56 +00:00
ozaki-r
805fe96546 Fix panic on receiving an ARP request
The panic happened if an ARP request has a spa (i.e., IP address) whose
ARP entry already exists in the table as a static ARP entry.
2016-04-18 02:24:42 +00:00
ozaki-r
4ace575dc7 Get rid of meaningless RTF_UP check from ip_hresolv_output
The check is meaningless because
- An obtained rtentry is ensured that it's always RTF_UP by rtcache,
  rtalloc1 and rtlookup. If the rtentry isn't changed (i.e., RTF_UP gets
  dropped) during processing, the check should be unnecessary
- Even if not, i.e., an obtained rtentry can be changed during processing,
  checking only at the point doesn't help; the rtentry can be changed after
  the check

Instead we have to ensure that RTF_UP isn't dropped if someone is using it
somehow. Note that we already ensure that a rtentry being used isn't freed
by rt_refcnt.

Proposed on tech-kern and tech-net.
2016-04-18 01:28:06 +00:00
rjs
b4a446b522 Remove stray debug printf(). 2016-04-14 18:36:56 +00:00
ozaki-r
4f0eb37aac ddb: rename show arptab to show routes
show arptab command of ddb is now inappropriate because it actually dumps
routes but arp entries aren't routes anymore. So rename it to show routes
and move the code from if_arp.c to route.c.

ok christos@
2016-04-13 00:47:01 +00:00
ozaki-r
322b6a238d Sweep unncessary radix.h inclusions 2016-04-11 08:56:16 +00:00
christos
b988d754df - tidy up error messages
- add a length argument to arpresolve()
- add KASSERT for overflow
2016-04-07 03:22:15 +00:00
ozaki-r
09973b35ac Separate nexthop caches from the routing table
By this change, nexthop caches (IP-MAC address pair) are not stored
in the routing table anymore. Instead nexthop caches are stored in
each network interface; we already have lltable/llentry data structure
for this purpose. This change also obsoletes the concept of cloning/cloned
routes. Cloned routes no longer exist while cloning routes still exist
with renamed to connected routes.

Noticeable changes are:
- Nexthop caches aren't listed in route show/netstat -r
  - sysctl(NET_RT_DUMP) doesn't return them
  - If RTF_LLDATA is specified, it returns nexthop caches
- Several definitions of routing flags and messages are removed
  - RTF_CLONING, RTF_XRESOLVE, RTF_LLINFO, RTF_CLONED and RTM_RESOLVE
- RTF_CONNECTED is added
  - It has the same value of RTF_CLONING for backward compatibility
- route's -xresolve, -[no]cloned and -llinfo options are removed
  - -[no]cloning remains because it seems there are users
  - -[no]connected is introduced and recommended
    to be used instead of -[no]cloning
- route show/netstat -r drops some flags
  - 'L' and 'c' are not seen anymore
  - 'C' now indicates a connected route
- Gateway value of a route of an interface address is now not
  a L2 address but "link#N" like a connected (cloning) route
- Proxy ARP: "arp -s ... pub" doesn't create a route

You can know details of behavior changes by seeing diffs under tests/.

Proposed on tech-net and tech-kern:
  http://mail-index.netbsd.org/tech-net/2016/03/11/msg005701.html
2016-04-04 07:37:07 +00:00
mlelstv
78f913b0b2 Replace generic queue macros with IFNET/IFADDR macros. 2016-04-03 09:57:40 +00:00
ozaki-r
35b18fbb1d Remove unnecessary casts and do s/0/NULL/ for rtrequest 2016-04-01 09:16:02 +00:00
christos
6228dc517a PR/50899: David Binderman: optimize memset 2016-03-06 19:46:05 +00:00
knakahara
9b7918b3ee remove unnecessary declarations and fix KNF
Thanks to riastradh@
2016-02-29 01:29:15 +00:00
knakahara
e80f101289 To eliminate gif_softc_list linear search, add extra argument to encapsw.pr_ctlinput(). 2016-02-26 07:35:17 +00:00
ozaki-r
a143583fe0 Use callout_halt instead of callout_stop 2016-02-25 06:00:01 +00:00
rtr
0a0528fd0a Fix building of IPv4-Mapped IPv6 addresses.
As discussed on tech-net@ use in6_sin_2_v4mapsin6() to build mapped
addresses.
2016-02-15 19:00:42 +00:00
rtr
e2a3307b85 Reduce code duplication.
Split creation of IPv4-Mapped IPv6 addresses into its own function
and use it.

No functional change intended.  As posted to tech-net@
2016-02-15 14:59:03 +00:00
rtr
f5c6d9772a remove duplicated #include of <netinet/in.h> 2016-02-14 23:47:57 +00:00
ozaki-r
9c4cd06355 Introduce softint-based if_input
This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!
2016-02-09 08:32:07 +00:00
knakahara
51f4870974 eliminate variable argument in encapsw 2016-01-26 06:00:10 +00:00
knakahara
b546d5277b implement encapsw instead of protosw and uniform prototype.
suggested and advised by riastradh@n.o, thanks.

BTW, It seems in_stf_input() had bugs...
2016-01-26 05:58:05 +00:00
ozaki-r
07e20941bc Remove unnecessary LLE_REMREF
The code around it was copied from arptimer, but LLE_REMREF
is unnecessary because it is needed only for arptimer that
is called after LLE_ADDREF.

This is a possible fix for PR#50548, PR#50702 and PR#50704.
2016-01-25 10:15:38 +00:00
riastradh
fa50b451d4 Those were local changes not meant to be part of the revert. SORRY! 2016-01-23 14:48:55 +00:00
christos
e1c6072fc4 fix compilation 2016-01-23 02:58:13 +00:00
riastradh
e588d95c25 Back out previous change to introduce struct encapsw.
This change was intended, but Nakahara-san had already made a better
one locally!  So I'll let him commit that one, and I'll try not to
step on anyone's toes again.
2016-01-22 23:27:12 +00:00
riastradh
87bc652e3d Don't abuse struct protosw for ip_encap -- introduce struct encapsw.
Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.

This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.
2016-01-22 05:15:10 +00:00
riastradh
7c7b1739c8 Revert previous: ran cvs commit when I meant cvs diff. Sorry!
Hit up-arrow one too few times.
2016-01-21 15:41:29 +00:00
riastradh
b41d562bd0 Give proper prototype to ip_output. 2016-01-21 15:27:48 +00:00
riastradh
f8b0ac1cb4 Give proper prototype to ip_output. 2016-01-20 22:12:22 +00:00