introduced in the prior patch.
The queue has capacity to store 8 link state changes, if it overflows then
the oldest state change is lost, but the oldest DOWN state change is
preserved to ensure any subsequent UP state changes reflect properly.
Because there are only 3 states to queue, the queue itself is implemented
by storing 2-bit numbers in a bigger one.
To increase the size of the queue, just increase the size of the backing
store to a bigger number.
The workaround was introduced because lltable/llentry uses rwlock
but it may be executed in hardware interrupt due to fast forward.
Now we don't run fast forward in hardware interrupt anymore, so
we can remove the workaround.
if_link_state_change can execute the network stack that is expected to
not run in hardware interrupt (at least now), however network drivers
may call it in hardware interrupt. Avoid that by introducing a new
softint for if_link_state_change.
The original patch is provided by mlelstv@ and tweaked a bit by me.
Should fix PR kern/50602.
Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.
- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers
This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.
This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.
To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.
Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html
Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!
This change was intended, but Nakahara-san had already made a better
one locally! So I'll let him commit that one, and I'll try not to
step on anyone's toes again.
Mostly mechanical change to replace it, culling some now-needless
boilerplate around all the users.
This does not substantively change the ip_encap API or eliminate
abuse of sketchy pointer casts -- that will come later, and will be
easier now that it is not tangled up with struct protosw.
You can't use this unless you know what it is a priori: the formal
prototype is variadic, and the different instances (e.g., ip_output,
route_output) have different real prototypes.
Convert the only user of it, raw_send in net/raw_cb.c, to take an
explicit callback argument. Convert the only instances of it,
route_output and key_output, to such explicit callbacks for raw_send.
Use assertions to make sure the conversion to explicit callbacks is
warranted.
Discussed on tech-net with no objections:
https://mail-index.netbsd.org/tech-net/2016/01/16/msg005484.html
llentry#la_opaque which is for token ring is allocated in arp.c
and freed in arp.c when freeing llentry. However, llentry can be
freed from other places, e.g., lltable_free. In such cases,
la_opaque is never freed.
To fix that, add a new callback (lle_ll_free) to llentry and
register a destruction function of la_opque to it. On freeing a
llentry, we can surely free la_opque via the callback.
lltable and llentry were introduced to replace ARP cache data structure
for further restructuring of the routing table: L2 nexthop cache
separation. This change replaces the NDP cache data structure
(llinfo_nd6) with them as well as ARP.
One noticeable change is for neighbor cache GC mechanism that was
introduced to prevent IPv6 DoS attacks. net.inet6.ip6.neighborgcthresh
was the max number of caches that we store in the system. After
introducing lltable/llentry, the value is changed to be per-interface
basis because lltable/llentry stores neighbor caches in each interface
separately. And the change brings one degradation; the old GC mechanism
dropped exceeded packets based on LRU while the new implementation drops
packets in order from the beginning of lltable (a hash table + linked
lists). It would be improved in the future.
Added functions in in6.c come from FreeBSD (as of r286629) and are
tweaked for NetBSD.
Proposed on tech-kern and tech-net.