This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.
This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.
To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.
Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html
Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.
Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.
Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
IFF_UP and IFF_RUNNING before running the 'disable' step, instead
of after. Soon I will handle the 'disable' step by calling into
PMF, which may call if_stop(, 0). Ordinarily, that is harmless.
This change lets the if_stop() routines exit early when they find
on entry that IFF_RUNNING is not set.
the opportunity to handle an ioctl before generic ifioctl handling
occurs. This will ease extending the kernel and sharing of code
between drivers.
First steps: Make the signature of ifioctl_common() match struct
ifinet->if_ioctl. Convert SIOCSIFCAP and SIOCSIFMTU to the new
ifioctl() regime, throughout the kernel.
transaction to 8. value 32 triggers occational watchdog() Tx
timeout when higher system load. This symptom is observed in
ipforwarding across two PCI devices case so far, and it remains
unidentified what really happens for Tx DMA activity. 16 seems
ok, 8 is conservative and heuristic value. may need more adjustment
work in other parts.
- distinguish 8842 from 8841. 8842 now keeps media selection "auto"
and indicates "up 100baseTX-FDX flow" when either of two ports has
a valid link. There is no provision to see and control the two this
moment and their media selections remain in "auto" all the time. This
arrangement is considered acceptable since 8842's external ports are
connected with the internal EMAC via managed 3 port Ethernet switch.
- 8841 behaves a plain stanadrd 10/100 EMAC with standard media
selection feature.
- gather MIB statistics counter values with evcnt(8) framework.
- increase Tx/Rx DMA DMA burst transfer size from 16 to 32.
No functionality change intented.
IHAE "IP Header Alignment Enable" feature of RXC register;
- careful cross referencing at KSZ8692P/8841P/8842P PDFs indicates IHEA
bit works as follows; When the feature is turned on, Rx DMA engine
will populate Rx frame data in the Rx memory with +2 or +3 byte swifted
to make its IP head field 32bit aligned. The shift amount is recorded
inside RDES0 to tell 0, 2 or 3. The automatic alignment is done only
when IHAE is enabled _and_ the Rx frame was IP frame. In other cases,
RDES0 swift amount field keeps 0.
- KSZ8841P document mentions the IHAE bit but its reference link is broken
to tell the new RDES0 field. KSZ8842P lacks both. The bit usage of
RDES0 23:20 seems different from KSZ8692P, which brings me a vague
suspiction of documentation error.