- make DMA descriptors volatile to avoid possible unintended reordering
which might cause some race conditions
- process interrupts until all NFE_IRQ_WANTED bits are handled
and also put misc fixes:
- return 1 and call nfe_start() in nfe_intr() only if any own interrupts
are actually handled
- use bus_dmamap_load_mbuf(9) for RX mbufs rather than bus_dmamap_load(9)
with mtod(9) and MCLBYTES
- check sc->txq.queued to see if TX descriptors are queued or handled
in nfe_start() and nfe_txeof()
- use proper BUS_DMASYNC_{PRE,POST} ops
- prepare and use NFE_[RT]X_NEXTDESC() macro
- rename NFE_TX_TCP_CSUM to NFE_TX_TCP_UDP_CSUM since it also enables
hardware udp4csum-tx for UDP4 packets
- some minor optimization
- misc KNF
Tested and confirmed by matthew green by
"to send >25MB/sec to nfe0 for over one hour,"
and also tested by me (with light TRX load on 100baseTX though)
for a month.
with other users who have been experiencing watchdog timeouts:
* Mask all interrupts while servicing a tx or rx interrupt.
* On init, clear IRQ status registers (workaround for buggy netbooters).
> Defer setting of the valid bit in the first TX descriptor after
> all descriptors have been setup. Otherwise, hardware may start
> processing descriptors faster than us and crap out.
> Fixes "watchdog timeout" errors.
>
> Original idea from Matthew Dillon @DragonFly.
- print PCI device name and revision
- print interrupt and Ethernet address like other devices
Before:
---
nfe0 at pci0 dev 5 function 0LKLN: Picked IRQ 20 with weight 1
: ioapic0 pin 20 (irq 9), address xx:xx:xx:xx:xx:xx
After:
---
nfe0 at pci0 dev 5 function 0: NVIDIA nForce3 ethernet #4 (rev. 0xa2)
LKLN: Picked IRQ 20 with weight 1
nfe0: interrupting at ioapic0 pin 20 (irq 9)
nfe0: Ethernet address xx:xx:xx:xx:xx:xx
(note "Picked IRQ" message is logged by aprint_verbose(9) in acpi(4))