For ARMEB, peripheral is configured to little-endian mode, even if
CPU itself is in big-endian mode. Therefore, we need to configure
the device to little-endian mode, and byte-swap descriptor fields
(unlike the case of powerpc).
Simplify and make extensible the config_search() / config_found() /
config_attach() interfaces: rather than having different variants for
which arguments you want pass along, just have a single call that
takes a variadic list of tag-value arguments.
Adjust all call sites:
- Simplify wherever possible; don't pass along arguments that aren't
actually needed.
- Don't be explicit about what interface attribute is attaching if
the device only has one. (More simplification.)
- Add a config_probe() function to be used in indirect configuiration
situations, making is visibly easier to see when indirect config is
in play, and allowing for future change in semantics. (As of now,
this is just a wrapper around config_match(), but that is an
implementation detail.)
Remove unnecessary or redundant interface attributes where they're not
needed.
There are currently 5 "cfargs" defined:
- CFARG_SUBMATCH (submatch function for direct config)
- CFARG_SEARCH (search function for indirect config)
- CFARG_IATTR (interface attribte)
- CFARG_LOCATORS (locators array)
- CFARG_DEVHANDLE (devhandle_t - wraps OFW, ACPI, etc. handles)
...and a sentinel value CFARG_EOL.
Add some extra sanity checking to ensure that interface attributes
aren't ambiguous.
Use CFARG_DEVHANDLE in MI FDT, OFW, and ACPI code, and macppc and shark
ports to associate those device handles with device_t instance. This
will trickle trough to more places over time (need back-end for pre-OFW
Sun OBP; any others?).
ifmedia_ioctl(), the hook is not required because ether_ioctl has it
(if_ethersubr.c rev. 1.160). These drivers don't return ENETRESET in
ifmedia_ioctl(), so no functional change.
int (*mii_readreg_t)(device_t, int, int);
void (*mii_writereg_t)(device_t, int, int, int);
to:
int (*mii_readreg_t)(device_t, int, int, uint16_t *);
int (*mii_writereg_t)(device_t, int, int, uint16_t);
Now we can test if a read/write operation failed or not by the return value.
In 802.3 spec says that the PHY shall not respond to read/write transaction
to the unimplemented register(22.2.4.3). Detecting timeout can be used to
check whether a register is implemented or not (if the register conforms to
the spec). ukphy(4) can be used this for MII_MMDACR and MII_MMDAADR.
Note that I noticed that the following code do infinite loop in the
read/wirte function. If it accesses unimplemented PHY register, it will hang.
It should be fixed:
arm/at91/at91emac.c
arm/ep93xx/epe.c
arm/omap/omapl1x_emac.c
mips/ralink/ralink_eth.c
arch/powerpc/booke/dev/pq3etsec.c(read)
dev/cadence/if_cemac.c <- hkenken
dev/ic/lan9118.c
Tested with the following device:
axe+ukphy
axe+rgephy
axen+rgephy (tested by Andrius V)
wm+atphy
wm+ukphy
wm+igphy
wm+ihphy
wm+makphy
sk+makphy
sk+brgphy
sk+gentbi
msk+makphy
sip+icsphy
sip+ukphy
re+rgephy
bge+brgphy
bnx+brgphy
gsip+gphyter
rtk+rlphy
fxp+inphy (tested by Andrius V)
tlp+acphy
ex+exphy
epic+qsphy
vge+ciphy (tested by Andrius V)
vr+ukphy (tested by Andrius V)
vte+ukphy (tested by Andrius V)
Not tested (MAC):
arm:at91emac
arm:cemac
arm:epe
arm:geminigmac
arm:enet
arm:cpsw
arm:emac(omac)
arm:emac(sunxi)
arm:npe
evbppc:temac
macppc:bm
macppc:gm
mips:aumac
mips:ae
mips:cnmac
mips:reth
mips:sbmac
playstation2:smap
powerpc:tsec
powerpc:emac(ibm4xx)
sgimips:mec
sparc:be
sf
ne(ax88190, dl10019)
awge
ep
gem
hme
smsh
mtd
sm
age
alc
ale
bce
cas
et
jme
lii
nfe
pcn
ste
stge
tl
xi
aue
mue
smsc
udav
url
Not tested (PHY):
amhphy
bmtphy
dmphy
etphy
glxtphy
ikphy
iophy
lxtphy
nsphyter
pnaphy
rdcphy
sqphy
tlphy
tqphy
urlphy
These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.
HOWEVER! Some subsystems have
#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))
even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.
To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.
I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:
cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))
It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.
Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.
This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.
kmem_alloc() with KM_SLEEP
kmem_zalloc() with KM_SLEEP
percpu_alloc()
pserialize_create()
psref_class_create()
all of these paths include an assertion that the allocation has not failed,
so callers should not assert that again.
The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
- Where/When if_ipackets is counted up
- Note that some drivers still update packet statistics in their own
way (periodical update)
- Moved bpf_mtap run in softint
- This makes it easy to MP-ify bpf
Proposed on tech-kern and tech-net
The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.
No functional change.
This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.
This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.
To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.
Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html
Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!
returns other than 0, the TX ring might not full. Check whether
the TX ring has one or more packets. If the ring is empty,
dont' set IFF_OACTIVE because an TX complete interrupt never
occur and IFF_OACTIVE flags is left. The interface's timer
isn't reset, so a device timeout desn't occur.
Fixes a bug that IFF_OACTIVE flag is left on heavy load.
Part of PR#48568.
- both TX side an RX side.
- different setting for each port
- TX side is hw.mvgbe.mvgbe*.ipginttx
- RX side is hw.mvgbe.mvgbe*.ipgintrx
- The default value is 768.
- The lowest value is 0
- For highest value, 0x3777 is used for V1, and 0xffff is used for V2.
is fragmented or not, so define new MVGBE_RX_IP_FRAGMENT with the same
value and use it.
- Remove the checking whether a packet length is lower than 72 octet.
This check is not used in Linux and FreeBSD. Tested with me (for Kirkwood)
and Kiyohara (for DiscoveryII).
- Fix the usage of a local variable for csum_flags.
- It seemd that sometimes MVGBE_RX_L4_CHECKSUM_OK bit were set to 0
even if the checksum is correct and the packet was not fragmented.
So we don't set M_CSUM_TCP_UDP_BAD even if csum bit is 0.
a real bug when HW checksum offload function is used. It was easy to
reproduce with NFS UDP mount.
- Fix a potential bug that a packet other than TCP and UDP is marked as bad
checksum.
- Change the synching order of descriptors. First, sync descriptors except
first and then sync the first descriptor.
- To recover from an race condition, reduce the if_timer from 5 to 1 and
when timeout occur write MVGBE_TQC_ENQ bit again.