Commit Graph

455 Commits

Author SHA1 Message Date
msaitoh 22d7e3fc82 KNF. No functional change. 2019-05-21 09:18:37 +00:00
msaitoh bf354a0797 The max subtype of the ifmedia word is 31. It's too small for Ethernet now.
We currently use use it up to 30. We should extend the limit to be able to use
more than 10Gbps speeds. Our ifmedia(4) is inconvenience and have some problem
so we should redesign the interface, but it's too late for netbsd-9 to do it.
So, we keep the data structure size and modify the structure a bit. The
strategy is almost the same as FreeBSD. Many bits of IFM_OMASK for Ethernet
have not used, so use some of them for Ethernet's subtype.

The differences against FreeBSD are:
 - We use NetBSD style compat code (i.e. no SIOCGIFXMEDIA).
 - FreeBSD's IFM_ETH_XTYPE's bit location is from 11 to "14" even though
   IFM_OMASK is from 8 to "15". We use _IFM_ETH_XTMASK from bit 13 to "15".
 - FreeBSD changed the meaning of IFM_TYPE_MATCH(). I think we should
   not do it. We keep it not changing and added new IFM_TYPE_SUBTYPE_MATCH()
   macro for matching both TYPE and SUBTYPE.
 - Added up to 400GBASE-SR16.

New layout of the media word is as follows (from ifmedia_h):

 * if_media Options word:
 *	Bits	Use
 *	----	-------
 *	0-4	Media subtype	MAX SUBTYPE == 255 for ETH and 31 for others
 *	5-7	Media type
 *	8-15	Type specific options
 *	16-18	Mode (for multi-mode devices)
 *	19	(Reserved for Future Use)
 *	20-27	Shared (global) options
 *	28-31	Instance
 *
 *   3                     2                   1
 *   1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
 *  +-------+---------------+-+-----+---------------+-----+---------+
 *  |       |               |R|     |               |     |         |
 *  | IMASK |     GMASK     |F|MMASK+-----+ OMASK   |NMASK|  TMASK  |
 *  |       |               |U|     |XTMSK|         |     |         |
 *  +-------+---------------+-+-----+-----+---------+-----+---------+
 *   <----->                   <--->                 <--->
 *  IFM_INST()               IFM_MODE()            IFM_TYPE()
 *
 *                              IFM_SUBTYPE(other than ETH)<------->
 *
 *                                   <---> IFM_SUBTYPE(ETH)<------->
 *
 *
 *           <------------->         <------------->
 *                        IFM_OPTIONS()
2019-05-17 07:37:11 +00:00
ozaki-r 7fc219a5ee Implement an aggressive psref leak detector
It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted.  A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread.  Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire).  We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern
2019-05-17 03:34:26 +00:00
ozaki-r 99ec0af5eb Store IFF_ALLMULTI in ec_flags instead of if_flags to avoid data races
IFF_ALLMULTI is set/unset to if_flags via if_mcast_op.  To avoid data races on
if_flags, IFNET_LOCK was added for if_mcast_op.  Unfortunately it produces
a deadlock so we want to remove added IFNET_LOCK by avoiding the data races by
another approach.

This fix introduces ec_flags to struct ethercom and stores IFF_ALLMULTI to it.
ec_flags is protected by ETHER_LOCK and thus IFNET_LOCK is no longer necessary
for if_mcast_op.  Note that the fix is applied only to MP-safe drivers that
the data races matter.

In the kernel, IFF_ALLMULTI is set by a driver and used by the driver itself.
So changing the storing place doesn't break anything.  One exception is
ioctl(SIOCGIFFLAGS); we have to include IFF_ALLMULTI in a result if needed to
export the flag as well as before.

A upcoming commit will remove IFNET_LOCK.

PR kern/54189
2019-05-15 02:56:47 +00:00
pgoyette 35c8bc0f3b Typos in comments. NFCI. 2019-04-20 22:16:47 +00:00
msaitoh d6117c1651 Rename ifreqo2n() and ifreqo2n() to IFREQO2N_43() and IFREQN2O_43():
- ifreqo2n() and ifreqn2o() are for COMPAT_43, so add _43 to the name.
 - Uppercase to make it clear those are macros.
2019-04-16 04:31:42 +00:00
christos c9d0acaf5e Zero out the ifreq struct for SIOCGIFCONF to avoid up to 127 bytes of stack
disclosure. From Andy Nguyen, many thanks!
2019-04-15 20:51:46 +00:00
msaitoh 4f0d5c60d3 Remove inclusion of compat/sys/socket.h. It's not required anymore. 2019-04-11 03:07:11 +00:00
pgoyette 327f2c734c Replace compile-time checking for vlan code with a module hook.
Should resolve the errors reported on irc when booting a kernel which
has agr without vlan:


 [   1.0000000] WARNING: module error: built-in module if_agr can't find builtin dependency `if_vlan'
 [   1.0000000] WARNING: module error: built-in module if_agr prerequisite if_vlan failed, error 2
2019-03-23 09:48:04 +00:00
pgoyette 8c2f80f160 Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.
2019-03-01 11:06:55 +00:00
pgoyette c1283e70fa Normalize all the compat hooks' names to the form
<subsystem>_<function>_<version>_hook

NFCI

XXX Note that although this introduces a change in the kernel-to-
XXX module interface, we are NOT bumping the kernel version number.
XXX We will bump the version number once the interface stabilizes.
2019-01-29 09:28:50 +00:00
pgoyette d91f98a871 Merge the [pgoyette-compat] branch 2019-01-27 02:08:33 +00:00
msaitoh 7b54e0066d Add SIOCSETHERCAP. It's used to change ec_capenable. 2018-12-21 08:58:08 +00:00
rin 73240bb1c3 PR kern/53562
Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
    specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
    TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej
2018-12-12 01:46:47 +00:00
maxv 5c98710094 Remove the 't' argument from m_tag_find(). 2018-11-15 10:23:55 +00:00
ozaki-r 973496ef18 Avoid double rt_replace_ifa on rtrequest1(RTM_ADD)
Some callers of rtrequest1(RTM_ADD) adjust rt_ifa of an rtentry created by
rtrequest1 that may change rt_ifa (in ifa_rtrequest) with another ifa that is
different from requested one.  It's wasteful and even worse introduces a race
condition.  rtrequest1 should just use a passed ifa as is if a caller hopes so.
2018-10-30 05:54:41 +00:00
ozaki-r 334ceb81c9 Use atomic operations for ifa_refcnt 2018-10-30 05:29:21 +00:00
ozaki-r 9b83640c45 Remove a wrong assertion in ifaref
Doing ifref on an ifa with IFA_DESTROYING is not a problem; the reference should
be dropped during the destruction of the ifa.
2018-10-30 05:27:51 +00:00
knakahara eecb6bd8af fix panic when do ifconfig -vlanif and ifconfig vlanif again. advised by ozaki-r@.
e.g. do the following commands.
    ====================
    # ifconfig vlan0 create
    # ifconfig vlan0 vlan 100 vlanif wm0
    # ifconfig vlan0 -vlanif wm0
    # ifconfig vlan0 vlan 100 vlanif wm0
    ====================

ATF net/if_vlan do this type of test, however it cannot detect this bug.
Because the shmif(4)'s ifp->if_hwdl is always NULL as shmif(4)'s ethernet
address is set U/L bit.
See: https://nxr.netbsd.org/xref/src/sys/net/if_ethersubr.c#997
2018-10-18 11:34:54 +00:00
christos 68d92d47f7 Flip the order of free'ing things to avoid crash (from ozaki-r). Tested
with a month's uptime. Used to crash once a week.
2018-09-07 13:24:14 +00:00
maxv f922b0f6bd Remove the network ATM code. 2018-09-06 06:41:59 +00:00
ozaki-r 07f0937270 Restore splx removed accidentally at v1.406
Pointed out by k-goda@IIJ
2018-08-27 04:53:24 +00:00
knakahara 726424d6e0 fix if_snd_is_used(), ifp->if_snd is also used by if.c::if_transmit(). 2018-08-10 10:31:01 +00:00
msaitoh 80bf5cccde - Fix a bug that drop counter shows incorrect vaule like
"net.inet.ip.ifq.drops = 72059810241052672"
- Change pktq's length sysctl to uint64_t.
2018-08-10 07:24:09 +00:00
msaitoh 68ccef5262 Change pktq's drops count sysctl from CTLTYPE_INT to CTLTYPE_QUAD. 2018-08-06 06:54:40 +00:00
christos 14ff979601 Calling rtinit(sa_family = AF_LINK, RTM_DELETE, 0) is guaranteed not to
work. Remove bogus call leaving a KASSERT behind.
2018-07-09 14:54:01 +00:00
ozaki-r 1350b04367 Fix net.inet6.ip6.ifq node doesn't exist
The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels.  However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels).  This fix is ad-hoc but good enough for netbsd-8.  We should refine
the initialization order of network components in the future.

Pointed out by hikaru@
2018-07-03 03:37:03 +00:00
msaitoh 3cd62456f9 Implement the BPF direction filter (BIOC[GS]DIRECTION). It provides backward
compatibility with BIOC[GS]SEESENT ioctl. The userland interface is the same
as FreeBSD.

 This change also fixes a bug that the direction is misunderstand on some
environment by passing the direction to bpf_mtap*() instead of checking
m->m_pkthdr.rcvif.
2018-06-26 06:47:57 +00:00
ozaki-r d72e3ba12d Make sure to remove all AF_LINK addresses in if_detach 2018-06-01 07:16:23 +00:00
ozaki-r 20ec6b197e Make sure to not change if_hwdl once set 2018-06-01 07:14:13 +00:00
ozaki-r 122cb40f7f Relax a lock check in if_mcast_op unless NET_MPSAFE
It seems that there remain some paths that don't satisfy the constraint that is
required only if NET_MPSAFE.  So don't check it by default.

One known path is nd6_rtrequest => in6_addmulti => if_mcast_op, which is not
easy to address.
2018-05-31 02:10:23 +00:00
msaitoh 38c7fb59b5 Print "NET_MPSAFE enabled" if it's enabled. 2018-05-24 05:27:29 +00:00
ozaki-r 146c9cdc07 Protect if_deferred_start_softint with KERNEL_LOCK if the interface isn't MP-safe 2018-05-14 02:55:46 +00:00
ozaki-r 4f93a8de81 Protect packet input routines with KERNEL_LOCK and splsoftnet
if_input, i.e, ether_input and friends, now runs in softint without any
protections.  It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe.  Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context.  In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@
2018-05-14 02:55:03 +00:00
ozaki-r efe7f42bce Use if_is_mpsafe (NFC) 2018-05-14 02:53:29 +00:00
christos b10d32c2eb disentangle a bit more the compat ioctl code. 2018-04-12 18:44:59 +00:00
ozaki-r ad8c8ec2d4 Destroy ifq_lock at the end of if_detach
It still can be used in if_detach.
2018-01-30 10:40:02 +00:00
ozaki-r 9aa00be0ba Check MP-safety in ifa_insert and ifa_remove only for IFEF_MPSAFE drivers
Eventually the assertions should pass for all drivers, however, at this point
it's too eager.

Fix PR kern/52895
2018-01-10 01:22:26 +00:00
ozaki-r c00d5e8f6f Suppress the assertion of IFNET_LOCK in if_mcast_op if MROUTING
MROUTING doesn't deal with IFNET_LOCK yet.

Reported by kardel@
2017-12-26 02:01:35 +00:00
ozaki-r 81e23d33f5 Remove IFNET_GLOBAL_LOCK where it's unnecessary because IFNET_LOCK is held 2017-12-15 04:04:58 +00:00
ozaki-r bde7231efb Ensure to call if_mcast_op with holding IFNET_LOCK
Note that CARP doesn't deal with IFNET_LOCK yet.
2017-12-15 04:03:46 +00:00
ozaki-r 5ff08fa39c Reorder some destruction routines in if_detach
- Destroy if_ioctl_lock at the end of the if_detach because it's used in various
  destruction routines
- Move psref_target_destroy after pr_purgeif because we want to use psref in
  pr_purgeif (otherwise destruction procedures can be tricky)
2017-12-14 05:46:54 +00:00
ozaki-r cb1c111a7d Wrap if_ioctl_lock with IFNET_* macros (NFC)
Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...
2017-12-11 03:29:20 +00:00
ozaki-r 61be422f1a Rename IFNET_LOCK to IFNET_GLOBAL_LOCK
IFNET_LOCK will be used in another lock, if_ioctl_lock (might be renamed then).
2017-12-11 03:25:45 +00:00
ozaki-r 330068d1ed Revert "Make if_timer MP-safe if IFEF_MPSAFE"
Because it has decreased the performance of wm. And also I found that
wm_watchdog doesn't work well with if_watchdog framework at all. Sharing one
counter (if_timer) with multiple instances (hardware multi-queues) can't detect
a single (or some) stall of them because other instances reset the counter even
if the stalled one want the watchdog to fire.

Interfaces without IFEF_MPSAFE works safely with the original if_watchdog thanks
to KENREL_LOCK. OTOH, interfaces with IFEF_MPSAFE shouldn't use if_watchdog and
should implement their own watchdog timer that works with multiple instances.
2017-12-08 05:22:23 +00:00
ozaki-r db9a3449c5 Fix build of kernels without ether
By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790
2017-12-08 04:03:51 +00:00
ozaki-r 1d72800a1d Get rid of outdated comments 2017-12-07 10:05:42 +00:00
ozaki-r 3c0d913f9e Ensure to call if_addr_init with holding if_ioctl_lock 2017-12-07 03:16:24 +00:00
ozaki-r d6ed53e050 Use IFADDR_WRITER_FOREACH instead of IFADDR_READER_FOREACH
At that point no other one modifies the list so IFADDR_READER_FOREACH
is unnecessary. Use of IFADDR_READER_FOREACH is harmless in general though,
if we try to detect contract violations of pserialize, using it violates
the contract. So avoid using it makes life easy.
2017-12-07 01:23:53 +00:00
ozaki-r 7f08ab8c46 Make if_link_queue MP-safe if IFEF_MPSAFE
if_link_queue is a queue to store events of link state changes, which is
used to pass events from (typically) an interrupt handler to
if_link_state_change softint. The queue was protected by KERNEL_LOCK so far,
but if IFEF_MPSAFE is enabled, it becomes unsafe because (perhaps) an interrupt
handler of an interface with IFEF_MPSAFE doesn't take KERNEL_LOCK. Protect it
by a spin mutex.

Additionally with this change KERNEL_LOCK of if_link_state_change softint is
omitted if NET_MPSAFE is enabled.

Note that the spin mutex is now ifp->if_snd.ifq_lock as well as the case of
if_timer (see the comment).
2017-12-06 09:54:47 +00:00