Commit Graph

223 Commits

Author SHA1 Message Date
ozaki-r
6a3a8456d3 Abandon unnecessary softint
The softint was introduced to defer fownsignal that was called in bpf_wakeup to
softint at v1.139, but now bpf_wakeup always runs in softint so we don't need
the softint anymore.
2018-01-25 02:45:02 +00:00
ozaki-r
580fb70bf3 Make softint and callout MP-safe 2017-12-15 07:29:11 +00:00
ozaki-r
e0d574e4f8 Fix panic in callout_halt (fix typo)
Reported by wiz@
2017-12-12 06:26:57 +00:00
christos
ea05286d92 add fo_name so we can identify the fileops in a simple way. 2017-11-30 20:25:54 +00:00
ozaki-r
cead3b8854 Provide macros for softnet_lock and KERNEL_LOCK hiding NET_MPSAFE switch
It reduces C&P codes such as "#ifndef NET_MPSAFE KERNEL_LOCK(1, NULL); ..."
scattered all over the source code and makes it easy to identify remaining
KERNEL_LOCK and/or softnet_lock that are held even if NET_MPSAFE.

No functional change
2017-11-17 07:37:12 +00:00
maya
18b796d442 Use C99 initializer for filterops
Mostly done with spatch with touchups for indentation

@@
expression a;
identifier b,c,d;
identifier p;
@@
const struct filterops p =
- 	{ a, b, c, d
+ 	{
+ 	.f_isfd = a,
+ 	.f_attach = b,
+ 	.f_detach = c,
+ 	.f_event = d,
};
2017-10-25 08:12:37 +00:00
ozaki-r
7107584815 Turn on D_MPSAFE flag of bpf_cdevsw that is already MP-safe
Pointed out by k-goda@IIJ
2017-10-19 01:57:15 +00:00
ozaki-r
b7100b390b Reinit a pslist entry before inserting it to a pslist again
Fix PR kern/51984
Tested by nonaka@
2017-02-20 03:08:38 +00:00
christos
290ad1f61f typo 2017-02-19 13:58:42 +00:00
ozaki-r
e7ec750b6d Update comments to reflect bpf MP-ification 2017-02-13 03:44:45 +00:00
ozaki-r
f66c9ca3fd Make bpf MP-safe
By the change, bpf_mtap can run without any locks as long as its bpf filter
doesn't match a target packet. Pushing data to a bpf buffer still needs
a lock. Removing the lock requires big changes and it's a future work.

Another known issue is that we need to remain some obsolete variables to
avoid breaking kvm(3) users such as netstat and fstat. One problem for
MP-ification is that in order to keep statistic counters of bpf_d we need
to use atomic operations for them. Once we retire the kvm(3) users, we
should make the counters per-CPU and remove the atomic operations.
2017-02-09 09:30:26 +00:00
ozaki-r
c66c595b80 Reduce return points 2017-02-01 08:18:33 +00:00
ozaki-r
23e72bfbc4 Kill tsleep/wakeup and use cv 2017-02-01 08:16:42 +00:00
ozaki-r
bbe8ead203 Make bpf_gstats percpu 2017-02-01 08:15:15 +00:00
ozaki-r
2fec859db2 Use pslist(9) instead of queue(9) for psz/psref
As usual some member variables of struct bpf_d and bpf_if remain to avoid
breaking kvm(3) users (netstat and fstat).
2017-02-01 08:13:45 +00:00
ozaki-r
b76d85bbe1 Use kmem(9) instead of malloc/free 2017-02-01 08:07:27 +00:00
ozaki-r
ddd60175a6 Make global variables static 2017-02-01 08:06:01 +00:00
ozaki-r
87e988a7d8 Use bpf_ops for bpf_mtap_softint
By doing so we don't need to care whether a kernel enables bpfilter or not.
2017-01-25 01:04:23 +00:00
ozaki-r
9674e2224b Defer bpf_mtap in Rx interrupt context to softint
bpf_mtap of some drivers is still called in hardware interrupt context.
We want to run them in softint as well as bpf_mtap of most drivers
(see if_percpuq_softint and if_input).

To this end, bpf_mtap_softint mechanism is implemented; it defers
bpf_mtap processing to a dedicated softint for a target driver.
By using the machanism, we can move bpf_mtap processing to softint
without changing target drivers much while it adds some overhead
on CPU and memory. Once target drivers are changed to softint-based,
we should return to normal bpf_mtap.

Proposed on tech-kern and tech-net
2017-01-24 09:05:27 +00:00
ozaki-r
e9b008839d Make bpf_setf static 2017-01-23 10:17:36 +00:00
pgoyette
7c20c5d3bb Fix regression introduced in tests/net/bpf and tests/net/bpfilter
The rump code needs to call devsw_attach() in order to assign a dev_major
for bpf;  it then uses this to create rumps /dev/bpf node.  Unfortunately,
this leaves the devsw attached, so when the bpf module tries to initialize
itself, it gets an EEXIST error and fails.

So, once rump has figured what the dev_major should be, call devsw_detach()
to remove the devsw.  Then, when the module initialization code calls
devsw_attach() it will succeed.
2016-07-19 02:47:45 +00:00
pgoyette
b380080ebc Now that we're only calling devsw_attach() in the modular driver, it
is not ok for the driver/module to already exist.  So don't ignore
EEXIST.
2016-07-17 02:48:07 +00:00
pgoyette
3c6a976d2d Don't initialize variables that no longer exist in built-in module. 2016-07-17 01:16:30 +00:00
pgoyette
5233aa279b Don't try to call devsw_attach() for built-in driver code. 2016-07-17 01:03:46 +00:00
knakahara
95fc145695 apply if_output_lock() to L3 callers which call ifp->if_output() of L2(or L3 tunneling). 2016-06-20 06:46:37 +00:00
ozaki-r
fe6d427551 Avoid storing a pointer of an interface in a mbuf
Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.
2016-06-10 13:31:43 +00:00
ozaki-r
d938d837b3 Introduce m_set_rcvif and m_reset_rcvif
The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.
2016-06-10 13:27:10 +00:00
pgoyette
532241d269 Create separate modules for i2c_bitbang and bpf_filter so these files
can be included in kernels which need them without also duplicating
them in other modules.  Removes the duplicate symbols I found which
prevented loading i2c and bpf modules after having fixed PR 45125.
2016-06-07 01:06:27 +00:00
ozaki-r
9c4cd06355 Introduce softint-based if_input
This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!
2016-02-09 08:32:07 +00:00
christos
654889b793 Do less work under the kernel lock, otherwise dhcpcd aborting causes us
to deadlock.
2016-02-01 16:32:28 +00:00
christos
43eac92e53 don't free mbuf twice.
XXX: pullup 7.
2015-12-16 23:14:42 +00:00
christos
d522fec9f5 PR/49386: Ryota Ozaki: Add a mutex for bpf creation/removal to avoid races.
Add M_CANFAIL to malloc.
2015-10-14 19:40:09 +00:00
joerg
adac2d746a Improve wording. 2015-05-30 19:14:46 +00:00
ozaki-r
9116f11456 Remove unnecessary variable bc 2014-12-29 13:38:13 +00:00
rmind
b891d5cdc7 PR/49190: bpf_deliver: set scratch memory store in bpf_args_t. 2014-09-13 17:18:45 +00:00
matt
45b1ec740d Try not to use f_data, use f_{vnode,socket,pipe,mqueue,kqueue,ksem} to get
a correctly typed pointer.
2014-09-05 09:20:59 +00:00
ozaki-r
5b4238682e Use NULL instead of 0 for pointers 2014-08-07 03:40:21 +00:00
alnsn
b67137b2bd Enable net.bpf.jit only if MODULAR and BPFJIT. Tweak a warning about postponed
jit activation.
2014-07-28 07:32:46 +00:00
dholland
f9228f4225 Add d_discard to all struct cdevsw instances I could find.
All have been set to "nodiscard"; some should get a real implementation.
2014-07-25 08:10:31 +00:00
christos
0e34796007 initialize args the same way we do in filter. 2014-07-10 15:32:09 +00:00
alnsn
19fed70d36 Implement copfuncs and external memory in bpfjit. 2014-06-24 10:53:30 +00:00
dholland
a68f9396b6 Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.
2014-03-16 05:20:22 +00:00
pooka
4f6fb3bf35 Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before
the sysctl link sets are processed, and remove redundancy.

Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate
lines of code.
2014-02-25 18:30:08 +00:00
christos
c16aecd187 It is silly to kill the system when an interface failed to clear promiscuous
mode. Some return EINVAL when they are dying, but others like USB return EIO.
Downgrade to a DIAGNOSTIC printf. Same should be done for the malloc/NOWAIT,
but this is rarely hit.
2013-12-05 15:55:35 +00:00
rmind
5bd8916144 bpf_deliver: convert to bpf_filter_ext(). 2013-11-16 01:13:52 +00:00
rmind
d0748eb941 - Add bpf_args_t and convert bpf_filter_ext() to use it. This allows the
caller to initialise (and re-use) the memory store.
- Add bpf_jit_generate() and bpf_jit_freecode() wrappers.
2013-11-15 00:12:44 +00:00
rmind
cb633e2d0c Add bpf_filter_ext() to use with BPF COP, restore bpf_filter() as it was
originally to preserve compatibility.  Similarly, add bpf_validate_ext()
which takes bpf_ctx_t.
2013-09-18 23:34:55 +00:00
christos
4a5538bfa8 PR/48198: Peter Bex: Avoid kernel panic caused by setting a very small bpf
buffer size.
XXX: Pullup -6
2013-09-09 20:53:51 +00:00
rmind
4c45c55542 bpf_filter: add a custom argument which can be passed to coprocessor routine. 2013-08-30 15:00:08 +00:00
rmind
1962fa8781 Implement BPF_COP/BPF_COPX instructions in the misc category (BPF_MISC)
which add a capability to call external functions in a predetermined way.

It can be thought as a BPF "coprocessor" -- a generic mechanism to offload
more complex packet inspection operations.  There is no default coprocessor
and this functionality is not targeted to the /dev/bpf.  This is primarily
targeted to the kernel subsystems, therefore there is no way to set a custom
coprocessor at the userlevel.

Discussed on: tech-net@
OK: core@
2013-08-29 14:25:40 +00:00