Commit Graph

195 Commits

Author SHA1 Message Date
ozaki-r
9c4cd06355 Introduce softint-based if_input
This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!
2016-02-09 08:32:07 +00:00
christos
654889b793 Do less work under the kernel lock, otherwise dhcpcd aborting causes us
to deadlock.
2016-02-01 16:32:28 +00:00
christos
43eac92e53 don't free mbuf twice.
XXX: pullup 7.
2015-12-16 23:14:42 +00:00
christos
d522fec9f5 PR/49386: Ryota Ozaki: Add a mutex for bpf creation/removal to avoid races.
Add M_CANFAIL to malloc.
2015-10-14 19:40:09 +00:00
joerg
adac2d746a Improve wording. 2015-05-30 19:14:46 +00:00
ozaki-r
9116f11456 Remove unnecessary variable bc 2014-12-29 13:38:13 +00:00
rmind
b891d5cdc7 PR/49190: bpf_deliver: set scratch memory store in bpf_args_t. 2014-09-13 17:18:45 +00:00
matt
45b1ec740d Try not to use f_data, use f_{vnode,socket,pipe,mqueue,kqueue,ksem} to get
a correctly typed pointer.
2014-09-05 09:20:59 +00:00
ozaki-r
5b4238682e Use NULL instead of 0 for pointers 2014-08-07 03:40:21 +00:00
alnsn
b67137b2bd Enable net.bpf.jit only if MODULAR and BPFJIT. Tweak a warning about postponed
jit activation.
2014-07-28 07:32:46 +00:00
dholland
f9228f4225 Add d_discard to all struct cdevsw instances I could find.
All have been set to "nodiscard"; some should get a real implementation.
2014-07-25 08:10:31 +00:00
christos
0e34796007 initialize args the same way we do in filter. 2014-07-10 15:32:09 +00:00
alnsn
19fed70d36 Implement copfuncs and external memory in bpfjit. 2014-06-24 10:53:30 +00:00
dholland
a68f9396b6 Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.
2014-03-16 05:20:22 +00:00
pooka
4f6fb3bf35 Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before
the sysctl link sets are processed, and remove redundancy.

Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate
lines of code.
2014-02-25 18:30:08 +00:00
christos
c16aecd187 It is silly to kill the system when an interface failed to clear promiscuous
mode. Some return EINVAL when they are dying, but others like USB return EIO.
Downgrade to a DIAGNOSTIC printf. Same should be done for the malloc/NOWAIT,
but this is rarely hit.
2013-12-05 15:55:35 +00:00
rmind
5bd8916144 bpf_deliver: convert to bpf_filter_ext(). 2013-11-16 01:13:52 +00:00
rmind
d0748eb941 - Add bpf_args_t and convert bpf_filter_ext() to use it. This allows the
caller to initialise (and re-use) the memory store.
- Add bpf_jit_generate() and bpf_jit_freecode() wrappers.
2013-11-15 00:12:44 +00:00
rmind
cb633e2d0c Add bpf_filter_ext() to use with BPF COP, restore bpf_filter() as it was
originally to preserve compatibility.  Similarly, add bpf_validate_ext()
which takes bpf_ctx_t.
2013-09-18 23:34:55 +00:00
christos
4a5538bfa8 PR/48198: Peter Bex: Avoid kernel panic caused by setting a very small bpf
buffer size.
XXX: Pullup -6
2013-09-09 20:53:51 +00:00
rmind
4c45c55542 bpf_filter: add a custom argument which can be passed to coprocessor routine. 2013-08-30 15:00:08 +00:00
rmind
1962fa8781 Implement BPF_COP/BPF_COPX instructions in the misc category (BPF_MISC)
which add a capability to call external functions in a predetermined way.

It can be thought as a BPF "coprocessor" -- a generic mechanism to offload
more complex packet inspection operations.  There is no default coprocessor
and this functionality is not targeted to the /dev/bpf.  This is primarily
targeted to the kernel subsystems, therefore there is no way to set a custom
coprocessor at the userlevel.

Discussed on: tech-net@
OK: core@
2013-08-29 14:25:40 +00:00
alnsn
e8c0d6c662 Add bpfjit and enable it for amd64. 2012-10-27 22:36:11 +00:00
alnsn
5c5a76d566 Remove bpf_jit which was ported from FreeBSD recently.
It will soon be replaced with the new bpfjit kernel module.
2012-09-27 18:28:53 +00:00
alnsn
55f9a36d99 Fix two bugs introduced by recent commit.
- When handling contiguous buffer in _bpf_tap(), pass its real size
   rather than 0 to avoid reading packet data as mbuf struct on
   out-of-bounds loads.
 - Correctly pass pktlen and buflen arguments from bpf_deliver() to
   bpf_filter() to avoid reading mbuf struct as packet data.
   JIT case is still broken.

Also, test pointers againts NULL.
2012-08-15 20:59:51 +00:00
rmind
24e587649b Build fix for some ports. 2012-08-02 00:40:51 +00:00
rmind
1f86dc56b4 Add BPF JIT compiler, currently supporting amd64 and i386. Code obtained
from FreeBSD.  Also, make few BPF fixes and simplifications while here.
Note that bpf_jit_enable is false for now.

OK dyoung@, some feedback from matt@
2012-08-01 23:24:28 +00:00
christos
4bdfaa0aa3 make comment reflect reality 2011-12-16 03:05:23 +00:00
christos
811ac7bb4f don't leak mbufs. 2011-12-15 22:20:26 +00:00
bouyer
ccc8030189 Provide netbsd32 compat for bpf. Beside the ioctls, the structure
returned to userland by read(2) also needs to be converted.
For this, the bpf descriptor is flagged as compat32 (or not) in the
open and ioctl functions (where the user process's pid is also updated
in the descriptor). When the bpf buffer is filled in, the 32bits or native
header is used depending on the information stored in the descriptor.

This won't work if a 64bit binary does the open and ioctls, and then
exec a 32bit program which will do the read. But this is very
unlikely to happen in real life ...

Tested on i386 and loongson; with these changes my loongson can run
dhclient and tcpdump with a n32 userland.
2011-08-30 14:22:22 +00:00
christos
eb8da70733 setting things once is enough. 2011-06-10 00:10:35 +00:00
christos
e826c9f234 lib/44807: something broken in stat(2), return that we are a character
device in st_mode.
2011-03-30 21:34:08 +00:00
bouyer
22637b9c37 Allocate buffers with (M_WAITOK | M_CANFAIL) instead of M_NOWAIT.
M_NOWAIT cause dhcpd on a low-memory server with lots of interfaces to
occasionally fail to start with ENOBUFS; (M_WAITOK | M_CANFAIL) seems to
fix this.
Tested on 3 different dhcp servers.
2011-03-30 18:04:27 +00:00
christos
87c238c4a3 undo previous. Read the diff wrong. 2011-01-22 19:12:58 +00:00
christos
6c793dc721 fix comment 2011-01-22 16:54:48 +00:00
christos
d232460a0a kern/44310: Alexander Nasonov: write to /dev/bpf truncates size_t to int 2011-01-02 21:03:45 +00:00
pooka
91a3d3404c linkset no more 2010-12-08 17:10:13 +00:00
pooka
735701ff27 Add a little comment on how bpf can be made unloadable, per pointer from ad. 2010-04-14 13:31:33 +00:00
joerg
58e867556f Push the bpf_ops usage back into bpf.h. Push the common ifp->if_bpf
check into the inline functions as well the fourth argument for
bpf_attach.
2010-04-05 07:19:28 +00:00
christos
8bc5973709 add BIOC{G,S}FEEDBACK which allows one to receive injected outgoing packets
via bpf.
2010-03-13 20:38:48 +00:00
pooka
de4f105d4a Include sys/atomic.h now that it's used but gets stealth-included
only on some archs.
2010-01-26 01:06:23 +00:00
pooka
b2bb0f38d5 Make bpf dynamically loadable. 2010-01-25 22:18:17 +00:00
pooka
10fe49d72c Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client.  This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached.  However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff.  ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.
2010-01-19 22:06:18 +00:00
pooka
64da563d90 Forward declare struct bpf_if and use that as the type for bpf_if
instead of "void *".  Buys us oo times the type-safety for 0 times
the price.
(no functional change)
2010-01-17 19:45:06 +00:00
pooka
ec8068f5fb * remove just-for-kicks locking
* KNF
* remove outdated comment (quite a funny one to read in 2010, though)
2010-01-15 22:16:46 +00:00
dsl
2a54322c7b If a multithreaded app closes an fd while another thread is blocked in
read/write/accept, then the expectation is that the blocked thread will
exit and the close complete.
Since only one fd is affected, but many fd can refer to the same file,
the close code can only request the fs code unblock with ERESTART.
Fixed for pipes and sockets, ERESTART will only be generated after such
a close - so there should be no change for other programs.
Also rename fo_abort() to fo_restart() (this used to be fo_drain()).
Fixes PR/26567
2009-12-20 09:36:05 +00:00
dsl
7a42c833db Rename fo_drain() to fo_abort(), 'drain' is used to mean 'wait for output
do drain' in many places, whereas fo_drain() was called in order to force
blocking read()/write() etc calls to return to userspace so that a close()
call from a different thread can complete.
In the sockets code comment out the broken code in the inner function,
it was being called from compat code.
2009-12-09 21:32:58 +00:00
rmind
dbd9b86792 Remove some unecessary includes sys/user.h header. 2009-11-23 02:13:44 +00:00
christos
14c3063365 add the error from ifpromisc to the panic. 2009-10-05 17:58:15 +00:00
christos
86ba58fd64 Fix locking as Andy explained. Also fill in uid and gid like sys_pipe did. 2009-04-11 23:05:26 +00:00