bpf(4): assorted markup tweaks

This is mostly non-controversial changes to the cargo-culted markup.
While here - add missing .It to BIOCLOCK so that it's not buried in
the text for the previous item and comment out a paragraph about an
ancient SunOS bug.
This commit is contained in:
uwe 2023-02-11 02:52:52 +00:00
parent b089952b65
commit 7bd77f6ba6
1 changed files with 162 additions and 109 deletions

View File

@ -1,6 +1,6 @@
.\" -*- nroff -*-
.\"
.\" $NetBSD: bpf.4,v 1.66 2023/02/07 01:17:41 gutteridge Exp $
.\" $NetBSD: bpf.4,v 1.67 2023/02/11 02:52:52 uwe Exp $
.\"
.\" Copyright (c) 1990, 1991, 1992, 1993, 1994
.\" The Regents of the University of California. All rights reserved.
@ -100,12 +100,12 @@ require
The (third) argument to the
.Xr ioctl 2
should be a pointer to the type indicated.
.Bl -tag -width indent -offset indent
.It Dv BIOCGBLEN ( u_int )
.Bl -tag -width Dv
.It Dv BIOCGBLEN Pq Vt u_int
Returns the required buffer length for reads on
.Nm
files.
.It Dv BIOCSBLEN ( u_int )
.It Dv BIOCSBLEN Pq Vt u_int
Sets the buffer length for reads on
.Nm
files.
@ -116,15 +116,15 @@ allowable size will be set and returned in the argument.
A read call will result in
.Er EINVAL
if it is passed a buffer that is not this size.
.It Dv BIOCGDLT ( u_int )
.It Dv BIOCGDLT Pq Vt u_int
Returns the type of the data link layer underlying the attached interface.
.Er EINVAL
is returned if no interface has been specified.
The device types, prefixed with
.Dq DLT_ ,
.Ql DLT_ ,
are defined in
.In net/bpf.h .
.It Dv BIOCGDLTLIST ( struct bpf_dltlist )
.It Dv BIOCGDLTLIST Pq Vt struct bpf_dltlist
Returns an array of the available types of the data link layer
underlying the attached interface:
.Bd -literal -offset indent
@ -135,26 +135,29 @@ struct bpf_dltlist {
.Ed
.Pp
The available types are returned in the array pointed to by the
.Va bfl_list
field while their length in u_int is supplied to the
.Va bfl_len
.Fa bfl_list
field while their length in
.Vt u_int
is supplied to the
.Fa bfl_len
field.
.Er ENOMEM
is returned if there is not enough buffer space and
.Er EFAULT
is returned if a bad address is encountered.
The
.Va bfl_len
.Fa bfl_len
field is modified on return to indicate the actual length in u_int
of the array returned.
If
.Va bfl_list
.Fa bfl_list
is
.Dv NULL ,
the
.Va bfl_len
field is set to indicate the required length of an array in u_int.
.It Dv BIOCSDLT ( u_int )
.Fa bfl_len
field is set to indicate the required length of an array in
.Vt u_int .
.It Dv BIOCSDLT Pq Vt u_int
Changes the type of the data link layer underlying the attached interface.
.Er EINVAL
is returned if no interface has been specified or the specified
@ -173,30 +176,34 @@ promiscuously are closed.
Flushes the buffer of incoming packets,
and resets the statistics that are returned by
.Dv BIOCGSTATS .
.It Dv BIOCGETIF ( struct ifreq )
.It Dv BIOCGETIF Pq Vt struct ifreq
Returns the name of the hardware interface that the file is listening on.
The name is returned in the ifr_name field of
.Fa ifr .
The name is returned in the
.Fa ifr_name
field of
.Vt ifreq .
All other fields are undefined.
.It Dv BIOCSETIF ( struct ifreq )
.It Dv BIOCSETIF Pq Vt struct ifreq
Sets the hardware interface associated with the file.
This command must be performed before any packets can be read.
The device is indicated by name using the
.Dv ifr_name
.Fa ifr_name
field of the
.Fa ifreq .
.Vt ifreq .
Additionally, performs the actions of
.Dv BIOCFLUSH .
.It Dv BIOCSRTIMEOUT , BIOCGRTIMEOUT ( struct timeval )
Sets or gets the read timeout parameter.
.It Dv BIOCSRTIMEOUT , BIOCGRTIMEOUT Pq Vt struct timeval
Sets or gets the
.Dq Em read timeout
parameter.
The
.Fa timeval
.Vt timeval
specifies the length of time to wait before timing
out on a read request.
This parameter is initialized to zero by
.Xr open 2 ,
indicating no timeout.
.It Dv BIOCGSTATS ( struct bpf_stat )
.It Dv BIOCGSTATS Pq Vt struct bpf_stat
Returns the following structure of packet statistics:
.Bd -literal -offset indent
struct bpf_stat {
@ -208,21 +215,23 @@ struct bpf_stat {
.Ed
.Pp
The fields are:
.Bl -tag -width bs_recv -offset indent
.It Va bs_recv
.Bl -tag -width Fa -offset indent
.It Fa bs_recv
the number of packets received by the descriptor since opened or reset
(including any buffered since the last read call);
.It Va bs_drop
.Pq including any buffered since the last read call ;
.It Fa bs_drop
the number of packets which were accepted by the filter but dropped by the
kernel because of buffer overflows
(i.e., the application's reads aren't keeping up with the packet
traffic); and
.It Va bs_capt
.Po
i.e., the application's reads aren't keeping up with the packet traffic
.Pc ;
and
.It Fa bs_capt
the number of packets accepted by the filter.
.El
.It Dv BIOCIMMEDIATE ( u_int )
.It Dv BIOCIMMEDIATE Pq Vt u_int
Enables or disables
.Dq immediate mode ,
.Dq Em immediate mode ,
based on the truth value of the argument.
When immediate mode is enabled, reads return immediately upon packet
reception.
@ -232,11 +241,11 @@ This is useful for programs like
.Xr rarpd 8 ,
which must respond to messages in real time.
The default for a new file is off.
.Dv BIOCLOCK
.It Dv BIOCLOCK Pq Dv NULL
Set the locked flag on the bpf descriptor.
This prevents the execution of ioctl commands which could change the
underlying operating parameters of the device.
.It Dv BIOCSETF ( struct bpf_program )
.It Dv BIOCSETF Pq Vt struct bpf_program
Sets the filter program used by the kernel to discard uninteresting
packets.
An array of instructions and its length are passed in using the following structure:
@ -248,26 +257,26 @@ struct bpf_program {
.Ed
.Pp
The filter program is pointed to by the
.Va bf_insns
.Fa bf_insns
field while its length in units of
.Sq struct bpf_insn
.Vt struct bpf_insn
is given by the
.Va bf_len
.Fa bf_len
field.
Also, the actions of
.Dv BIOCFLUSH
are performed.
.Pp
See section
.Sy FILTER MACHINE
.Sx FILTER MACHINE
for an explanation of the filter language.
.It Dv BIOCSETWF ( struct bpf_program )
.It Dv BIOCSETWF Pq Vt struct bpf_program
Sets the write filter program used by the kernel to control what type
of packets can be written to the interface.
See the
.Dv BIOCSETF
command for more information on the bpf filter program.
.It Dv BIOCVERSION ( struct bpf_version )
.It Dv BIOCVERSION Pq Vt struct bpf_version
Returns the major and minor version numbers of the filter language currently
recognized by the kernel.
Before installing a filter, applications must check
@ -289,16 +298,19 @@ and
from
.In net/bpf.h .
An incompatible filter
may result in undefined behavior (most likely, an error returned by
may result in undefined behavior
.Po
most likely, an error returned by
.Xr ioctl 2
or haphazard packet matching).
.It Dv BIOCSRSIG , BIOCGRSIG ( u_int )
or haphazard packet matching
.Pc .
.It Dv BIOCSRSIG , BIOCGRSIG Pq Vt u_int
Sets or gets the receive signal.
This signal will be sent to the process or process group specified by
.Dv FIOSETOWN .
It defaults to
.Dv SIGIO .
.It Dv BIOCGHDRCMPLT , BIOCSHDRCMPLT ( u_int )
.It Dv BIOCGHDRCMPLT , BIOCSHDRCMPLT Pq Vt u_int
Sets or gets the status of the
.Dq header complete
flag.
@ -307,7 +319,7 @@ automatically by the interface output routine.
Set to one if the link level source address will be written,
as provided, to the wire.
This flag is initialized to zero by default.
.It Dv BIOCGSEESENT , BIOCSSEESENT ( u_int )
.It Dv BIOCGSEESENT , BIOCSSEESENT Pq Vt u_int
These commands are obsolete but left for compatibility.
Use
.Dv BIOCSDIRECTION
@ -319,7 +331,7 @@ interface should be returned by BPF.
Set to zero to see only incoming packets on the interface.
Set to one to see packets originating locally and remotely on the interface.
This flag is initialized to one by default.
.It Dv BIOCSDIRECTION , BIOCGDIRECTION Pq Li u_int
.It Dv BIOCSDIRECTION , BIOCGDIRECTION Pq Vt u_int
Set or get the setting determining whether incoming, outgoing, or all packets
on the interface should be returned by BPF.
Set to
@ -334,7 +346,7 @@ to see only outgoing packets on the interface.
This setting is initialized to
.Dv BPF_D_INOUT
by default.
.It Dv BIOCFEEDBACK , BIOCSFEEDBACK , BIOCGFEEDBACK ( u_int )
.It Dv BIOCFEEDBACK , BIOCSFEEDBACK , BIOCGFEEDBACK Pq Vt u_int
Set (or get)
.Dq packet feedback mode .
This allows injected packets to be fed back as input to the interface when
@ -352,19 +364,19 @@ This flag is initialized to zero by default.
.El
.Sh STANDARD IOCTLS
.Nm
now supports several standard
.Xr ioctl 2 Ns 's
supports several standard
.Xr ioctl 2 Ap s
which allow the user to do async and/or non-blocking I/O to an open
.Nm bpf
file descriptor.
.Bl -tag -width indent -offset indent
.It Dv FIONREAD ( int )
.Bl -tag -width Dv
.It Dv FIONREAD Pq Vt int
Returns the number of bytes that are immediately available for reading.
.It Dv FIONBIO ( int )
.It Dv FIONBIO Pq Vt int
Set or clear non-blocking I/O.
If arg is non-zero, then doing a
.Xr read 2
when no data is available will return -1 and
when no data is available will return \-1 and
.Va errno
will be set to
.Er EAGAIN .
@ -372,20 +384,22 @@ If arg is zero, non-blocking I/O is disabled.
Note: setting this
overrides the timeout set by
.Dv BIOCSRTIMEOUT .
.It Dv FIOASYNC ( int )
.It Dv FIOASYNC Pq Vt int
Enable or disable async I/O.
When enabled (arg is non-zero), the process or process group specified by
.Dv FIOSETOWN
will start receiving SIGIO's when packets
arrive.
will start receiving
.Dv SIGIO Ap s
when packets arrive.
Note that you must do an
.Dv FIOSETOWN
in order for this to take effect, as
the system will not default this for you.
The signal may be changed via
.Dv BIOCSRSIG .
.It Dv FIOSETOWN , FIOGETOWN ( int )
Set or get the process or process group (if negative) that should receive SIGIO
.It Dv FIOSETOWN , FIOGETOWN Pq Vt int
Set or get the process or process group (if negative) that should receive
.Dv SIGIO
when packets are available.
The signal may be changed using
.Dv BIOCSRSIG
@ -404,42 +418,44 @@ struct bpf_hdr {
.Ed
.Pp
The fields, whose values are stored in host order, are:
.Bl -tag -width bh_datalen -offset indent
.It Va bh_tstamp
.Bl -tag -width Fa -offset indent
.It Fa bh_tstamp
The time at which the packet was processed by the packet filter.
This structure differs from the standard
.Vt struct timeval
in that both members are of type
.Vt long .
.It Va bh_caplen
.It Fa bh_caplen
The length of the captured portion of the packet.
This is the minimum of
the truncation amount specified by the filter and the length of the packet.
.It Va bh_datalen
.It Fa bh_datalen
The length of the packet off the wire.
This value is independent of the truncation amount specified by the filter.
.It Va bh_hdrlen
.It Fa bh_hdrlen
The length of the BPF header, which may not be equal to
.Em sizeof(struct bpf_hdr) .
.Li sizeof(struct bpf_hdr) .
.El
.Pp
The
.Va bh_hdrlen
.Fa bh_hdrlen
field exists to account for
padding between the header and the link level protocol.
The purpose here is to guarantee proper alignment of the packet
data structures, which is required on alignment sensitive
architectures and improves performance on many other architectures.
The packet filter ensures that the
.Va bpf_hdr
.Vt bpf_hdr
and the
.Em network layer
header will be word aligned.
Suitable precautions must be taken when accessing the link layer
protocol fields on alignment restricted machines.
(This isn't a problem on an Ethernet, since
.Po
This isn't a problem on an Ethernet, since
the type field is a short falling on an even offset,
and the addresses are probably accessed in a bytewise fashion).
and the addresses are probably accessed in a bytewise fashion
.Pc .
.Pp
Additionally, individual packets are padded so that each starts
on a word boundary.
@ -451,12 +467,15 @@ is defined in
.In net/bpf.h
to facilitate this process.
It rounds up its argument
to the nearest word aligned value (where a word is
to the nearest word aligned value
.Po
where a word is
.Dv BPF_ALIGNMENT
bytes wide).
bytes wide
.Pc .
.Pp
For example, if
.Sq Va p
.Va p
points to the start of a packet, this expression
will advance it to the next packet:
.Pp
@ -469,9 +488,10 @@ must itself be word aligned.
.Xr malloc 3
will always return an aligned buffer.
.Sh FILTER MACHINE
A filter program is an array of instructions, with all branches forwardly
directed, terminated by a
.Sy return
A filter program is an array of instructions, with all branches
.Em forwardly directed ,
terminated by a
.Em return
instruction.
Each instruction performs some action on the pseudo-machine state,
which consists of an accumulator, index register, scratch memory store,
@ -488,42 +508,72 @@ struct bpf_insn {
.Ed
.Pp
The
.Va k
.Fa k
field is used in different ways by different instructions,
and the
.Va jt
.Fa jt
and
.Va jf
.Fa jf
fields are used as offsets
by the branch instructions.
The opcodes are encoded in a semi-hierarchical fashion.
There are eight classes of instructions: BPF_LD, BPF_LDX, BPF_ST, BPF_STX,
BPF_ALU, BPF_JMP, BPF_RET, and BPF_MISC.
There are eight classes of instructions:
.Dv BPF_LD ,
.Dv BPF_LDX ,
.Dv BPF_ST ,
.Dv BPF_STX ,
.Dv BPF_ALU ,
.Dv BPF_JMP ,
.Dv BPF_RET ,
and
.Dv BPF_MISC .
Various other mode and
operator bits are or'd into the class to give the actual instructions.
operator bits are
.Em or Ap d
into the class to give the actual instructions.
The classes and modes are defined in
.In net/bpf.h .
.Pp
Below are the semantics for each defined BPF instruction.
We use the convention that A is the accumulator, X is the index register,
P[] packet data, and M[] scratch memory store.
P[i:n] gives the data at byte offset
.Dq i
We use the convention that
.Ar A
is the accumulator,
.Ar X
is the index register,
.Ar P
packet data, and
.Ar M
scratch memory store.
.Sm off
.Ar P Li \&[ Ar i Li \&: Ar n\^ Li \&]
.Sm on
gives the data at byte offset
.Ar i
in the packet,
interpreted as a word (n=4),
unsigned halfword (n=2), or unsigned byte (n=1).
M[i] gives the i'th word in the scratch memory store, which is only
interpreted as a word
.Ar ( n No = 4 ) ,
unsigned halfword
.Ar ( n No = 2 ) ,
or unsigned byte
.Ar ( n No = 1 ) .
.Sm off
.Ar M\^ Li \&[ Ar i\^ Li \&]
.Sm on
gives the
.Ar i Ap th
word in the scratch memory store, which is only
addressed in word units.
The memory store is indexed from 0 to BPF_MEMWORDS-1.
.Va k ,
.Va jt ,
The memory store is indexed from 0 to
.Dv BPF_MEMWORDS Ns Li \&-1 .
.Fa k ,
.Fa jt ,
and
.Va jf
.Fa jf
are the corresponding fields in the
instruction definition.
.Dq len
.Ar len
refers to the length of the packet.
.Bl -tag -width indent -offset indent
.Bl -tag -width indent
.It Sy BPF_LD
These instructions copy a value into the accumulator.
The type of the source operand is specified by an
@ -683,7 +733,7 @@ array initializers:
The following sysctls are available when
.Nm
is enabled:
.Bl -tag -width "XnetXbpfXmaxbufsizeXX"
.Bl -tag -width ".Li net.bpf.maxbufsize"
.It Li net.bpf.maxbufsize
Sets the maximum buffer size available for
.Nm
@ -707,12 +757,12 @@ utility.
On architectures with
.Xr bpfjit 4
support, the additional sysctl is available:
.Bl -tag -width "XnetXbpfXjitXX"
.Bl -tag -width ".Li net.bpf.jit"
.It Li net.bpf.jit
Toggle
.Sy Just-In-Time
.Em just-in-time
compilation of new filter programs.
In order to enable Just-In-Time compilation,
In order to enable just-in-time compilation,
the bpfjit kernel module must be loaded.
Changing a value of this sysctl doesn't affect
existing filter programs.
@ -804,9 +854,12 @@ The design was in collaboration with
.An Van Jacobson ,
also of Lawrence Berkeley Laboratory.
.Sh BUGS
The read buffer must be of a fixed size (returned by the
The read buffer must be of a fixed size
.Po
returned by the
.Dv BIOCGBLEN
ioctl).
ioctl
.Pc .
.Pp
A file that does not request promiscuous mode may receive promiscuously
received packets as a side effect of another file requesting this
@ -815,16 +868,16 @@ This could be fixed in the kernel with additional processing overhead.
However, we favor the model where
all files must assume that the interface is promiscuous, and if
so desired, must use a filter to reject foreign packets.
.\" .Pp
.\" Under SunOS, if a BPF application reads more than 2^31 bytes of
.\" data, read will fail in
.\" .Er EINVAL .
.\" You can either fix the bug in SunOS,
.\" or lseek to 0 when read fails for this reason.
.Pp
Under SunOS, if a BPF application reads more than 2^31 bytes of
data, read will fail in
.Er EINVAL .
You can either fix the bug in SunOS,
or lseek to 0 when read fails for this reason.
.Pp
.Dq Immediate mode
.Dq Em Immediate mode
and the
.Dq read timeout
.Dq Em read timeout
are misguided features.
This functionality can be emulated with non-blocking mode and
.Xr select 2 .