NetBSD/share/man/man9/sysctl.9

683 lines
19 KiB
Groff

.\" $NetBSD: sysctl.9,v 1.24 2022/09/07 01:18:32 pgoyette Exp $
.\"
.\" Copyright (c) 2004 The NetBSD Foundation, Inc.
.\" All rights reserved.
.\"
.\" This code is derived from software contributed to The NetBSD Foundation
.\" by Andrew Brown.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
.\" POSSIBILITY OF SUCH DAMAGE.
.\"
.Dd September 6, 2022
.Dt SYSCTL 9
.Os
.Sh NAME
.Nm sysctl
.Nd system variable control interfaces
.Sh SYNOPSIS
.In sys/param.h
.In sys/sysctl.h
.Pp
Primary external interfaces:
.Ft void
.Fn sysctl_init void
.Ft int
.Fn sysctl_lock "struct lwp *l" "void *oldp" "size_t savelen"
.Ft int
.Fn sysctl_dispatch "const int *name" "u_int namelen" "void *oldp" \
"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \
"struct lwp *l" "const struct sysctlnode *rnode"
.Ft void
.Fn sysctl_unlock "struct lwp *l"
.Ft int
.Fn sysctl_createv "struct sysctllog **log" "int cflags" \
"const struct sysctlnode **rnode" "const struct sysctlnode **cnode" \
"int flags" "int type" "const char *namep" "const char *desc" \
"sysctlfn func" "u_quad_t qv" "void *newp" "size_t newlen" ...
.Ft int
.Fn sysctl_destroyv "struct sysctlnode *rnode" ...
.Ft void
.Fn sysctl_free "struct sysctlnode *rnode"
.Ft void
.Fn sysctl_teardown "struct sysctllog **"
.Ft int
.Fn old_sysctl "int *name" "u_int namelen" "void *oldp" \
"size_t *oldlenp" "void *newp" "size_t newlen" "struct lwp *l"
.Pp
Core internal functions:
.Ft int
.Fn sysctl_locate "struct lwp *l" "const int *name" "u_int namelen" \
"const struct sysctlnode **rnode" "int *nip"
.Ft int
.Fn sysctl_lookup "const int *name" "u_int namelen" "void *oldp" \
"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \
"struct lwp *l" "const struct sysctlnode *rnode"
.Ft int
.Fn sysctl_create "const int *name" "u_int namelen" "void *oldp" \
"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \
"struct lwp *l" "const struct sysctlnode *rnode"
.Ft int
.Fn sysctl_destroy "const int *name" "u_int namelen" "void *oldp" \
"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \
"struct lwp *l" "const struct sysctlnode *rnode"
.Ft int
.Fn sysctl_query "const int *name" "u_int namelen" "void *oldp" \
"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \
"struct lwp *l" "const struct sysctlnode *rnode"
.Pp
Simple
.Dq helper
functions:
.Ft int
.Fn sysctl_needfunc "const int *name" "u_int namelen" "void *oldp" \
"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \
"struct lwp *l" "const struct sysctlnode *rnode"
.Ft int
.Fn sysctl_notavail "const int *name" "u_int namelen" "void *oldp" \
"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \
"struct lwp *l" "const struct sysctlnode *rnode"
.Ft int
.Fn sysctl_null "const int *name" "u_int namelen" "void *oldp" \
"size_t *oldlenp" "const void *newp" "size_t newlen" "const int *oname" \
"struct lwp *l" "const struct sysctlnode *rnode"
.Sh DESCRIPTION
The SYSCTL subsystem instruments a number of kernel tunables and other
data structures via a simple MIB-like interface, primarily for
consumption by userland programs, but also for use internally by the
kernel.
.Sh LOCKING
All operations on the SYSCTL tree must be protected by acquiring the
main SYSCTL lock.
The only functions that can be called when the lock is not held are
.Fn sysctl_lock ,
.Fn sysctl_createv ,
.Fn sysctl_destroyv ,
and
.Fn old_sysctl .
All other functions require the tree to be locked.
This is to prevent other users of the tree from moving nodes around
during an add operation, or from destroying nodes or subtrees that are
actively being used.
The lock is acquired by calling
.Fn sysctl_lock
with a pointer to the process's lwp
.Fa l
.Dv ( NULL
may be passed to all functions as the lwp pointer if no lwp is
appropriate, though any changes made via
.Fn sysctl_create ,
.Fn sysctl_destroy ,
.Fn sysctl_lookup ,
or by any helper function will be done with effective superuser
privileges).
.Pp
The
.Fa oldp
and
.Fa savelen
arguments are a pointer to and the size of the memory region the
caller will be using to collect data from SYSCTL.
These may also be
.Dv NULL
and 0, respectively.
.Pp
The memory region will be locked via
.Fn uvm_vslock
if it is a region in userspace.
The address and size of the region are recorded so that when the
SYSCTL lock is to be released via
.Fn sysctl_unlock ,
only the lwp pointer
.Fa l
is required.
.Sh LOOKUPS
Once the lock has been acquired, it is typical to call
.Fn sysctl_dispatch
to handle the request.
.Fn sysctl_dispatch
will examine the contents of
.Fa name ,
an array of integers at least
.Fa namelen
long, which is to be located in kernel space, in order to determine
which function to call to handle the specific request.
.Pp
The following algorithm is used by
.Fn sysctl_dispatch
to determine the function to call:
.Bl -bullet -offset indent
.It
Scan the tree using
.Fn sysctl_locate .
.It
If the node returned has a
.Dq helper
function, call it.
.It
If the requested node was found but has no function, call
.Fn sysctl_lookup .
.It
If the node was not found and
.Fa name
specifies one of
.Fn sysctl_query ,
.Fn sysctl_create ,
or
.Fn sysctl_destroy ,
call the appropriate function.
.It
If none of these options applies and no other error was yet recorded,
return
.Er EOPNOTSUPP .
.El
The
.Fa oldp
and
.Fa oldlenp
arguments to
.Fn sysctl_dispatch ,
as with all the other core functions, describe an area into which the
current or requested value may be copied.
.Fa oldp
may or may not be a pointer into userspace (as dictated by whether
.Fa l
is
.Dv NULL
or not).
.Fa oldlenp
is a
.No non- Ns Dv NULL
pointer to a size_t.
.Fa newp
and
.Fa newlen
describe an area where the new value for the request may be found;
.Fa newp
may also be a pointer into userspace.
The
.Fa oname
argument is a
.No non- Ns Dv NULL
pointer to the base of the request currently
being processed.
By simple arithmetic on
.Fa name ,
.Fa namelen ,
and
.Fa oname ,
one can easily determine the entire original request and
.Fa namelen
values, if needed.
The
.Fa rnode
value, as passed to
.Fn sysctl_dispatch
represents the root of the tree into which the current request is to
be dispatched.
If
.Dv NULL ,
the main tree will be used.
.Pp
The
.Fn sysctl_locate
function scans a tree for the node most specific to a request.
If the pointer referenced by
.Fa rnode
is not
.Dv NULL ,
the tree indicated is searched, otherwise the main tree
will be used.
The address of the most relevant node will be returned via
.Fa rnode
and the number of MIB entries consumed will be returned via
.Fa nip ,
if it is not
.Dv NULL .
.Pp
The
.Fn sysctl_lookup
function takes the same arguments as
.Fn sysctl_dispatch
with the caveat that the value for
.Fa namelen
must be zero in order to indicate that the node referenced by the
.Fa rnode
argument is the one to which the lookup is being applied.
.Sh CREATION AND DESTRUCTION OF NODES
New nodes are created and destroyed by the
.Fn sysctl_create
and
.Fn sysctl_destroy
functions.
These functions take the same arguments as
.Fn sysctl_dispatch
with the additional requirement that the
.Fa namelen
argument must be 1 and the
.Fa name
argument must point to an integer valued either
.Dv CTL_CREATE
or
.Dv CTL_CREATESYM
when creating a new node, or
.Dv CTL_DESTROY
when destroying
a node.
.Pp
The
.Fa newp
and
.Fa newlen
arguments should point to a copy of the node to be created or
destroyed.
If the create or destroy operation was successful, a copy of the node
created or destroyed will be placed in the space indicated by
.Fa oldp
and
.Fa oldlenp .
If the create operation fails because of a conflict with an existing
node, a copy of that node will be returned instead.
.Pp
In order to facilitate the creation and destruction of nodes from a
given tree by kernel subsystems, the functions
.Fn sysctl_createv
and
.Fn sysctl_destroyv
are provided.
These functions take care of the overhead of filling in the contents
of the create or destroy request, dealing with locking, locating the
appropriate parent node, etc.
.Pp
The arguments to
.Fn sysctl_createv
are used to construct the new node.
If the
.Fa log
argument is not
.Dv NULL ,
a
.Em sysctllog
structure will be allocated and the pointer referenced
will be changed to address it.
The same log may be used for any number of nodes, provided they are
all inserted into the same tree.
This allows for a series of nodes to be created and later removed from
the tree in a single transaction (via
.Fn sysctl_teardown )
without the need for any record
keeping on the caller's part.
.Pp
The
.Fa cflags
argument is currently unused and must be zero.
The
.Fa rnode
argument must either be
.Dv NULL
or a valid pointer to a reference to the root of the tree into which
the new node must be placed.
If it is
.Dv NULL ,
the main tree will be used.
It is illegal for
.Fa rnode
to refer to a
.Dv NULL
pointer.
If the
.Fa cnode
argument is not
.Dv NULL ,
on return it will be adjusted to point to the address of the new node.
.Pp
The
.Fa flags
and
.Fa type
arguments are combined into the
.Fa sysctl_flags
field, and the current value for
.Dv SYSCTL_VERSION
is added in.
The following types are defined:
.Bl -tag -width ".Dv CTLTYPE_STRING " -offset indent
.It Dv CTLTYPE_NODE
A node intended to be a parent for other nodes.
.It Dv CTLTYPE_INT
A signed integer.
.It Dv CTLTYPE_STRING
A NUL-terminated string.
.It Dv CTLTYPE_QUAD
An unsigned 64-bit integer.
.It Dv CTLTYPE_STRUCT
A structure.
.It Dv CTLTYPE_BOOL
A boolean.
.El
.Pp
The
.Fa namep
argument is copied into the
.Fa sysctl_name
field and must be less than
.Dv SYSCTL_NAMELEN
characters in length.
The string indicated by
.Fa desc
will be copied if the
.Dv CTLFLAG_OWNDESC
flag is set, and will be used as the node's description.
.Pp
Two additional remarks:
.Bl -enum -offset indent
.It
The
.Dv CTLFLAG_PERMANENT
flag can only be set from SYSCTL setup routines (see
.Sx SETUP FUNCTIONS )
as called by
.Fn sysctl_init .
.It
If
.Fn sysctl_destroyv
attempts to delete a node that does not own its own description (and
is not marked as permanent), but the deletion fails, the description
will be copied and
.Fn sysctl_destroyv
will set the
.Dv CTLFLAG_OWNDESC
flag.
.El
.Pp
The
.Fa func
argument is the name of a
.Dq helper
function (see
.Sx HELPER FUNCTIONS AND MACROS ) .
If the
.Dv CTLFLAG_IMMEDIATE
flag is set, the
.Fa qv
argument will be interpreted as the initial value for the new
.Dq bool ,
.Dq int
or
.Dq quad
node.
This flag does not apply to any other type of node.
The
.Fa newp
and
.Fa newlen
arguments describe the data external to SYSCTL that is to be
instrumented.
One of
.Fa func ,
.Fa qv
and the
.Dv CTLFLAG_IMMEDIATE
flag, or
.Fa newp
and
.Fa newlen
must be given for nodes that instrument data, otherwise an error is
returned.
.Pp
The remaining arguments are a list of integers specifying the path
through the MIB to the node being created.
The list must be terminated by the
.Dv CTL_EOL
value.
The penultimate value in the list may be
.Dv CTL_CREATE
if a dynamic MIB entry is to be made for this node.
.Fn sysctl_createv
specifically does not support
.Dv CTL_CREATESYM ,
since setup routines are
expected to be able to use the in-kernel
.Xr ksyms 4
interface to discover the location of the data to be instrumented.
If the node to be created matches a node that already exists, a return
code of 0 is given, indicating success.
.Pp
When using
.Fn sysctl_destroyv
to destroy a given node, the
.Fa rnode
argument, if not
.Dv NULL ,
is taken to be the root of the tree from which
the node is to be destroyed, otherwise the main tree is used.
The rest of the arguments are a list of integers specifying the path
through the MIB to the node being destroyed.
If the node being destroyed does not exist, a successful return code
is given.
Nodes marked with the
.Dv CTLFLAG_PERMANENT
flag cannot be destroyed.
.Sh HELPER FUNCTIONS AND MACROS
Helper functions are invoked with the same common argument set as
.Fn sysctl_dispatch
except that the
.Fa rnode
argument will never be
.Dv NULL .
It will be set to point to the node that corresponds most closely to
the current request.
Helpers are forbidden from modifying the node they are passed; they
should instead copy the structure if changes are required in order to
effect access control or other checks.
The
.Dq helper
prototype and function that needs to ensure that a newly assigned
value is within a certain range (presuming external data) would look
like the following:
.Pp
.Bd -literal -offset indent -compact
static int sysctl_helper(SYSCTLFN_PROTO);
static int
sysctl_helper(SYSCTLFN_ARGS)
{
struct sysctlnode node;
int t, error;
t = *(int *)rnode->sysctl_data;
node = *rnode;
node.sysctl_data = &t;
error = sysctl_lookup(SYSCTLFN_CALL(&node));
if (error || newp == NULL)
return (error);
if (t < 0 || t > 20)
return (EINVAL);
*(int *)rnode->sysctl_data = t;
return (0);
}
.Ed
.Pp
The use of the
.Dv SYSCTLFN_PROTO ,
.Dv SYSCTLFN_ARGS, and
.Dv SYSCTLFN_CALL
macros ensure that all arguments are passed properly.
The single argument to the
.Dv SYSCTLFN_CALL
macro is the pointer to the node being examined.
.Pp
Three basic helper functions are available for use.
.Fn sysctl_needfunc
will emit a warning to the system console whenever it is invoked and
provides a simplistic read-only interface to the given node.
.Fn sysctl_notavail
will forward
.Dq queries
to
.Fn sysctl_query
so that subtrees can be discovered, but will return
.Er EOPNOTSUPP
for any other condition.
.Fn sysctl_null
specifically ignores any arguments given, sets the value indicated by
.Fa oldlenp
to zero, and returns success.
.Sh SETUP FUNCTIONS
Although nodes can be added to the SYSCTL tree at any time, in order to
add nodes during the kernel bootstrap phase (and during loadable module
initialization), a proper
.Dq setup
function must be used.
Setup functions are declared using the
.Dv SYSCTL_SETUP
macro, which takes the name of the function and a short string
description of the function as arguments.
.Po
See the
.Dv SYSCTL_DEBUG_SETUP
kernel configuration in
.Xr options 4 .
.Pc
.Pp
The address of the function is added to a list of functions that
.Fn sysctl_init
traverses during initialization.
For loadable kernel modules (see
.Xr module 9 ) ,
the list of functions is called from the module loader before the module's
initialization routine.
Any sysctl nodes created for the loadable module are removed using
.Fn sysctl_teardown
after calling the module's termination code.
.Pp
Setup functions do not have to add nodes to the main tree, but can set
up their own trees for emulation or other purposes.
Emulations that require use of a main tree but with some nodes changed
to suit their own purposes can arrange to overlay a sparse private
tree onto their main tree by making the
.Fa e_sysctlovly
member of their struct emul definition point to the overlaid tree.
.Pp
Setup functions should take care to create all nodes from the root
down to the subtree they are creating, since the order in which setup
functions are called is arbitrary (the order in which setup functions
are called is only determined by the ordering of the object files as
passed to the linker when the kernel is built).
.Sh MISCELLANEOUS FUNCTIONS
.Fn sysctl_init
is called early in the kernel bootstrap process.
It initializes the SYSCTL lock, calls all the registered setup
functions, and marks the tree as permanent.
.Pp
.Fn sysctl_free
will unconditionally delete any and all nodes below the given node.
Its intended use is for the deletion of entire trees, not subtrees.
If a subtree is to be removed,
.Fn sysctl_destroy
or
.Fn sysctl_destroyv
should be used to ensure that nodes not owned by the sub-system being
deactivated are not mistakenly destroyed.
The SYSCTL lock must be held when calling this function.
.Pp
.Fn sysctl_teardown
unwinds a
.Em sysctllog
and deletes the nodes in the opposite order in
which they were created.
.Pp
.Fn old_sysctl
provides an interface similar to the old SYSCTL implementation, with
the exception that access checks on a per-node basis are performed if
the
.Fa l
argument is
.No non- Ns Dv NULL .
If called with a
.Dv NULL
argument, the values for
.Fa newp
and
.Fa oldp
are interpreted as kernel addresses, and access is performed as for
the superuser.
.Sh NOTES
It is expected that nodes will be added to (or removed from) the tree
during the following stages of a machine's lifetime:
.Pp
.Bl -bullet -compact
.It
initialization \(em when the kernel is booting
.It
autoconfiguration \(em when devices are being probed at boot time
.It
.Dq plug and play
device attachment \(em when a PC-Card, USB, or other device is plugged
in or attached
.It
module initialization \(em when a module is being loaded
.It
.Dq run-time
\(em when a process creates a node via the
.Xr sysctl 3
interface
.El
.Pp
Nodes marked with
.Dv CTLFLAG_PERMANENT
can only be added to a tree during the first or initialization phase,
and can never be removed.
The initialization phase terminates when the main tree's root is
marked with the
.Dv CTLFLAG_PERMANENT
flag.
Once the main tree is marked in this manner, no nodes can be added to
any tree that is marked with
.Dv CTLFLAG_READONLY
at its root, and no nodes can be added at all if the main tree's root
is so marked.
.Pp
Nodes added by device drivers, modules, and at device insertion time can
be added to (and removed from)
.Dq read-only
parent nodes.
.Pp
Nodes created by processes can only be added to
.Dq writable
parent nodes.
See
.Xr sysctl 3
for a description of the flags that are allowed to be used by
when creating nodes.
.Sh SEE ALSO
.Xr sysctl 3
.Sh HISTORY
The dynamic SYSCTL implementation first appeared in
.Nx 2.0 .
.Sh AUTHORS
.An Andrew Brown
.Aq atatat@NetBSD.org
designed and implemented the dynamic SYSCTL implementation.