279 lines
8.1 KiB
Groff
279 lines
8.1 KiB
Groff
.\" $NetBSD: kmem.9,v 1.7 2010/02/13 07:44:11 wiz Exp $
|
|
.\"
|
|
.\" Copyright (c)2006 YAMAMOTO Takashi,
|
|
.\" All rights reserved.
|
|
.\"
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
.\" modification, are permitted provided that the following conditions
|
|
.\" are met:
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
.\"
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
.\" SUCH DAMAGE.
|
|
.\"
|
|
.\" ------------------------------------------------------------
|
|
.Dd February 11, 2010
|
|
.Dt KMEM 9
|
|
.Os
|
|
.\" ------------------------------------------------------------
|
|
.Sh NAME
|
|
.Nm kmem
|
|
.Nd kernel wired memory allocator
|
|
.\" ------------------------------------------------------------
|
|
.Sh SYNOPSIS
|
|
.In sys/kmem.h
|
|
.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
|
.Ft void *
|
|
.Fn kmem_alloc \
|
|
"size_t size" "km_flag_t kmflags"
|
|
.Ft void *
|
|
.Fn kmem_zalloc \
|
|
"size_t size" "km_flag_t kmflags"
|
|
.Ft void
|
|
.Fn kmem_free \
|
|
"void *p" "size_t size"
|
|
.Ft char *
|
|
.Fn kmem_asprintf \
|
|
"const char *fmt" "..."
|
|
.\" ------------------------------------------------------------
|
|
.Pp
|
|
.Cd "options DEBUG"
|
|
.Sh DESCRIPTION
|
|
.Fn kmem_alloc
|
|
allocates kernel wired memory.
|
|
It takes the following arguments.
|
|
.Bl -tag -width kmflags
|
|
.It Fa size
|
|
Specify the size of allocation in bytes.
|
|
.It Fa kmflags
|
|
Either of the following:
|
|
.Bl -tag -width KM_NOSLEEP
|
|
.It KM_SLEEP
|
|
If the allocation cannot be satisfied immediately, sleep until enough
|
|
memory is available.
|
|
.It KM_NOSLEEP
|
|
Don't sleep.
|
|
Immediately return
|
|
.Dv NULL
|
|
if there is not enough memory available.
|
|
It should only be used when failure to allocate will not have harmful,
|
|
user-visible effects.
|
|
.Pp
|
|
.Bf -symbolic
|
|
Use of
|
|
.Dv KM_NOSLEEP
|
|
is strongly discouraged as it can create transient, hard to debug failures
|
|
that occur when the system is under memory pressure.
|
|
.Ef
|
|
.Pp
|
|
In situations where it is not possible to sleep, for example because locks
|
|
are held by the caller, the code path should be restructured to allow the
|
|
allocation to be made in another place.
|
|
.El
|
|
.El
|
|
.Pp
|
|
The contents of allocated memory are uninitialized.
|
|
.Pp
|
|
Unlike Solaris, kmem_alloc(0, flags) is illegal.
|
|
.Pp
|
|
.\" ------------------------------------------------------------
|
|
.Fn kmem_zalloc
|
|
is the equivalent of
|
|
.Fn kmem_alloc ,
|
|
except that it initializes the memory to zero.
|
|
.Pp
|
|
.\" ------------------------------------------------------------
|
|
.Fn kmem_asprintf
|
|
functions as the well known
|
|
.Fn asprintf
|
|
function, but allocates memory using
|
|
.Fn kmem_alloc .
|
|
This routine can sleep during allocation.
|
|
The size of the allocated area is the length of the returned character string, plus one (for the NUL terminator).
|
|
This must be taken into consideration when freeing the returned area with
|
|
.Fn kmem_free .
|
|
.Pp
|
|
.\" ------------------------------------------------------------
|
|
.Fn kmem_free
|
|
frees kernel wired memory allocated by
|
|
.Fn kmem_alloc
|
|
or
|
|
.Fn kmem_zalloc
|
|
so that it can be used for other purposes.
|
|
It takes the following arguments.
|
|
.Bl -tag -width kmflags
|
|
.It Fa p
|
|
The pointer to the memory being freed.
|
|
It must be the one returned by
|
|
.Fn kmem_alloc
|
|
or
|
|
.Fn kmem_zalloc .
|
|
.It Fa size
|
|
The size of the memory being freed, in bytes.
|
|
It must be the same as the
|
|
.Fa size
|
|
argument used for
|
|
.Fn kmem_alloc
|
|
or
|
|
.Fn kmem_zalloc
|
|
when the memory was allocated.
|
|
.El
|
|
.Pp
|
|
Freeing
|
|
.Dv NULL
|
|
is illegal.
|
|
.\" ------------------------------------------------------------
|
|
.Sh NOTES
|
|
Making
|
|
.Dv KM_SLEEP
|
|
allocations while holding mutexes or reader/writer locks is discouraged, as the
|
|
caller can sleep for an unbounded amount of time in order to satisfy the
|
|
allocation.
|
|
This can in turn block other threads that wish to acquire locks held by the
|
|
caller.
|
|
.Pp
|
|
For some locks this is permissible or even unavoidable.
|
|
For others, particularly locks that may be taken from soft interrupt context,
|
|
it is a serious problem.
|
|
As a general rule it is better not to allow this type of situation to develop.
|
|
One way to circumvent the problem is to make allocations speculative and part
|
|
of a retryable sequence.
|
|
For example:
|
|
.Bd -literal
|
|
retry:
|
|
/* speculative unlocked check */
|
|
if (need to allocate) {
|
|
new_item = kmem_alloc(sizeof(*new_item), KM_SLEEP);
|
|
} else {
|
|
new_item = NULL;
|
|
}
|
|
mutex_enter(lock);
|
|
/* check while holding lock for true status */
|
|
if (need to allocate) {
|
|
if (new_item == NULL) {
|
|
mutex_exit(lock);
|
|
goto retry;
|
|
}
|
|
consume(new_item);
|
|
new_item = NULL;
|
|
}
|
|
mutex_exit(lock);
|
|
if (new_item != NULL) {
|
|
/* did not use it after all */
|
|
kmem_free(new_item, sizeof(*new_item));
|
|
}
|
|
.Ed
|
|
.\" ------------------------------------------------------------
|
|
.Sh OPTIONS
|
|
Kernels compiled with the
|
|
.Dv DEBUG
|
|
option perform CPU intensive sanity checks on kmem operations,
|
|
and include the
|
|
.Dv kmguard
|
|
facility which can be enabled at runtime.
|
|
.Pp
|
|
.Dv kmguard
|
|
adds additional, very high overhead runtime verification to kmem operations.
|
|
To enable it, boot the system with the
|
|
.Fl d
|
|
option, which causes the debugger to be entered early during the kernel
|
|
boot process.
|
|
Issue commands such as the following:
|
|
.Bd -literal
|
|
db\*[Gt] w kmem_guard_depth 0t30000
|
|
db\*[Gt] c
|
|
.Ed
|
|
.Pp
|
|
This instructs
|
|
.Dv kmguard
|
|
to queue up to 60000 (30000*2) pages of unmapped KVA to catch
|
|
use-after-free type errors.
|
|
When
|
|
.Fn kmem_free
|
|
is called, memory backing a freed item is unmapped and the kernel VA
|
|
space pushed onto a FIFO.
|
|
The VA space will not be reused until another 30k items have been freed.
|
|
Until reused the kernel will catch invalid accesses and panic with a page fault.
|
|
Limitations:
|
|
.Bl -bullet
|
|
.It
|
|
It has a severe impact on performance.
|
|
.It
|
|
It is best used on a 64-bit machine with lots of RAM.
|
|
.It
|
|
Allocations larger than PAGE_SIZE bypass the
|
|
.Dv kmguard
|
|
facility.
|
|
.El
|
|
.Pp
|
|
kmguard tries to catch the following types of bugs:
|
|
.Bl -bullet
|
|
.It
|
|
Overflow at time of occurrence, by means of a guard page.
|
|
.It
|
|
Underflow at
|
|
.Fn kmem_free ,
|
|
by using a canary value.
|
|
.It
|
|
Invalid pointer or size passed, at
|
|
.Fn kmem_free .
|
|
.El
|
|
.Sh RETURN VALUES
|
|
On success,
|
|
.Fn kmem_alloc
|
|
and
|
|
.Fn kmem_zalloc
|
|
return a pointer to allocated memory.
|
|
Otherwise,
|
|
.Dv NULL
|
|
is returned.
|
|
.\" ------------------------------------------------------------
|
|
.Sh CODE REFERENCES
|
|
This section describes places within the
|
|
.Nx
|
|
source tree where actual code implementing the
|
|
.Nm
|
|
subsystem
|
|
can be found.
|
|
All pathnames are relative to
|
|
.Pa /usr/src .
|
|
.Pp
|
|
The
|
|
.Nm
|
|
subsystem is implemented within the file
|
|
.Pa sys/kern/subr_kmem.c .
|
|
.\" ------------------------------------------------------------
|
|
.Sh SEE ALSO
|
|
.Xr intro 9 ,
|
|
.Xr memoryallocators 9 ,
|
|
.Xr percpu 9 ,
|
|
.Xr pool_cache 9
|
|
.\" ------------------------------------------------------------
|
|
.Sh CAVEATS
|
|
.Fn kmem_alloc
|
|
cannot be used from interrupt context, from a soft interrupt, or from
|
|
a callout.
|
|
Use
|
|
.Xr pool_cache 9
|
|
in these situations.
|
|
.\" ------------------------------------------------------------
|
|
.Sh SECURITY CONSIDERATION
|
|
As the memory allocated by
|
|
.Fn kmem_alloc
|
|
is uninitialized, it can contain security-sensitive data left by its
|
|
previous user.
|
|
It is the caller's responsibility not to expose it to the world.
|