300 lines
8.6 KiB
Groff
300 lines
8.6 KiB
Groff
.\" $NetBSD: kmem.9,v 1.14 2013/11/26 20:47:26 rmind Exp $
|
|
.\"
|
|
.\" Copyright (c)2006 YAMAMOTO Takashi,
|
|
.\" All rights reserved.
|
|
.\"
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
.\" modification, are permitted provided that the following conditions
|
|
.\" are met:
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
.\"
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
.\" SUCH DAMAGE.
|
|
.\"
|
|
.\" ------------------------------------------------------------
|
|
.Dd November 26, 2013
|
|
.Dt KMEM 9
|
|
.Os
|
|
.\" ------------------------------------------------------------
|
|
.Sh NAME
|
|
.Nm kmem
|
|
.Nd kernel wired memory allocator
|
|
.\" ------------------------------------------------------------
|
|
.Sh SYNOPSIS
|
|
.In sys/kmem.h
|
|
.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
|
.Ft void *
|
|
.Fn kmem_alloc \
|
|
"size_t size" "km_flag_t kmflags"
|
|
.Ft void *
|
|
.Fn kmem_zalloc \
|
|
"size_t size" "km_flag_t kmflags"
|
|
.Ft void
|
|
.Fn kmem_free \
|
|
"void *p" "size_t size"
|
|
.\" ---
|
|
.Ft void *
|
|
.Fn kmem_intr_alloc \
|
|
"size_t size" "km_flag_t kmflags"
|
|
.Ft void *
|
|
.Fn kmem_intr_zalloc \
|
|
"size_t size" "km_flag_t kmflags"
|
|
.Ft void
|
|
.Fn kmem_intr_free \
|
|
"void *p" "size_t size"
|
|
.\" ---
|
|
.Ft char *
|
|
.Fn kmem_asprintf \
|
|
"const char *fmt" "..."
|
|
.\" ------------------------------------------------------------
|
|
.Pp
|
|
.Cd "options DEBUG"
|
|
.Sh DESCRIPTION
|
|
.Fn kmem_alloc
|
|
allocates kernel wired memory.
|
|
It takes the following arguments.
|
|
.Bl -tag -width kmflags
|
|
.It Fa size
|
|
Specify the size of allocation in bytes.
|
|
.It Fa kmflags
|
|
Either of the following:
|
|
.Bl -tag -width KM_NOSLEEP
|
|
.It KM_SLEEP
|
|
If the allocation cannot be satisfied immediately, sleep until enough
|
|
memory is available.
|
|
.It KM_NOSLEEP
|
|
Don't sleep.
|
|
Immediately return
|
|
.Dv NULL
|
|
if there is not enough memory available.
|
|
It should only be used when failure to allocate will not have harmful,
|
|
user-visible effects.
|
|
.Pp
|
|
.Bf -symbolic
|
|
Use of
|
|
.Dv KM_NOSLEEP
|
|
is strongly discouraged as it can create transient, hard to debug failures
|
|
that occur when the system is under memory pressure.
|
|
.Ef
|
|
.Pp
|
|
In situations where it is not possible to sleep, for example because locks
|
|
are held by the caller, the code path should be restructured to allow the
|
|
allocation to be made in another place.
|
|
.El
|
|
.El
|
|
.Pp
|
|
The contents of allocated memory are uninitialized.
|
|
.Pp
|
|
Unlike Solaris, kmem_alloc(0, flags) is illegal.
|
|
.Pp
|
|
.\" ------------------------------------------------------------
|
|
.Fn kmem_zalloc
|
|
is the equivalent of
|
|
.Fn kmem_alloc ,
|
|
except that it initializes the memory to zero.
|
|
.Pp
|
|
.\" ------------------------------------------------------------
|
|
.Fn kmem_asprintf
|
|
functions as the well known
|
|
.Fn asprintf
|
|
function, but allocates memory using
|
|
.Fn kmem_alloc .
|
|
This routine can sleep during allocation.
|
|
The size of the allocated area is the length of the returned character string, plus one (for the NUL terminator).
|
|
This must be taken into consideration when freeing the returned area with
|
|
.Fn kmem_free .
|
|
.Pp
|
|
.\" ------------------------------------------------------------
|
|
.Fn kmem_free
|
|
frees kernel wired memory allocated by
|
|
.Fn kmem_alloc
|
|
or
|
|
.Fn kmem_zalloc
|
|
so that it can be used for other purposes.
|
|
It takes the following arguments.
|
|
.Bl -tag -width kmflags
|
|
.It Fa p
|
|
The pointer to the memory being freed.
|
|
It must be the one returned by
|
|
.Fn kmem_alloc
|
|
or
|
|
.Fn kmem_zalloc .
|
|
.It Fa size
|
|
The size of the memory being freed, in bytes.
|
|
It must be the same as the
|
|
.Fa size
|
|
argument used for
|
|
.Fn kmem_alloc
|
|
or
|
|
.Fn kmem_zalloc
|
|
when the memory was allocated.
|
|
.El
|
|
.Pp
|
|
Freeing
|
|
.Dv NULL
|
|
is illegal.
|
|
.Pp
|
|
.\" ------------------------------------------------------------
|
|
.Fn kmem_intr_alloc ,
|
|
.Fn kmem_intr_zalloc
|
|
and
|
|
.Fn kmem_intr_free
|
|
are the equivalents of the above kmem routines which can be called
|
|
from the interrupt context.
|
|
These routines are for the special cases.
|
|
Normally,
|
|
.Xr pool_cache 9
|
|
should be used for memory allocation from interrupt context.
|
|
.\" ------------------------------------------------------------
|
|
.Sh NOTES
|
|
Making
|
|
.Dv KM_SLEEP
|
|
allocations while holding mutexes or reader/writer locks is discouraged, as the
|
|
caller can sleep for an unbounded amount of time in order to satisfy the
|
|
allocation.
|
|
This can in turn block other threads that wish to acquire locks held by the
|
|
caller.
|
|
It should be noted that
|
|
.Fn kmem_free
|
|
may also block.
|
|
.Pp
|
|
For some locks this is permissible or even unavoidable.
|
|
For others, particularly locks that may be taken from soft interrupt context,
|
|
it is a serious problem.
|
|
As a general rule it is better not to allow this type of situation to develop.
|
|
One way to circumvent the problem is to make allocations speculative and part
|
|
of a retryable sequence.
|
|
For example:
|
|
.Bd -literal
|
|
retry:
|
|
/* speculative unlocked check */
|
|
if (need to allocate) {
|
|
new_item = kmem_alloc(sizeof(*new_item), KM_SLEEP);
|
|
} else {
|
|
new_item = NULL;
|
|
}
|
|
mutex_enter(lock);
|
|
/* check while holding lock for true status */
|
|
if (need to allocate) {
|
|
if (new_item == NULL) {
|
|
mutex_exit(lock);
|
|
goto retry;
|
|
}
|
|
consume(new_item);
|
|
new_item = NULL;
|
|
}
|
|
mutex_exit(lock);
|
|
if (new_item != NULL) {
|
|
/* did not use it after all */
|
|
kmem_free(new_item, sizeof(*new_item));
|
|
}
|
|
.Ed
|
|
.\" ------------------------------------------------------------
|
|
.Sh OPTIONS
|
|
Kernels compiled with the
|
|
.Dv DEBUG
|
|
option perform CPU intensive sanity checks on kmem operations,
|
|
and include the
|
|
.Dv kmguard
|
|
facility which can be enabled at runtime.
|
|
.Pp
|
|
.Dv kmguard
|
|
adds additional, very high overhead runtime verification to kmem operations.
|
|
To enable it, boot the system with the
|
|
.Fl d
|
|
option, which causes the debugger to be entered early during the kernel
|
|
boot process.
|
|
Issue commands such as the following:
|
|
.Bd -literal
|
|
db\*[Gt] w kmem_guard_depth 0t30000
|
|
db\*[Gt] c
|
|
.Ed
|
|
.Pp
|
|
This instructs
|
|
.Dv kmguard
|
|
to queue up to 60000 (30000*2) pages of unmapped KVA to catch
|
|
use-after-free type errors.
|
|
When
|
|
.Fn kmem_free
|
|
is called, memory backing a freed item is unmapped and the kernel VA
|
|
space pushed onto a FIFO.
|
|
The VA space will not be reused until another 30k items have been freed.
|
|
Until reused the kernel will catch invalid accesses and panic with a page fault.
|
|
Limitations:
|
|
.Bl -bullet
|
|
.It
|
|
It has a severe impact on performance.
|
|
.It
|
|
It is best used on a 64-bit machine with lots of RAM.
|
|
.It
|
|
Allocations larger than PAGE_SIZE bypass the
|
|
.Dv kmguard
|
|
facility.
|
|
.El
|
|
.Pp
|
|
kmguard tries to catch the following types of bugs:
|
|
.Bl -bullet
|
|
.It
|
|
Overflow at time of occurrence, by means of a guard page.
|
|
.It
|
|
Underflow at
|
|
.Fn kmem_free ,
|
|
by using a canary value.
|
|
.It
|
|
Invalid pointer or size passed, at
|
|
.Fn kmem_free .
|
|
.El
|
|
.Sh RETURN VALUES
|
|
On success,
|
|
.Fn kmem_alloc
|
|
and
|
|
.Fn kmem_zalloc
|
|
return a pointer to allocated memory.
|
|
Otherwise,
|
|
.Dv NULL
|
|
is returned.
|
|
.\" ------------------------------------------------------------
|
|
.Sh CODE REFERENCES
|
|
The
|
|
.Nm
|
|
subsystem is implemented within the file
|
|
.Pa sys/kern/subr_kmem.c .
|
|
.\" ------------------------------------------------------------
|
|
.Sh SEE ALSO
|
|
.Xr intro 9 ,
|
|
.Xr memoryallocators 9 ,
|
|
.Xr percpu 9 ,
|
|
.Xr pool_cache 9 ,
|
|
.Xr uvm_km 9
|
|
.\" ------------------------------------------------------------
|
|
.Sh CAVEATS
|
|
Neither
|
|
.Fn kmem_alloc
|
|
nor
|
|
.Fn kmem_free
|
|
can be used from interrupt context, from a soft interrupt, or from
|
|
a callout.
|
|
Use
|
|
.Xr pool_cache 9
|
|
in these situations.
|
|
.\" ------------------------------------------------------------
|
|
.Sh SECURITY CONSIDERATIONS
|
|
As the memory allocated by
|
|
.Fn kmem_alloc
|
|
is uninitialized, it can contain security-sensitive data left by its
|
|
previous user.
|
|
It is the caller's responsibility not to expose it to the world.
|