5d2bff060a
run through copy-on-write. Call fscow_run() with valid data where possible. The LP_UFSCOW hack is no longer needed to protect ffs_copyonwrite() against endless recursion. - Add a flag B_MODIFY to bread(), breada() and breadn(). If set the caller intends to modify the buffer returned. - Always run copy-on-write on buffers returned from ffs_balloc(). - Add new function ffs_getblk() that gets a buffer, assigns a new blkno, may clear the buffer and runs copy-on-write. Process possible errors from getblk() or fscow_run(). Part of PR kern/38664. Welcome to 4.99.63 Reviewed by: YAMAMOTO Takashi <yamt@netbsd.org>
384 lines
13 KiB
Groff
384 lines
13 KiB
Groff
.\" $NetBSD: buffercache.9,v 1.23 2008/05/16 09:21:59 hannken Exp $
|
|
.\"
|
|
.\" Copyright (c)2003 YAMAMOTO Takashi,
|
|
.\" All rights reserved.
|
|
.\"
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
.\" modification, are permitted provided that the following conditions
|
|
.\" are met:
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
.\"
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
.\" SUCH DAMAGE.
|
|
.\"
|
|
.\"
|
|
.\" following copyright notices are from sys/kern/vfs_bio.c.
|
|
.\" they are here because i took some comments from it. yamt@NetBSD.org
|
|
.\"
|
|
.\"
|
|
.\"/*-
|
|
.\" * Copyright (c) 1982, 1986, 1989, 1993
|
|
.\" * The Regents of the University of California. All rights reserved.
|
|
.\" * (c) UNIX System Laboratories, Inc.
|
|
.\" * All or some portions of this file are derived from material licensed
|
|
.\" * to the University of California by American Telephone and Telegraph
|
|
.\" * Co. or Unix System Laboratories, Inc. and are reproduced herein with
|
|
.\" * the permission of UNIX System Laboratories, Inc.
|
|
.\" *
|
|
.\" * Redistribution and use in source and binary forms, with or without
|
|
.\" * modification, are permitted provided that the following conditions
|
|
.\" * are met:
|
|
.\" * 1. Redistributions of source code must retain the above copyright
|
|
.\" * notice, this list of conditions and the following disclaimer.
|
|
.\" * 2. Redistributions in binary form must reproduce the above copyright
|
|
.\" * notice, this list of conditions and the following disclaimer in the
|
|
.\" * documentation and/or other materials provided with the distribution.
|
|
.\" * 3. Neither the name of the University nor the names of its contributors
|
|
.\" * may be used to endorse or promote products derived from this software
|
|
.\" * without specific prior written permission.
|
|
.\" *
|
|
.\" * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
|
|
.\" * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
.\" * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
.\" * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
|
|
.\" * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
.\" * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
.\" * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
.\" * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
.\" * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
.\" * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
.\" * SUCH DAMAGE.
|
|
.\" *
|
|
.\" * @(#)vfs_bio.c 8.6 (Berkeley) 1/11/94
|
|
.\" */
|
|
.\"
|
|
.\"/*-
|
|
.\" * Copyright (c) 1994 Christopher G. Demetriou
|
|
.\" *
|
|
.\" * Redistribution and use in source and binary forms, with or without
|
|
.\" * modification, are permitted provided that the following conditions
|
|
.\" * are met:
|
|
.\" * 1. Redistributions of source code must retain the above copyright
|
|
.\" * notice, this list of conditions and the following disclaimer.
|
|
.\" * 2. Redistributions in binary form must reproduce the above copyright
|
|
.\" * notice, this list of conditions and the following disclaimer in the
|
|
.\" * documentation and/or other materials provided with the distribution.
|
|
.\" * 3. All advertising materials mentioning features or use of this software
|
|
.\" * must display the following acknowledgement:
|
|
.\" * This product includes software developed by the University of
|
|
.\" * California, Berkeley and its contributors.
|
|
.\" * 4. Neither the name of the University nor the names of its contributors
|
|
.\" * may be used to endorse or promote products derived from this software
|
|
.\" * without specific prior written permission.
|
|
.\" *
|
|
.\" * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
|
|
.\" * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
.\" * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
.\" * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
|
|
.\" * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
.\" * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
.\" * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
.\" * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
.\" * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
.\" * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
.\" * SUCH DAMAGE.
|
|
.\" *
|
|
.\" * @(#)vfs_bio.c 8.6 (Berkeley) 1/11/94
|
|
.\" */
|
|
.\"
|
|
.\"
|
|
.\" ------------------------------------------------------------
|
|
.Dd May 16, 2008
|
|
.Dt BUFFERCACHE 9
|
|
.Os
|
|
.Sh NAME
|
|
.Nm buffercache ,
|
|
.Nm bread ,
|
|
.Nm breada ,
|
|
.Nm breadn ,
|
|
.Nm bwrite ,
|
|
.Nm bawrite ,
|
|
.Nm bdwrite ,
|
|
.Nm getblk ,
|
|
.Nm geteblk ,
|
|
.Nm incore ,
|
|
.Nm allocbuf ,
|
|
.Nm brelse ,
|
|
.Nm biodone ,
|
|
.Nm biowait
|
|
.Nd buffer cache interfaces
|
|
.\" ------------------------------------------------------------
|
|
.Sh SYNOPSIS
|
|
.In sys/buf.h
|
|
.Ft int
|
|
.Fn bread "struct vnode *vp" "daddr_t blkno" "int size" \
|
|
"struct kauth_cred *cred" "int flags" "struct buf **bpp"
|
|
.Ft int
|
|
.Fn breadn "struct vnode *vp" "daddr_t blkno" "int size" \
|
|
"daddr_t rablks[]" "int rasizes[]" "int nrablks" \
|
|
"struct kauth_cred *cred" "int flags" "struct buf **bpp"
|
|
.Ft int
|
|
.Fn breada "struct vnode *vp" "daddr_t blkno" "int size" \
|
|
"daddr_t rablkno" "int rabsize" \
|
|
"struct kauth_cred *cred" "int flags" "struct buf **bpp"
|
|
.Ft int
|
|
.Fn bwrite "struct buf *bp"
|
|
.Ft void
|
|
.Fn bawrite "struct buf *bp"
|
|
.Ft void
|
|
.Fn bdwrite "struct buf *bp"
|
|
.Ft struct buf *
|
|
.Fn getblk "struct vnode *vp" "daddr_t blkno" "int size" \
|
|
"int slpflag" "int slptimeo"
|
|
.Ft struct buf *
|
|
.Fn geteblk "int size"
|
|
.Ft struct buf *
|
|
.Fn incore "struct vnode *vp" "daddr_t blkno"
|
|
.Ft void
|
|
.Fn allocbuf "struct buf *bp" "int size" "int preserve"
|
|
.Ft void
|
|
.Fn brelse "struct buf *bp"
|
|
.Ft void
|
|
.Fn biodone "struct buf *bp"
|
|
.Ft int
|
|
.Fn biowait "struct buf *bp"
|
|
.\" ------------------------------------------------------------
|
|
.Sh DESCRIPTION
|
|
The
|
|
.Nm
|
|
interface is used by each filesystems to improve I/O performance using
|
|
in-core caches of filesystem blocks.
|
|
.Pp
|
|
The kernel memory used to cache a block is called a buffer and
|
|
described by a
|
|
.Em buf
|
|
structure.
|
|
In addition to describing a cached block, a
|
|
.Em buf
|
|
structure is also used to describe an I/O request as a part of
|
|
the disk driver interface.
|
|
.\" XXX struct buf, B_ flags, MP locks, etc
|
|
.\" ------------------------------------------------------------
|
|
.Sh FUNCTIONS
|
|
.Bl -tag -width compact
|
|
.It Fn bread "vp" "blkno" "size" "cred" "flags" "bpp"
|
|
Read a block corresponding to
|
|
.Fa vp
|
|
and
|
|
.Fa blkno .
|
|
The buffer is returned via
|
|
.Fa bpp .
|
|
The units of
|
|
.Fa blkno
|
|
are specifically the units used by the
|
|
.Fn VOP_STRATEGY
|
|
routine for the
|
|
.Fa vp
|
|
vnode.
|
|
For device special files,
|
|
.Fa blkno
|
|
is in units of
|
|
.Dv DEV_BSIZE
|
|
and both
|
|
.Fa blkno
|
|
and
|
|
.Fa size
|
|
must be multiples of the underlying device's block size.
|
|
For other files,
|
|
.Fa blkno
|
|
is in units chosen by the file system containing
|
|
.Fa vp .
|
|
.Pp
|
|
If the buffer is not found (i.e. the block is not cached in memory),
|
|
.Fn bread
|
|
allocates a buffer with enough pages for
|
|
.Fa size
|
|
and reads the specified disk block into it using
|
|
credential
|
|
.Fa cred .
|
|
.Pp
|
|
The buffer returned by
|
|
.Fn bread
|
|
is marked as busy.
|
|
(The
|
|
.Dv B_BUSY
|
|
flag is set.)
|
|
After manipulation of the buffer returned from
|
|
.Fn bread ,
|
|
the caller should unbusy it so that another thread can get it.
|
|
If the buffer contents are modified and should be written back to disk,
|
|
it should be unbusied using one of variants of
|
|
.Fn bwrite .
|
|
Otherwise, it should be unbusied using
|
|
.Fn brelse .
|
|
.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
|
.It Fn breadn "vp" "blkno" "size" "rablks" "rasizes" "nrablks" "cred" "flags" \
|
|
"bpp"
|
|
Get a buffer as
|
|
.Fn bread .
|
|
In addition,
|
|
.Fn breadn
|
|
will start read-ahead of blocks specified by
|
|
.Fa rablks ,
|
|
.Fa rasizes ,
|
|
.Fa nrablks .
|
|
.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
|
.It Fn breada "vp" "blkno" "size" "rablkno" "rabsize" "cred" "flags" "bpp"
|
|
Same as
|
|
.Fn breadn
|
|
with single block read-ahead.
|
|
This function is for compatibility with old filesystem code and
|
|
shouldn't be used by new ones.
|
|
.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
|
.It Fn bwrite "bp"
|
|
Write a block.
|
|
Start I/O for write using
|
|
.Fn VOP_STRATEGY .
|
|
Then, unless the
|
|
.Dv B_ASYNC
|
|
flag is set in
|
|
.Fa bp ,
|
|
.Fn bwrite
|
|
waits for the I/O to complete.
|
|
.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
|
.It Fn bawrite "bp"
|
|
Write a block asynchronously.
|
|
Set the
|
|
.Dv B_ASYNC
|
|
flag in
|
|
.Fa bp
|
|
and simply call
|
|
.Fn VOP_BWRITE ,
|
|
which results in
|
|
.Fn bwrite
|
|
for most filesystems.
|
|
.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
|
.It Fn bdwrite "bp"
|
|
Delayed write.
|
|
Unlike
|
|
.Fn bawrite ,
|
|
.Fn bdwrite
|
|
won't start any I/O.
|
|
It only marks the buffer as dirty
|
|
.Pq Dv B_DELWRI
|
|
and unbusy it.
|
|
.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
|
.It Fn getblk "vp" "blkno" "size" "slpflag" "slptimeo"
|
|
Get a block of requested size
|
|
.Fa size
|
|
that is associated with a given vnode and block
|
|
offset, specified by
|
|
.Fa vp
|
|
and
|
|
.Fa blkno .
|
|
If it is found in the block cache, make it busy and return it.
|
|
Otherwise, return an empty block of the correct size.
|
|
It is up to the caller to ensure that the cached blocks
|
|
are of the correct size.
|
|
.Pp
|
|
If
|
|
.Fn getblk
|
|
needs to sleep,
|
|
.Fa slpflag
|
|
and
|
|
.Fa slptimeo
|
|
are used as arguments for
|
|
.Fn cv_timedwait .
|
|
.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
|
.It Fn geteblk "size"
|
|
Allocate an empty, disassociated block of a given size
|
|
.Fa size .
|
|
.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
|
.It Fn incore "vp" "blkno"
|
|
Determine if a block associated to a given vnode and block offset
|
|
is in the cache.
|
|
If it is there, return a pointer to it.
|
|
Note that
|
|
.Fn incore
|
|
doesn't busy the buffer unlike
|
|
.Fn getblk .
|
|
.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
|
.It Fn allocbuf "bp" "size" "preserve"
|
|
Expand or contract the actual memory allocated to a buffer.
|
|
If
|
|
.Fa preserve
|
|
is zero, the entire data in the buffer will be lost.
|
|
Otherwise, if the buffer shrinks, the truncated part of the data
|
|
is lost, so it is up to the caller to have written
|
|
it out
|
|
.Em first
|
|
if needed; this routine will not start a write.
|
|
If the buffer grows, it is the callers responsibility to fill out
|
|
the buffer's additional contents.
|
|
.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
|
.It Fn brelse "bp"
|
|
Unbusy a buffer and release it to the free lists.
|
|
.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
|
.It Fn biodone "bp"
|
|
Mark I/O complete on a buffer.
|
|
If a callback has been requested by
|
|
.Dv B_CALL ,
|
|
do so.
|
|
Otherwise, wakeup waiters.
|
|
.\" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
|
.It Fn biowait "bp"
|
|
Wait for operations on the buffer to complete.
|
|
When they do, extract and return the I/O's error value.
|
|
.El
|
|
.\" ------------------------------------------------------------
|
|
.Sh CODE REFERENCES
|
|
This section describes places within the
|
|
.Nx
|
|
source tree where actual code implementing the buffer cache subsystem
|
|
can be found.
|
|
All pathnames are relative to
|
|
.Pa /usr/src .
|
|
.Pp
|
|
The buffer cache subsystem is implemented within the file
|
|
.Pa sys/kern/vfs_bio.c .
|
|
.Sh SEE ALSO
|
|
.Xr intro 9 ,
|
|
.Xr vnode 9
|
|
.Rs
|
|
.%A Maurice J. Bach
|
|
.%B "The Design of the UNIX Operating System"
|
|
.%I "Prentice Hall"
|
|
.%D 1986
|
|
.Re
|
|
.Rs
|
|
.%A Marshall Kirk McKusick
|
|
.%A Keith Bostic
|
|
.%A Michael J. Karels
|
|
.%A John S. Quarterman
|
|
.%B "The Design and Implementation of the 4.4BSD Operating System"
|
|
.%I "Addison Wesley"
|
|
.%D 1996
|
|
.Re
|
|
.\" ------------------------------------------------------------
|
|
.Sh BUGS
|
|
In the current implementation,
|
|
.Fn bread
|
|
and its variants
|
|
don't use a specified credential.
|
|
.Pp
|
|
Because
|
|
.Fn biodone
|
|
and
|
|
.Fn biowait
|
|
do not really belong to
|
|
.Nm ,
|
|
they shouldn't be documented here.
|