Document how system call versioning is done. From this summer's compat-linux

GSoC, by Theodore Preduta.
This commit is contained in:
christos 2023-07-08 16:14:11 +00:00
parent 486fcaa047
commit e706571b76
3 changed files with 296 additions and 3 deletions

View File

@ -1,4 +1,4 @@
# $NetBSD: mi,v 1.2434 2023/07/04 16:23:15 riastradh Exp $
# $NetBSD: mi,v 1.2435 2023/07/08 16:14:11 christos Exp $
#
# Note: don't delete entries from here - mark them as "obsolete" instead.
./etc/mtree/set.comp comp-sys-root
@ -12908,6 +12908,7 @@
./usr/share/man/cat9/vdead_check.0 comp-sys-catman .cat
./usr/share/man/cat9/vdevgone.0 comp-sys-catman .cat
./usr/share/man/cat9/veriexec.0 comp-sys-catman .cat
./usr/share/man/cat9/versioningsyscalls.0 comp-sys-catman .cat
./usr/share/man/cat9/vfinddev.0 comp-sys-catman .cat
./usr/share/man/cat9/vflush.0 comp-sys-catman .cat
./usr/share/man/cat9/vflushbuf.0 comp-sys-catman .cat
@ -21214,6 +21215,7 @@
./usr/share/man/html9/vdead_check.html comp-sys-htmlman html
./usr/share/man/html9/vdevgone.html comp-sys-htmlman html
./usr/share/man/html9/veriexec.html comp-sys-htmlman html
./usr/share/man/html9/versigoningsyscalls.html comp-sys-htmlman html
./usr/share/man/html9/vfinddev.html comp-sys-htmlman html
./usr/share/man/html9/vflush.html comp-sys-htmlman html
./usr/share/man/html9/vflushbuf.html comp-sys-htmlman html
@ -29753,6 +29755,7 @@
./usr/share/man/man9/vdead_check.9 comp-sys-man .man
./usr/share/man/man9/vdevgone.9 comp-sys-man .man
./usr/share/man/man9/veriexec.9 comp-sys-man .man
./usr/share/man/man9/versioningsyscalls.9 comp-sys-man .man
./usr/share/man/man9/vfinddev.9 comp-sys-man .man
./usr/share/man/man9/vflush.9 comp-sys-man .man
./usr/share/man/man9/vflushbuf.9 comp-sys-man .man

View File

@ -1,4 +1,4 @@
# $NetBSD: Makefile,v 1.466 2023/03/06 00:49:31 uwe Exp $
# $NetBSD: Makefile,v 1.467 2023/07/08 16:14:11 christos Exp $
# Makefile for section 9 (kernel function and variable) manual pages.
@ -64,7 +64,8 @@ MAN= accept_filter.9 accf_data.9 accf_http.9 acl.9 \
usbd_status.9 usbdi.9 usbnet.9 \
userret.9 ustore.9 \
uvm.9 uvm_hotplug.9 uvm_km.9 uvm_map.9 \
vattr.9 veriexec.9 vcons.9 vfs.9 vfs_hooks.9 vfsops.9 vfssubr.9 \
vattr.9 veriexec.9 vcons.9 versioningsyscalls.9 \
vfs.9 vfs_hooks.9 vfsops.9 vfssubr.9 \
video.9 vme.9 vnfileops.9 vnode.9 vnodeops.9 vnsubr.9 vmem.9 \
wapbl.9 wdc.9 workqueue.9 \
wsbell.9 wscons.9 wsdisplay.9 wsfont.9 wskbd.9 wsmouse.9 \

View File

@ -0,0 +1,289 @@
.\" $NetBSD: versioningsyscalls.9,v 1.1 2023/07/08 16:14:11 christos Exp $
.\"
.\" Copyright (c) 2023 The NetBSD Foundation, Inc.
.\" All rights reserved.
.\"
.\" This code is derived from software contributed to The NetBSD Foundation
.\" by Theodore Preduta.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
.\" POSSIBILITY OF SUCH DAMAGE.
.\"
.Dd June 23, 2023
.Dt versioningsyscalls 9
.Os
.Sh NAME
.Nm versioningsyscalls
.Nd guide on versioning syscalls
.Sh DESCRIPTION
.Nx
has the ability to change the ABI of a syscall whilst retaining backwards
compatibility with existing code.
This means that existing code keeps working the same way as before, and
new code can use new features and/or functionality.
In the past this has allowed
.Ft dev_t
to move from 16 bits to 32 bits,
.Ft ino_t
and
.Ft time_t
to move from 16 bits to 32 bits and
adding fields to
.Ft struct kevent
without disturbing existing binaries.
To achieve this both kernel and userland changes are required.
.Pp
In the kernel, a new syscall is added with a new ABI, and the old syscall
is retained and moved to a new location that holds the compatibility syscalls
.Pa ( src/sys/compat ) .
Kernels can be compiled with or without backwards compatibility syscalls.
See the
.Li COMPAT_XX
options in
.Xr options 4 .
.Pp
In userland, the original syscall stub is moved into
.Pa src/lib/libc/compat
retaining the same symbol name and ABI.
The new stub is added to libc, and in the header file the syscall symbol is
made to point to the new name with the new ABI.
.Pp
This is done via symbol renaming instead of ELF versioned symbols for
historical reasons.
.Nx
has retained binary compatibility with most syscalls since
.Nx 0.9
with the exception of Scheduler Activation syscalls which are not being
emulated because of the cost and safety of doing so.
.Pp
To avoid confusion, the following words are used to disambiguate which version
of the system call is being described.
.Bl -tag -offset indent -width "current"
.It old
Any previous versions of the syscall, which have already been versioned and
superseded by the current version of the syscall.
.It current
The version of the syscall currently in use.
.It next
The version of the syscall that will become standard in the next release.
.El
.Pp
Additionally,
.Em XYZ
always represents the last
.Nx
release where the current
version of the system call is the default, multiplied by ten and retaining a
leading zero.
For example
.Nx 0.9
has
.Li COMPAT_09
whereas
.Nx 10.0
has
.Li COMPAT_100 .
.Sh VERSIONING THE SYSCALL
This section describes what needs to be modified to add the new version of the
syscall.
It assumes the current version of the syscall is
.Fn my_syscall struct\ my_struct\ *ms
and that
.Li my_struct
will be versioned.
If not versioning a struct, passages that mention
.Li my_struct
can be ignored.
.Ss Versioning structs
To version
.Li struct my_struct ,
first make a copy of
.Li my_struct
renamed to
.Li my_structXYZ
in an equivalent header in
.Pa sys/compat/sys .
After that, you can freely modify
.Li my_struct
as desired.
.Ss Versioning the entry point
The stub for the next version of the syscall will be
.Fn __my_syscallXYZ ,
and will have entry point
.Fn sys___my_syscallXYZ .
.Ss Modifying syscalls.conf
.Pa sys/kern/syscalls.conf
may need to be modified to contain
.Li compat_XYZ
in the
.Li compatopts
field.
.Ss Modifying syscalls.master
First, add the next syscall to
.Pa sys/kern/syscalls.master
keeping
.Li my_syscall
as the name, and set the (optional) compat field of the declaration to XYZ.
.Pp
Next, modify the current version of the syscall, and replace the type
field (usually just STD) with
.Li COMPAT_XYZ MODULAR compat_XYZ .
.Pp
The keyword
.Li MODULAR
indicates that the system call can be part of a kernel module.
Even if the system call was not part of a module before, now it will be part
of the
.Li COMPAT_XYZ
module.
.Pp
Finally, if applicable, replace the types of the current and old versions of the
syscall with the compat type.
.Pp
Overall, the final diff should look like
.Bd -literal
- 123 STD { int|sys||my_syscall(struct my_struct *ms); }
+ 123 COMPAT_XYZ MODULAR compat_XYZ { int|sys||my_syscall(struct my_structXYZ *ms); }
\&.\&.\&.
+ 456 STD { int|sys|XYZ|my_syscall(struct my_struct *ms); }
.Ed
.Ss Modifying Makefile.rump
If the current syscall is rump,
.Pa sys/rump/Makefile.rump
must contain
.Li XYZ
in the
.Li RUMP_NBCOMPAT
variable.
.Ss Regenerating the system calls
If versioning structs, then modify
.Pa sys/kern/makesyscalls.sh
by adding and entry for
.Li struct my_structXYZ
type to
.Li uncompattypes .
.Pp
The
.Li uncompattypes
map is used in
.Xr rump 7
system call table generation, to map from the versioned types to the original
names since
.Xr rump 7
wants to have a non-versioned copy of the system call table.
.Pp
Then regenerate the syscall tables in the usual way, first by running
.Pa sys/kern/makesyscalls.sh ,
then if the system call is rump, doing a build in
.Pa sys/rump
and then running
.Pa sys/rump/makerumpsyscalls.sh
passing it the path to the result of the build you just did as its first
argument.
.Sh KERNEL COMPATIBILITY
This section covers maintaining compatibility at the kernel level, by
adding an entry point for the current syscall in an appropriate compat
module.
For the purposes of this section, we assume the current
syscall has entry point
.Fn sys_my_syscall
and lives inside
.Pa sys/kern/my_file.c .
.Ss Creating the compat current syscall
The compat version of the current syscall has entry point
.Fn compat_XYZ_sys_my_syscall ,
and should be implemented in
.Pa sys/compat/common/my_file_XYZ.c
with the same semantics as the current syscall.
Often this involves translating the arguments to the next syscall,
and then calling that syscall's entry point.
.Ss Adding it to the compat module
.Pa sys/compat/common/my_file_XYZ.c
must contain an array of
.Ft struct syscall_package
that declares the mapping between syscall number and entry point,
terminating in a zero element (see sample diff below).
.Pp
Additionally,
.Pa sys/compat/common/my_file_XYZ.c
must contain two functions,
.Fn my_file_XYZ_init
and
.Fn my_file_XYZ_fini
that are used to initialize/clean up anything related to this syscall.
At the minimum they must make calls to
.Fn syscall_establish
and
.Fn syscall_disestablish
respectively, adding and removing the syscalls.
The stubs for these functions should be located in
.Pa sys/compat/common/compat_mod.h .
.Pp
Overall,
.Pa sys/compat/common/my_file_XYZ.c
must at the minimum include
.Bd -literal
+ static const struct syscall_package my_file_XYZ_syscalls[] = {
+ { SYS_compat_XYZ_my_syscall, 0, (sy_call_t *)compat_XYZ_sys_my_syscall },
+ { 0, 0, NULL },
+ };
+
+ int
+ compat_XYZ_my_syscall(...)
+ { /* Compat implementation goes here. */ }
+
+ int
+ my_file_XYZ_init(void)
+ { return syscall_establish(NULL, my_file_XYZ_syscalls); }
+
+ int
+ my_file_XYZ_fini(void)
+ { return syscall_disestablish(NULL, my_file_XYZ_syscalls); }
.Ed
.Pp
Finally,
.Pa sys/compat/common/compat_XYZ_mod.c
needs to be be modified to have its
.Fn compat_XYZ_init
and
.Fn compat_XYZ_fini
functions call
.Fn my_file_XYZ_init
and
.Fn my_file_XYZ_fini .
.Ss Modifying old compat syscalls
If the current syscall has already been versioned, you might need to
modify the old compat syscalls in
.Pa sys/compat/common
to either use the next syscall or the current compat syscall.
Note that compat code can be made to depend on compat code for more
recent releases.
.Sh USERLAND COMPATIBILITY
With the exception of the libraries described below, making the rest of userland
work will just involve recompiling, and perhaps changing a constant or a
.Li #define .
.Ss libc
A userland version of any old and current versions of the syscall should be
implemented in terms of the next syscall in
.Pa lib/libc/compat/sys
and should contain an appropriate call to
.Fn __warn_references
for old and current versions of the syscall.