Right now for xattr remapping, we support types of "prefix", "ok" or "bad".
Type "bad" returns -EPERM on setxattr and hides xattr in listxattr. For
getxattr, mapping code returns -EPERM but getxattr code converts it to -ENODATA.
I need a new semantics where if an xattr is unsupported, then
getxattr()/setxattr() return -ENOTSUP and listxattr() should hide the xattr.
This is needed to simulate that security.selinux is not supported by
virtiofs filesystem and in that case client falls back to some default
label specified by policy.
So add a new type "unsupported" which returns -ENOTSUP on getxattr() and
setxattr() and hides xattrs in listxattr().
For example, one can use following mapping rule to not support
security.selinux xattr and allow others.
"-o xattrmap=/unsupported/all/security.selinux/security.selinux//ok/all///"
Suggested-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <YUt9qbmgAfCFfg5t@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Both qemu and qemu-img use writeback cache mode by default, which is
already documented in qemu(1). qemu-nbd uses writethrough cache mode by
default, and the default cache mode is not documented.
According to the qemu-nbd(8):
--cache=CACHE
The cache mode to be used with the file. See the
documentation of the emulator's -drive cache=... option for
allowed values.
qemu(1) says:
The default mode is cache=writeback.
So users have no reason to assume that qemu-nbd is using writethough
cache mode. The only hint is the painfully slow writing when using the
defaults.
Looking in git history, it seems that qemu used writethrough in the past
to support broken guests that did not flush data properly, or could not
flush due to limitations in qemu. But qemu-nbd clients can use
NBD_CMD_FLUSH to flush data, so using writethrough does not help anyone.
Change the default cache mode to writback, and document the default and
available values properly in the online help and manual.
With this change converting image via qemu-nbd is 3.5 times faster.
$ qemu-img create dst.img 50g
$ qemu-nbd -t -f raw -k /tmp/nbd.sock dst.img
Before this change:
$ hyperfine -r3 "./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock"
Benchmark #1: ./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock
Time (mean ± σ): 83.639 s ± 5.970 s [User: 2.733 s, System: 6.112 s]
Range (min … max): 76.749 s … 87.245 s 3 runs
After this change:
$ hyperfine -r3 "./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock"
Benchmark #1: ./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock
Time (mean ± σ): 23.522 s ± 0.433 s [User: 2.083 s, System: 5.475 s]
Range (min … max): 23.234 s … 24.019 s 3 runs
Users can avoid the issue by using --cache=writeback[1] but the defaults
should give good performance for the common use case.
[1] https://bugzilla.redhat.com/1990656
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Message-Id: <20210813205519.50518-1-nsoffer@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Although we have long supported 'qemu-img convert -o
backing_file=foo,backing_fmt=bar', the fact that we have a shortcut -B
for backing_file but none for backing_fmt has made it more likely that
users accidentally run into:
qemu-img: warning: Deprecated use of backing file without explicit backing format
when using -B instead of -o. For similarity with other qemu-img
commands, such as create and compare, add '-F $fmt' as the shorthand
for '-o backing_fmt=$fmt'. Update iotest 122 for coverage of both
spellings.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210913131735.1948339-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Use a standard heading format for the index.rst file in a directory.
Using overlines makes it clear that individual documents can use e.g.
=== for chapter titles and --- for section titles, as suggested in the
Linux kernel guidelines[1]. They could do it anyway, because documents
included in a toctree are parsed separately and therefore are not tied
to the same conventions for headings. However, keeping some consistency is
useful since sometimes files are included from multiple places.
[1] https://www.kernel.org/doc/html/latest/doc-guide/sphinx.html
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Documents within a Sphinx manual are separate files and therefore can use
different conventions for headings. However, keeping some consistency is
useful so that included files are easy to get right.
This patch uses a standard heading format for book titles, so that it is
obvious when a file sits at the top level toctree of a book or man page.
The heading is irrelevant for man pages, but keep it consistent as well.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The documentation of the posix_acl option has a stray backtick
at the end of the text (which is rendered literally into the HTML).
Delete it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 20210726142338.31872-11-peter.maydell@linaro.org
The point of 'qemu-img convert --bitmaps' is to be a convenience for
actions that are already possible through a string of smaller
'qemu-img bitmap' sub-commands. One situation not accounted for
already is that if a source image contains an inconsistent bitmap (for
example, because a qemu process died abruptly before flushing bitmap
state), the user MUST delete those inconsistent bitmaps before
anything else useful can be done with the image.
We don't want to delete inconsistent bitmaps by default: although a
corrupt bitmap is only a loss of optimization rather than a corruption
of user-visible data, it is still nice to require the user to opt in
to the fact that they are aware of the loss of the bitmap. Still,
requiring the user to check 'qemu-img info' to see whether bitmaps are
consistent, then use 'qemu-img bitmap --remove' to remove offenders,
all before using 'qemu-img convert', is a lot more work than just
adding a knob 'qemu-img convert --bitmaps --skip-broken-bitmaps' which
opts in to skipping the broken bitmaps.
After testing the new option, also demonstrate the way to manually fix
things (either deleting bad bitmaps, or re-creating them as empty) so
that it is possible to convert without the option.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1946084
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210709153951.2801666-4-eblake@redhat.com>
[eblake: warning message tweak, test enhancements]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Since the top-level subsections aren't self-contained manuals
any more, the "Contents:" lines at the top of each of their
index pages look a bit odd; remove them.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20210705095547.15790-4-peter.maydell@linaro.org
We merged our previous multiple-manual setup into a single Sphinx
manual, but we left some text in the various index.rst lines that
still calls the top level subsections separate 'manuals'. Update
them to talk about "this section of the manual" instead, and remove
now-obsolete comments about how the index.rst files are the "top
level page for the 'foo' manual".
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20210705095547.15790-3-peter.maydell@linaro.org
Reword the paragraphs to list the JSON key first, rather than in the
middle of prose.
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210707184125.2551140-1-eblake@redhat.com>
Reviewed-by: Nir Soffer <nsoffer@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
The recently-added NBD context qemu:allocation-depth is able to
distinguish between locally-present data (even when that data is
sparse) [shown as depth 1 over NBD], and data that could not be found
anywhere in the backing chain [shown as depth 0]; and the libnbd
project was recently patched to give the human-readable name "absent"
to an allocation-depth of 0. But qemu-img map --output=json predates
that addition, and has the unfortunate behavior that all portions of
the backing chain that resolve without finding a hit in any backing
layer report the same depth as the final backing layer. This makes it
harder to reconstruct a qcow2 backing chain using just 'qemu-img map'
output, especially when using "backing":null to artificially limit a
backing chain, because it is impossible to distinguish between a
QCOW2_CLUSTER_UNALLOCATED (which defers to a [missing] backing file)
and a QCOW2_CLUSTER_ZERO_PLAIN cluster (which would override any
backing file), since both types of clusters otherwise show as
"data":false,"zero":true" (but note that we can distinguish a
QCOW2_CLUSTER_ZERO_ALLOCATED, which would also have an "offset":
listing).
The task of reconstructing a qcow2 chain was made harder in commit
0da9856851 (nbd: server: Report holes for raw images), because prior
to that point, it was possible to abuse NBD's block status command to
see which portions of a qcow2 file resulted in BDRV_BLOCK_ALLOCATED
(showing up as NBD_STATE_ZERO in isolation) vs. missing from the chain
(showing up as NBD_STATE_ZERO|NBD_STATE_HOLE); but now qemu reports
more accurate sparseness information over NBD.
An obvious solution is to make 'qemu-img map --output=json' add an
additional "present":false designation to any cluster lacking an
allocation anywhere in the chain, without any change to the "depth"
parameter to avoid breaking existing clients. The iotests have
several examples where this distinction demonstrates the additional
accuracy.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210701190655.2131223-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: fix more iotest fallout]
Signed-off-by: Eric Blake <eblake@redhat.com>
fuse has an option FUSE_POSIX_ACL which needs to be opted in by fuse
server to enable posix acls. As of now we are not opting in for this,
so posix acls are disabled on virtiofs by default.
Add virtiofsd option "-o posix_acl/no_posix_acl" to let users enable/disable
posix acl support. By default it is disabled as of now due to performance
concerns with cache=none.
Currently even if file server has not opted in for FUSE_POSIX_ACL, user can
still query acl and set acl, and system.posix_acl_access and
system.posix_acl_default xattrs show up listxattr response.
Miklos said this is confusing. So he said lets block and filter
system.posix_acl_access and system.posix_acl_default xattrs in
getxattr/setxattr/listxattr if user has explicitly disabled
posix acls using -o no_posix_acl.
As of now continuing to keeping the existing behavior if user did not
specify any option to disable acl support due to concerns about backward
compatibility.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <20210622150852.1507204-8-vgoyal@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Different guest xattr prefixes have distinct access control rules applied
by the guest. When remapping a guest xattr care must be taken that the
remapping does not allow the a guest user to bypass guest kernel access
control rules.
For example if 'trusted.*' which requires CAP_SYS_ADMIN is remapped
to 'user.virtiofs.trusted.*', an unprivileged guest user which can
write to 'user.*' can bypass the CAP_SYS_ADMIN control. Thus the
target of any remapping must be explicitly blocked from read/writes
by the guest, to prevent access control bypass.
The examples shown in the virtiofsd man page already do the right
thing and ensure safety, but the security implications of getting
this wrong were not made explicit. This could lead to host admins
and apps unwittingly creating insecure configurations.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210611120427.49736-1-berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
For literal blocks, there has to be an empty line after the two colons,
and the block itself should be indented.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210607180015.924571-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
In downstream, we want to use a different name for the QEMU binary,
and some people might also use the docs for non-x86 binaries, that's
why we already created the |qemu_system| placeholder in the past.
Use it now in the virtiofsd doc, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210607174250.920226-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
- drop block/io write notifiers
- qemu-iotests enhancements to make debugging easier
- rbd parsing fix
- HMP qemu-io fix (for iothreads)
- mirror job cancel relaxation (do not cancel in-flight requests when a
READY mirror job is canceled with force=false)
- document qcow2's data_file and data_file_raw features
- fix iotest 297 for pylint 2.8
- block/copy-on-read refactoring
-----BEGIN PGP SIGNATURE-----
iQFGBAABCAAwFiEEkb62CjDbPohX0Rgp9AfbAGHVz0AFAmCeqLwSHG1yZWl0ekBy
ZWRoYXQuY29tAAoJEPQH2wBh1c9A9pMIAKYIlLQfSSMdy0fZ+6AHiAjaTZAaDr4G
d6NDz/RONZEoxcl01LkUWJfvqH/IdCLx5q4cl9SU4+JzMdKW9K1xBLdAGousuhk/
geYqymbORj/VntJDYwp30KUlC0pLUBbuuzYN+QXrLp5qJvS9nPBcxEPjfSc6GX9z
Bt+GCRW08+C4WKJ3lGu9zNGe47gTFUE/VodUYG4tKg5xZFzsAWd/PZZaVOdW0fCz
/0tdxN4N82XT+cE/lA0Tm6B6L3ZueMAt8byu4BPz21M7kULNn2roVMiFKJELZlZQ
0RyDXH2jb/aH/ha6gJ4S+JhMvq45rH9GuQeAYl6IPngbta+NbZW+U4w=
=+Kha
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2021-05-14' into staging
Block patches:
- drop block/io write notifiers
- qemu-iotests enhancements to make debugging easier
- rbd parsing fix
- HMP qemu-io fix (for iothreads)
- mirror job cancel relaxation (do not cancel in-flight requests when a
READY mirror job is canceled with force=false)
- document qcow2's data_file and data_file_raw features
- fix iotest 297 for pylint 2.8
- block/copy-on-read refactoring
# gpg: Signature made Fri 14 May 2021 17:43:40 BST
# gpg: using RSA key 91BEB60A30DB3E8857D11829F407DB0061D5CF40
# gpg: issuer "mreitz@redhat.com"
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full]
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* remotes/maxreitz/tags/pull-block-2021-05-14:
write-threshold: deal with includes
test-write-threshold: drop extra TestStruct structure
test-write-threshold: drop extra tests
block/write-threshold: drop extra APIs
test-write-threshold: rewrite test_threshold_(not_)trigger tests
block: drop write notifiers
block/write-threshold: don't use write notifiers
qemu-iotests: fix pylint 2.8 consider-using-with error
block/copy-on-read: use bdrv_drop_filter() and drop s->active
Document qemu-img options data_file and data_file_raw
qemu-iotests: fix case of SOCK_DIR already in the environment
qemu-iotests: let "check" spawn an arbitrary test command
qemu-iotests: move command line and environment handling from TestRunner to TestEnv
qemu-iotests: allow passing unittest.main arguments to the test scripts
qemu-iotests: do not buffer the test output
mirror: stop cancelling in-flight requests on non-force cancel in READY
monitor: hmp_qemu_io: acquire aio contex, fix crash
block/rbd: Add an escape-aware strchr helper
iotests/231: Update expected deprecation message
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The contents of this patch were initially developed and posted by Han
Han[1], however, it appears the original patch was not applied. Since
then, the relevant documentation has been moved and adapted to a new
format.
I've taken most of the original wording and tweaked it according to
some of the feedback from the original patch submission. I've also
adapted it to restructured text, which is the format the documentation
currently uses.
[1] https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01253.html
Fixes: https://bugzilla.redhat.com/1763105
Signed-off-by: Han Han <hhan@redhat.com>
Suggested-by: Max Reitz <mreitz@redhat.com>
[ Max: provided description of data_file_raw behavior ]
Signed-off-by: Connor Kuehl <ckuehl@redhat.com>
Message-Id: <20210505195512.391128-1-ckuehl@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
The default "alabaster" sphinx theme has a couple shortcomings:
- the navbar moves along the page
- the search bar is not always at the same place
- it lacks some contrast and colours
The "rtd" theme from readthedocs.org is a popular third party theme used
notably by the kernel, with a custom style sheet. I like it better,
perhaps others do too. It also simplifies the "Edit on Gitlab" links.
Tweak a bit the custom theme to match qemu.org style, use the
QEMU logo, and favicon etc.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210323115328.4146052-1-marcandre.lureau@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Now that we merged into one doc, it makes the nav looks nicer.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210323074704.4078381-1-marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Implementing FUSE exports required no changes to the storage daemon, so
we forgot to document them there. Considering that both NBD and
vhost-user-blk exports are documented in its man page (and NBD exports
in its --help text), we should probably do the same for FUSE.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210217115844.62661-1-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This switches qemu-img from a QemuOpts-based parser for --object to
user_creatable_process_cmdline() which uses a keyval parser and enforces
the QAPI schema.
Apart from being a cleanup, this makes non-scalar properties accessible.
As a side effect, fix wrong exit codes in the object parsing error path
of 'qemu-img compare'. This was broken in commit 334c43e2c3 because
&error_fatal exits with an exit code of 1, while it should have been 2.
Document that exit code 0 is also returned when just requested help was
printed instead of comparing images. This is preexisting behaviour that
isn't changed by this patch, though another instance of it is added with
'--object help'.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The 'name' option for NBD exports is optional. Add a note that the
default for the option is the node name (people could otherwise expect
that it's the empty string like for qemu-nbd).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210305094856.18964-1-kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
World-writeable directories have security issues. Avoid showing them in
the documentation since someone might accidentally use them in
situations where they are insecure.
There tend to be 3 security problems:
1. Denial of service. An adversary may be able to create the file
beforehand, consume all space/inodes, etc to sabotage us.
2. Impersonation. An adversary may be able to create a listen socket and
accept incoming connections that were meant for us.
3. Unauthenticated client access. An adversary may be able to connect to
us if we did not set the uid/gid and permissions correctly.
These can be prevented or mitigated with private /tmp, carefully setting
the umask, etc but that requires special action and does not apply to
all situations. Just avoid using /tmp in examples.
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210301172728.135331-3-stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The QMP monitor, NBD server, and vhost-user-blk export all support file
descriptor passing. This is a useful technique because it allows the
parent process to spawn and wait for qemu-storage-daemon without busy
waiting, which may delay startup due to arbitrary sleep() calls.
This Python example is inspired by the test case written for libnbd by
Richard W.M. Jones <rjones@redhat.com>:
89113f484e
Thanks to Daniel P. Berrangé <berrange@redhat.com> for suggestions on
how to get this working. Now let's document it!
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210301172728.135331-2-stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Daemons often have a --pidfile option where the pid is written to a file
so that scripts can stop the daemon by sending a signal.
The pid file also acts as a lock to prevent multiple instances of the
daemon from launching for a given pid file.
QEMU, qemu-nbd, qemu-ga, virtiofsd, and qemu-pr-helper all support the
--pidfile option. Add it to qemu-storage-daemon too.
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210302142746.170535-1-stefanha@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On Linux, the 'security.capability' xattr holds a set of
capabilities that can change when an executable is run, giving
a limited form of privilege escalation to those programs that
the writer of the file deemed worthy.
Any write causes the 'security.capability' xattr to be dropped,
stopping anyone from gaining privilege by modifying a blessed
file.
Fuse relies on the daemon to do this dropping, and in turn the
daemon relies on the host kernel to drop the xattr for it. However,
with the addition of -o xattrmap, the xattr that the guest
stores its capabilities in is now not the same as the one that
the host kernel automatically clears.
Where the mapping changes 'security.capability', explicitly clear
the remapped name to preserve the same behaviour.
This bug is assigned CVE-2021-20263.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
The preferred syntax is to use "foo=on|off", rather than a bare
"foo" or "nofoo".
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210216191027.595031-8-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This gives us better feature parity with QMP nbd-server-start, where
max-connections defaults to 0 for unlimited.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210209152759.209074-3-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
When we first converted our documentation to Sphinx, we split it into
multiple manuals (system, interop, tools, etc), which are all built
separately. The primary driver for this was wanting to be able to
avoid shipping the 'devel' manual to end-users. However, this is
working against the grain of the way Sphinx wants to be used and
causes some annoyances:
* Cross-references between documents become much harder or
possibly impossible
* There is no single index to the whole documentation
* Within one manual there's no links or table-of-contents info
that lets you easily navigate to the others
* The devel manual doesn't get published on the QEMU website
(it would be nice to able to refer to it there)
Merely hiding our developer documentation from end users seems like
it's not enough benefit for these costs. Combine all the
documentation into a single manual (the same way that the readthedocs
site builds it) and install the whole thing. The previous manual
divisions remain as the new top level sections in the manual.
* The per-manual conf.py files are no longer needed
* The man_pages[] specifications previously in each per-manual
conf.py move to the top level conf.py
* docs/meson.build logic is simplified as we now only need to run
Sphinx once for the HTML and then once for the manpages5B
* The old index.html.in that produced the top-level page with
links to each manual is no longer needed
Unfortunately this means that we now have to build the HTML
documentation into docs/manual in the build tree rather than directly
into docs/; otherwise it is too awkward to ensure we install only the
built manual and not also the dependency info, stamp file, etc. The
manual still ends up in the same place in the final installed
directory, but anybody who was consulting documentation from within
the build tree will have to adjust where they're looking.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210115154449.4801-1-peter.maydell@linaro.org
Document the qemu-storage-daemon tool. Most of the command-line options
are identical to their QEMU counterparts. Perhaps Sphinx hxtool
integration could be extended to extract documentation for individual
command-line options so they can be shared. For now the
qemu-storage-daemon simply refers to the qemu(1) man page where the
command-line options are identical.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20201209103802.350848-3-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Split the documentation of the qemu-pr-helper binary into the tools
manual, and give it a manpage like our other standalone executables.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Fix also a similar typo in a code comment.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20201117193448.393472-1-sw@weilnetz.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a link to the top of the sidebar in every docs page that takes the
user back to the source code in gitlab.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20201102130926.161183-5-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Allow the server to expose an additional metacontext to be requested
by savvy clients. qemu-nbd adds a new option -A to expose the
qemu:allocation-depth metacontext through NBD_CMD_BLOCK_STATUS; this
can also be set via QMP when using block-export-add.
qemu as client is hacked into viewing the key aspects of this new
context by abusing the already-experimental x-dirty-bitmap option to
collapse all depths greater than 2, which results in a tri-state value
visible in the output of 'qemu-img map --output=json' (yes, that means
x-dirty-bitmap is now a bit of a misnomer, but I didn't feel like
renaming it as it would introduce a needless break of back-compat,
even though we make no compat guarantees with x- members):
unallocated (depth 0) => "zero":false, "data":true
local (depth 1) => "zero":false, "data":false
backing (depth 2+) => "zero":true, "data":true
libnbd as client is probably a nicer way to get at the information
without having to decipher such hacks in qemu as client. ;)
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201027050556.269064-11-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
- qcow2: Skip copy-on-write when allocating a zero cluster
- qemu-img: add support for rate limit in qemu-img convert/commit
- Fix deadlock when deleting a block node during drain_all
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAl+YOT8RHGt3b2xmQHJl
ZGhhdC5jb20ACgkQfwmycsiPL9YBPw//eAhI/entLtuhhGNgOq4ULmRgL5qrA/jU
Cuel9/xoMsNSmh7hCa5ZSR39Wy4ZbqrTwnJZBSVGetg+v/6R1bvsDKpCKvAgFi14
bsZikgCHILB8oMWue80pTD+FF3SOoyv2C7uPjGM84EZNHRK5herW5qI2XXQWlV+G
PMW9ZgWyYLNoD+D23DZfTusmLVC1PkuVlDahhYj1mNHpLbEqYd9RQBkt+ONhbm1X
Ra6xcbRdL5jzLsb6DNRWUglO+mteNET58VVGdsB2Z4Qu0n+QQtnRl4SaatN0WsSY
5IEbHSOWC40NJQIhluxfcYHnSdPoNfTEyOC6wIb+BaFkZXV99sCYzZh1Gt+jRVM3
WEO47C1HCQ65c3aJcoVw0XC3Bm8MpfNU0DYa7O7hVEcmIA8de3N3yHDusHy552Pm
FMFThGjkuKaRnA59BLCkNBXBgsqXbAzP6TxroBFVcKCEmM89On8ejAJOP0uig6Fq
REqVnCny+0ESRPgw936IGwFHiL2MeRsbqam1TaHcvtgmUX3eMAOIW3xo85F1vziw
d8XxC7vVcDwL2xn6sCD/E2UVhFHQQnZAEs4OTsW0R9FkM2u0PAR4jiiPsKpm+VQb
/wJatDmAAQfVD8SJbsNDJEercWsUdtSEYrPq/rD0e2KaiEUQg6teyJxeVr2r1rq2
d5n4cwXpkYQ=
=ON5g
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- qcow2: Skip copy-on-write when allocating a zero cluster
- qemu-img: add support for rate limit in qemu-img convert/commit
- Fix deadlock when deleting a block node during drain_all
# gpg: Signature made Tue 27 Oct 2020 15:14:07 GMT
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
block: End quiescent sections when a BDS is deleted
qcow2: Skip copy-on-write when allocating a zero cluster
qcow2: Report BDRV_BLOCK_ZERO more accurately in bdrv_co_block_status()
qemu-img: add support for rate limit in qemu-img convert
qemu-img: add support for rate limit in qemu-img commit
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
add support for rate limit in qemu-img convert.
Signed-off-by: Zhengui <lizhengui@huawei.com>
Message-Id: <1603205264-17424-3-git-send-email-lizhengui@huawei.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
add support for rate limit in qemu-img commit.
Signed-off-by: Zhengui <lizhengui@huawei.com>
Message-Id: <1603205264-17424-2-git-send-email-lizhengui@huawei.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The mapping rule system implemented in the last few patches is
extremely flexible, but not easy to use. Add a simple
'map' type as a sprinkling of sugar to make it easy.
e.g.
-o xattrmap=":map::user.virtiofs.:"
would be sufficient to prefix all xattr's
or
-o xattrmap=":map:trusted.:user.virtiofs.:"
would just prefix 'trusted.' xattr's and leave
everything else alone.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20201023165812.36028-6-dgilbert@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Add a few examples of xattrmaps to the documentation.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20201023165812.36028-5-dgilbert@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Add an option to define mappings of xattr names so that
the client and server filesystems see different views.
This can be used to have different SELinux mappings as
seen by the guest, to run the virtiofsd with less privileges
(e.g. in a case where it can't set trusted/system/security
xattrs but you want the guest to be able to), or to isolate
multiple users of the same name; e.g. trusted attributes
used by stacking overlayfs.
A mapping engine is used with 3 simple rules; the rules can
be combined to allow most useful mapping scenarios.
The ruleset is defined by -o xattrmap='rules...'.
This patch doesn't use the rule maps yet.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20201023165812.36028-2-dgilbert@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
virtiofsd cannot run in a container because CAP_SYS_ADMIN is required to
create namespaces.
Introduce a weaker sandbox mode that is sufficient in container
environments because the container runtime already sets up namespaces.
Use chroot to restrict path traversal to the shared directory.
virtiofsd loses the following:
1. Mount namespace. The process chroots to the shared directory but
leaves the mounts in place. Seccomp rejects mount(2)/umount(2)
syscalls.
2. Pid namespace. This should be fine because virtiofsd is the only
process running in the container.
3. Network namespace. This should be fine because seccomp already
rejects the connect(2) syscall, but an additional layer of security
is lost. Container runtime-specific network security policies can be
used drop network traffic (except for the vhost-user UNIX domain
socket).
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20201008085534.16070-1-stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
If you like running QEMU as a normal user (very common for TCG runs)
but you have to run virtiofsd as a root user you run into connection
problems. Adding support for an optional --socket-group allows the
users to keep using the command line.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200925125147.26943-2-alex.bennee@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
dgilbert: Split long line
The virtiofsd --help output documents the cache=auto default value but
the man page does not. Fix this.
Signed-off-by: Harry G. Coin <hgcoin@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200916112250.760245-1-stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
I found that there are many spelling errors in the comments of qemu,
so I used the spellcheck tool to check the spelling errors
and finally found some spelling errors in the docs folder.
Signed-off-by: zhaolichang <zhaolichang@huawei.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20200917075029.313-4-zhaolichang@huawei.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Commit 93bb3d8d4c ("virtiofsd: remove symlink fallbacks") removed
the implementation of the "norace" option, so remove it from the
cmdline help and the documentation too.
Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20200717121110.50580-1-slp@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Right now we enable remote posix locks by default. That means when guest
does a posix lock it sends request to server (virtiofsd). But currently
we only support non-blocking posix lock and return -EOPNOTSUPP for
blocking version.
This means that existing applications which are doing blocking posix
locks get -EOPNOTSUPP and fail. To avoid this, people have been
running virtiosd with option "-o no_posix_lock". For new users it
is still a surprise and trial and error takes them to this option.
Given posix lock implementation is not complete in virtiofsd, disable
it by default. This means that posix locks will work with-in applications
in a guest but not across guests. Anyway we don't support sharing
filesystem among different guests yet in virtiofs so this should
not lead to any kind of surprise or regression and will make life
little easier for virtiofs users.
Reported-by: Aa Aa <jimbothom@yandex.com>
Suggested-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The use of 'qemu-img amend' to change qcow2 backing files is not
tested very well. In particular, our implementation has a bug where
if a new backing file is provided without a format, then the prior
format is blindly reused, even if this results in data corruption, but
this is not caught by iotests.
There are also situations where amending other options needs access to
the original backing file (for example, on a downgrade to a v2 image,
knowing whether a v3 zero cluster must be allocated or may be left
unallocated depends on knowing whether the backing file already reads
as zero), but the command line does not have a nice way to tell us
both the backing file to use for opening the image as well as the
backing file to install after the operation is complete.
Even if we do allow changing the backing file, it is redundant with
the existing ability to change backing files via 'qemu-img rebase -u'.
It is time to deprecate this support (leaving the existing behavior
intact, even if it is buggy), and at a point in the future, require
the use of only 'qemu-img rebase' for adjusting backing chain
relations, saving 'qemu-img amend' for changes unrelated to the
backing chain.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200706203954.341758-8-eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
'force' option will be used for some unsafe amend operations.
This includes things like erasing last keyslot in luks based formats
(which destroys the data, unless the master key is backed up
by external means), but that _might_ be desired result.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200608094030.670121-4-mlevitsk@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Allow capabilities to be added or removed from the allowed set for the
daemon; e.g.
default:
CapPrm: 00000000880000df
CapEff: 00000000880000df
-o modcaps=+sys_admin
CapPrm: 00000000882000df
CapEff: 00000000882000df
-o modcaps=+sys_admin:-chown
CapPrm: 00000000882000de
CapEff: 00000000882000de
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200629115420.98443-4-dgilbert@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Prefer a consistent naming for the --merge argument.
Fixes: 3b51ab4bf
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200529144527.1943527-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>