Commit Graph

56 Commits

Author SHA1 Message Date
Nir Soffer
0961525705 qemu-nbd: Change default cache mode to writeback
Both qemu and qemu-img use writeback cache mode by default, which is
already documented in qemu(1). qemu-nbd uses writethrough cache mode by
default, and the default cache mode is not documented.

According to the qemu-nbd(8):

   --cache=CACHE
          The  cache  mode  to be used with the file.  See the
          documentation of the emulator's -drive cache=... option for
          allowed values.

qemu(1) says:

    The default mode is cache=writeback.

So users have no reason to assume that qemu-nbd is using writethough
cache mode. The only hint is the painfully slow writing when using the
defaults.

Looking in git history, it seems that qemu used writethrough in the past
to support broken guests that did not flush data properly, or could not
flush due to limitations in qemu. But qemu-nbd clients can use
NBD_CMD_FLUSH to flush data, so using writethrough does not help anyone.

Change the default cache mode to writback, and document the default and
available values properly in the online help and manual.

With this change converting image via qemu-nbd is 3.5 times faster.

    $ qemu-img create dst.img 50g
    $ qemu-nbd -t -f raw -k /tmp/nbd.sock dst.img

Before this change:

    $ hyperfine -r3 "./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock"
    Benchmark #1: ./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock
      Time (mean ± σ):     83.639 s ±  5.970 s    [User: 2.733 s, System: 6.112 s]
      Range (min … max):   76.749 s … 87.245 s    3 runs

After this change:

    $ hyperfine -r3 "./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock"
    Benchmark #1: ./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock
      Time (mean ± σ):     23.522 s ±  0.433 s    [User: 2.083 s, System: 5.475 s]
      Range (min … max):   23.234 s … 24.019 s    3 runs

Users can avoid the issue by using --cache=writeback[1] but the defaults
should give good performance for the common use case.

[1] https://bugzilla.redhat.com/1990656

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Message-Id: <20210813205519.50518-1-nsoffer@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-09-29 13:46:31 -05:00
Eric Blake
1899bf4737 qemu-img: Add -F shorthand to convert
Although we have long supported 'qemu-img convert -o
backing_file=foo,backing_fmt=bar', the fact that we have a shortcut -B
for backing_file but none for backing_fmt has made it more likely that
users accidentally run into:

qemu-img: warning: Deprecated use of backing file without explicit backing format

when using -B instead of -o.  For similarity with other qemu-img
commands, such as create and compare, add '-F $fmt' as the shorthand
for '-o backing_fmt=$fmt'.  Update iotest 122 for coverage of both
spellings.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210913131735.1948339-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2021-09-15 18:42:38 +02:00
Paolo Bonzini
06905f6402 docs: standardize directory index to --- with overline
Use a standard heading format for the index.rst file in a directory.
Using overlines makes it clear that individual documents can use e.g.
=== for chapter titles and --- for section titles, as suggested in the
Linux kernel guidelines[1].  They could do it anyway, because documents
included in a toctree are parsed separately and therefore are not tied
to the same conventions for headings.  However, keeping some consistency is
useful since sometimes files are included from multiple places.

[1] https://www.kernel.org/doc/html/latest/doc-guide/sphinx.html

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-13 13:56:26 +02:00
Paolo Bonzini
8a1f7d299c docs: standardize book titles to === with overline
Documents within a Sphinx manual are separate files and therefore can use
different conventions for headings.  However, keeping some consistency is
useful so that included files are easy to get right.

This patch uses a standard heading format for book titles, so that it is
obvious when a file sits at the top level toctree of a book or man page.
The heading is irrelevant for man pages, but keep it consistent as well.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-13 13:56:26 +02:00
Peter Maydell
4d6646c7de docs/tools/virtiofsd.rst: Delete stray backtick
The documentation of the posix_acl option has a stray backtick
at the end of the text (which is rendered literally into the HTML).
Delete it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 20210726142338.31872-11-peter.maydell@linaro.org
2021-08-02 12:55:51 +01:00
Eric Blake
955171e441 qemu-img: Add --skip-broken-bitmaps for 'convert --bitmaps'
The point of 'qemu-img convert --bitmaps' is to be a convenience for
actions that are already possible through a string of smaller
'qemu-img bitmap' sub-commands.  One situation not accounted for
already is that if a source image contains an inconsistent bitmap (for
example, because a qemu process died abruptly before flushing bitmap
state), the user MUST delete those inconsistent bitmaps before
anything else useful can be done with the image.

We don't want to delete inconsistent bitmaps by default: although a
corrupt bitmap is only a loss of optimization rather than a corruption
of user-visible data, it is still nice to require the user to opt in
to the fact that they are aware of the loss of the bitmap.  Still,
requiring the user to check 'qemu-img info' to see whether bitmaps are
consistent, then use 'qemu-img bitmap --remove' to remove offenders,
all before using 'qemu-img convert', is a lot more work than just
adding a knob 'qemu-img convert --bitmaps --skip-broken-bitmaps' which
opts in to skipping the broken bitmaps.

After testing the new option, also demonstrate the way to manually fix
things (either deleting bad bitmaps, or re-creating them as empty) so
that it is possible to convert without the option.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1946084
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210709153951.2801666-4-eblake@redhat.com>
[eblake: warning message tweak, test enhancements]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-07-21 14:14:41 -05:00
Peter Maydell
21b6c26d63 docs: Remove "Contents:" lines from top-level subsections
Since the top-level subsections aren't self-contained manuals
any more, the "Contents:" lines at the top of each of their
index pages look a bit odd; remove them.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20210705095547.15790-4-peter.maydell@linaro.org
2021-07-18 10:59:46 +01:00
Peter Maydell
b4634487c4 docs: Stop calling the top level subsections of our manual 'manuals'
We merged our previous multiple-manual setup into a single Sphinx
manual, but we left some text in the various index.rst lines that
still calls the top level subsections separate 'manuals'.  Update
them to talk about "this section of the manual" instead, and remove
now-obsolete comments about how the index.rst files are the "top
level page for the 'foo' manual".

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20210705095547.15790-3-peter.maydell@linaro.org
2021-07-18 10:59:46 +01:00
Eric Blake
a275b452c6 qemu-img: Reword 'qemu-img map --output=json' docs
Reword the paragraphs to list the JSON key first, rather than in the
middle of prose.

Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210707184125.2551140-1-eblake@redhat.com>
Reviewed-by: Nir Soffer <nsoffer@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-07-12 11:11:48 -05:00
Eric Blake
8417e1378c qemu-img: Make unallocated part of backing chain obvious in map
The recently-added NBD context qemu:allocation-depth is able to
distinguish between locally-present data (even when that data is
sparse) [shown as depth 1 over NBD], and data that could not be found
anywhere in the backing chain [shown as depth 0]; and the libnbd
project was recently patched to give the human-readable name "absent"
to an allocation-depth of 0.  But qemu-img map --output=json predates
that addition, and has the unfortunate behavior that all portions of
the backing chain that resolve without finding a hit in any backing
layer report the same depth as the final backing layer.  This makes it
harder to reconstruct a qcow2 backing chain using just 'qemu-img map'
output, especially when using "backing":null to artificially limit a
backing chain, because it is impossible to distinguish between a
QCOW2_CLUSTER_UNALLOCATED (which defers to a [missing] backing file)
and a QCOW2_CLUSTER_ZERO_PLAIN cluster (which would override any
backing file), since both types of clusters otherwise show as
"data":false,"zero":true" (but note that we can distinguish a
QCOW2_CLUSTER_ZERO_ALLOCATED, which would also have an "offset":
listing).

The task of reconstructing a qcow2 chain was made harder in commit
0da9856851 (nbd: server: Report holes for raw images), because prior
to that point, it was possible to abuse NBD's block status command to
see which portions of a qcow2 file resulted in BDRV_BLOCK_ALLOCATED
(showing up as NBD_STATE_ZERO in isolation) vs. missing from the chain
(showing up as NBD_STATE_ZERO|NBD_STATE_HOLE); but now qemu reports
more accurate sparseness information over NBD.

An obvious solution is to make 'qemu-img map --output=json' add an
additional "present":false designation to any cluster lacking an
allocation anywhere in the chain, without any change to the "depth"
parameter to avoid breaking existing clients.  The iotests have
several examples where this distinction demonstrates the additional
accuracy.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210701190655.2131223-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: fix more iotest fallout]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-07-12 11:10:53 -05:00
Vivek Goyal
65a820d292 virtiofsd: Add an option to enable/disable posix acls
fuse has an option FUSE_POSIX_ACL which needs to be opted in by fuse
server to enable posix acls. As of now we are not opting in for this,
so posix acls are disabled on virtiofs by default.

Add virtiofsd option "-o posix_acl/no_posix_acl" to let users enable/disable
posix acl support. By default it is disabled as of now due to performance
concerns with cache=none.

Currently even if file server has not opted in for FUSE_POSIX_ACL, user can
still query acl and set acl, and system.posix_acl_access and
system.posix_acl_default xattrs show up listxattr response.

Miklos said this is confusing. So he said lets block and filter
system.posix_acl_access and system.posix_acl_default xattrs in
getxattr/setxattr/listxattr if user has explicitly disabled
posix acls using -o no_posix_acl.

As of now continuing to keeping the existing behavior if user did not
specify any option to disable acl support due to concerns about backward
compatibility.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Message-Id: <20210622150852.1507204-8-vgoyal@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Daniel P. Berrangé
3399bca451 docs: describe the security considerations with virtiofsd xattr mapping
Different guest xattr prefixes have distinct access control rules applied
by the guest. When remapping a guest xattr care must be taken that the
remapping does not allow the a guest user to bypass guest kernel access
control rules.

For example if 'trusted.*' which requires CAP_SYS_ADMIN is remapped
to 'user.virtiofs.trusted.*', an unprivileged guest user which can
write to 'user.*' can bypass the CAP_SYS_ADMIN control. Thus the
target of any remapping must be explicitly blocked from read/writes
by the guest, to prevent access control bypass.

The examples shown in the virtiofsd man page already do the right
thing and ensure safety, but the security implications of getting
this wrong were not made explicit. This could lead to host admins
and apps unwittingly creating insecure configurations.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210611120427.49736-1-berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-05 10:51:26 +01:00
Thomas Huth
af94f14046 docs/tools/virtiofsd: Fix bad rst syntax
For literal blocks, there has to be an empty line after the two colons,
and the block itself should be indented.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210607180015.924571-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-06-21 05:43:11 +02:00
Thomas Huth
771f3be1b5 docs/tools/virtiofsd.rst: Do not hard-code the QEMU binary name
In downstream, we want to use a different name for the QEMU binary,
and some people might also use the docs for non-x86 binaries, that's
why we already created the |qemu_system| placeholder in the past.
Use it now in the virtiofsd doc, too.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210607174250.920226-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-06-21 05:43:11 +02:00
Peter Maydell
32de74a1ac Block patches:
- drop block/io write notifiers
 - qemu-iotests enhancements to make debugging easier
 - rbd parsing fix
 - HMP qemu-io fix (for iothreads)
 - mirror job cancel relaxation (do not cancel in-flight requests when a
   READY mirror job is canceled with force=false)
 - document qcow2's data_file and data_file_raw features
 - fix iotest 297 for pylint 2.8
 - block/copy-on-read refactoring
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEkb62CjDbPohX0Rgp9AfbAGHVz0AFAmCeqLwSHG1yZWl0ekBy
 ZWRoYXQuY29tAAoJEPQH2wBh1c9A9pMIAKYIlLQfSSMdy0fZ+6AHiAjaTZAaDr4G
 d6NDz/RONZEoxcl01LkUWJfvqH/IdCLx5q4cl9SU4+JzMdKW9K1xBLdAGousuhk/
 geYqymbORj/VntJDYwp30KUlC0pLUBbuuzYN+QXrLp5qJvS9nPBcxEPjfSc6GX9z
 Bt+GCRW08+C4WKJ3lGu9zNGe47gTFUE/VodUYG4tKg5xZFzsAWd/PZZaVOdW0fCz
 /0tdxN4N82XT+cE/lA0Tm6B6L3ZueMAt8byu4BPz21M7kULNn2roVMiFKJELZlZQ
 0RyDXH2jb/aH/ha6gJ4S+JhMvq45rH9GuQeAYl6IPngbta+NbZW+U4w=
 =+Kha
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2021-05-14' into staging

Block patches:
- drop block/io write notifiers
- qemu-iotests enhancements to make debugging easier
- rbd parsing fix
- HMP qemu-io fix (for iothreads)
- mirror job cancel relaxation (do not cancel in-flight requests when a
  READY mirror job is canceled with force=false)
- document qcow2's data_file and data_file_raw features
- fix iotest 297 for pylint 2.8
- block/copy-on-read refactoring

# gpg: Signature made Fri 14 May 2021 17:43:40 BST
# gpg:                using RSA key 91BEB60A30DB3E8857D11829F407DB0061D5CF40
# gpg:                issuer "mreitz@redhat.com"
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full]
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* remotes/maxreitz/tags/pull-block-2021-05-14:
  write-threshold: deal with includes
  test-write-threshold: drop extra TestStruct structure
  test-write-threshold: drop extra tests
  block/write-threshold: drop extra APIs
  test-write-threshold: rewrite test_threshold_(not_)trigger tests
  block: drop write notifiers
  block/write-threshold: don't use write notifiers
  qemu-iotests: fix pylint 2.8 consider-using-with error
  block/copy-on-read: use bdrv_drop_filter() and drop s->active
  Document qemu-img options data_file and data_file_raw
  qemu-iotests: fix case of SOCK_DIR already in the environment
  qemu-iotests: let "check" spawn an arbitrary test command
  qemu-iotests: move command line and environment handling from TestRunner to TestEnv
  qemu-iotests: allow passing unittest.main arguments to the test scripts
  qemu-iotests: do not buffer the test output
  mirror: stop cancelling in-flight requests on non-force cancel in READY
  monitor: hmp_qemu_io: acquire aio contex, fix crash
  block/rbd: Add an escape-aware strchr helper
  iotests/231: Update expected deprecation message

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-17 11:29:59 +01:00
Connor Kuehl
d65173f924 Document qemu-img options data_file and data_file_raw
The contents of this patch were initially developed and posted by Han
Han[1], however, it appears the original patch was not applied. Since
then, the relevant documentation has been moved and adapted to a new
format.

I've taken most of the original wording and tweaked it according to
some of the feedback from the original patch submission. I've also
adapted it to restructured text, which is the format the documentation
currently uses.

[1] https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01253.html

Fixes: https://bugzilla.redhat.com/1763105
Signed-off-by: Han Han <hhan@redhat.com>
Suggested-by: Max Reitz <mreitz@redhat.com>
[ Max: provided description of data_file_raw behavior ]
Signed-off-by: Connor Kuehl <ckuehl@redhat.com>
Message-Id: <20210505195512.391128-1-ckuehl@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-05-14 16:14:10 +02:00
Marc-André Lureau
73e6aec652 sphinx: adopt kernel readthedoc theme
The default "alabaster" sphinx theme has a couple shortcomings:
- the navbar moves along the page
- the search bar is not always at the same place
- it lacks some contrast and colours

The "rtd" theme from readthedocs.org is a popular third party theme used
notably by the kernel, with a custom style sheet. I like it better,
perhaps others do too. It also simplifies the "Edit on Gitlab" links.

Tweak a bit the custom theme to match qemu.org style, use the
QEMU logo, and favicon etc.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210323115328.4146052-1-marcandre.lureau@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
2021-05-14 15:05:03 +04:00
Marc-André Lureau
816f93b200 docs: simplify each section title
Now that we merged into one doc, it makes the nav looks nicer.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210323074704.4078381-1-marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
2021-04-01 15:27:44 +04:00
Max Reitz
220222a0fe qsd: Document FUSE exports
Implementing FUSE exports required no changes to the storage daemon, so
we forgot to document them there.  Considering that both NBD and
vhost-user-blk exports are documented in its man page (and NBD exports
in its --help text), we should probably do the same for FUSE.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210217115844.62661-1-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2021-03-29 18:28:33 +02:00
Kevin Wolf
99b1e64688 qemu-img: Use user_creatable_process_cmdline() for --object
This switches qemu-img from a QemuOpts-based parser for --object to
user_creatable_process_cmdline() which uses a keyval parser and enforces
the QAPI schema.

Apart from being a cleanup, this makes non-scalar properties accessible.

As a side effect, fix wrong exit codes in the object parsing error path
of 'qemu-img compare'. This was broken in commit 334c43e2c3 because
&error_fatal exits with an exit code of 1, while it should have been 2.

Document that exit code 0 is also returned when just requested help was
printed instead of comparing images. This is preexisting behaviour that
isn't changed by this patch, though another instance of it is added with
'--object help'.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2021-03-19 10:17:14 +01:00
Kevin Wolf
ef809f709d docs: qsd: Explain --export nbd,name=... default
The 'name' option for NBD exports is optional. Add a note that the
default for the option is the node name (people could otherwise expect
that it's the empty string like for qemu-nbd).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210305094856.18964-1-kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-08 14:56:55 +01:00
Stefan Hajnoczi
e246bf3ddc docs: replace insecure /tmp examples in qsd docs
World-writeable directories have security issues. Avoid showing them in
the documentation since someone might accidentally use them in
situations where they are insecure.

There tend to be 3 security problems:
1. Denial of service. An adversary may be able to create the file
   beforehand, consume all space/inodes, etc to sabotage us.
2. Impersonation. An adversary may be able to create a listen socket and
   accept incoming connections that were meant for us.
3. Unauthenticated client access. An adversary may be able to connect to
   us if we did not set the uid/gid and permissions correctly.

These can be prevented or mitigated with private /tmp, carefully setting
the umask, etc but that requires special action and does not apply to
all situations. Just avoid using /tmp in examples.

Reported-by: Richard W.M. Jones <rjones@redhat.com>
Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210301172728.135331-3-stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-08 14:55:19 +01:00
Stefan Hajnoczi
3f14b909eb docs: show how to spawn qemu-storage-daemon with fd passing
The QMP monitor, NBD server, and vhost-user-blk export all support file
descriptor passing. This is a useful technique because it allows the
parent process to spawn and wait for qemu-storage-daemon without busy
waiting, which may delay startup due to arbitrary sleep() calls.

This Python example is inspired by the test case written for libnbd by
Richard W.M. Jones <rjones@redhat.com>:
89113f484e

Thanks to Daniel P. Berrangé <berrange@redhat.com> for suggestions on
how to get this working. Now let's document it!

Reported-by: Richard W.M. Jones <rjones@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210301172728.135331-2-stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-08 14:55:19 +01:00
Stefan Hajnoczi
03d2b412aa qemu-storage-daemon: add --pidfile option
Daemons often have a --pidfile option where the pid is written to a file
so that scripts can stop the daemon by sending a signal.

The pid file also acts as a lock to prevent multiple instances of the
daemon from launching for a given pid file.

QEMU, qemu-nbd, qemu-ga, virtiofsd, and qemu-pr-helper all support the
--pidfile option. Add it to qemu-storage-daemon too.

Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210302142746.170535-1-stefanha@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-08 14:55:18 +01:00
Dr. David Alan Gilbert
e586edcb41 virtiofs: drop remapped security.capability xattr as needed
On Linux, the 'security.capability' xattr holds a set of
capabilities that can change when an executable is run, giving
a limited form of privilege escalation to those programs that
the writer of the file deemed worthy.

Any write causes the 'security.capability' xattr to be dropped,
stopping anyone from gaining privilege by modifying a blessed
file.

Fuse relies on the daemon to do this dropping, and in turn the
daemon relies on the host kernel to drop the xattr for it.  However,
with the addition of -o xattrmap, the xattr that the guest
stores its capabilities in is now not the same as the one that
the host kernel automatically clears.

Where the mapping changes 'security.capability', explicitly clear
the remapped name to preserve the same behaviour.

This bug is assigned CVE-2021-20263.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
2021-03-04 10:26:16 +00:00
Daniel P. Berrangé
c23874132b docs: update to show preferred boolean syntax for -chardev
The preferred syntax is to use "foo=on|off", rather than a bare
"foo" or "nofoo".

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210216191027.595031-8-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-25 14:14:33 +01:00
Eric Blake
3dcf56e625 qemu-nbd: Permit --shared=0 for unlimited clients
This gives us better feature parity with QMP nbd-server-start, where
max-connections defaults to 0 for unlimited.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210209152759.209074-3-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-12 07:42:08 -06:00
Peter Maydell
b93f4fbdc4 docs: Build and install all the docs in a single manual
When we first converted our documentation to Sphinx, we split it into
multiple manuals (system, interop, tools, etc), which are all built
separately.  The primary driver for this was wanting to be able to
avoid shipping the 'devel' manual to end-users.  However, this is
working against the grain of the way Sphinx wants to be used and
causes some annoyances:
 * Cross-references between documents become much harder or
   possibly impossible
 * There is no single index to the whole documentation
 * Within one manual there's no links or table-of-contents info
   that lets you easily navigate to the others
 * The devel manual doesn't get published on the QEMU website
   (it would be nice to able to refer to it there)

Merely hiding our developer documentation from end users seems like
it's not enough benefit for these costs.  Combine all the
documentation into a single manual (the same way that the readthedocs
site builds it) and install the whole thing.  The previous manual
divisions remain as the new top level sections in the manual.

 * The per-manual conf.py files are no longer needed
 * The man_pages[] specifications previously in each per-manual
   conf.py move to the top level conf.py
 * docs/meson.build logic is simplified as we now only need to run
   Sphinx once for the HTML and then once for the manpages5B
 * The old index.html.in that produced the top-level page with
   links to each manual is no longer needed

Unfortunately this means that we now have to build the HTML
documentation into docs/manual in the build tree rather than directly
into docs/; otherwise it is too awkward to ensure we install only the
built manual and not also the dependency info, stamp file, etc.  The
manual still ends up in the same place in the final installed
directory, but anybody who was consulting documentation from within
the build tree will have to adjust where they're looking.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210115154449.4801-1-peter.maydell@linaro.org
2021-01-19 15:45:14 +00:00
Stefan Hajnoczi
1982e1602d docs: add qemu-storage-daemon(1) man page
Document the qemu-storage-daemon tool. Most of the command-line options
are identical to their QEMU counterparts. Perhaps Sphinx hxtool
integration could be extended to extract documentation for individual
command-line options so they can be shared. For now the
qemu-storage-daemon simply refers to the qemu(1) man page where the
command-line options are identical.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20201209103802.350848-3-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-12-18 11:48:39 +01:00
Peter Maydell
773ee3f1ea docs: Split qemu-pr-helper documentation into tools manual
Split the documentation of the qemu-pr-helper binary into the tools
manual, and give it a manpage like our other standalone executables.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2020-11-23 11:10:04 +00:00
Stefan Weil
ac9574bc87 docs: Fix some typos (found by codespell)
Fix also a similar typo in a code comment.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20201117193448.393472-1-sw@weilnetz.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-18 09:29:41 +01:00
Daniel P. Berrangé
704a256da8 docs: add "page source" link to sphinx documentation
Add a link to the top of the sidebar in every docs page that takes the
user back to the source code in gitlab.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20201102130926.161183-5-berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2020-11-10 08:51:30 +01:00
Eric Blake
dbc7b01492 nbd: Add 'qemu-nbd -A' to expose allocation depth
Allow the server to expose an additional metacontext to be requested
by savvy clients.  qemu-nbd adds a new option -A to expose the
qemu:allocation-depth metacontext through NBD_CMD_BLOCK_STATUS; this
can also be set via QMP when using block-export-add.

qemu as client is hacked into viewing the key aspects of this new
context by abusing the already-experimental x-dirty-bitmap option to
collapse all depths greater than 2, which results in a tri-state value
visible in the output of 'qemu-img map --output=json' (yes, that means
x-dirty-bitmap is now a bit of a misnomer, but I didn't feel like
renaming it as it would introduce a needless break of back-compat,
even though we make no compat guarantees with x- members):

unallocated (depth 0) => "zero":false, "data":true
local (depth 1)       => "zero":false, "data":false
backing (depth 2+)    => "zero":true,  "data":true

libnbd as client is probably a nicer way to get at the information
without having to decipher such hacks in qemu as client. ;)

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201027050556.269064-11-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2020-10-30 15:22:00 -05:00
Peter Maydell
c99fa56b95 Block layer patches:
- qcow2: Skip copy-on-write when allocating a zero cluster
 - qemu-img: add support for rate limit in qemu-img convert/commit
 - Fix deadlock when deleting a block node during drain_all
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAl+YOT8RHGt3b2xmQHJl
 ZGhhdC5jb20ACgkQfwmycsiPL9YBPw//eAhI/entLtuhhGNgOq4ULmRgL5qrA/jU
 Cuel9/xoMsNSmh7hCa5ZSR39Wy4ZbqrTwnJZBSVGetg+v/6R1bvsDKpCKvAgFi14
 bsZikgCHILB8oMWue80pTD+FF3SOoyv2C7uPjGM84EZNHRK5herW5qI2XXQWlV+G
 PMW9ZgWyYLNoD+D23DZfTusmLVC1PkuVlDahhYj1mNHpLbEqYd9RQBkt+ONhbm1X
 Ra6xcbRdL5jzLsb6DNRWUglO+mteNET58VVGdsB2Z4Qu0n+QQtnRl4SaatN0WsSY
 5IEbHSOWC40NJQIhluxfcYHnSdPoNfTEyOC6wIb+BaFkZXV99sCYzZh1Gt+jRVM3
 WEO47C1HCQ65c3aJcoVw0XC3Bm8MpfNU0DYa7O7hVEcmIA8de3N3yHDusHy552Pm
 FMFThGjkuKaRnA59BLCkNBXBgsqXbAzP6TxroBFVcKCEmM89On8ejAJOP0uig6Fq
 REqVnCny+0ESRPgw936IGwFHiL2MeRsbqam1TaHcvtgmUX3eMAOIW3xo85F1vziw
 d8XxC7vVcDwL2xn6sCD/E2UVhFHQQnZAEs4OTsW0R9FkM2u0PAR4jiiPsKpm+VQb
 /wJatDmAAQfVD8SJbsNDJEercWsUdtSEYrPq/rD0e2KaiEUQg6teyJxeVr2r1rq2
 d5n4cwXpkYQ=
 =ON5g
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- qcow2: Skip copy-on-write when allocating a zero cluster
- qemu-img: add support for rate limit in qemu-img convert/commit
- Fix deadlock when deleting a block node during drain_all

# gpg: Signature made Tue 27 Oct 2020 15:14:07 GMT
# gpg:                using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg:                issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  block: End quiescent sections when a BDS is deleted
  qcow2: Skip copy-on-write when allocating a zero cluster
  qcow2: Report BDRV_BLOCK_ZERO more accurately in bdrv_co_block_status()
  qemu-img: add support for rate limit in qemu-img convert
  qemu-img: add support for rate limit in qemu-img commit

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-10-30 14:36:52 +00:00
Zhengui
0c8c4895a6 qemu-img: add support for rate limit in qemu-img convert
add support for rate limit in qemu-img convert.

Signed-off-by: Zhengui <lizhengui@huawei.com>
Message-Id: <1603205264-17424-3-git-send-email-lizhengui@huawei.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-10-27 15:26:20 +01:00
Zhengui
a0441b66e8 qemu-img: add support for rate limit in qemu-img commit
add support for rate limit in qemu-img commit.

Signed-off-by: Zhengui <lizhengui@huawei.com>
Message-Id: <1603205264-17424-2-git-send-email-lizhengui@huawei.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-10-27 15:26:20 +01:00
Dr. David Alan Gilbert
1d84a0213a tools/virtiofsd: xattr name mappings: Simple 'map'
The mapping rule system implemented in the last few patches is
extremely flexible, but not easy to use.  Add a simple
'map' type as a sprinkling of sugar to make it easy.

e.g.

  -o xattrmap=":map::user.virtiofs.:"

would be sufficient to prefix all xattr's
or

  -o xattrmap=":map:trusted.:user.virtiofs.:"

would just prefix 'trusted.' xattr's and leave
everything else alone.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20201023165812.36028-6-dgilbert@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-10-26 18:35:32 +00:00
Dr. David Alan Gilbert
491bfaea3b tools/virtiofsd: xattr name mapping examples
Add a few examples of xattrmaps to the documentation.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20201023165812.36028-5-dgilbert@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-10-26 18:35:32 +00:00
Dr. David Alan Gilbert
6084633dff tools/virtiofsd: xattr name mappings: Add option
Add an option to define mappings of xattr names so that
the client and server filesystems see different views.
This can be used to have different SELinux mappings as
seen by the guest, to run the virtiofsd with less privileges
(e.g. in a case where it can't set trusted/system/security
xattrs but you want the guest to be able to), or to isolate
multiple users of the same name; e.g. trusted attributes
used by stacking overlayfs.

A mapping engine is used with 3 simple rules; the rules can
be combined to allow most useful mapping scenarios.
The ruleset is defined by -o xattrmap='rules...'.

This patch doesn't use the rule maps yet.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20201023165812.36028-2-dgilbert@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-10-26 18:35:32 +00:00
Stefan Hajnoczi
06844584b6 virtiofsd: add container-friendly -o sandbox=chroot option
virtiofsd cannot run in a container because CAP_SYS_ADMIN is required to
create namespaces.

Introduce a weaker sandbox mode that is sufficient in container
environments because the container runtime already sets up namespaces.
Use chroot to restrict path traversal to the shared directory.

virtiofsd loses the following:

1. Mount namespace. The process chroots to the shared directory but
   leaves the mounts in place. Seccomp rejects mount(2)/umount(2)
   syscalls.

2. Pid namespace. This should be fine because virtiofsd is the only
   process running in the container.

3. Network namespace. This should be fine because seccomp already
   rejects the connect(2) syscall, but an additional layer of security
   is lost. Container runtime-specific network security policies can be
   used drop network traffic (except for the vhost-user UNIX domain
   socket).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20201008085534.16070-1-stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-10-26 18:35:32 +00:00
Alex Bennée
f6698f2b03 tools/virtiofsd: add support for --socket-group
If you like running QEMU as a normal user (very common for TCG runs)
but you have to run virtiofsd as a root user you run into connection
problems. Adding support for an optional --socket-group allows the
users to keep using the command line.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

Message-Id: <20200925125147.26943-2-alex.bennee@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  dgilbert: Split long line
2020-10-12 12:39:38 +01:00
Harry G. Coin
f1303afe22 virtiofsd: document cache=auto default
The virtiofsd --help output documents the cache=auto default value but
the man page does not. Fix this.

Signed-off-by: Harry G. Coin <hgcoin@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200916112250.760245-1-stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-09-25 12:45:58 +01:00
zhaolichang
76ca4b58c2 docs/: fix some comment spelling errors
I found that there are many spelling errors in the comments of qemu,
so I used the spellcheck tool to check the spelling errors
and finally found some spelling errors in the docs folder.

Signed-off-by: zhaolichang <zhaolichang@huawei.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20200917075029.313-4-zhaolichang@huawei.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-09-17 20:37:13 +02:00
Sergio Lopez
e9a78564a1 virtiofsd: Remove "norace" from cmdline help and docs
Commit 93bb3d8d4c ("virtiofsd: remove symlink fallbacks") removed
the implementation of the "norace" option, so remove it from the
cmdline help and the documentation too.

Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20200717121110.50580-1-slp@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-08-28 13:34:52 +01:00
Vivek Goyal
88fc107956 virtiofsd: Disable remote posix locks by default
Right now we enable remote posix locks by default. That means when guest
does a posix lock it sends request to server (virtiofsd). But currently
we only support non-blocking posix lock and return -EOPNOTSUPP for
blocking version.

This means that existing applications which are doing blocking posix
locks get -EOPNOTSUPP and fail. To avoid this, people have been
running virtiosd with option "-o no_posix_lock". For new users it
is still a surprise and trial and error takes them to this option.

Given posix lock implementation is not complete in virtiofsd, disable
it by default. This means that posix locks will work with-in applications
in a guest but not across guests. Anyway we don't support sharing
filesystem among different guests yet in virtiofs so this should
not lead to any kind of surprise or regression and will make life
little easier for virtiofs users.

Reported-by: Aa Aa <jimbothom@yandex.com>
Suggested-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-08-28 13:34:52 +01:00
Eric Blake
bc5ee6da71 qcow2: Deprecate use of qemu-img amend to change backing file
The use of 'qemu-img amend' to change qcow2 backing files is not
tested very well.  In particular, our implementation has a bug where
if a new backing file is provided without a format, then the prior
format is blindly reused, even if this results in data corruption, but
this is not caught by iotests.

There are also situations where amending other options needs access to
the original backing file (for example, on a downgrade to a v2 image,
knowing whether a v3 zero cluster must be allocated or may be left
unallocated depends on knowing whether the backing file already reads
as zero), but the command line does not have a nice way to tell us
both the backing file to use for opening the image as well as the
backing file to install after the operation is complete.

Even if we do allow changing the backing file, it is redundant with
the existing ability to change backing files via 'qemu-img rebase -u'.
It is time to deprecate this support (leaving the existing behavior
intact, even if it is buggy), and at a point in the future, require
the use of only 'qemu-img rebase' for adjusting backing chain
relations, saving 'qemu-img amend' for changes unrelated to the
backing chain.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200706203954.341758-8-eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2020-07-14 15:18:59 +02:00
Maxim Levitsky
a3579bfa0a block/amend: add 'force' option
'force' option will be used for some unsafe amend operations.

This includes things like erasing last keyslot in luks based formats
(which destroys the data, unless the master key is backed up
by external means), but that _might_ be desired result.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200608094030.670121-4-mlevitsk@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2020-07-06 08:49:28 +02:00
Dr. David Alan Gilbert
3005c099ef virtiofsd: Allow addition or removal of capabilities
Allow capabilities to be added or removed from the allowed set for the
daemon; e.g.

default:
CapPrm: 00000000880000df
CapEff: 00000000880000df

-o modcaps=+sys_admin

CapPrm: 00000000882000df
CapEff: 00000000882000df

-o modcaps=+sys_admin:-chown

CapPrm: 00000000882000de
CapEff: 00000000882000de

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200629115420.98443-4-dgilbert@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2020-07-03 16:23:05 +01:00
Eric Blake
1d74594065 qemu-img: Fix doc typo for 'bitmap' subcommand
Prefer a consistent naming for the --merge argument.

Fixes: 3b51ab4bf
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200529144527.1943527-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2020-06-09 15:47:09 -05:00
Eric Blake
15e39ad950 qemu-img: Add convert --bitmaps option
Make it easier to copy all the persistent bitmaps of (the top layer
of) a source image along with its guest-visible contents, by adding a
boolean flag for use with qemu-img convert.  This is basically
shorthand, as the same effect could be accomplished with a series of
'qemu-img bitmap --add' and 'qemu-img bitmap --merge -b source'
commands, or by their corresponding QMP commands.

Note that this command will fail in the same scenarios where 'qemu-img
measure' omits a 'bitmaps size:' line, namely, when either the source
or the destination lacks persistent bitmap support altogether.

See also https://bugzilla.redhat.com/show_bug.cgi?id=1779893

While touching this, clean up a couple coding issues spotted in the
same function: an extra blank line, and merging back-to-back 'if
(!skip_create)' blocks.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200521192137.1120211-5-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2020-05-28 13:16:30 -05:00