mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Vladimir Sementsov-Ogievskiy	6c5f7b3a10	block: add forgotten bdrv_abort_perm_update() to bdrv_co_invalidate_cache() Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201106124241.16950-2-vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-11-09 18:43:31 +01:00
Eric Blake	122860bae7	block: Fix integer promotion error in bdrv_getlength() Back in 2015, we attempted to fix error reporting for images that claimed to have more than INT64_MAX/512 sectors, but due to the type promotions caused by BDRV_SECTOR_SIZE being unsigned, this inadvertently forces all negative ret values to be slammed into -EFBIG rather than the original error. While we're at it, we can avoid the confusing ?: by spelling the logic more directly. Fixes: `4a9c9ea0d3` Reported-by: Guoyi Tu <tu.guoyi@h3c.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201105155122.60943-1-eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-11-09 15:44:21 +01:00
Eric Blake	9812e7125b	qapi: Add QAPI_LIST_PREPEND() macro block.c has a useful macro QAPI_LIST_ADD() for inserting at the front of any QAPI-generated list; move it from block.c to qapi/util.h so more places can use it, including one earlier place in block.c, and rename it to something more obvious (since we also have a lot of places that append, rather than prepend, to a list). There are many more places in the codebase that can benefit from using the macro, but converting them will be left to later patches. In theory, all QAPI list types are child classes of GenericList; but in practice, that relationship is not explicitly spelled out in the C type declarations (rather, it is something that happens implicitly due to C compatible layouts), and the macro does not actually depend on the GenericList type. We considered moving GenericList from visitor.h into util.h to group related code; however, such a move would be awkward if we do not also move GenericAlternate. Unfortunately, moving GenericAlternate would introduce its own problems of declaration circularity (qapi-builtin-types.h needs a complete definition of QEnumLookup from util.h, but GenericAlternate needs a complete definition of QType from qapi-builtin-types.h). Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20201027050556.269064-3-eblake@redhat.com> [eblake: s/ADD/PREPEND/ per suggestion by Markus]	2020-10-30 15:10:14 -05:00
Eric Blake	159f844238	block: Simplify QAPI_LIST_ADD There is no need to rely on the verbosity of the gcc/clang compiler extension of g_new(typeof(X), 1) when we can instead use the standard g_malloc(sizeof(X)). In general, we like g_new over g_malloc for returning type X rather than void* to let the compiler catch more potential typing mistakes, but in this particular macro, our other use of typeof on the same line already ensures we are getting correct results. Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201027050556.269064-2-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Markus Armbruster <armbru@redhat.com>	2020-10-30 15:10:14 -05:00
Greg Kurz	1a6d3bd229	block: End quiescent sections when a BDS is deleted If a BDS gets deleted during blk_drain_all(), it might miss a call to bdrv_do_drained_end(). This means missing a call to aio_enable_external() and the AIO context remains disabled for ever. This can cause a device to become irresponsive and to disrupt the guest execution, ie. hang, loop forever or worse. This scenario is quite easy to encounter with virtio-scsi on POWER when punching multiple blockdev-create QMP commands while the guest is booting and it is still running the SLOF firmware. This happens because SLOF disables/re-enables PCI devices multiple times via IO/MEM/MASTER bits of PCI_COMMAND register after the initial probe/feature negotiation, as it tends to work with a single device at a time at various stages like probing and running block/network bootloaders without doing a full reset in-between. This naturally generates many dataplane stops and starts, and thus many drain sections that can race with blockdev_create_run(). In the end, SLOF bails out. It is somehow reproducible on x86 but it requires to generate articial dataplane start/stop activity with stop/cont QMP commands. In this case, seabios ends up looping for ever, waiting for the virtio-scsi device to send a response to a command it never received. Add a helper that pairs all previously called bdrv_do_drained_begin() with a bdrv_do_drained_end() and call it from bdrv_close(). While at it, update the "/bdrv-drain/graph-change/drain_all" test in test-bdrv-drain so that it can catch the issue. BugId: https://bugzilla.redhat.com/show_bug.cgi?id=1874441 Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <160346526998.272601.9045392804399803158.stgit@bahia.lan> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-27 15:26:20 +01:00
Kevin Wolf	18c6ac1c6e	block: Add bdrv_lock()/unlock() Inside of coroutine context, we can't directly use aio_context_acquire() for the AioContext of a block node because we already own the lock of the current AioContext and we need to avoid double locking to prevent deadlocks. This provides helper functions to lock the AioContext of a node only if it's not the same as the current AioContext. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20201005155855.256490-14-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2020-10-09 07:08:20 +02:00
Kevin Wolf	e336fd4c4b	block: Add bdrv_co_enter()/leave() Add a pair of functions to temporarily move the current coroutine to the AioContext of a given BlockDriverState. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20201005155855.256490-13-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2020-10-09 07:08:20 +02:00
Vladimir Sementsov-Ogievskiy	9bb4b066cc	block: generate coroutine-wrapper code Use code generation implemented in previous commit to generated coroutine wrappers in block.c and block/io.c Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20200924185414.28642-6-vsementsov@virtuozzo.com>	2020-10-05 10:59:42 +01:00
Vladimir Sementsov-Ogievskiy	21c2283ebc	block: declare some coroutine functions in block/coroutines.h We are going to keep coroutine-wrappers code (structure-packing parameters, BDRV_POLL wrapper functions) in separate auto-generated files. So, we'll need a header with declaration of original _co_ functions, for those which are static now. As well, we'll need declarations for wrapper functions. Do these declarations now, as a preparation step. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20200924185414.28642-4-vsementsov@virtuozzo.com>	2020-10-05 09:35:52 +01:00
Vladimir Sementsov-Ogievskiy	5416645fcf	block: return error-code from bdrv_invalidate_cache This is the only coroutine wrapper from block.c and block/io.c which doesn't return a value, so let's convert it to the common behavior, to simplify moving to generated coroutine wrappers in a further commit. Also, bdrv_invalidate_cache is a void function, returning error only through **errp parameter, which is considered to be bad practice, as it forces callers to define and propagate local_err variable, so conversion is good anyway. This patch leaves the conversion of .bdrv_co_invalidate_cache() driver callbacks and bdrv_invalidate_cache_all() for another day. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20200924185414.28642-2-vsementsov@virtuozzo.com>	2020-10-05 09:35:52 +01:00
Kevin Wolf	bc4ee65b8c	block/export: Add blk_exp_close_all(_type) This adds a function to shut down all block exports, and another one to shut down the block exports of a single type. The latter is used for now when stopping the NBD server. As soon as we implement support for multiple NBD servers, we'll need a per-server list of exports and it will be replaced by a function using that. As a side effect, the BlockExport layer has a list tracking all existing exports now. closed_exports loses its only user and can go away. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-18-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-02 15:46:40 +02:00
Stefan Hajnoczi	d73415a315	qemu/atomic.h: rename atomic_ to qatomic_ clang's C11 atomic_fetch_() functions only take a C11 atomic type pointer argument. QEMU uses direct types (int, etc) and this causes a compiler error when a QEMU code calls these functions in a source file that also included <stdatomic.h> via a system header file: $ CC=clang CXX=clang++ ./configure ... && make ../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int ' invalid) Avoid using atomic_*() names in QEMU's atomic.h since that namespace is used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h and <stdatomic.h> can co-exist. I checked /usr/include on my machine and searched GitHub for existing "qatomic_" users but there seem to be none. This patch was generated using: $ git grep -h -o '\<atomic$64$\?_[a-z0-9_]\+' include/qemu/atomic.h \| \ sort -u >/tmp/changed_identifiers $ for identifier in $(</tmp/changed_identifiers); do sed -i "s%\<$identifier\>%q$identifier%g" \ $(git grep -I -l "\<$identifier\>") done I manually fixed line-wrap issues and misaligned rST tables. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200923105646.47864-1-stefanha@redhat.com>	2020-09-23 16:07:44 +01:00
zhaolichang	e3a6e0daf4	qemu/: fix some comment spelling errors I found that there are many spelling errors in the comments of qemu, so I used the spellcheck tool to check the spelling errors and finally found some spelling errors in the folder. Signed-off-by: zhaolichang <zhaolichang@huawei.com> Reviewed-by: Alex Bennee <alex.bennee@linaro.org> Message-Id: <20200917075029.313-2-zhaolichang@huawei.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2020-09-17 20:35:43 +02:00
Max Reitz	0b877d09df	block: Leave BDS.backing_{file,format} constant Parts of the block layer treat BDS.backing_file as if it were whatever the image header says (i.e., if it is a relative path, it is relative to the overlay), other parts treat it like a cache for bs->backing->bs->filename (relative paths are relative to the CWD). Considering bs->backing->bs->filename exists, let us make it mean the former. Among other things, this now allows the user to specify a base when using qemu-img to commit an image file in a directory that is not the CWD (assuming, everything uses relative filenames). Before this patch: $ ./qemu-img create -f qcow2 foo/bot.qcow2 1M $ ./qemu-img create -f qcow2 -b bot.qcow2 foo/mid.qcow2 $ ./qemu-img create -f qcow2 -b mid.qcow2 foo/top.qcow2 $ ./qemu-img commit -b mid.qcow2 foo/top.qcow2 qemu-img: Did not find 'mid.qcow2' in the backing chain of 'foo/top.qcow2' $ ./qemu-img commit -b foo/mid.qcow2 foo/top.qcow2 qemu-img: Did not find 'foo/mid.qcow2' in the backing chain of 'foo/top.qcow2' $ ./qemu-img commit -b $PWD/foo/mid.qcow2 foo/top.qcow2 qemu-img: Did not find '[...]/foo/mid.qcow2' in the backing chain of 'foo/top.qcow2' After this patch: $ ./qemu-img commit -b mid.qcow2 foo/top.qcow2 Image committed. $ ./qemu-img commit -b foo/mid.qcow2 foo/top.qcow2 qemu-img: Did not find 'foo/mid.qcow2' in the backing chain of 'foo/top.qcow2' $ ./qemu-img commit -b $PWD/foo/mid.qcow2 foo/top.qcow2 Image committed. With this change, bdrv_find_backing_image() must look at whether the user has overridden a BDS's backing file. If so, it can no longer use bs->backing_file, but must instead compare the given filename against the backing node's filename directly. Note that this changes the QAPI output for a node's backing_file. We had very inconsistent output there (sometimes what the image header said, sometimes the actual filename of the backing image). This inconsistent output was effectively useless, so we have to decide one way or the other. Considering that bs->backing_file usually at runtime contained the path to the image relative to qemu's CWD (or absolute), this patch changes QAPI's backing_file to always report the bs->backing->bs->filename from now on. If you want to receive the image header information, you have to refer to full-backing-filename. This necessitates a change to iotest 228. The interesting information it really wanted is the image header, and it can get that now, but it has to use full-backing-filename instead of backing_file. Because of this patch's changes to bs->backing_file's behavior, we also need some reference output changes. Along with the changes to bs->backing_file, stop updating BDS.backing_format in bdrv_backing_attach() as well. This way, ImageInfo's backing-filename and backing-filename-format fields will represent what the image header says and nothing else. iotest 245 changes in behavior: With the backing node no longer overriding the parent node's backing_file string, you can now omit the @backing option when reopening a node with neither a default nor a current backing file even if it used to have a backing node at some point. 273 also changes: The base image is opened without a format layer, so ImageInfo.backing-filename-format used to report "file" for the base image's overlay after blockdev-snapshot. However, the image header never says "file" anywhere, so it now reports $IMGFMT. Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	081e465026	block: Improve get_allocated_file_size's default There are two practical problems with bdrv_get_allocated_file_size()'s default right now: (1) For drivers with children, we should generally sum all their sizes instead of just passing the request through to bs->file. The latter is good for filters, but not so much for format drivers. (2) Filters need not have bs->file, so we should actually go to the filtered child instead of hard-coding bs->file. Fix this by splitting the default implementation into three branches: (1) For filter drivers: Return the size of the filtered child (2) For protocol drivers: Return -ENOTSUP, because the default implementation cannot make a guess (3) For other drivers: Sum all data-bearing children's sizes Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	f706a92f24	block: Use CAFs for debug breakpoints When looking for a blkdebug node (which implements debug breakpoints), use bdrv_primary_bs() to iterate through the graph, because that is where a blkdebug node would be. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	52f72d6fb6	block: Use CAFs in bdrv_refresh_filename() bdrv_refresh_filename() and the kind of related bdrv_dirname() should look to the primary child when they wish to copy the underlying file's filename. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	1d42f48c3a	block: Re-evaluate backing file handling in reopen Reopening a node's backing child needs a bit of special handling because the "backing" child has different defaults than all other children (among other things). Adding filter support here is a bit more difficult than just using the child access functions. In fact, we often have to directly use bs->backing because these functions are about the "backing" child (which may or may not be the COW backing file). Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	dcf3f9b268	block: Use CAFs when working with backing chains Use child access functions when iterating through backing chains so filters do not break the chain. In addition, bdrv_find_overlay() will now always return the actual overlay; that is, it will never return a filter node but only one with a COW backing file (there may be filter nodes between that node and @bs). Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	93393e698c	block: Use bdrv_filter_(bs\|child) where obvious Places that use patterns like if (bs->drv->is_filter && bs->file) { ... something about bs->file->bs ... } should be BlockDriverState *filtered = bdrv_filter_bs(bs); if (filtered) { ... something about @filtered ... } instead. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	ae23f78646	block: Add bdrv_supports_compressed_writes() Filters cannot compress data themselves but they have to implement .bdrv_co_pwritev_compressed() still (or they cannot forward compressed writes). Therefore, checking whether bs->drv->bdrv_co_pwritev_compressed is non-NULL is not sufficient to know whether the node can actually handle compressed writes. This function looks down the filter chain to see whether there is a non-filter that can actually convert the compressed writes into compressed data (and thus normal writes). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:31 +02:00
Max Reitz	8b8277cdb0	block: Drop bdrv_is_encrypted() The original purpose of bdrv_is_encrypted() was to inquire whether a BDS can be used without the user entering a password or not. It has not been used for that purpose for quite some time. Actually, it is not even fit for that purpose, because to answer that question, it would have recursively query all of the given node's children. So now we have to decide in which direction we want to fix bdrv_is_encrypted(): Recursively query all children, or drop it and just use bs->encrypted to get the current node's status? Nowadays, its only purpose is to report through bdrv_query_image_info() whether the given image is encrypted or not. For this purpose, it is probably more interesting to see whether a given node itself is encrypted or not (otherwise, a management application cannot discern for certain which nodes are really encrypted and which just have encrypted children). Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:30 +02:00
Max Reitz	7b99a26600	block: Include filters when freezing backing chain In order to make filters work in backing chains, the associated functions must be able to deal with them and freeze both COW and filter child links. While at it, add some comments that note which functions require their caller to ensure that a given child link is not frozen, and how the callers do so. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:30 +02:00
Max Reitz	9ee413cb56	block: bdrv_set_backing_hd() is about bs->backing bdrv_set_backing_hd() is a function that explicitly cares about the bs->backing child. Highlight that in its description and use child_bs(bs->backing) instead of backing_bs(bs) to make it more obvious. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:30 +02:00
Max Reitz	34778172f1	block: bdrv_cow_child() for bdrv_has_zero_init() bdrv_has_zero_init() should use bdrv_cow_child() if it wants to check whether the given BDS has a COW backing file. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:31:30 +02:00
Max Reitz	d38d7eb8a5	block: Add chain helper functions Add some helper functions for skipping filters in a chain of block nodes. Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-07 12:31:30 +02:00
Max Reitz	9a6fc88799	block: Add child access functions There are BDS children that the general block layer code can access, namely bs->file and bs->backing. Since the introduction of filters and external data files, their meaning is not quite clear. bs->backing can be a COW source, or it can be a filtered child; bs->file can be a filtered child, it can be data and metadata storage, or it can be just metadata storage. This overloading really is not helpful. This patch adds functions that retrieve the correct child for each exact purpose. Later patches in this series will make use of them. Doing so will allow us to handle filter nodes in a meaningful way. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2020-09-07 12:31:30 +02:00
Connor Kuehl	975a7bd228	block: Raise an error when backing file parameter is an empty string Providing an empty string for the backing file parameter like so: qemu-img create -f qcow2 -b '' /tmp/foo allows the flow of control to reach and subsequently fail an assert statement because passing an empty string to bdrv_get_full_backing_filename_from_filename() simply results in NULL being returned without an error being raised. To fix this, let's check for an empty string when getting the value from the opts list. Reported-by: Attila Fazekas <afazekas@redhat.com> Fixes: https://bugzilla.redhat.com/1809553 Signed-off-by: Connor Kuehl <ckuehl@redhat.com> Message-Id: <20200813134722.802180-1-ckuehl@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-09-07 12:23:55 +02:00
Paolo Bonzini	859aef026e	meson: replace create-config with meson configure_file Move the create-config logic to meson.build; create a configuration_data object and let meson handle the quoting and output. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-21 06:30:43 -04:00
Marc-André Lureau	5e5733e599	meson: convert block Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-21 06:30:18 -04:00
Kevin Wolf	9c60a5d197	block: Require aligned image size to avoid assertion failure Unaligned requests will automatically be aligned to bl.request_alignment and we can't extend write requests to access space beyond the end of the image without resizing the image, so if we have the WRITE permission, but not the RESIZE one, it's required that the image size is aligned. Failing to meet this requirement could cause assertion failures like this if RESIZE permissions weren't requested: qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors \|\| child->perm & BLK_PERM_RESIZE' failed. This was e.g. triggered by qemu-img converting to a target image with 4k request alignment when the image was only aligned to 512 bytes, but not to 4k. Turn this into a graceful error in bdrv_check_perm() so that WRITE without RESIZE can only be taken if the image size is aligned. If a user holds both permissions and drops only RESIZE, the function will return an error, but bdrv_child_try_set_perm() will ignore the failure silently if permissions are only requested to be relaxed and just keep both permissions while returning success. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200716142601.111237-2-kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-17 14:20:57 +02:00
Eric Blake	d9f059aa6c	qemu-img: Deprecate use of -b without -F Creating an image that requires format probing of the backing image is potentially unsafe (we've had several CVEs over the years based on probes leaking information to the guest on a subsequent boot, although these days tools like libvirt are aware of the issue enough to prevent the worst effects). For example, if our probing algorithm ever changes, or if other tools like libvirt determine a different probe result than we do, then subsequent use of that backing file under a different format will present corrupted data to the guest. Fortunately, the worst effects occur only when the backing image is originally raw, and we at least prevent commit into a probed raw backing file that would change its probed type. Still, it is worth starting a deprecation clock so that future qemu-img can refuse to create backing chains that would rely on probing, to encourage clients to avoid unsafe practices. Most warnings are intentionally emitted from bdrv_img_create() in the block layer, but qemu-img convert uses bdrv_create() which cannot emit its own warning without causing spurious warnings on other code paths. In the end, all command-line image creation or backing file rewriting now performs a check. Furthermore, if we probe a backing file as non-raw, then it is safe to explicitly record that result (rather than relying on future probes); only where we probe a raw image do we care about further warnings to the user when using such an image (for example, commits into a probed-raw backing file are prevented), to help them improve their tooling. But whether or not we make the probe results explicit, we still warn the user to remind them to upgrade their workflow to supply -F always. iotest 114 specifically wants to create an unsafe image for later amendment rather than defaulting to our new default of recording a probed format, so it needs an update. While touching it, expand it to cover all of the various warnings enabled by this patch. iotest 301 also shows a change to qcow messages. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200706203954.341758-11-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-14 15:24:05 +02:00
Eric Blake	e54ee1b385	block: Add support to warn on backing file change without format For now, this is a mechanical addition; all callers pass false. But the next patch will use it to improve 'qemu-img rebase -u' when selecting a backing file with no format. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Message-Id: <20200706203954.341758-10-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-14 15:18:59 +02:00
Eric Blake	add8200dd1	block: Error if backing file fails during creation without -u Back in commit `6e6e55f5` (Jul 2017, v2.10), we tweaked the code to warn if the backing file could not be opened but the user gave a size, unless the user also passes the -u option to bypass the open of the backing file. As one common reason for failure to open the backing file is when there is mismatch in the requested backing format in relation to what the backing file actually contains, we actually want to open the backing file and ensure that it has the right format in as many cases as possible. iotest 301 for qcow demonstrates how detecting explicit format mismatch is useful to prevent the creation of an image that would probe differently than the user requested. Now is the time to finally turn the warning an error, as promised. Note that the original warning was added prior to our documentation of an official deprecation policy (`eb22aeca`, also Jul 2017), and because the warning didn't mention the word "deprecated", we never actually remembered to document it as such. But the warning has been around long enough that I don't see prolonging it another two releases. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200706203954.341758-7-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-14 15:18:59 +02:00
Eric Blake	4e2f441878	qemu-img: Flush stdout before before potential stderr messages During 'qemu-img create ... 2>&1', if --quiet is not in force, we can end up with buffered I/O in stdout that was produced before failure, but which appears in output after failure. This is confusing; the fix is to flush stdout prior to attempting anything that might produce an error message. Several iotests demonstrate the resulting ordering change now that the merged outputs now reflect chronology. (An even better fix would be to avoid printf from within block.c altogether, but that's much more invasive...) Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200706203954.341758-2-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-14 15:18:59 +02:00
Markus Armbruster	a5f9b9df25	error: Reduce unnecessary error propagation When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away, even when we need to keep error_propagate() for other error paths. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-38-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	af175e85f9	error: Eliminate error_propagate() with Coccinelle, part 2 When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. The previous commit did that with a Coccinelle script I consider fairly trustworthy. This commit uses the same script with the matching of return taken out, i.e. we convert if (!foo(..., &err)) { ... error_propagate(errp, err); ... } to if (!foo(..., errp)) { ... ... } This is unsound: @err could still be read between afterwards. I don't know how to express "no read of @err without an intervening write" in Coccinelle. Instead, I manually double-checked for uses of @err. Suboptimal line breaks tweaked manually. qdev_realize() simplified further to placate scripts/checkpatch.pl. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-36-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	668f62ec62	error: Eliminate error_propagate() with Coccinelle, part 1 When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. Convert if (!foo(..., &err)) { ... error_propagate(errp, err); ... return ... } to if (!foo(..., errp)) { ... ... return ... } where nothing else needs @err. Coccinelle script: @rule1 forall@ identifier fun, err, errp, lbl; expression list args, args2; binary operator op; constant c1, c2; symbol false; @@ if ( ( - fun(args, &err, args2) + fun(args, errp, args2) \| - !fun(args, &err, args2) + !fun(args, errp, args2) \| - fun(args, &err, args2) op c1 + fun(args, errp, args2) op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; \| return c2; \| return false; ) } @rule2 forall@ identifier fun, err, errp, lbl; expression list args, args2; expression var; binary operator op; constant c1, c2; symbol false; @@ - var = fun(args, &err, args2); + var = fun(args, errp, args2); ... when != err if ( ( var \| !var \| var op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; \| return c2; \| return false; \| return var; ) } @depends on rule1 \|\| rule2@ identifier err; @@ - Error *err = NULL; ... when != err Not exactly elegant, I'm afraid. The "when != lbl:" is necessary to avoid transforming if (fun(args, &err)) { goto out } ... out: error_propagate(errp, err); even though other paths to label out still need the error_propagate(). For an actual example, see sclp_realize(). Without the "when strict", Coccinelle transforms vfio_msix_setup(), incorrectly. I don't know what exactly "when strict" does, only that it helps here. The match of return is narrower than what I want, but I can't figure out how to express "return where the operand doesn't use @err". For an example where it's too narrow, see vfio_intx_enable(). Silently fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Converted manually. Line breaks tidied up manually. One nested declaration of @local_err deleted manually. Preexisting unwanted blank line dropped in hw/riscv/sifive_e.c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-35-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	3882578bb5	block: Avoid error accumulation in bdrv_img_create() When creating an image fails because the format doesn't support option "backing_file" or "backing_fmt", bdrv_img_create() first has qemu_opt_set() put a generic error into @local_err, then puts the real error into @errp with error_setg(), and then propagates the former to the latter, which throws away the generic error. A bit complicated, but works. Now that qemu_opt_set() returns a useful value, we can simply ignore the generic error instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-16-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	235e59cf03	qemu-option: Use returned bool to check for failure The previous commit enables conversion of foo(..., &err); if (err) { ... } to if (!foo(..., &err)) { ... } for QemuOpts functions that now return true / false on success / error. Coccinelle script: @@ identifier fun = { opts_do_parse, parse_option_bool, parse_option_number, parse_option_size, qemu_opt_parse, qemu_opt_rename, qemu_opt_set, qemu_opt_set_bool, qemu_opt_set_number, qemu_opts_absorb_qdict, qemu_opts_do_parse, qemu_opts_from_qdict_entry, qemu_opts_set, qemu_opts_validate }; expression list args, args2; typedef Error; Error *err; @@ - fun(args, &err, args2); - if (err) + if (!fun(args, &err, args2)) { ... } A few line breaks tidied up manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-15-armbru@redhat.com> [Conflict with commit `0b6786a9c1` "block/amend: refactor qcow2 amend options" resolved by rerunning Coccinelle on master's version]	2020-07-10 15:17:35 +02:00
Vladimir Sementsov-Ogievskiy	7b1efe996c	block: inline bdrv_unallocated_blocks_are_zero() The function has only one user: bdrv_co_block_status(). Inline it to simplify reviewing of the following patches, which will finally drop unallocated_blocks_are_zero field too. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200528094405.145708-3-vsementsov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-07-06 08:49:28 +02:00
Maxim Levitsky	a3579bfa0a	block/amend: add 'force' option 'force' option will be used for some unsafe amend operations. This includes things like erasing last keyslot in luks based formats (which destroys the data, unless the master key is backed up by external means), but that _might_ be desired result. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200608094030.670121-4-mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-07-06 08:49:28 +02:00
Eric Blake	5c86bdf120	block: Call attention to truncation of long NBD exports Commit `93676c88` relaxed our NBD client code to request export names up to the NBD protocol maximum of 4096 bytes without NUL terminator, even though the block layer can't store anything longer than 4096 bytes including NUL terminator for display to the user. Since this means there are some export names where we have to truncate things, we can at least try to make the truncation a bit more obvious for the user. Note that in spite of the truncated display name, we can still communicate with an NBD server using such a long export name; this was deemed nicer than refusing to even connect to such a server (since the server may not be under our control, and since determining our actual length limits gets tricky when nbd://host:port/export and nbd+unix:///export?socket=/path are themselves variable-length expansions beyond the export name but count towards the block layer name length). Reported-by: Xueqiang Wei <xuwei@redhat.com> Fixes: https://bugzilla.redhat.com/1843684 Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200610163741.3745251-3-eblake@redhat.com>	2020-06-10 12:58:59 -05:00
Max Reitz	e5d8a40685	block: Drop @child_class from bdrv_child_perm() Implementations should decide the necessary permissions based on @role. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200513110544.176672-35-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	f6de853fa3	block: Drop child_file Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200513110544.176672-33-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	9aab945e9c	block: Drop bdrv_format_default_perms() Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-32-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	87278af1d9	block: Make bdrv_filter_default_perms() static Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-31-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	58944401d6	block: Use child_of_bds in remaining places Replace child_file by child_of_bds in all remaining places (excluding tests). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200513110544.176672-28-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	ff3541c4e2	block: Drop child_backing Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200513110544.176672-25-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	25191e5ff0	block: Make backing files child_of_bds children Make all parents of backing files pass the appropriate BdrvChildRole. By doing so, we can switch their BdrvChildClass over to the generic child_of_bds, which will do the right thing when given a correct BdrvChildRole. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200513110544.176672-24-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	f34ade1148	block: Drop child_format Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-23-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	2519f54919	block: Add bdrv_default_perms() This callback can be used by BDSs that use child_of_bds with the appropriate BdrvChildRole for their children. Also, make bdrv_format_default_perms() use it for child_of_bds children (just a temporary solution until we can drop bdrv_format_default_perms() altogether). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200513110544.176672-20-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	f889054f03	block: Relax *perms_for_storage for data children We can be less restrictive about pure data children than those with metadata on them, so let bdrv_default_perms_for_storage() handle metadata children differently from pure data children. As explained in the code, the restrictions on metadata children are strictly stricter than those for pure data children, so in theory we just have to distinguish between pure-data and all other storage children (pure metadata or data+metadata). In practice, that is not obvious, though, so we have two independent code paths for metadata and for data children, and data+metadata children will go through both (without the path for data children doing anything meaningful). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200513110544.176672-19-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	6f838a4b73	block: Pull out bdrv_default_perms_for_storage() Right now, bdrv_format_default_perms() is used by format parents (generally). We want to switch to a model where most parents use a single BdrvChildClass, which then decides the permissions based on the child role. To do so, we have to split bdrv_format_default_perms() into separate functions for each such role. Note that bdrv_default_perms_for_storage() currently handles all DATA \| METADATA children. A follow-up patch is going to split it further into one function for each case. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-18-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	70082db4ef	block: Pull out bdrv_default_perms_for_cow() Right now, bdrv_format_default_perms() is used by format parents (generally). We want to switch to a model where most parents use a single BdrvChildClass, which then decides the permissions based on the child role. To do so, we have to split bdrv_format_default_perms() into separate functions for each such role. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200513110544.176672-17-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	33f2663bd5	block: Distinguish paths in *_format_default_perms bdrv_format_default_perms() has one code path for backing files, and one for storage files. We want to pull them out into their own functions, so make sure they are completely distinct before so the next patches will be a bit cleaner. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-16-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	4348355032	block: Add child_of_bds Any current user of child_file, child_format, and child_backing can and should use this generic BdrvChildClass instead, as it can handle all of these cases. However, to be able to do so, the users must pass the appropriate BdrvChildRole when the child is created/attached. (The following commits will take care of that.) Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200513110544.176672-15-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	48e0828861	block: Unify bdrv_child_cb_detach() Make bdrv_child_cb_detach() call bdrv_backing_detach() for children with a COW role (and drop the reverse call from bdrv_backing_detach()), so it can be used for any child (with a proper role set). Because so far no child has a proper role set, we need a temporary new callback for child_backing.detach that ensures bdrv_backing_detach() is called for all COW children that do not have their role set yet. (Also, move bdrv_child_cb_detach() down to group it with bdrv_inherited_options() and bdrv_child_cb_attach().) Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200513110544.176672-14-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	ca2f1234c3	block: Unify bdrv_child_cb_attach() Make bdrv_child_cb_attach() call bdrv_backing_attach() for children with a COW role (and drop the reverse call from bdrv_backing_attach()), so it can be used for any child (with a proper role set). Because so far no child has a proper role set, we need a temporary new callback for child_backing.attach that ensures bdrv_backing_attach() is called for all COW children that do not have their role set yet. (Also, move bdrv_child_cb_attach() down to group it with bdrv_inherited_options().) Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200513110544.176672-13-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	00ff7ffd67	block: Use bdrv_inherited_options() Let child_file's, child_format's, and child_backing's .inherit_options() implementations fall back to bdrv_inherited_options() to show that it would really work for all of these cases, if only the parents passed the appropriate BdrvChildRole and parent_is_format values. (Also, make bdrv_open_inherit(), the only place to explicitly call bdrv_backing_options(), call bdrv_inherited_options() instead.) This patch should incur only two visible changes, both for child_format children, both of which are effectively bug fixes: First, they no longer have discard=unmap set by default. This reason it was set is because bdrv_inherited_fmt_options() fell through to bdrv_protocol_options(), and that set it because "format drivers take care to send flushes and respect unmap policy". None of the drivers that use child_format for their children (quorum and blkverify) are format drivers, though, so this reasoning does not apply here. Second, they no longer have BDRV_O_NO_IO force-cleared. child_format was used solely for children that do not store any metadata and as such will not be accessed by their parents as long as those parents do not receive I/O themselves. Thus, such children should inherit BDRV_O_NO_IO. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200513110544.176672-12-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	fae8bd3904	block: Add generic bdrv_inherited_options() After the series this patch belongs to, we want to have a common BdrvChildClass that encompasses all of child_file, child_format, and child_backing. Such a single class needs a single .inherit_options() implementation, and this patch introduces it. The next patch will show how the existing implementations can fall back to it just by passing appropriate BdrvChildRole and parent_is_format values. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200513110544.176672-11-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	b054ff7354	block: Rename bdrv_inherited_options() The other two .inherit_options implementations specify exactly for what case they are used in their name, so do it for this one as well. (The actual intention behind this patch is to follow it up with a generic bdrv_inherited_options() that works for all three cases.) Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-10-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	3cdc69d31b	block: Pass parent_is_format to .inherit_options() We plan to unify the generic .inherit_options() functions. The resulting common function will need to decide whether to force-enable format probing, force-disable it, or leave it as-is. To make this decision, it will need to know whether the parent node is a format node or not (because we never want format probing if the parent is a format node already (except for the backing chain)). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-9-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	272c02eaef	block: Pass BdrvChildRole to .inherit_options() For now, all callers (effectively) pass 0 and no callee evaluates thie value. Later patches will change both. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-8-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	bf8e925eb5	block: Pass BdrvChildRole to bdrv_child_perm() For now, all callers pass 0 and no callee evaluates this value. Later patches will change both. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-7-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	258b776515	block: Add BdrvChildRole to BdrvChild For now, it is always set to 0. Later patches in this series will ensure that all callers pass an appropriate combination of flags. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-6-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	bd86fb990c	block: Rename BdrvChildRole to BdrvChildClass This structure nearly only contains parent callbacks for child state changes. It cannot really reflect a child's role, because different roles may overlap (as we will see when real roles are introduced), and because parents can have custom callbacks even when the child fulfills a standard role. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-Id: <20200513110544.176672-4-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	6f7a3b535f	block: Add bdrv_make_empty() Right now, all users of bdrv_make_empty() call the BlockDriver method directly. That is not only bad style, it is also wrong, unless the caller has a BdrvChild with a WRITE or WRITE_UNCHANGED permission. (WRITE_UNCHANGED suffices, because callers generally use this function to clear a node with a backing file after a commit operation.) Introduce bdrv_make_empty() that verifies that it does. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200429141126.85159-2-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Eric Blake	47e0b38a13	block: Drop unused .bdrv_has_zero_init_truncate Now that there are no clients of bdrv_has_zero_init_truncate, none of the drivers need to worry about providing it. What's more, this eliminates a source of some confusion: a literal reading of the documentation as written in `ceaca56f` and implemented in commit `1dcaf527` claims that a driver which returns 0 for bdrv_has_zero_init_truncate() must not return 1 for bdrv_has_zero_init(); this condition was violated for parallels, qcow, and sometimes for vdi, although in practice it did not matter since those drivers also lacked .bdrv_co_truncate. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200428202905.770727-10-eblake@redhat.com> Acked-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-08 13:26:35 +02:00
Philippe Mathieu-Daudé	dfde483ea3	block: Avoid dead assignment Fix warning reported by Clang static code analyzer: block.c:3167:5: warning: Value stored to 'ret' is never read ret = bdrv_fill_options(&options, filename, &flags, &local_err); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fixes: `462f5bcf6` Reported-by: Clang Static Analyzer Suggested-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200422133152.16770-2-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2020-05-04 14:43:24 +02:00
Kevin Wolf	8c6242b6f3	block-backend: Add flags to blk_truncate() Now that node level interface bdrv_truncate() supports passing request flags to the block driver, expose this on the BlockBackend level, too. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200424125448.63318-4-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-04-30 17:51:07 +02:00
Markus Armbruster	1f5842487a	qapi: Only input visitors can actually fail The previous few commits have made this more obvious, and removed the one exception. Time to clarify the documentation, and drop dead error checking. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200424084338.26803-13-armbru@redhat.com>	2020-04-30 07:26:40 +02:00
Vladimir Sementsov-Ogievskiy	7a26df203c	block: fix bdrv_root_attach_child forget to unref child_bs bdrv_root_attach_child promises to drop child_bs reference on failure. It does it on first handled failure path, but not on the second. Fix that. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200324155921.23822-1-vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-03-27 14:47:23 +01:00
Maxim Levitsky	5a5e7f8cd8	block: trickle down the fallback image creation function use to the block drivers Instead of checking the .bdrv_co_create_opts to see if we need the fallback, just implement the .bdrv_co_create_opts in the drivers that need it. This way we don't break various places that need to know if the underlying protocol/format really supports image creation, and this way we still allow some drivers to not support image creation. Fixes: `fd17146cd9` Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1816007 Note that technically this driver reverts the image creation fallback for the vxhs driver since I don't have a means to test it, and IMHO it is better to leave it not supported as it was prior to generic image creation patches. Also drop iscsi_create_opts which was left accidentally. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20200326011218.29230-3-mlevitsk@redhat.com> Reviewed-by: Denis V. Lunev <den@openvz.org> [mreitz: Fixed alignment, and moved bdrv_co_create_opts_simple() and bdrv_create_opts_simple from block.h into block_int.h] Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-03-26 14:44:33 +01:00
Maxim Levitsky	b92902dfea	block: pass BlockDriver reference to the .bdrv_co_create This will allow the reuse of a single generic .bdrv_co_create implementation for several drivers. No functional changes. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20200326011218.29230-2-mlevitsk@redhat.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-03-26 14:44:33 +01:00
Vladimir Sementsov-Ogievskiy	6e57963a77	block: bdrv_set_backing_bs: fix use-after-free There is a use-after-free possible: bdrv_unref_child() leaves bs->backing freed but not NULL. bdrv_attach_child may produce nested polling loop due to drain, than access of freed pointer is possible. I've produced the following crash on 30 iotest with modified code. It does not reproduce on master, but still seems possible: #0 __strcmp_avx2 () at /lib64/libc.so.6 #1 bdrv_backing_overridden (bs=0x55c9d3cc2060) at block.c:6350 #2 bdrv_refresh_filename (bs=0x55c9d3cc2060) at block.c:6404 #3 bdrv_backing_attach (c=0x55c9d48e5520) at block.c:1063 #4 bdrv_replace_child_noperm (child=child@entry=0x55c9d48e5520, new_bs=new_bs@entry=0x55c9d3cc2060) at block.c:2290 #5 bdrv_replace_child (child=child@entry=0x55c9d48e5520, new_bs=new_bs@entry=0x55c9d3cc2060) at block.c:2320 #6 bdrv_root_attach_child (child_bs=child_bs@entry=0x55c9d3cc2060, child_name=child_name@entry=0x55c9d241d478 "backing", child_role=child_role@entry=0x55c9d26ecee0 <child_backing>, ctx=<optimized out>, perm=<optimized out>, shared_perm=21, opaque=0x55c9d3c5a3d0, errp=0x7ffd117108e0) at block.c:2424 #7 bdrv_attach_child (parent_bs=parent_bs@entry=0x55c9d3c5a3d0, child_bs=child_bs@entry=0x55c9d3cc2060, child_name=child_name@entry=0x55c9d241d478 "backing", child_role=child_role@entry=0x55c9d26ecee0 <child_backing>, errp=errp@entry=0x7ffd117108e0) at block.c:5876 #8 in bdrv_set_backing_hd (bs=bs@entry=0x55c9d3c5a3d0, backing_hd=backing_hd@entry=0x55c9d3cc2060, errp=errp@entry=0x7ffd117108e0) at block.c:2576 #9 stream_prepare (job=0x55c9d49d84a0) at block/stream.c:150 #10 job_prepare (job=0x55c9d49d84a0) at job.c:761 #11 job_txn_apply (txn=<optimized out>, fn=<optimized out>) at job.c:145 #12 job_do_finalize (job=0x55c9d49d84a0) at job.c:778 #13 job_completed_txn_success (job=0x55c9d49d84a0) at job.c:832 #14 job_completed (job=0x55c9d49d84a0) at job.c:845 #15 job_completed (job=0x55c9d49d84a0) at job.c:836 #16 job_exit (opaque=0x55c9d49d84a0) at job.c:864 #17 aio_bh_call (bh=0x55c9d471a160) at util/async.c:117 #18 aio_bh_poll (ctx=ctx@entry=0x55c9d3c46720) at util/async.c:117 #19 aio_poll (ctx=ctx@entry=0x55c9d3c46720, blocking=blocking@entry=true) at util/aio-posix.c:728 #20 bdrv_parent_drained_begin_single (poll=true, c=0x55c9d3d558f0) at block/io.c:121 #21 bdrv_parent_drained_begin_single (c=c@entry=0x55c9d3d558f0, poll=poll@entry=true) at block/io.c:114 #22 bdrv_replace_child_noperm (child=child@entry=0x55c9d3d558f0, new_bs=new_bs@entry=0x55c9d3d27300) at block.c:2258 #23 bdrv_replace_child (child=child@entry=0x55c9d3d558f0, new_bs=new_bs@entry=0x55c9d3d27300) at block.c:2320 #24 bdrv_root_attach_child (child_bs=child_bs@entry=0x55c9d3d27300, child_name=child_name@entry=0x55c9d241d478 "backing", child_role=child_role@entry=0x55c9d26ecee0 <child_backing>, ctx=<optimized out>, perm=<optimized out>, shared_perm=21, opaque=0x55c9d3cc2060, errp=0x7ffd11710c60) at block.c:2424 #25 bdrv_attach_child (parent_bs=parent_bs@entry=0x55c9d3cc2060, child_bs=child_bs@entry=0x55c9d3d27300, child_name=child_name@entry=0x55c9d241d478 "backing", child_role=child_role@entry=0x55c9d26ecee0 <child_backing>, errp=errp@entry=0x7ffd11710c60) at block.c:5876 #26 bdrv_set_backing_hd (bs=bs@entry=0x55c9d3cc2060, backing_hd=backing_hd@entry=0x55c9d3d27300, errp=errp@entry=0x7ffd11710c60) at block.c:2576 #27 stream_prepare (job=0x55c9d495ead0) at block/stream.c:150 ... Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200316060631.30052-2-vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-03-24 11:41:46 +01:00
Philippe Mathieu-Daudé	a15f08dcee	block: Assert BlockDriver::format_name is not NULL bdrv_do_find_format() calls strcmp() using BlockDriver::format_name as argument, which must not be NULL. Assert this field is not null when we register a block driver in bdrv_register(). Reported-by: Mansour Ahmadi <ManSoSec@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200318222235.23856-1-philmd@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-03-24 11:41:46 +01:00
Daniel Henrique Barboza	e1d7f8bb1e	block.c: adding bdrv_co_delete_file Using the new 'bdrv_co_delete_file' interface, a pure co_routine function 'bdrv_co_delete_file' inside block.c can can be used in a way similar of the existing bdrv_create_file to to clean up a created file. We're creating a pure co_routine because the only caller of 'bdrv_co_delete_file' will be already in co_routine context, thus there is no need to add all the machinery to check for qemu_in_coroutine() and create a separated co_routine to do the job. Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20200130213907.2830642-3-danielhb413@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-03-11 15:54:38 +01:00
Kevin Wolf	30dd65f307	block: Fix cross-AioContext blockdev-snapshot external_snapshot_prepare() tries to move the overlay to the AioContext of the backing file (the snapshotted node). However, it's possible that this doesn't work, but the backing file can instead be moved to the overlay's AioContext (e.g. opening the backing chain for a mirror target). bdrv_append() already indirectly uses bdrv_attach_node(), which takes care to move nodes to make sure they use the same AioContext and which tries both directions. So the problem has a simple fix: Just delete the unnecessary extra bdrv_try_set_aio_context() call in external_snapshot_prepare() and instead assert in bdrv_append() that both nodes were indeed moved to the same AioContext. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200310113831.27293-6-kwolf@redhat.com> Tested-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-03-11 15:54:38 +01:00
Kevin Wolf	c7a0f2be8f	block: Make bdrv_get_cumulative_perm() public Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200310113831.27293-2-kwolf@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-03-11 15:54:38 +01:00
Kevin Wolf	1de6b45fb5	block: bdrv_reopen() with backing file in different AioContext This patch allows bdrv_reopen() (and therefore the x-blockdev-reopen QMP command) to attach a node as the new backing file even if the node is in a different AioContext than the parent if one of both nodes can be moved to the AioContext of the other node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Peter Krempa <pkrempa@redhat.com> Message-Id: <20200306141413.30705-3-kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-03-06 17:34:09 +01:00
Peter Krempa	17e1e2be5f	block: Introduce 'bdrv_reopen_commit_post' step Add another step in the reopen process where driver can execute code after permission changes are comitted. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Message-Id: <adc02cf591c3cb34e98e33518eb1c540a0f27db1.1582893284.git.pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-03-06 17:15:37 +01:00
Max Reitz	eeea1faa09	block: Fix leak in bdrv_create_file_fallback() @options is leaked by the first two return statements in this function. Note that blk_new_open() takes the reference to @options even on failure, so all we need to do to fix the leak is to move the QDict allocation down to where we actually need it. Reported-by: Coverity (CID 1419884) Fixes: `fd17146cd9` ("block: Generic file creation fallback") Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200225155618.133412-1-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-03-06 17:15:37 +01:00
Stefan Hajnoczi	195ed8cb36	qemu/queue.h: add QLIST_SAFE_REMOVE() QLIST_REMOVE() assumes the element is in a list. It also leaves the element's linked list pointers dangling. Introduce a safe version of QLIST_REMOVE() and convert open-coded instances of this pattern. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Sergio Lopez <slp@redhat.com> Message-id: 20200214171712.541358-4-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-02-22 08:26:47 +00:00
Max Reitz	fd17146cd9	block: Generic file creation fallback If a protocol driver does not support image creation, we can see whether maybe the file exists already. If so, just truncating it will be sufficient. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200122164532.178040-3-mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-02-20 16:43:42 +01:00
Peter Krempa	facda5443f	qapi: Allow getting flat output from 'query-named-block-nodes' When a management application manages node names there's no reason to recurse into backing images in the output of query-named-block-nodes. Add a parameter to the command which will return just the top level structs. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Message-Id: <4470f8c779abc404dcf65e375db195cd91a80651.1579509782.git.pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [mreitz: Fixed coding style] Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-02-20 16:43:42 +01:00
Max Reitz	6b4907cf42	block: Remove bdrv_recurse_is_first_non_filter() It no longer has any users. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200218103454.296704-11-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 11:55:40 +01:00
Max Reitz	810803a87c	block: Use bdrv_recurse_can_replace() Let check_to_replace_node() use the more specialized bdrv_recurse_can_replace() instead of bdrv_recurse_is_first_non_filter(), which is too restrictive (or, in the case of quorum, sometimes not restrictive enough). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200218103454.296704-10-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 11:55:40 +01:00
Max Reitz	5d69b5ab85	block: Add bdrv_recurse_can_replace() After a couple of follow-up patches, this function will replace bdrv_recurse_is_first_non_filter() in check_to_replace_node(). bdrv_recurse_is_first_non_filter() is both not sufficiently specific for check_to_replace_node() (it allows cases that should not be allowed, like replacing child nodes of quorum with dissenting data that have more parents than just quorum), and it is too restrictive (it is perfectly fine to replace filters). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200218103454.296704-7-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 11:55:39 +01:00
Max Reitz	a851ad4cac	block: Drop bdrv_is_first_non_filter() It is unused now. (And it was ugly because it needed to explore all BDS chains from the top.) Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200218103454.296704-4-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 11:55:39 +01:00
Philippe Mathieu-Daudé	0beab8119f	block: Remove superfluous semicolons Fixes: `132ada80c4` Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200218094402.26625-4-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 10:53:56 +01:00
Pan Nengyuan	cb8956144c	block: fix memleaks in bdrv_refresh_filename If we call the qmp 'query-block' while qemu is working on 'block-commit', it will cause memleaks, the memory leak stack is as follow: Indirect leak of 12360 byte(s) in 3 object(s) allocated from: #0 0x7f80f0b6d970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970) #1 0x7f80ee86049d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d) #2 0x55ea95b5bb67 in qdict_new /mnt/sdb/qemu-4.2.0-rc0/qobject/qdict.c:29 #3 0x55ea956cd043 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6427 #4 0x55ea956cc950 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6399 #5 0x55ea956cc950 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6399 #6 0x55ea956cc950 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6399 #7 0x55ea958818ea in bdrv_block_device_info /mnt/sdb/qemu-4.2.0-rc0/block/qapi.c:56 #8 0x55ea958879de in bdrv_query_info /mnt/sdb/qemu-4.2.0-rc0/block/qapi.c:392 #9 0x55ea9588b58f in qmp_query_block /mnt/sdb/qemu-4.2.0-rc0/block/qapi.c:578 #10 0x55ea95567392 in qmp_marshal_query_block qapi/qapi-commands-block-core.c:95 Indirect leak of 4120 byte(s) in 1 object(s) allocated from: #0 0x7f80f0b6d970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970) #1 0x7f80ee86049d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d) #2 0x55ea95b5bb67 in qdict_new /mnt/sdb/qemu-4.2.0-rc0/qobject/qdict.c:29 #3 0x55ea956cd043 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6427 #4 0x55ea956cc950 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6399 #5 0x55ea956cc950 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6399 #6 0x55ea9569f301 in bdrv_backing_attach /mnt/sdb/qemu-4.2.0-rc0/block.c:1064 #7 0x55ea956a99dd in bdrv_replace_child_noperm /mnt/sdb/qemu-4.2.0-rc0/block.c:2283 #8 0x55ea956b9b53 in bdrv_replace_node /mnt/sdb/qemu-4.2.0-rc0/block.c:4196 #9 0x55ea956b9e49 in bdrv_append /mnt/sdb/qemu-4.2.0-rc0/block.c:4236 #10 0x55ea958c3472 in commit_start /mnt/sdb/qemu-4.2.0-rc0/block/commit.c:306 #11 0x55ea94b68ab0 in qmp_block_commit /mnt/sdb/qemu-4.2.0-rc0/blockdev.c:3459 #12 0x55ea9556a7a7 in qmp_marshal_block_commit qapi/qapi-commands-block-core.c:407 Fixes: `bb808d5f5c` Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Message-id: 20200116085600.24056-1-pannengyuan@huawei.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-02-06 13:47:45 +01:00
Alberto Garcia	e2a7423a11	block: Use a GString in bdrv_perm_names() This is a bit more efficient than having to allocate and free memory for each new permission. The default size (30) is enough for "consistent read, write, resize". Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 20200110171518.22168-1-berto@igalia.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-02-06 13:47:45 +01:00
Aarushi Mehta	f80f267373	blockdev: adds bdrv_parse_aio to use io_uring Signed-off-by: Aarushi Mehta <mehta.aaru20@gmail.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200120141858.587874-8-stefanha@redhat.com Message-Id: <20200120141858.587874-8-stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-01-30 20:59:41 +00:00
Max Reitz	cdb1cec89e	block: Use bdrv_qapi_perm_to_blk_perm() We can save some LoC in xdbg_graph_add_edge() by using bdrv_qapi_perm_to_blk_perm(). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20191108123455.39445-3-mreitz@redhat.com Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-01-06 13:43:06 +01:00
Max Reitz	7b1d9c4df0	block: Add bdrv_qapi_perm_to_blk_perm() We need some way to correlate QAPI BlockPermission values with BLK_PERM_* flags. We could: (1) have the same order in the QAPI definition as the the BLK_PERM_* flags are in LSb-first order. However, then there is no guarantee that they actually match (e.g. when someone modifies the QAPI schema without thinking of the BLK_PERM_* definitions). We could add static assertions, but these would break what’s good about this solution, namely its simplicity. (2) define the BLK_PERM_* flags based on the BlockPermission values. But this way whenever someone were to modify the QAPI order (perfectly sensible in theory), the BLK_PERM_* values would change. Because these values are used for file locking, this might break file locking between different qemu versions. Therefore, go the slightly more cumbersome way: Add a function to translate from the QAPI constants to the BLK_PERM_* flags. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20191108123455.39445-2-mreitz@redhat.com Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-01-06 13:43:06 +01:00
Kevin Wolf	7bb4941ace	block: Activate recursively even for already active nodes bdrv_invalidate_cache_all() assumes that all nodes in a given subtree are either active or inactive when it starts. Therefore, as soon as it arrives at an already active node, it stops. However, this assumption is wrong. For example, it's possible to take a snapshot of an inactive node, which results in an active overlay over an inactive backing file. The active overlay is probably also the root node of an inactive BlockBackend (blk->disable_perm == true). In this case, bdrv_invalidate_cache_all() does not need to do anything to activate the overlay node, but it still needs to recurse into the children and the parents to make sure that after returning success, really everything is activated. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2019-12-19 18:04:25 +01:00
Kevin Wolf	f6dc1c31d3	block: Error out on image creation with conflicting size options If both the create options (qemu-img create -o ...) and the size parameter were given, the size parameter was silently ignored. Instead, make specifying two sizes an error. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2019-12-18 11:20:57 +01:00
Kevin Wolf	ae0f57f0aa	block: Remove 'backing': null from bs->{explicit_,}options bs->options and bs->explicit_options shouldn't contain any options for child nodes. bdrv_open_inherited() takes care to remove any options that match a child name after opening the image and the same is done when reopening. However, we miss the case of 'backing': null, which is a child option, but results in no child being created. This means that a 'backing': null remains in bs->options and bs->explicit_options. A typical use for 'backing': null is in live snapshots: blockdev-add for the qcow2 overlay makes sure not to open the backing file (because it is already opened and blockdev-snapshot will attach it). After doing a blockdev-snapshot, bs->options and bs->explicit_options become inconsistent with the actual state (bs has a backing file now, but the options still say null). On the next occasion that the image is reopened, e.g. switching it from read-write to read-only when another snapshot is taken, the option will take effect again and the node incorrectly loses its backing file. Fix bdrv_open_inherited() to remove the 'backing' option from bs->options and bs->explicit_options even for the case where it specifies that no backing file is wanted. Reported-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Tested-by: Peter Krempa <pkrempa@redhat.com>	2019-11-18 17:50:08 +01:00
Wei Yang	038adc2f58	core: replace getpagesize() with qemu_real_host_page_size There are three page size in qemu: real host page size host page size target page size All of them have dedicate variable to represent. For the last two, we use the same form in the whole qemu project, while for the first one we use two forms: qemu_real_host_page_size and getpagesize(). qemu_real_host_page_size is defined to be a replacement of getpagesize(), so let it serve the role. [Note] Not fully tested for some arch or device. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-26 15:38:06 +02:00
Vladimir Sementsov-Ogievskiy	4dd09f6223	qcow2-bitmap: move bitmap reopen-rw code to qcow2_reopen_commit The only reason I can imagine for this strange code at the very-end of bdrv_reopen_commit is the fact that bs->read_only updated after calling drv->bdrv_reopen_commit in bdrv_reopen_commit. And in the same time, prior to previous commit, qcow2_reopen_bitmaps_rw did a wrong check for being writable, when actually it only need writable file child not self. So, as it's fixed, let's move things to correct place. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Acked-by: Max Reitz <mreitz@redhat.com> Message-id: 20190927122355.7344-10-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:53:28 -04:00
Vladimir Sementsov-Ogievskiy	fcd6a4f42d	block: reverse order for reopen commits It's needed to fix reopening qcow2 with bitmaps to RW. Currently it can't work, as qcow2 needs write access to file child, to mark bitmaps in-image with IN_USE flag. But usually children goes after parents in reopen queue and file child is still RO on qcow2 reopen commit. Reverse reopen order to fix it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Acked-by: Max Reitz <mreitz@redhat.com> Acked-by: John Snow <jsnow@redhat.com> Message-id: 20190927122355.7344-3-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:02:32 -04:00
Vladimir Sementsov-Ogievskiy	859443b0fb	block: switch reopen queue from QSIMPLEQ to QTAILQ We'll need reverse-foreach in the following commit, QTAILQ support it, so move to QTAILQ. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20190927122355.7344-2-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:02:32 -04:00
Vladimir Sementsov-Ogievskiy	ef9041a7b8	block/dirty-bitmap: refactor bdrv_dirty_bitmap_next bdrv_dirty_bitmap_next is always used in same pattern. So, split it into _next and _first, instead of combining two functions into one and add FOR_EACH_DIRTY_BITMAP macro. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20190916141911.5255-5-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:02:32 -04:00
Vladimir Sementsov-Ogievskiy	85cc8a4f6b	block: move bdrv_can_store_new_dirty_bitmap to block/dirty-bitmap.c block/dirty-bitmap.c seems to be more appropriate for it and bdrv_remove_persistent_dirty_bitmap already in it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20190920082543.23444-2-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:02:32 -04:00
Anton Nefedov	d924559953	qapi: query-blockstat: add driver specific file-posix stats A block driver can provide a callback to report driver-specific statistics. file-posix driver now reports discard statistics Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-id: 20190923121737.83281-10-anton.nefedov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Vladimir Sementsov-Ogievskiy	d10529a2b7	block: teach bdrv_debug_breakpoint skip filters with backing Teach bdrv_debug_breakpoint and bdrv_debug_remove_breakpoint skip filters with backing. This is needed to implement and use in backup job it's own backup_top filter driver (like mirror already has one), and without this improvement, breakpoint removal will fail at least in 55 iotest. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20190920142056.12778-9-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:17 +02:00
Vladimir Sementsov-Ogievskiy	b503de619e	block: fix permission update in bdrv_replace_node It's wrong to OR shared permissions. It may lead to crash on further permission updates. Also, no needs to consider previously calculated permissions, as at this point we already bind all new parents and bdrv_get_cumulative_perm result is enough. So fix the bug by just set permissions by bdrv_get_cumulative_perm result. Bug was introduced in long ago `234ac1a902`, in 2.9. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20190824100740.61635-1-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-09-03 14:55:35 +02:00
Max Reitz	ceaca56fee	block: Add bdrv_has_zero_init_truncate() No .bdrv_has_zero_init() implementation returns 1 if growing the file would add non-zero areas (at least with PREALLOC_MODE_OFF), so using it in lieu of this new function was always safe. But on the other hand, it is possible that growing an image that is not zero-initialized would still add a zero-initialized area, like when using nonpreallocating truncation on a preallocated image. For callers that care only about truncation, not about creation with potential preallocation, this new function is useful. Alternatively, we could have added a PreallocMode parameter to bdrv_has_zero_init(). But the only user would have been qemu-img convert, which does not have a plain PreallocMode value right now -- it would have to parse the creation option to obtain it. Therefore, the simpler solution is to let bdrv_has_zero_init() inquire the preallocation status and add the new bdrv_has_zero_init_truncate() that presupposes PREALLOC_MODE_OFF. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190724171239.8764-4-mreitz@redhat.com Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-08-19 17:13:26 +02:00
John Snow	c4e4b0fa59	qapi: implement block-dirty-bitmap-remove transaction action It is used to do transactional movement of the bitmap (which is possible in conjunction with merge command). Transactional bitmap movement is needed in scenarios with external snapshot, when we don't want to leave copy of the bitmap in the base image. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20190708220502.12977-3-jsnow@redhat.com [Edited "since" version to 4.2 --js] Signed-off-by: John Snow <jsnow@redhat.com>	2019-08-16 16:28:03 -04:00
Peter Maydell	e018ccb3fb	Block layer patches: - file-posix: Fix O_DIRECT alignment detection - Fixes for concurrent block jobs - block-backend: Queue requests while drained (fix IDE vs. job crashes) - qemu-img convert: Deprecate using -n and -o together - iotests: Migration tests with filter nodes - iotests: More media change tests -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJdVnduAAoJEH8JsnLIjy/W0IgQAKft/M3aDgt0sbTzQh8vdy6A yAfTnnSL4Z56+8qAsqhEnplC3rZxvTkg9AGOoNYHOZKl3FgRH9r8g9/Enemh4fWu MH52hiRf2ytlFVurIQal3aj9O+i0YTnzuvYbysvkH4ID5zbv2QnwdagtEcBxbbYL NZTMZBynDzp4rKIZ7p6T/kkaklLHh4vZrjW+Mzm3LQx9JJr8TwVNqqetSfc4VKIJ ByaNbbihDUVjQyIaJ24DXXJdzonGrrtSbSZycturc5FzXymzSRgrXZCeSKCs8X+i fjwMXH5v4/UfK511ILsXiumeuxBfD2Ck4sAblFxVo06oMPRNmsAKdRLeDByE7IC1 lWep/pB3y/au9CW2/pkWJOiaz5s5iuv2fFYidKUJ0KQ1dD7G8M9rzkQlV3FUmTZO jBKSxHEffXsYl0ojn0vGmZEd7FAPi3fsZibGGws1dVgxlWI93aUJsjCq0E+lHIRD hEmQcjqZZa4taKpj0Y3Me05GkL7tH6RYA153jDNb8rPdzriGRCLZSObEISrOJf8H Mh0gTLi8KJNh6bULd12Ake1tKn7ZeTXpHH+gadz9OU7eIModh1qYTSHPlhy5oAv0 Hm9BikNlS1Hzw+a+EbLcOW7TrsteNeGr7r8T6QKPMq1sfsYcp3svbC2c+zVlQ6Ll mLoTssksXOkgBevVqSiS =T7L5 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches: - file-posix: Fix O_DIRECT alignment detection - Fixes for concurrent block jobs - block-backend: Queue requests while drained (fix IDE vs. job crashes) - qemu-img convert: Deprecate using -n and -o together - iotests: Migration tests with filter nodes - iotests: More media change tests # gpg: Signature made Fri 16 Aug 2019 10:29:18 BST # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: file-posix: Handle undetectable alignment qemu-img convert: Deprecate using -n and -o together block-backend: Queue requests while drained mirror: Keep mirror_top_bs drained after dropping permissions block: Remove blk_pread_unthrottled() iotests: Add test for concurrent stream/commit tests: Test mid-drain bdrv_replace_child_noperm() tests: Test polling in bdrv_drop_intermediate() block: Reduce (un)drains when replacing a child block: Keep subtree drained in drop_intermediate block: Simplify bdrv_filter_default_perms() iotests: Test migration with all kinds of filter nodes iotests: Move migration helpers to iotests.py iotests/118: Add -blockdev based tests iotests/118: Create test classes dynamically iotests/118: Test media change for scsi-cd Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-08-16 16:43:46 +01:00
Markus Armbruster	db72581598	Include qemu/main-loop.h less In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-08-16 13:31:52 +02:00
Max Reitz	debc292767	block: Reduce (un)drains when replacing a child Currently, bdrv_replace_child_noperm() undrains the parent until it is completely undrained, then re-drains it after attaching the new child node. This is a problem with bdrv_drop_intermediate(): We want to keep the whole subtree drained, including parents, while the operation is under way. bdrv_replace_child_noperm() breaks this by allowing every parent to become unquiesced briefly, and then redraining it. In fact, there is no reason why the parent should become unquiesced and be allowed to submit requests to the new child node if that new node is supposed to be kept drained. So if anything, we have to drain the parent before detaching the old child node. Conversely, we have to undrain it only after attaching the new child node. Thus, change the whole drain algorithm here: Calculate the number of times we have to drain/undrain the parent before replacing the child node then drain it (if necessary), replace the child node, and then undrain it. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-08-16 10:25:16 +02:00
Max Reitz	637d54a5f3	block: Keep subtree drained in drop_intermediate bdrv_drop_intermediate() calls BdrvChildRole.update_filename(). That may poll, thus changing the graph, which potentially breaks the QLIST_FOREACH_SAFE() loop. Just keep the whole subtree drained. This is probably the right thing to do anyway (dropping nodes while the subtree is not drained seems wrong). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-08-16 10:25:16 +02:00
Kevin Wolf	e444fa8312	block: Simplify bdrv_filter_default_perms() The same change as commit `2b23f28639` ('block/copy-on-read: Fix permissions for inactive node') made for the copy-on-read driver can be made for bdrv_filter_default_perms(): Retaining the old permissions from the BdrvChild if it is given complicates things unnecessarily when in the end this only means that the options set in the c == NULL case (i.e. during child creation) are retained. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2019-08-16 10:25:16 +02:00
Max Reitz	43eaaaef0e	block: Only the main loop can change AioContexts bdrv_set_aio_context_ignore() can only work in the main loop: bdrv_drained_begin() only works in the main loop and the node's (old) AioContext; and bdrv_drained_end() really only works in the main loop and the node's (new) AioContext (contrary to its current comment, which is just wrong). Consequentially, bdrv_set_aio_context_ignore() must be called from the main loop. Luckily, assuming that we can make block graph changes only from the main loop as well, all its callers do that already. Note that changing a node's context in a sense is an operation that changes the block graph, so it actually makes sense to require this function to be called from the main loop. Also, fix bdrv_drained_end()'s description. You can only use it from the main loop or the node's AioContext, and in the latter case, the whole subtree must be in the same context. Fixes: `e037c09c78` Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190722133054.21781-3-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-07-22 18:41:43 +02:00
Max Reitz	e037c09c78	block: Do not poll in bdrv_do_drained_end() We should never poll anywhere in bdrv_do_drained_end() (including its recursive callees like bdrv_drain_invoke()), because it does not cope well with graph changes. In fact, it has been written based on the postulation that no graph changes will happen in it. Instead, the callers that want to poll must poll, i.e. all currently globally available wrappers: bdrv_drained_end(), bdrv_subtree_drained_end(), bdrv_unapply_subtree_drain(), and bdrv_drain_all_end(). Graph changes there do not matter. They can poll simply by passing a pointer to a drained_end_counter and wait until it reaches 0. This patch also adds a non-polling global wrapper for bdrv_do_drained_end() that takes a drained_end_counter pointer. We need such a variant because now no function called anywhere from bdrv_do_drained_end() must poll. This includes BdrvChildRole.drained_end(), which already must not poll according to its interface documentation, but bdrv_child_cb_drained_end() just violates that by invoking bdrv_drained_end() (which does poll). Therefore, BdrvChildRole.drained_end() must take a drained_end_counter parameter, which bdrv_child_cb_drained_end() can pass on to the new bdrv_drained_end_no_poll() function. Note that we now have a pattern of all drained_end-related functions either polling or receiving a drained_end_counter to let the caller poll based on that. A problem with a single poll loop is that when the drained section in bdrv_set_aio_context_ignore() ends, some nodes in the subgraph may be in the old contexts, while others are in the new context already. To let the collective poll in bdrv_drained_end() work correctly, we must not hold a lock to the old context, so that the old context can make progress in case it is different from the current context. (In the process, remove the comment saying that the current context is always the old context, because it is wrong.) In all other places, all nodes in a subtree must be in the same context, so we can just poll that. The exception of course is bdrv_drain_all_end(), but that always runs in the main context, so we can just poll NULL (like bdrv_drain_all_begin() does). Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-07-19 13:19:16 +02:00
Max Reitz	804db8ea00	block: Introduce BdrvChild.parent_quiesce_counter Commit `5cb2737e92` laid out why bdrv_do_drained_end() must decrement the quiesce_counter after bdrv_drain_invoke(). It did not give a very good reason why it has to happen after bdrv_parent_drained_end(), instead only claiming symmetry to bdrv_do_drained_begin(). It turns out that delaying it for so long is wrong. Situation: We have an active commit job (i.e. a mirror job) from top to base for the following graph: filter \| [file] \| v top --[backing]--> base Now the VM is closed, which results in the job being cancelled and a bdrv_drain_all() happening pretty much simultaneously. Beginning the drain means the job is paused once whenever one of its nodes is quiesced. This is reversed when the drain ends. With how the code currently is, after base's drain ends (which means that it will have unpaused the job once), its quiesce_counter remains at 1 while it goes to undrain its parents (bdrv_parent_drained_end()). For some reason or another, undraining filter causes the job to be kicked and enter mirror_exit_common(), where it proceeds to invoke block_job_remove_all_bdrv(). Now base will be detached from the job. Because its quiesce_counter is still 1, it will unpause the job once more. So in total, undraining base will unpause the job twice. Eventually, this will lead to the job's pause_count going negative -- well, it would, were there not an assertion against this, which crashes qemu. The general problem is that if in bdrv_parent_drained_end() we undrain parent A, and then undrain parent B, which then leads to A detaching the child, bdrv_replace_child_noperm() will undrain A as if we had not done so yet; that is, one time too many. It follows that we cannot decrement the quiesce_counter after invoking bdrv_parent_drained_end(). Unfortunately, decrementing it before bdrv_parent_drained_end() would be wrong, too. Imagine the above situation in reverse: Undraining A leads to B detaching the child. If we had already decremented the quiesce_counter by that point, bdrv_replace_child_noperm() would undrain B one time too little; because it expects bdrv_parent_drained_end() to issue this undrain. But bdrv_parent_drained_end() won't do that, because B is no longer a parent. Therefore, we have to do something else. This patch opts for introducing a second quiesce_counter that counts how many times a child's parent has been quiesced (though c->role->drained_*). With that, bdrv_replace_child_noperm() just has to undrain the parent exactly that many times when removing a child, and it will always be right. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-07-19 13:19:16 +02:00
Max Reitz	3cf746b3f1	block: Deep-clear inherits_from BDS.inherits_from does not always point to an immediate parent node. When launching a block job with a filter node, for example, the node directly below the filter will not point to the filter, but keep its old pointee (above the filter). If that pointee goes away while the job is still running, the node's inherits_from will not be updated and thus point to garbage. To fix this, bdrv_unref_child() has to check not only the parent node's immediate children for nodes whose inherits_from needs to be cleared, but its whole subtree. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190703172813.6868-7-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-07-15 15:48:40 +02:00
Max Reitz	e5182c1c57	block: Add BDS.never_freeze The commit and the mirror block job must be able to drop their filter node at any point. However, this will not be possible if any of the BdrvChild links to them is frozen. Therefore, we need to prevent them from ever becoming frozen. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20190703172813.6868-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-07-15 15:48:40 +02:00
Max Reitz	1046779e64	block: Ignore loosening perm restrictions failures We generally assume that loosening permission restrictions can never fail. We have seen in the past that this assumption is wrong. This has led to crashes because we generally pass &error_abort when loosening permissions. However, a failure in such a case should actually be handled in quite the opposite way: It is very much not fatal, so qemu may report it, but still consider the operation successful. The only realistic problem is that qemu may then retain permissions and thus locks on images it actually does not require. But again, that is not fatal. To implement this behavior, we make all functions that change permissions and that pass &error_abort to the initiating function (bdrv_check_perm() or bdrv_child_check_perm()) evaluate the @loosen_restrictions value introduced in the previous patch. If it is true and an error did occur, we abort the permission update, discard the error, and instead report success to the caller. bdrv_child_try_set_perm() itself does not pass &error_abort, but it is the only public function to change permissions. As such, callers may pass &error_abort to it, expecting dropping permission restrictions to never fail. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-18 16:41:10 +02:00
Max Reitz	9eab154415	block: Add tighten_restrictions to check*_perm() This patch makes three functions report whether the necessary permission change tightens restrictions or not. These functions are: - bdrv_check_perm() - bdrv_check_update_perm() - bdrv_child_check_perm() Callers can use this result to decide whether a failure is fatal or not (see the next patch). Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-18 16:41:10 +02:00
Max Reitz	87ace5f8b6	block: Fix order in bdrv_replace_child() We have to start by applying the permission restrictions to new_bs before we can loosen them on old_bs. See the comment for the explanation. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-18 16:41:10 +02:00
Max Reitz	c1087f1206	block: Add bdrv_child_refresh_perms() If a block node uses bdrv_child_try_set_perm() to change the permission it takes on its child, the result may be very short-lived. If anything makes the block layer recalculate the permissions internally, it will invoke the node driver's .bdrv_child_perm() implementation. The permission/shared permissions masks that returns will then override the values previously passed to bdrv_child_try_set_perm(). If drivers want a child edge to have specific values for the permissions/shared permissions mask, it must return them in .bdrv_child_perm(). Consequentially, there is no need for them to pass the same values to bdrv_child_try_set_perm() then: It is better to have a function that invokes .bdrv_child_perm() and calls bdrv_child_try_set_perm() with the result. This patch adds such a function under the name of bdrv_child_refresh_perms(). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-18 16:41:10 +02:00
Vladimir Sementsov-Ogievskiy	b23c580c94	block: drop bs->job Drop remaining users of bs->job: 1. assertions actually duplicated by assert(!bs->refcnt) 2. trace-point seems not enough reason to change stream_start to return BlockJob pointer 3. Restricting creation of two jobs based on same bs is bad idea, as 3.1 Some jobs creates filters to be their main node, so, this check don't actually prevent creating second job on same real node (which will create another filter node) (but I hope it is restricted by other mechanisms) 3.2 Even without bs->job we have two systems of permissions: op-blockers and BLK_PERM 3.3 We may want to run several jobs on one node one day And finally, drop bs->job pointer itself. Hurrah! Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-18 16:41:10 +02:00
Kevin Wolf	42a65f02f9	block: Remove bdrv_set_aio_context() All callers of bdrv_set_aio_context() are eliminated now, they have moved to bdrv_try_set_aio_context() and related safe functions. Remove bdrv_set_aio_context(). With this, we can now know that the .set_aio_ctx callback must be present in bdrv_set_aio_context_ignore() because bdrv_can_set_aio_context() would have returned false previously, so instead of checking the condition, we can assert it. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-04 16:55:58 +02:00
Kevin Wolf	d0ee0204f4	block: Remove wrong bdrv_set_aio_context() calls The mirror and commit block jobs use bdrv_set_aio_context() to move their filter node into the right AioContext before hooking it up in the graph. Similarly, bdrv_open_backing_file() explicitly moves the backing file node into the right AioContext first. This isn't necessary any more, they get automatically moved into the right context now when attaching them. However, in the case of bdrv_open_backing_file() with a node reference, it's actually not only unnecessary, but even wrong: The unchecked bdrv_set_aio_context() changes the AioContext of the child node even if other parents require it to retain the old context. So this is not only a simplification, but a bug fix, too. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1684342 Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-04 15:22:22 +02:00
Kevin Wolf	ad943dcb22	block: Move node without parents to main AioContext A node should only be in a non-default AioContext if a user is attached to it that requires this. When the last parent of a node is gone, it can move back to the main AioContext. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-04 15:22:22 +02:00
Kevin Wolf	132ada80c4	block: Adjust AioContexts when attaching nodes So far, we only made sure that updating the AioContext of a node affected the whole subtree. However, if a node is newly attached to a new parent, we also need to make sure that both the subtree of the node and the parent are in the same AioContext. This tries to move the new child node to the parent AioContext and returns an error if this isn't possible. BlockBackends now actually apply their AioContext to their root node. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-04 15:22:22 +02:00
Kevin Wolf	d861ab3acf	block: Add BlockBackend.ctx This adds a new parameter to blk_new() which requires its callers to declare from which AioContext this BlockBackend is going to be used (or the locks of which AioContext need to be taken anyway). The given context is only stored and kept up to date when changing AioContexts. Actually applying the stored AioContext to the root node is saved for another commit. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-04 15:22:22 +02:00
Kevin Wolf	f871abd60f	block: Drain source node in bdrv_replace_node() Instead of just asserting that no requests are in flight in bdrv_replace_node(), which is a requirement that most callers ignore, we can just drain the source node right there. This fixes at least starting a commit job while I/O is active on the backing chain, but probably other callers, too. Having requests in flight on the target node isn't a problem because the target just gets new parents, but the call path of running requests isn't modified. So we can just drop this assertion without a replacement. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1711643 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2019-06-04 15:20:41 +02:00
Alberto Garcia	b441dc71c0	block: Make bdrv_root_attach_child() unref child_bs on failure A consequence of the previous patch is that bdrv_attach_child() transfers the reference to child_bs from the caller to parent_bs, which will drop it on bdrv_close() or when someone calls bdrv_unref_child(). But this only happens when bdrv_attach_child() succeeds. If it fails then the caller is responsible for dropping the reference to child_bs. This patch makes bdrv_attach_child() take the reference also when there is an error, freeing the caller for having to do it. A similar situation happens with bdrv_root_attach_child(), so the changes on this patch affect both functions. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 20dfb3d9ccec559cdd1a9690146abad5d204a186.1557754872.git.berto@igalia.com [mreitz: Removed now superfluous BdrvChild * variable in bdrv_open_child()] Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-05-28 20:30:55 +02:00
Alberto Garcia	dd4118c792	block: Use bdrv_unref_child() for all children in bdrv_close() bdrv_unref_child() does the following things: - Updates the child->bs->inherits_from pointer. - Calls bdrv_detach_child() to remove the BdrvChild from bs->children. - Calls bdrv_unref() to unref the child BlockDriverState. When bdrv_unref_child() was introduced in commit `33a604075c` it was not used in bdrv_close() because the drivers that had additional children (like quorum or blkverify) had already called bdrv_unref() on their children during their own close functions. This was changed later (in `0bd6e91a7e` for quorum, in `3e586be0b2` for blkverify) so there's no reason not to use bdrv_unref_child() in bdrv_close() anymore. After this there's also no need to remove bs->backing and bs->file separately from the rest of the children, so bdrv_close() can be simplified. Now bdrv_close() unrefs all children (before this patch it was only bs->file and bs->backing). As a result, none of the callers of brvd_attach_child() should remove their reference to child_bs (because this function effectively steals that reference). This patch updates a couple of tests that were doing their own bdrv_unref(). Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 6d1d5feaa53aa1ab127adb73d605dc4503e3abd5.1557754872.git.berto@igalia.com [mreitz: s/where/were/] Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-05-28 20:30:55 +02:00
Max Reitz	481e0eeef4	block: Improve "Block node is read-only" message This message does not make any sense when it appears as the response to making an R/W node read-only. We should detect that case and emit a different message, then. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-05-20 17:08:57 +02:00
Kevin Wolf	53a7d04185	block: Propagate AioContext change to parents All block nodes and users in any connected component of the block graph must be in the same AioContext, so changing the AioContext of one node must not only change all of its children, but all of its parents (and in turn their children etc.) as well. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-05-20 17:08:56 +02:00
Kevin Wolf	0d83708a1d	block: Move recursion to bdrv_set_aio_context() Instead of having two recursions, in bdrv_attach_aio_context() and in bdrv_detach_aio_context(), just having one recursion is enough. Said functions are only about a single node now. It is important that the recursion doesn't happen between detaching and attaching a context to the current node because the nested call will drain the node, and draining with a NULL context would segfault. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-05-20 17:08:56 +02:00
Kevin Wolf	a3a683c33d	block: Make bdrv_attach/detach_aio_context() static Since commit `b97511c7bc`, there is no reason for block drivers any more to call these functions (see the function comment in block_int.h). They are now just internal helper functions for bdrv_set_aio_context() and can be made static. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-05-20 17:08:56 +02:00
Kevin Wolf	5d2318499f	block: Add bdrv_try_set_aio_context() Eventually, we want to make sure that all parents and all children of a node are in the same AioContext as the node itself. This means that changing the AioContext may fail because one of the other involved parties (e.g. a guest device that was configured with an iothread) cannot allow switching to a different AioContext. Introduce a set of functions that allow to first check whether all involved nodes can switch to a new context and only then do the actual switch. The check recursively covers children and parents. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-05-20 17:08:56 +02:00
Anton Kuchin	30c321f97f	block: remove bs from lists before closing Close involves flush that can be performed asynchronously and bs must be protected from being referenced before it is deleted. Signed-off-by: Anton Kuchin <antonkuchin@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-05-10 16:45:40 +02:00
Nikita Alekseev	66a5bdf309	block: Add coroutine_fn to bdrv_check_co_entry bdrv_check_co_entry calls bdrv_co_check, which is a coroutine function. Thus, it also needs to be marked as a coroutine. Signed-off-by: Nikita Alekseev <n.alekseev2104@gmail.com> Message-id: 20190401093051.16488-1-n.alekseev2104@gmail.com Message-Id: <20190401093051.16488-1-n.alekseev2104@gmail.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-05-10 10:53:21 +01:00
Alberto Garcia	0b3ca76e52	block: Assert that drv->bdrv_child_perm is set in bdrv_child_perm() There is no need to check for this because all block drivers that have children implement bdrv_child_perm and all callers already ensure that bs->drv is set. Furthermore, if this check would fail then the callers would end up with uninitialized values for nperm and nshared. This patch replaces the check with an assertion. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 20190404112953.4058-1-berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-05-07 17:14:21 +02:00
Kevin Wolf	1bffe1ae7a	block: Fix AioContext switch for bs->drv == NULL Even for block nodes with bs->drv == NULL, we can't just ignore a bdrv_set_aio_context() call. Leaving the node in its old context can mean that it's still in an iothread context in bdrv_close_all() during shutdown, resulting in an attempted unlock of the AioContext lock which we don't hold. This is an example stack trace of a related crash: #0 0x00007ffff59da57f in raise () at /lib64/libc.so.6 #1 0x00007ffff59c4895 in abort () at /lib64/libc.so.6 #2 0x0000555555b97b1e in error_exit (err=<optimized out>, msg=msg@entry=0x555555d386d0 <__func__.19059> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36 #3 0x0000555555b97f7f in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5555568002f0, file=file@entry=0x555555d378df "util/async.c", line=line@entry=507) at util/qemu-thread-posix.c:97 #4 0x0000555555b92f55 in aio_context_release (ctx=ctx@entry=0x555556800290) at util/async.c:507 #5 0x0000555555b05cf8 in bdrv_prwv_co (child=child@entry=0x7fffc80012f0, offset=offset@entry=131072, qiov=qiov@entry=0x7fffffffd4f0, is_write=is_write@entry=true, flags=flags@entry=0) at block/io.c:833 #6 0x0000555555b060a9 in bdrv_pwritev (qiov=0x7fffffffd4f0, offset=131072, child=0x7fffc80012f0) at block/io.c:990 #7 0x0000555555b060a9 in bdrv_pwrite (child=0x7fffc80012f0, offset=131072, buf=<optimized out>, bytes=<optimized out>) at block/io.c:990 #8 0x0000555555ae172b in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568cc740, i=i@entry=0) at block/qcow2-cache.c:51 #9 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568cc740) at block/qcow2-cache.c:248 #10 0x0000555555ae15de in qcow2_cache_flush (bs=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259 #11 0x0000555555ae16b1 in qcow2_cache_flush_dependency (c=0x5555568a1700, c=0x5555568a1700, bs=0x555556810680) at block/qcow2-cache.c:194 #12 0x0000555555ae16b1 in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568a1700, i=i@entry=0) at block/qcow2-cache.c:194 #13 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568a1700) at block/qcow2-cache.c:248 #14 0x0000555555ae15de in qcow2_cache_flush (bs=bs@entry=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259 #15 0x0000555555ad242c in qcow2_inactivate (bs=bs@entry=0x555556810680) at block/qcow2.c:2124 #16 0x0000555555ad2590 in qcow2_close (bs=0x555556810680) at block/qcow2.c:2153 #17 0x0000555555ab0c62 in bdrv_close (bs=0x555556810680) at block.c:3358 #18 0x0000555555ab0c62 in bdrv_delete (bs=0x555556810680) at block.c:3542 #19 0x0000555555ab0c62 in bdrv_unref (bs=0x555556810680) at block.c:4598 #20 0x0000555555af4d72 in blk_remove_bs (blk=blk@entry=0x5555568103d0) at block/block-backend.c:785 #21 0x0000555555af4dbb in blk_remove_all_bs () at block/block-backend.c:483 #22 0x0000555555aae02f in bdrv_close_all () at block.c:3412 #23 0x00005555557f9796 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4776 The reproducer I used is a qcow2 image on gluster volume, where the virtual disk size (4 GB) is larger than the gluster volume size (64M), so we can easily trigger an ENOSPC. This backend is assigned to a virtio-blk device using an iothread, and then from the guest a 'dd if=/dev/zero of=/dev/vda bs=1G count=1' causes the VM to stop because of an I/O error. qemu_gluster_co_flush_to_disk() sets bs->drv = NULL on error, so when virtio-blk stops the dataplane, the block nodes stay in the iothread AioContext. A 'quit' monitor command issued from this paused state crashes the process. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1631227 Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>	2019-04-30 15:29:00 +02:00
Kevin Wolf	3f48686fac	block: Forward 'discard' to temporary overlay When bdrv_temp_snapshot_options() is called for snapshot=on, the 'discard' option in the options QDict hasn't been parsed and merged into the flags yet. So copy the dict entry to make sure that the temporary overlay enables discard when it was requested for the drive. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>	2019-04-08 16:48:46 +02:00
Alberto Garcia	0f0998f621	block: continue until base is found in bdrv_freeze_backing_chain() et al All three functions that handle the BdrvChild.frozen attribute walk the backing chain from 'bs' to 'base' and stop either when 'base' is found or at the end of the chain if 'base' is NULL. However if 'base' is not found then the functions return without errors as if it was NULL. This is wrong: if the caller passed an incorrect parameter that means that there is a bug in the code. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-04-02 12:04:44 +02:00
Kevin Wolf	2345bde647	block: Silence Coverity in bdrv_drop_intermediate() Coverity doesn't like that the return value of bdrv_check_update_perm() stays unused only in this place (CID 1399710). Even if checking local_err should be equivalent to checking ret < 0, let's switch to using the return value to be more consistent (and in case of a bug somewhere down the call chain, forgetting to assign errp is more likely than returning 0 for an error case). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Markus Armbruster <armbru@redhat.com>	2019-03-19 15:49:29 +01:00
Alberto Garcia	5019aece2a	block: Remove the AioContext parameter from bdrv_reopen_multiple() This parameter has been unused since `1a63a90750` Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-03-12 20:30:14 +01:00
Alberto Garcia	faf116b438	block: Add bdrv_reset_options_allowed() bdrv_reopen_prepare() receives a BDRVReopenState with (among other things) a new set of options to be applied to that BlockDriverState. If an option is missing then it means that we want to reset it to its default value rather than keep the previous one. This way the state of the block device after being reopened is comparable to that of a device added with "blockdev-add" using the same set of options. Not all options from all drivers can be changed this way, however. If the user attempts to reset an immutable option to its default value using this method then we must forbid it. This new function takes a BlockDriverState and a new set of options and checks if there's any option that was previously set but is missing from the new set of options. If the option is present in both sets we don't need to check that they have the same value. The loop at the end of bdrv_reopen_prepare() already takes care of that. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-03-12 20:30:14 +01:00
Alberto Garcia	cb828c31de	block: Allow changing the backing file on reopen This patch allows the user to change the backing file of an image that is being reopened. Here's what it does: - In bdrv_reopen_prepare(): check that the value of 'backing' points to an existing node or is null. If it points to an existing node it also needs to make sure that replacing the backing file will not create a cycle in the node graph (i.e. you cannot reach the parent from the new backing file). - In bdrv_reopen_commit(): perform the actual node replacement by calling bdrv_set_backing_hd(). There may be temporary implicit nodes between a BDS and its backing file (e.g. a commit filter node). In these cases bdrv_reopen_prepare() looks for the real (non-implicit) backing file and requires that the 'backing' option points to it. Replacing or detaching a backing file is forbidden if there are implicit nodes in the middle. Although x-blockdev-reopen is meant to be used like blockdev-add, there's an important thing that must be taken into account: the only way to set a new backing file is by using a reference to an existing node (previously added with e.g. blockdev-add). If 'backing' contains a dictionary with a new set of options ({"driver": "qcow2", "file": { ... }}) then it is interpreted that the _existing_ backing file must be reopened with those options. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-03-12 20:30:14 +01:00
Alberto Garcia	bacd9b87c4	block: Allow omitting the 'backing' option in certain cases Of all options of type BlockdevRef used to specify children in BlockdevOptions, 'backing' is the only one that is optional. For "x-blockdev-reopen" we want that if an option is omitted then it must be reset to its default value. The default value of 'backing' means that QEMU opens the backing file specified in the image metadata, but this is not something that we want to support for the reopen operation. Because of this the 'backing' option has to be specified during reopen, pointing to the existing backing file if we want to keep it, or pointing to a different one (or NULL) if we want to replace it (to be implemented in a subsequent patch). In order to simplify things a bit and not to require that the user passes the 'backing' option to every single block device even when it's clearly not necessary, this patch allows omitting this option if the block device being reopened doesn't have a backing file attached _and_ no default backing file is specified in the image metadata. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-03-12 20:30:14 +01:00
Alberto Garcia	8546632e61	block: Handle child references in bdrv_reopen_queue() Children in QMP are specified with BlockdevRef / BlockdevRefOrNull, which can contain a set of child options, a child reference, or NULL. In optional attributes like "backing" it can also be missing. Only the first case (set of child options) is being handled properly by bdrv_reopen_queue(). This patch deals with all the others. Here's how these cases should be handled when bdrv_reopen_queue() is deciding what to do with each child of a BlockDriverState: 1) Set of child options: if the child was implicitly created (i.e inherits_from points to the parent) then the options are removed from the parent's options QDict and are passed to the child with a recursive bdrv_reopen_queue() call. This case was already working fine. 2) Child reference: there's two possibilites here. 2a) Reference to the current child: if the child was implicitly created then it is put in the reopen queue, keeping its current set of options (since this was a child reference there was no way to specify a different set of options). If the child is not implicit then it keeps its current set of options but it is not reopened (and therefore does not inherit any new option from the parent). 2b) Reference to a different BDS: the current child is not put in the reopen queue at all. Passing a reference to a different BDS can be used to replace a child, although at the moment no driver implements this, so it results in an error. In any case, the current child is not going to be reopened (and might in fact disappear if it's replaced) 3) NULL: This is similar to (2b). Although no driver allows this yet it can be used to detach the current child so it should not be put in the reopen queue. 4) Missing option: at the moment "backing" is the only case where this can happen. With "blockdev-add", leaving "backing" out means that the default backing file is opened. We don't want to open a new image during reopen, so we require that "backing" is always present. We'll relax this requirement a bit in the next patch. If keep_old_opts is true and "backing" is missing then this behaves like 2a (the current child is reopened). Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-03-12 20:30:14 +01:00

1 2 3 4 5 ...

1492 Commits