mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Alberto Garcia	6bd858b311	block: Update BlockDriverState.inherits_from on bdrv_drop_intermediate() The previous patch fixed the inherits_from pointer after block-stream, and this one does the same for block-commit. When block-commit finishes and the 'top' node is not the topmost one from the backing chain then all nodes above 'base' up to and including 'top' are removed from the chain. The bdrv_drop_intermediate() call converts a chain like this one: base <- intermediate <- top <- active into this one: base <- active In a simple scenario each backing file from the first chain has the inherits_from attribute pointing to its parent. This means that reopening 'active' will recursively reopen all its children, whose options can be changed in the process. However after the 'block-commit' call base.inherits_from is NULL and the chain is broken, so 'base' does not inherit from 'active' and will not be reopened automatically: $ qemu-img create -f qcow2 hd0.qcow2 1M $ qemu-img create -f qcow2 -b hd0.qcow2 hd1.qcow2 $ qemu-img create -f qcow2 -b hd1.qcow2 hd2.qcow2 $ $QEMU -drive if=none,file=hd2.qcow2 { 'execute': 'block-commit', 'arguments': { 'device': 'none0', 'top': 'hd1.qcow2' } } { 'execute': 'human-monitor-command', 'arguments': { 'command-line': 'qemu-io none0 "reopen -o backing.l2-cache-size=2M"' } } { "return": "Cannot change the option 'backing.l2-cache-size'\r\n"} This patch updates base.inherits_from in this scenario, and adds a test case. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-11-22 19:37:31 +01:00
Alberto Garcia	0065c455f9	block: Update BlockDriverState.inherits_from on bdrv_set_backing_hd() When a BlockDriverState's child is opened (be it a backing file, the protocol layer, or any other) inherits_from is set to point to the parent node. Children opened separately and then attached to a parent don't have this pointer set. bdrv_reopen_queue_child() uses this to determine whether a node's children must also be reopened inheriting the options from the parent or not. If inherits_from points to the parent then the child is reopened and its options can be changed, like in this example: $ qemu-img create -f qcow2 hd0.qcow2 1M $ qemu-img create -f qcow2 hd1.qcow2 1M $ $QEMU -drive if=none,node-name=hd0,file=hd0.qcow2,\ backing.driver=qcow2,backing.file.filename=hd1.qcow2 (qemu) qemu-io hd0 "reopen -o backing.l2-cache-size=2M" If the child does not inherit from the parent then it does not get reopened and its options cannot be changed: $ $QEMU -drive if=none,node-name=hd1,file=hd1.qcow2 -drive if=none,node-name=hd0,file=hd0.qcow2,backing=hd1 (qemu) qemu-io hd0 "reopen -o backing.l2-cache-size=2M" Cannot change the option 'backing.l2-cache-size' If a disk image has a chain of backing files then all of them are also connected through their inherits_from pointers (i.e. it's possible to walk the chain in reverse order from base to top). However this is broken if the intermediate nodes are removed using e.g. block-stream because the inherits_from pointer from the base node becomes NULL: $ qemu-img create -f qcow2 hd0.qcow2 1M $ qemu-img create -f qcow2 -b hd0.qcow2 hd1.qcow2 $ qemu-img create -f qcow2 -b hd1.qcow2 hd2.qcow2 $ $QEMU -drive if=none,file=hd2.qcow2 (qemu) qemu-io none0 "reopen -o backing.l2-cache-size=2M" (qemu) block_stream none0 0 hd0.qcow2 (qemu) qemu-io none0 "reopen -o backing.l2-cache-size=2M" Cannot change the option 'backing.l2-cache-size' This patch updates the inherits_from pointer if the intermediate nodes of a backing chain are removed using bdrv_set_backing_hd(), and adds a test case for this scenario. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-11-22 19:37:31 +01:00
Alberto Garcia	2a3d4331fa	block: Fix update of BDRV_O_AUTO_RDONLY in update_flags_from_options() Commit `e35bdc123a` added the auto-read-only option and the code to update its corresponding flag in update_flags_from_options(), but forgot to clear the flag if auto-read-only is false. Signed-off-by: Alberto Garcia <berto@igalia.com> Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-11-22 16:43:52 +01:00
Max Reitz	9ad08c4456	block: Always abort reopen after prepare succeeded bdrv_reopen_multiple() does not invoke bdrv_reopen_abort() for the element of the reopen queue for which bdrv_reopen_prepare() failed, because it assumes that the prepare function will have rolled back all changes already. However, bdrv_reopen_prepare() does not do this in every case: It may notice an error after BlockDriver.bdrv_reopen_prepare() succeeded, and it will not invoke BlockDriver.bdrv_reopen_abort() then; and neither will bdrv_reopen_multiple(), as explained above. This is wrong because we must always call .bdrv_reopen_commit() or .bdrv_reopen_abort() after .bdrv_reopen_prepare() has succeeded. Otherwise, the block driver has no chance to undo what it has done in its implementation of .bdrv_reopen_prepare(). To fix this, bdrv_reopen_prepare() has to call .bdrv_reopen_abort() if it wants to return an error after .bdrv_reopen_prepare() has succeeded. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-11-19 14:31:48 +01:00
Kevin Wolf	eaa2410f1e	block: Require auto-read-only for existing fallbacks Some block drivers have traditionally changed their node to read-only mode without asking the user. This behaviour has been marked deprecated since 2.11, expecting users to provide an explicit read-only=on option. Now that we have auto-read-only=on, enable these drivers to make use of the option. This is the only use of bdrv_set_read_only(), so we can make it a bit more specific and turn it into a bdrv_apply_auto_read_only() that is more convenient for drivers to use. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-11-05 15:09:55 +01:00
Kevin Wolf	e35bdc123a	block: Add auto-read-only option If a management application builds the block graph node by node, the protocol layer doesn't inherit its read-only option from the format layer any more, so it must be set explicitly. Backing files should work on read-only storage, but at the same time, a block job like commit should be able to reopen them read-write if they are on read-write storage. However, without option inheritance, reopen only changes the read-only option for the root node (typically the format layer), but not the protocol layer, so reopening fails (the format layer wants to get write permissions, but the protocol layer is still read-only). A simple workaround for the problem in the management tool would be to open the protocol layer always read-write and to make only the format layer read-only for backing files. However, sometimes the file is actually stored on read-only storage and we don't know whether the image can be opened read-write (for example, for NBD it depends on the server we're trying to connect to). This adds an option that makes QEMU try to open the image read-write, but allows it to degrade to a read-only mode without returning an error. The documentation for this option is consciously phrased in a way that allows QEMU to switch to a better model eventually: Instead of trying when the image is first opened, making the read-only flag dynamic and changing it automatically whenever the first BLK_PERM_WRITE user is attached or the last one is detached would be much more useful behaviour. Unfortunately, this more useful behaviour is also a lot harder to implement, and libvirt needs a solution now before it can switch to -blockdev, so let's start with this easier approach for now. Instead of adding a new auto-read-only option, turning the existing read-only into an enum (with a bool alternate for compatibility) was considered, but it complicated the implementation to the point that it didn't seem to be worth it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-11-05 15:09:55 +01:00
Kevin Wolf	eeae6a596b	block: Update flags in bdrv_set_read_only() To fully change the read-only state of a node, we must not only change bs->read_only, but also update bs->open_flags. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>	2018-11-05 15:09:55 +01:00
Alberto Garcia	415bbca86d	block: replace "discard" literal with BDRV_OPT_DISCARD macro Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-11-05 15:09:54 +01:00
Vladimir Sementsov-Ogievskiy	9c98f145df	dirty-bitmaps: clean-up bitmaps loading and migration logic This patch aims to bring the following behavior: 1. We don't load bitmaps, when started in inactive mode. It's the case of incoming migration. In this case we wait for bitmaps migration through migration channel (if 'dirty-bitmaps' capability is enabled) or for invalidation (to load bitmaps from the image). 2. We don't remove persistent bitmaps on inactivation. Instead, we only remove bitmaps after storing. This is the only way to restore bitmaps, if we decided to resume source after [failed] migration with 'dirty-bitmaps' capability enabled (which means, that bitmaps were not stored). 3. We load bitmaps on open and any invalidation, it's ok for all cases: - normal open - migration target invalidation with dirty-bitmaps capability (bitmaps are migrating through migration channel, the are not stored, so they should have IN_USE flag set and will be skipped when loading. However, it would fail if bitmaps are read-only[1]) - migration target invalidation without dirty-bitmaps capability (normal load of the bitmaps, if migrated with shared storage) - source invalidation with dirty-bitmaps capability (skip because IN_USE) - source invalidation without dirty-bitmaps capability (bitmaps were dropped, reload them) [1]: to accurately handle this, migration of read-only bitmaps is explicitly forbidden in this patch. New mechanism for not storing bitmaps when migrate with dirty-bitmaps capability is introduced: migration filed in BdrvDirtyBitmap. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: John Snow <jsnow@redhat.com>	2018-10-29 16:23:17 -04:00
Markus Armbruster	da7e92cac9	block: Clean up bdrv_img_create()'s error reporting bdrv_img_create() takes an Error ** argument and uses it in the conventional way, except for one place: when qemu_opts_do_parse() fails, it first reports its error to stderr or the HMP monitor with error_report_err(), then error_setg()'s a generic error. When the caller reports that second error similarly, this produces two consecutive error messages on stderr or the HMP monitor. When the caller does something else with it, such as send it via QMP, the first error still goes to stderr or the HMP monitor. Fortunately, no such caller exists. Simply use the first error as is. Update expected output of qemu-iotest 049 accordingly. Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20181017082702.5581-37-armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2018-10-19 14:51:34 +02:00
Markus Armbruster	4b5766488f	error: Fix use of error_prepend() with &error_fatal, &error_abort From include/qapi/error.h: * Pass an existing error to the caller with the message modified: * error_propagate(errp, err); * error_prepend(errp, "Could not frobnicate '%s': ", name); Fei Li pointed out that doing error_propagate() first doesn't work well when @errp is &error_fatal or &error_abort: the error_prepend() is never reached. Since I doubt fixing the documentation will stop people from getting it wrong, introduce error_propagate_prepend(), in the hope that it lures people away from using its constituents in the wrong order. Update the instructions in error.h accordingly. Convert existing error_prepend() next to error_propagate to error_propagate_prepend(). If any of these get reached with &error_fatal or &error_abort, the error messages improve. I didn't check whether that's the case anywhere. Cc: Fei Li <fli@suse.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20181017082702.5581-2-armbru@redhat.com>	2018-10-19 14:51:34 +02:00
Alberto Garcia	543770bd2e	block: Allow changing 'detect-zeroes' on reopen 'detect-zeroes' is one of the basic BlockdevOptions available for all drivers, but it's not handled by bdrv_reopen_prepare(), so any attempt to change it results in an error: (qemu) qemu-io virtio0 "reopen -o detect-zeroes=on" Cannot change the option 'detect-zeroes' Since there's no reason why we shouldn't allow changing it and the implementation is simple let's just do it. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-10-01 12:51:12 +02:00
Alberto Garcia	593b307197	block: Allow changing 'discard' on reopen 'discard' is one of the basic BlockdevOptions available for all drivers, but it's not handled by bdrv_reopen_prepare() so any attempt to change it results in an error: (qemu) qemu-io virtio0 "reopen -o discard=on" Cannot change the option 'discard' Since there's no reason why we shouldn't allow changing it and the implementation is simple let's just do it. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-10-01 12:51:12 +02:00
Alberto Garcia	57f9db9a94	block: Forbid trying to change unsupported options during reopen The bdrv_reopen_prepare() function checks all options passed to each BlockDriverState (in the reopen_state->options QDict) and makes all necessary preparations to apply the option changes requested by the user. Options are removed from the QDict as they are processed, so at the end of bdrv_reopen_prepare() only the options that can't be changed are left. Then a loop goes over all remaining options and verifies that the old and new values are identical, returning an error if they're not. The problem is that at the moment there are options that are removed from the QDict although they can't be changed. The consequence of this is any modification to any of those options is silently ignored: (qemu) qemu-io virtio0 "reopen -o discard=on" This happens when all options from bdrv_runtime_opts are removed from the QDict but then only a few of them are processed. Since it's especially important that "node-name" and "driver" are not changed, the code puts them back into the QDict so they are checked at the end of the function. Instead of putting only those two options back into the QDict, this patch puts all unprocessed options using qemu_opts_to_qdict(). update_flags_from_options() also needs to be modified to prevent BDRV_OPT_CACHE_NO_FLUSH, BDRV_OPT_CACHE_DIRECT and BDRV_OPT_READ_ONLY from going back to the QDict. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-10-01 12:51:12 +02:00
Alberto Garcia	db905283b8	block: Allow child references on reopen In the previous patches we removed all child references from bs->{options,explicit_options} because keeping them is useless and wrong. Because of this, any attempt to reopen a BlockDriverState using a child reference as one of its options would result in a failure, because bdrv_reopen_prepare() would detect that there's a new option (the child reference) that wasn't present in bs->options. But passing child references on reopen can be useful. It's a way to specify a BDS's child without having to pass recursively all of the child's options, and if the reference points to a different BDS then this can allow us to replace the child. However, replacing the child is something that needs to be implemented case by case and only when it makes sense. For now, this patch allows passing a child reference as long as it points to the current child of the BlockDriverState. It's also important to remember that, as a consequence of the previous patches, this child reference will be removed from bs->{options,explicit_options} after the reopening has been completed. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-10-01 12:51:12 +02:00
Alberto Garcia	a600aaddc3	block: Don't look for child references in append_open_options() In the previous patch we removed child references from bs->options, so there's no need to look for them here anymore. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-10-01 12:51:12 +02:00
Alberto Garcia	50196d7a7c	block: Remove child references from bs->{options,explicit_options} Block drivers allow opening their children using a reference to an existing BlockDriverState. These references remain stored in the 'options' and 'explicit_options' QDicts, but we don't need to keep them once everything is open. What is more important, these values can become wrong if the children change: $ qemu-img create -f qcow2 hd0.qcow2 10M $ qemu-img create -f qcow2 hd1.qcow2 10M $ qemu-img create -f qcow2 hd2.qcow2 10M $ $QEMU -drive if=none,file=hd0.qcow2,node-name=hd0 \ -drive if=none,file=hd1.qcow2,node-name=hd1,backing=hd0 \ -drive file=hd2.qcow2,node-name=hd2,backing=hd1 After this hd2 has hd1 as its backing file. Now let's remove it using block_stream: (qemu) block_stream hd2 0 hd0.qcow2 Now hd0 is the backing file of hd2, but hd2's options QDicts still contain backing=hd1. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-10-01 12:51:12 +02:00
Kevin Wolf	cfe29d8294	block: Use a single global AioWait When draining a block node, we recurse to its parent and for subtree drains also to its children. A single AIO_WAIT_WHILE() is then used to wait for bdrv_drain_poll() to become true, which depends on all of the nodes we recursed to. However, if the respective child or parent becomes quiescent and calls bdrv_wakeup(), only the AioWait of the child/parent is checked, while AIO_WAIT_WHILE() depends on the AioWait of the original node. Fix this by using a single AioWait for all callers of AIO_WAIT_WHILE(). This may mean that the draining thread gets a few more unnecessary wakeups because an unrelated operation got completed, but we already wake it up when something _could_ have changed rather than only if it has certainly changed. Apart from that, drain is a slow path anyway. In theory it would be possible to use wakeups more selectively and still correctly, but the gains are likely not worth the additional complexity. In fact, this patch is a nice simplification for some places in the code. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-09-25 15:50:15 +02:00
Alberto Garcia	8961be33e8	block: Fix use after free error in bdrv_open_inherit() When a block device is opened with BDRV_O_SNAPSHOT and the bdrv_append_temp_snapshot() call fails then the error code path tries to unref the already destroyed 'options' QDict. This can be reproduced easily by setting TMPDIR to a location where the QEMU process can't write: $ TMPDIR=/nonexistent $QEMU -drive driver=null-co,snapshot=on Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-09-25 15:50:15 +02:00
Markus Armbruster	dd98e84819	qjson: Have qobject_from_json() & friends reject empty and blank The last case where qobject_from_json() & friends return null without setting an error is empty or blank input. Callers: * block.c's parse_json_protocol() reports "Could not parse the JSON options". It's marked as a work-around, because it also covered actual bugs, but they got fixed in the previous few commits. * qobject_input_visitor_new_str() reports "JSON parse error". Also marked as work-around. The recent fixes have made this unreachable, because it currently gets called only for input starting with '{'. * check-qjson.c's empty_input() and blank_input() demonstrate the behavior. * The other callers are not affected since they only pass input with exactly one JSON value or, in the case of negative tests, one error. Fail with "Expecting a JSON value" instead of returning null, and simplify callers. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180823164025.12553-48-armbru@redhat.com>	2018-08-24 20:26:37 +02:00
Alberto Garcia	261dbcb18f	block: Simplify append_open_options() This function returns a BDS's driver-specific options, excluding also those from its children. Since we have just removed all children options from bs->options there's no need to do this last step. We allow references to children, though ("backing": "node0"), so those we still have to remove. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-08-15 12:50:39 +02:00
Alberto Garcia	4c8350fe17	block: Update bs->options if bdrv_reopen() succeeds If bdrv_reopen() succeeds then bs->explicit_options is updated with the new values, but bs->options never changes. Here's an example: { "execute": "blockdev-add", "arguments": { "driver": "qcow2", "node-name": "hd0", "overlap-check": "all", "file": { "driver": "file", "filename": "hd0.qcow2" } } } After this, both bs->options and bs->explicit_options contain "overlap-check": "all". Now let's change that using qemu-io's reopen command: (qemu) qemu-io hd0 "reopen -o overlap-check=none" After this, bs->explicit_options contains the new value but bs->options still keeps the old one. This patch updates bs->options after a BDS has been successfully reopened. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-08-15 12:50:39 +02:00
Alberto Garcia	1bab38e7bd	block: Simplify bdrv_reopen_abort() If a bdrv_reopen_multiple() call fails, then the explicit_options QDict has to be deleted for every entry in the reopen queue. This must happen regardless of whether that entry's bdrv_reopen_prepare() call succeeded or not. This patch simplifies the cleanup code a bit. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-08-15 12:50:39 +02:00
Alberto Garcia	2f624b80ba	block: Remove children options from bs->{options,explicit_options} When bdrv_open_inherit() opens a BlockDriverState the options QDict can contain options for some of its children, passed in the form of child-name.option=value So while each child is opened with that subset of options, those same options remain stored in the parent BDS, leaving (at least) two copies of each one of them ("child-name.option=value" in the parent and "option=value" in the child). Having the children options stored in the parent is unnecessary and it can easily lead to an inconsistent state: $ qemu-img create -f qcow2 hd0.qcow2 10M $ qemu-img create -f qcow2 -b hd0.qcow2 hd1.qcow2 $ qemu-img create -f qcow2 -b hd1.qcow2 hd2.qcow2 $ $QEMU -drive file=hd2.qcow2,node-name=hd2,backing.node-name=hd1 This opens a chain of images hd0 <- hd1 <- hd2. Now let's remove hd1 using block_stream: (qemu) block_stream hd2 0 hd0.qcow2 After this hd2 contains backing.node-name=hd1, which is no longer correct because hd1 doesn't exist anymore. This patch removes all children options from the parent dictionaries at the end of bdrv_open_inherit() and bdrv_reopen_queue_child(). Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-08-15 12:50:39 +02:00
Vladimir Sementsov-Ogievskiy	3c005293c2	block: make .bdrv_close optional Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-08-15 12:50:39 +02:00
Stefan Weil	50d6a8a352	block: Fix typos in comments (found by codespell) Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-23 16:50:43 +02:00
Kevin Wolf	4be6a6d118	block: Poll after drain on attaching a node Commit `dcf94a23b1` ('block: Don't poll in parent drain callbacks') removed polling in bdrv_child_cb_drained_begin() on the grounds that the original bdrv_drain() already will poll and BdrvChildRole.drained_begin calls must not cause graph changes (and therefore must not call aio_poll() or the recursion through the graph will break. This reasoning is correct for calls through bdrv_do_drained_begin(). However, BdrvChildRole.drained_begin is also called when a node that is already in a drained section (i.e. bdrv_do_drained_begin() has already returned and therefore can't poll any more) is attached to a new parent. In this case, we must explicitly poll to have all requests completed before the drained new child can be attached to the parent. In bdrv_replace_child_noperm(), we know that we're not inside the recursion of bdrv_do_drained_begin() because graph changes are not allowed there, and bdrv_replace_child_noperm() is a graph change. The call of BdrvChildRole.drained_begin() must therefore be followed by a BDRV_POLL_WHILE() that waits for the completion of requests. Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-10 10:36:15 +02:00
Kevin Wolf	824808dd77	block: Don't silently truncate node names If the user passes a too long node name string, we silently truncate it to fit into BlockDriverState.node_name, i.e. to 31 characters. Apart from surprising the user when the node has a different name than requested, this also bypasses the check for duplicate names, so that the same name can be assigned to multiple nodes. Fix this by just making too long node names an error. Reported-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-05 10:29:19 +02:00
Ari Sundholm	7ae9f3f61b	block: Move two block permission constants to the relevant enum This allows using the two constants outside of block.c, which will happen in a subsequent patch. Signed-off-by: Ari Sundholm <ari@tuxera.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-05 10:29:19 +02:00
Kevin Wolf	3d9f2d2af6	block: Move bdrv_truncate() implementation to io.c This moves the bdrv_truncate() implementation from block.c to block/io.c so it can have access to the tracked requests infrastructure. This involves making refresh_total_sectors() public (in block_int.h). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-29 14:20:56 +02:00
Kevin Wolf	061ca8a368	block: Convert .bdrv_truncate callback to coroutine_fn bdrv_truncate() is an operation that can block (even for a quite long time, depending on the PreallocMode) in I/O paths that shouldn't block. Convert it to a coroutine_fn so that we have the infrastructure for drivers to make their .bdrv_co_truncate implementation asynchronous. This change could potentially introduce new race conditions because bdrv_truncate() isn't necessarily executed atomically any more. Whether this is a problem needs to be evaluated for each block driver that supports truncate: * file-posix/win32, gluster, iscsi, nfs, rbd, ssh, sheepdog: The protocol drivers are trivially safe because they don't actually yield yet, so there is no change in behaviour. * copy-on-read, crypto, raw-format: Essentially just filter drivers that pass the request to a child node, no problem. * qcow2: The implementation modifies metadata, so it needs to hold s->lock to be safe with concurrent I/O requests. In order to avoid double locking, this requires pulling the locking out into preallocate_co() and using qcow2_write_caches() instead of bdrv_flush(). * qed: Does a single header update, this is fine without locking. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-29 14:20:56 +02:00
Max Reitz	ec9f10fe06	block: Generalize should_update_child() rule Currently, bdrv_replace_node() refuses to create loops from one BDS to itself if the BDS to be replaced is the backing node of the BDS to replace it: Say there is a node A and a node B. Replacing B by A means making all references to B point to A. If B is a child of A (i.e. A has a reference to B), that would mean we would have to make this reference point to A itself -- so we'd create a loop. bdrv_replace_node() (through should_update_child()) refuses to do so if B is the backing node of A. There is no reason why we should create loops if B is not the backing node of A, though. The BDS graph should never contain loops, so we should always refuse to create them. If B is a child of A and B is to be replaced by A, we should simply leave B in place there because it is the most sensible choice. A more specific argument would be: Putting filter drivers into the BDS graph is basically the same as appending an overlay to a backing chain. But the main child BDS of a filter driver is not "backing" but "file", so restricting the no-loop rule to backing nodes would fail here. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20180613181823.13618-7-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-18 17:04:54 +02:00
Kevin Wolf	0f12264e7a	block: Allow graph changes in bdrv_drain_all_begin/end sections bdrv_drain_all_*() used bdrv_next() to iterate over all root nodes and did a subtree drain for each of them. This works fine as long as the graph is static, but sadly, reality looks different. If the graph changes so that root nodes are added or removed, we would have to compensate for this. bdrv_next() returns each root node only once even if it's the root node for multiple BlockBackends or for a monitor-owned block driver tree, which would only complicate things. The much easier and more obviously correct way is to fundamentally change the way the functions work: Iterate over all BlockDriverStates, no matter who owns them, and drain them individually. Compensation is only necessary when a new BDS is created inside a drain_all section. Removal of a BDS doesn't require any action because it's gone afterwards anyway. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	6cd5c9d7b2	block: ignore_bds_parents parameter for drain functions In the future, bdrv_drained_all_begin/end() will drain all invidiual nodes separately rather than whole subtrees. This means that we don't want to propagate the drain to all parents any more: If the parent is a BDS, it will already be drained separately. Recursing to all parents is unnecessary work and would make it an O(n²) operation. Prepare the drain function for the changed drain_all by adding an ignore_bds_parents parameter to the internal implementation that prevents the propagation of the drain to BDS parents. We still (have to) propagate it to non-BDS parents like BlockBackends or Jobs because those are not drained separately. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	dcf94a23b1	block: Don't poll in parent drain callbacks bdrv_do_drained_begin() is only safe if we have a single BDRV_POLL_WHILE() after quiescing all affected nodes. We cannot allow that parent callbacks introduce a nested polling loop that could cause graph changes while we're traversing the graph. Split off bdrv_do_drained_begin_quiesce(), which only quiesces a single node without waiting for its requests to complete. These requests will be waited for in the BDRV_POLL_WHILE() call down the call chain. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	fe4f0614ef	block: Drain recursively with a single BDRV_POLL_WHILE() Anything can happen inside BDRV_POLL_WHILE(), including graph changes that may interfere with its callers (e.g. child list iteration in recursive callers of bdrv_do_drained_begin). Switch to a single BDRV_POLL_WHILE() call for the whole subtree at the end of bdrv_do_drained_begin() to avoid such effects. The recursion happens now inside the loop condition. As the graph can only change between bdrv_drain_poll() calls, but not inside of it, doing the recursion here is safe. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	89bd030533	block: Really pause block jobs on drain We already requested that block jobs be paused in .bdrv_drained_begin, but no guarantee was made that the job was actually inactive at the point where bdrv_drained_begin() returned. This introduces a new callback BdrvChildRole.bdrv_drained_poll() and uses it to make bdrv_drain_poll() consider block jobs using the node to be drained. For the test case to work as expected, we have to switch from block_job_sleep_ns() to qemu_co_sleep_ns() so that the test job is even considered active and must be waited for when draining the node. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Max Reitz	609f45ea95	block: Add block-specific QDict header There are numerous QDict functions that have been introduced for and are used only by the block layer. Move their declarations into an own header file to reflect that. While qdict_extract_subqdict() is in fact used outside of the block layer (in util/qemu-config.c), it is still a function related very closely to how the block layer works with nested QDicts, namely by sometimes flattening them. Therefore, its declaration is put into this header as well and util/qemu-config.c includes it with a comment stating exactly which function it needs. Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20180509165530.29561-7-mreitz@redhat.com> [Copyright note tweaked, superfluous includes dropped] Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Max Reitz	cc02214097	block: Make bdrv_is_writable() public This is a useful function for the whole block layer, so make it public. At the same time, users outside of block.c probably do not need to make use of the reopen functionality, so rename the current function to bdrv_is_writable_after_reopen() create a new bdrv_is_writable() function that just passes NULL to it for the reopen queue. Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180606193702.7113-2-mreitz@redhat.com Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Max Reitz	d1402b5026	block: Add Error parameter to bdrv_amend_options Looking at the qcow2 code that is riddled with error_report() calls, this is really how it should have been from the start. Along the way, turn the target_version/current_version comparisons at the beginning of qcow2_downgrade() into assertions (the caller has to make sure these conditions are met), and rephrase the error message on using compat=1.1 to get refcount widths other than 16 bits. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180509210023.20283-3-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Kevin Wolf	b3b5299d58	block: Cancel job in bdrv_close_all() callers Now that we cancel all jobs and not only block jobs on shutdown, doing that in bdrv_close_all() isn't really appropriate any more. Move the job_cancel_sync_all() call to the callers, and only assert that there are no job running in bdrv_close_all(). Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-05-23 14:30:51 +02:00
Kevin Wolf	3d70ff53b6	job: Move completion and cancellation to Job This moves the top-level job completion and cancellation functions from BlockJob to Job. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-05-23 14:30:51 +02:00
Marc-André Lureau	f5a74a5a50	qobject: Modify qobject_ref() to return obj For convenience and clarity, make it possible to call qobject_ref() at the time when the reference is associated with a variable, or argument, by making qobject_ref() return the same pointer as given. Use that to simplify the callers. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180419150145.24795-5-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Useless change to qobject_ref_impl() dropped, commit message improved slightly] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2018-05-04 08:27:53 +02:00
Marc-André Lureau	cb3e7f08ae	qobject: Replace qobject_incref/QINCREF qobject_decref/QDECREF Now that we can safely call QOBJECT() on QObject * as well as its subtypes, we can have macros qobject_ref() / qobject_unref() that work everywhere instead of having to use QINCREF() / QDECREF() for QObject and qobject_incref() / qobject_decref() for its subtypes. The replacement is mechanical, except I broke a long line, and added a cast in monitor_qmp_cleanup_req_queue_locked(). Unlike qobject_decref(), qobject_unref() doesn't accept void *. Note that the new macros evaluate their argument exactly once, thus no need to shout them. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180419150145.24795-4-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Rebased, semantic conflict resolved, commit message improved] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2018-05-04 08:27:53 +02:00
Max Reitz	4f7be2806e	block: Deprecate "backing": "" We have a clear replacement, so let's deprecate it. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-Id: <20180224154033.29559-8-mreitz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-19 14:58:36 -05:00
Max Reitz	e59a0cf17b	block: Handle null backing link Instead of converting all "backing": null instances into "backing": "", handle a null value directly in bdrv_open_inherit(). This enables explicitly null backing links for json:{} filenames. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-Id: <20180224154033.29559-7-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: rebase to qobject_to() parameter order and qapi headers split] Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-19 14:58:36 -05:00
Max Reitz	7dc847ebba	qapi: Replace qobject_to_X(o) by qobject_to(X, o) This patch was generated using the following Coccinelle script: @@ expression Obj; @@ ( - qobject_to_qnum(Obj) + qobject_to(QNum, Obj) \| - qobject_to_qstring(Obj) + qobject_to(QString, Obj) \| - qobject_to_qdict(Obj) + qobject_to(QDict, Obj) \| - qobject_to_qlist(Obj) + qobject_to(QList, Obj) \| - qobject_to_qbool(Obj) + qobject_to(QBool, Obj) ) and a bit of manual fix-up for overly long lines and three places in tests/check-qjson.c that Coccinelle did not find. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-Id: <20180224154033.29559-4-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: swap order from qobject_to(o, X), rebase to master, also a fix to latent false-positive compiler complaint about hw/i386/acpi-build.c] Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-19 14:58:36 -05:00
Fam Zheng	2c860e797a	block: Fix leak of ignore_children in error path Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-19 12:01:39 +01:00
Fam Zheng	1a5297366f	block: Fix flags in reopen queue Reopen flags are not synchronized according to the bdrv_reopen_queue_child precedence until bdrv_reopen_prepare. It is a bit too late: we already check the consistency in bdrv_check_perm before that. This fixes the bug that when bdrv_reopen a RO node as RW, the flags for backing child are wrong. Before, we could recurse with flags.rw=1; now, role->inherit_options + update_flags_from_options will make sure to clear the bit when necessary. Note that this will not clear an explicitly set bit, as in the case of parallel block jobs (e.g. test_stream_parallel in 030), because the explicit options include 'read-only=false' (for an intermediate node used by a different job). Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-19 12:01:39 +01:00
Kevin Wolf	cd8b7aaa07	block: Fail bdrv_truncate() with negative size Most callers have their own checks, but something like this should also be checked centrally. As it happens, x-blockdev-create can pass negative image sizes to format drivers (because there is no QAPI type that would reject negative numbers) and triggers the check added by this patch. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-03-09 15:17:48 +01:00

1 2 3 4 5 ...

1201 Commits