mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Peter Krempa	facda5443f	qapi: Allow getting flat output from 'query-named-block-nodes' When a management application manages node names there's no reason to recurse into backing images in the output of query-named-block-nodes. Add a parameter to the command which will return just the top level structs. Signed-off-by: Peter Krempa <pkrempa@redhat.com> Message-Id: <4470f8c779abc404dcf65e375db195cd91a80651.1579509782.git.pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [mreitz: Fixed coding style] Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-02-20 16:43:42 +01:00
Max Reitz	6b4907cf42	block: Remove bdrv_recurse_is_first_non_filter() It no longer has any users. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200218103454.296704-11-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 11:55:40 +01:00
Max Reitz	810803a87c	block: Use bdrv_recurse_can_replace() Let check_to_replace_node() use the more specialized bdrv_recurse_can_replace() instead of bdrv_recurse_is_first_non_filter(), which is too restrictive (or, in the case of quorum, sometimes not restrictive enough). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200218103454.296704-10-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 11:55:40 +01:00
Max Reitz	5d69b5ab85	block: Add bdrv_recurse_can_replace() After a couple of follow-up patches, this function will replace bdrv_recurse_is_first_non_filter() in check_to_replace_node(). bdrv_recurse_is_first_non_filter() is both not sufficiently specific for check_to_replace_node() (it allows cases that should not be allowed, like replacing child nodes of quorum with dissenting data that have more parents than just quorum), and it is too restrictive (it is perfectly fine to replace filters). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200218103454.296704-7-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 11:55:39 +01:00
Max Reitz	a851ad4cac	block: Drop bdrv_is_first_non_filter() It is unused now. (And it was ugly because it needed to explore all BDS chains from the top.) Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200218103454.296704-4-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 11:55:39 +01:00
Philippe Mathieu-Daudé	0beab8119f	block: Remove superfluous semicolons Fixes: `132ada80c4` Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200218094402.26625-4-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 10:53:56 +01:00
Pan Nengyuan	cb8956144c	block: fix memleaks in bdrv_refresh_filename If we call the qmp 'query-block' while qemu is working on 'block-commit', it will cause memleaks, the memory leak stack is as follow: Indirect leak of 12360 byte(s) in 3 object(s) allocated from: #0 0x7f80f0b6d970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970) #1 0x7f80ee86049d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d) #2 0x55ea95b5bb67 in qdict_new /mnt/sdb/qemu-4.2.0-rc0/qobject/qdict.c:29 #3 0x55ea956cd043 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6427 #4 0x55ea956cc950 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6399 #5 0x55ea956cc950 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6399 #6 0x55ea956cc950 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6399 #7 0x55ea958818ea in bdrv_block_device_info /mnt/sdb/qemu-4.2.0-rc0/block/qapi.c:56 #8 0x55ea958879de in bdrv_query_info /mnt/sdb/qemu-4.2.0-rc0/block/qapi.c:392 #9 0x55ea9588b58f in qmp_query_block /mnt/sdb/qemu-4.2.0-rc0/block/qapi.c:578 #10 0x55ea95567392 in qmp_marshal_query_block qapi/qapi-commands-block-core.c:95 Indirect leak of 4120 byte(s) in 1 object(s) allocated from: #0 0x7f80f0b6d970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970) #1 0x7f80ee86049d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d) #2 0x55ea95b5bb67 in qdict_new /mnt/sdb/qemu-4.2.0-rc0/qobject/qdict.c:29 #3 0x55ea956cd043 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6427 #4 0x55ea956cc950 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6399 #5 0x55ea956cc950 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6399 #6 0x55ea9569f301 in bdrv_backing_attach /mnt/sdb/qemu-4.2.0-rc0/block.c:1064 #7 0x55ea956a99dd in bdrv_replace_child_noperm /mnt/sdb/qemu-4.2.0-rc0/block.c:2283 #8 0x55ea956b9b53 in bdrv_replace_node /mnt/sdb/qemu-4.2.0-rc0/block.c:4196 #9 0x55ea956b9e49 in bdrv_append /mnt/sdb/qemu-4.2.0-rc0/block.c:4236 #10 0x55ea958c3472 in commit_start /mnt/sdb/qemu-4.2.0-rc0/block/commit.c:306 #11 0x55ea94b68ab0 in qmp_block_commit /mnt/sdb/qemu-4.2.0-rc0/blockdev.c:3459 #12 0x55ea9556a7a7 in qmp_marshal_block_commit qapi/qapi-commands-block-core.c:407 Fixes: `bb808d5f5c` Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Message-id: 20200116085600.24056-1-pannengyuan@huawei.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-02-06 13:47:45 +01:00
Alberto Garcia	e2a7423a11	block: Use a GString in bdrv_perm_names() This is a bit more efficient than having to allocate and free memory for each new permission. The default size (30) is enough for "consistent read, write, resize". Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 20200110171518.22168-1-berto@igalia.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-02-06 13:47:45 +01:00
Aarushi Mehta	f80f267373	blockdev: adds bdrv_parse_aio to use io_uring Signed-off-by: Aarushi Mehta <mehta.aaru20@gmail.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200120141858.587874-8-stefanha@redhat.com Message-Id: <20200120141858.587874-8-stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-01-30 20:59:41 +00:00
Max Reitz	cdb1cec89e	block: Use bdrv_qapi_perm_to_blk_perm() We can save some LoC in xdbg_graph_add_edge() by using bdrv_qapi_perm_to_blk_perm(). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20191108123455.39445-3-mreitz@redhat.com Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-01-06 13:43:06 +01:00
Max Reitz	7b1d9c4df0	block: Add bdrv_qapi_perm_to_blk_perm() We need some way to correlate QAPI BlockPermission values with BLK_PERM_* flags. We could: (1) have the same order in the QAPI definition as the the BLK_PERM_* flags are in LSb-first order. However, then there is no guarantee that they actually match (e.g. when someone modifies the QAPI schema without thinking of the BLK_PERM_* definitions). We could add static assertions, but these would break what’s good about this solution, namely its simplicity. (2) define the BLK_PERM_* flags based on the BlockPermission values. But this way whenever someone were to modify the QAPI order (perfectly sensible in theory), the BLK_PERM_* values would change. Because these values are used for file locking, this might break file locking between different qemu versions. Therefore, go the slightly more cumbersome way: Add a function to translate from the QAPI constants to the BLK_PERM_* flags. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20191108123455.39445-2-mreitz@redhat.com Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-01-06 13:43:06 +01:00
Kevin Wolf	7bb4941ace	block: Activate recursively even for already active nodes bdrv_invalidate_cache_all() assumes that all nodes in a given subtree are either active or inactive when it starts. Therefore, as soon as it arrives at an already active node, it stops. However, this assumption is wrong. For example, it's possible to take a snapshot of an inactive node, which results in an active overlay over an inactive backing file. The active overlay is probably also the root node of an inactive BlockBackend (blk->disable_perm == true). In this case, bdrv_invalidate_cache_all() does not need to do anything to activate the overlay node, but it still needs to recurse into the children and the parents to make sure that after returning success, really everything is activated. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2019-12-19 18:04:25 +01:00
Kevin Wolf	f6dc1c31d3	block: Error out on image creation with conflicting size options If both the create options (qemu-img create -o ...) and the size parameter were given, the size parameter was silently ignored. Instead, make specifying two sizes an error. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2019-12-18 11:20:57 +01:00
Kevin Wolf	ae0f57f0aa	block: Remove 'backing': null from bs->{explicit_,}options bs->options and bs->explicit_options shouldn't contain any options for child nodes. bdrv_open_inherited() takes care to remove any options that match a child name after opening the image and the same is done when reopening. However, we miss the case of 'backing': null, which is a child option, but results in no child being created. This means that a 'backing': null remains in bs->options and bs->explicit_options. A typical use for 'backing': null is in live snapshots: blockdev-add for the qcow2 overlay makes sure not to open the backing file (because it is already opened and blockdev-snapshot will attach it). After doing a blockdev-snapshot, bs->options and bs->explicit_options become inconsistent with the actual state (bs has a backing file now, but the options still say null). On the next occasion that the image is reopened, e.g. switching it from read-write to read-only when another snapshot is taken, the option will take effect again and the node incorrectly loses its backing file. Fix bdrv_open_inherited() to remove the 'backing' option from bs->options and bs->explicit_options even for the case where it specifies that no backing file is wanted. Reported-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Tested-by: Peter Krempa <pkrempa@redhat.com>	2019-11-18 17:50:08 +01:00
Wei Yang	038adc2f58	core: replace getpagesize() with qemu_real_host_page_size There are three page size in qemu: real host page size host page size target page size All of them have dedicate variable to represent. For the last two, we use the same form in the whole qemu project, while for the first one we use two forms: qemu_real_host_page_size and getpagesize(). qemu_real_host_page_size is defined to be a replacement of getpagesize(), so let it serve the role. [Note] Not fully tested for some arch or device. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-26 15:38:06 +02:00
Vladimir Sementsov-Ogievskiy	4dd09f6223	qcow2-bitmap: move bitmap reopen-rw code to qcow2_reopen_commit The only reason I can imagine for this strange code at the very-end of bdrv_reopen_commit is the fact that bs->read_only updated after calling drv->bdrv_reopen_commit in bdrv_reopen_commit. And in the same time, prior to previous commit, qcow2_reopen_bitmaps_rw did a wrong check for being writable, when actually it only need writable file child not self. So, as it's fixed, let's move things to correct place. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Acked-by: Max Reitz <mreitz@redhat.com> Message-id: 20190927122355.7344-10-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:53:28 -04:00
Vladimir Sementsov-Ogievskiy	fcd6a4f42d	block: reverse order for reopen commits It's needed to fix reopening qcow2 with bitmaps to RW. Currently it can't work, as qcow2 needs write access to file child, to mark bitmaps in-image with IN_USE flag. But usually children goes after parents in reopen queue and file child is still RO on qcow2 reopen commit. Reverse reopen order to fix it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Acked-by: Max Reitz <mreitz@redhat.com> Acked-by: John Snow <jsnow@redhat.com> Message-id: 20190927122355.7344-3-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:02:32 -04:00
Vladimir Sementsov-Ogievskiy	859443b0fb	block: switch reopen queue from QSIMPLEQ to QTAILQ We'll need reverse-foreach in the following commit, QTAILQ support it, so move to QTAILQ. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20190927122355.7344-2-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:02:32 -04:00
Vladimir Sementsov-Ogievskiy	ef9041a7b8	block/dirty-bitmap: refactor bdrv_dirty_bitmap_next bdrv_dirty_bitmap_next is always used in same pattern. So, split it into _next and _first, instead of combining two functions into one and add FOR_EACH_DIRTY_BITMAP macro. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20190916141911.5255-5-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:02:32 -04:00
Vladimir Sementsov-Ogievskiy	85cc8a4f6b	block: move bdrv_can_store_new_dirty_bitmap to block/dirty-bitmap.c block/dirty-bitmap.c seems to be more appropriate for it and bdrv_remove_persistent_dirty_bitmap already in it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20190920082543.23444-2-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:02:32 -04:00
Anton Nefedov	d924559953	qapi: query-blockstat: add driver specific file-posix stats A block driver can provide a callback to report driver-specific statistics. file-posix driver now reports discard statistics Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-id: 20190923121737.83281-10-anton.nefedov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Vladimir Sementsov-Ogievskiy	d10529a2b7	block: teach bdrv_debug_breakpoint skip filters with backing Teach bdrv_debug_breakpoint and bdrv_debug_remove_breakpoint skip filters with backing. This is needed to implement and use in backup job it's own backup_top filter driver (like mirror already has one), and without this improvement, breakpoint removal will fail at least in 55 iotest. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20190920142056.12778-9-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:17 +02:00
Vladimir Sementsov-Ogievskiy	b503de619e	block: fix permission update in bdrv_replace_node It's wrong to OR shared permissions. It may lead to crash on further permission updates. Also, no needs to consider previously calculated permissions, as at this point we already bind all new parents and bdrv_get_cumulative_perm result is enough. So fix the bug by just set permissions by bdrv_get_cumulative_perm result. Bug was introduced in long ago `234ac1a902`, in 2.9. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20190824100740.61635-1-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-09-03 14:55:35 +02:00
Max Reitz	ceaca56fee	block: Add bdrv_has_zero_init_truncate() No .bdrv_has_zero_init() implementation returns 1 if growing the file would add non-zero areas (at least with PREALLOC_MODE_OFF), so using it in lieu of this new function was always safe. But on the other hand, it is possible that growing an image that is not zero-initialized would still add a zero-initialized area, like when using nonpreallocating truncation on a preallocated image. For callers that care only about truncation, not about creation with potential preallocation, this new function is useful. Alternatively, we could have added a PreallocMode parameter to bdrv_has_zero_init(). But the only user would have been qemu-img convert, which does not have a plain PreallocMode value right now -- it would have to parse the creation option to obtain it. Therefore, the simpler solution is to let bdrv_has_zero_init() inquire the preallocation status and add the new bdrv_has_zero_init_truncate() that presupposes PREALLOC_MODE_OFF. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190724171239.8764-4-mreitz@redhat.com Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-08-19 17:13:26 +02:00
John Snow	c4e4b0fa59	qapi: implement block-dirty-bitmap-remove transaction action It is used to do transactional movement of the bitmap (which is possible in conjunction with merge command). Transactional bitmap movement is needed in scenarios with external snapshot, when we don't want to leave copy of the bitmap in the base image. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20190708220502.12977-3-jsnow@redhat.com [Edited "since" version to 4.2 --js] Signed-off-by: John Snow <jsnow@redhat.com>	2019-08-16 16:28:03 -04:00
Peter Maydell	e018ccb3fb	Block layer patches: - file-posix: Fix O_DIRECT alignment detection - Fixes for concurrent block jobs - block-backend: Queue requests while drained (fix IDE vs. job crashes) - qemu-img convert: Deprecate using -n and -o together - iotests: Migration tests with filter nodes - iotests: More media change tests -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJdVnduAAoJEH8JsnLIjy/W0IgQAKft/M3aDgt0sbTzQh8vdy6A yAfTnnSL4Z56+8qAsqhEnplC3rZxvTkg9AGOoNYHOZKl3FgRH9r8g9/Enemh4fWu MH52hiRf2ytlFVurIQal3aj9O+i0YTnzuvYbysvkH4ID5zbv2QnwdagtEcBxbbYL NZTMZBynDzp4rKIZ7p6T/kkaklLHh4vZrjW+Mzm3LQx9JJr8TwVNqqetSfc4VKIJ ByaNbbihDUVjQyIaJ24DXXJdzonGrrtSbSZycturc5FzXymzSRgrXZCeSKCs8X+i fjwMXH5v4/UfK511ILsXiumeuxBfD2Ck4sAblFxVo06oMPRNmsAKdRLeDByE7IC1 lWep/pB3y/au9CW2/pkWJOiaz5s5iuv2fFYidKUJ0KQ1dD7G8M9rzkQlV3FUmTZO jBKSxHEffXsYl0ojn0vGmZEd7FAPi3fsZibGGws1dVgxlWI93aUJsjCq0E+lHIRD hEmQcjqZZa4taKpj0Y3Me05GkL7tH6RYA153jDNb8rPdzriGRCLZSObEISrOJf8H Mh0gTLi8KJNh6bULd12Ake1tKn7ZeTXpHH+gadz9OU7eIModh1qYTSHPlhy5oAv0 Hm9BikNlS1Hzw+a+EbLcOW7TrsteNeGr7r8T6QKPMq1sfsYcp3svbC2c+zVlQ6Ll mLoTssksXOkgBevVqSiS =T7L5 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches: - file-posix: Fix O_DIRECT alignment detection - Fixes for concurrent block jobs - block-backend: Queue requests while drained (fix IDE vs. job crashes) - qemu-img convert: Deprecate using -n and -o together - iotests: Migration tests with filter nodes - iotests: More media change tests # gpg: Signature made Fri 16 Aug 2019 10:29:18 BST # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: file-posix: Handle undetectable alignment qemu-img convert: Deprecate using -n and -o together block-backend: Queue requests while drained mirror: Keep mirror_top_bs drained after dropping permissions block: Remove blk_pread_unthrottled() iotests: Add test for concurrent stream/commit tests: Test mid-drain bdrv_replace_child_noperm() tests: Test polling in bdrv_drop_intermediate() block: Reduce (un)drains when replacing a child block: Keep subtree drained in drop_intermediate block: Simplify bdrv_filter_default_perms() iotests: Test migration with all kinds of filter nodes iotests: Move migration helpers to iotests.py iotests/118: Add -blockdev based tests iotests/118: Create test classes dynamically iotests/118: Test media change for scsi-cd Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-08-16 16:43:46 +01:00
Markus Armbruster	db72581598	Include qemu/main-loop.h less In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-08-16 13:31:52 +02:00
Max Reitz	debc292767	block: Reduce (un)drains when replacing a child Currently, bdrv_replace_child_noperm() undrains the parent until it is completely undrained, then re-drains it after attaching the new child node. This is a problem with bdrv_drop_intermediate(): We want to keep the whole subtree drained, including parents, while the operation is under way. bdrv_replace_child_noperm() breaks this by allowing every parent to become unquiesced briefly, and then redraining it. In fact, there is no reason why the parent should become unquiesced and be allowed to submit requests to the new child node if that new node is supposed to be kept drained. So if anything, we have to drain the parent before detaching the old child node. Conversely, we have to undrain it only after attaching the new child node. Thus, change the whole drain algorithm here: Calculate the number of times we have to drain/undrain the parent before replacing the child node then drain it (if necessary), replace the child node, and then undrain it. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-08-16 10:25:16 +02:00
Max Reitz	637d54a5f3	block: Keep subtree drained in drop_intermediate bdrv_drop_intermediate() calls BdrvChildRole.update_filename(). That may poll, thus changing the graph, which potentially breaks the QLIST_FOREACH_SAFE() loop. Just keep the whole subtree drained. This is probably the right thing to do anyway (dropping nodes while the subtree is not drained seems wrong). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-08-16 10:25:16 +02:00
Kevin Wolf	e444fa8312	block: Simplify bdrv_filter_default_perms() The same change as commit `2b23f28639` ('block/copy-on-read: Fix permissions for inactive node') made for the copy-on-read driver can be made for bdrv_filter_default_perms(): Retaining the old permissions from the BdrvChild if it is given complicates things unnecessarily when in the end this only means that the options set in the c == NULL case (i.e. during child creation) are retained. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2019-08-16 10:25:16 +02:00
Max Reitz	43eaaaef0e	block: Only the main loop can change AioContexts bdrv_set_aio_context_ignore() can only work in the main loop: bdrv_drained_begin() only works in the main loop and the node's (old) AioContext; and bdrv_drained_end() really only works in the main loop and the node's (new) AioContext (contrary to its current comment, which is just wrong). Consequentially, bdrv_set_aio_context_ignore() must be called from the main loop. Luckily, assuming that we can make block graph changes only from the main loop as well, all its callers do that already. Note that changing a node's context in a sense is an operation that changes the block graph, so it actually makes sense to require this function to be called from the main loop. Also, fix bdrv_drained_end()'s description. You can only use it from the main loop or the node's AioContext, and in the latter case, the whole subtree must be in the same context. Fixes: `e037c09c78` Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190722133054.21781-3-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-07-22 18:41:43 +02:00
Max Reitz	e037c09c78	block: Do not poll in bdrv_do_drained_end() We should never poll anywhere in bdrv_do_drained_end() (including its recursive callees like bdrv_drain_invoke()), because it does not cope well with graph changes. In fact, it has been written based on the postulation that no graph changes will happen in it. Instead, the callers that want to poll must poll, i.e. all currently globally available wrappers: bdrv_drained_end(), bdrv_subtree_drained_end(), bdrv_unapply_subtree_drain(), and bdrv_drain_all_end(). Graph changes there do not matter. They can poll simply by passing a pointer to a drained_end_counter and wait until it reaches 0. This patch also adds a non-polling global wrapper for bdrv_do_drained_end() that takes a drained_end_counter pointer. We need such a variant because now no function called anywhere from bdrv_do_drained_end() must poll. This includes BdrvChildRole.drained_end(), which already must not poll according to its interface documentation, but bdrv_child_cb_drained_end() just violates that by invoking bdrv_drained_end() (which does poll). Therefore, BdrvChildRole.drained_end() must take a drained_end_counter parameter, which bdrv_child_cb_drained_end() can pass on to the new bdrv_drained_end_no_poll() function. Note that we now have a pattern of all drained_end-related functions either polling or receiving a drained_end_counter to let the caller poll based on that. A problem with a single poll loop is that when the drained section in bdrv_set_aio_context_ignore() ends, some nodes in the subgraph may be in the old contexts, while others are in the new context already. To let the collective poll in bdrv_drained_end() work correctly, we must not hold a lock to the old context, so that the old context can make progress in case it is different from the current context. (In the process, remove the comment saying that the current context is always the old context, because it is wrong.) In all other places, all nodes in a subtree must be in the same context, so we can just poll that. The exception of course is bdrv_drain_all_end(), but that always runs in the main context, so we can just poll NULL (like bdrv_drain_all_begin() does). Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-07-19 13:19:16 +02:00
Max Reitz	804db8ea00	block: Introduce BdrvChild.parent_quiesce_counter Commit `5cb2737e92` laid out why bdrv_do_drained_end() must decrement the quiesce_counter after bdrv_drain_invoke(). It did not give a very good reason why it has to happen after bdrv_parent_drained_end(), instead only claiming symmetry to bdrv_do_drained_begin(). It turns out that delaying it for so long is wrong. Situation: We have an active commit job (i.e. a mirror job) from top to base for the following graph: filter \| [file] \| v top --[backing]--> base Now the VM is closed, which results in the job being cancelled and a bdrv_drain_all() happening pretty much simultaneously. Beginning the drain means the job is paused once whenever one of its nodes is quiesced. This is reversed when the drain ends. With how the code currently is, after base's drain ends (which means that it will have unpaused the job once), its quiesce_counter remains at 1 while it goes to undrain its parents (bdrv_parent_drained_end()). For some reason or another, undraining filter causes the job to be kicked and enter mirror_exit_common(), where it proceeds to invoke block_job_remove_all_bdrv(). Now base will be detached from the job. Because its quiesce_counter is still 1, it will unpause the job once more. So in total, undraining base will unpause the job twice. Eventually, this will lead to the job's pause_count going negative -- well, it would, were there not an assertion against this, which crashes qemu. The general problem is that if in bdrv_parent_drained_end() we undrain parent A, and then undrain parent B, which then leads to A detaching the child, bdrv_replace_child_noperm() will undrain A as if we had not done so yet; that is, one time too many. It follows that we cannot decrement the quiesce_counter after invoking bdrv_parent_drained_end(). Unfortunately, decrementing it before bdrv_parent_drained_end() would be wrong, too. Imagine the above situation in reverse: Undraining A leads to B detaching the child. If we had already decremented the quiesce_counter by that point, bdrv_replace_child_noperm() would undrain B one time too little; because it expects bdrv_parent_drained_end() to issue this undrain. But bdrv_parent_drained_end() won't do that, because B is no longer a parent. Therefore, we have to do something else. This patch opts for introducing a second quiesce_counter that counts how many times a child's parent has been quiesced (though c->role->drained_*). With that, bdrv_replace_child_noperm() just has to undrain the parent exactly that many times when removing a child, and it will always be right. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-07-19 13:19:16 +02:00
Max Reitz	3cf746b3f1	block: Deep-clear inherits_from BDS.inherits_from does not always point to an immediate parent node. When launching a block job with a filter node, for example, the node directly below the filter will not point to the filter, but keep its old pointee (above the filter). If that pointee goes away while the job is still running, the node's inherits_from will not be updated and thus point to garbage. To fix this, bdrv_unref_child() has to check not only the parent node's immediate children for nodes whose inherits_from needs to be cleared, but its whole subtree. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190703172813.6868-7-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-07-15 15:48:40 +02:00
Max Reitz	e5182c1c57	block: Add BDS.never_freeze The commit and the mirror block job must be able to drop their filter node at any point. However, this will not be possible if any of the BdrvChild links to them is frozen. Therefore, we need to prevent them from ever becoming frozen. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20190703172813.6868-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-07-15 15:48:40 +02:00
Max Reitz	1046779e64	block: Ignore loosening perm restrictions failures We generally assume that loosening permission restrictions can never fail. We have seen in the past that this assumption is wrong. This has led to crashes because we generally pass &error_abort when loosening permissions. However, a failure in such a case should actually be handled in quite the opposite way: It is very much not fatal, so qemu may report it, but still consider the operation successful. The only realistic problem is that qemu may then retain permissions and thus locks on images it actually does not require. But again, that is not fatal. To implement this behavior, we make all functions that change permissions and that pass &error_abort to the initiating function (bdrv_check_perm() or bdrv_child_check_perm()) evaluate the @loosen_restrictions value introduced in the previous patch. If it is true and an error did occur, we abort the permission update, discard the error, and instead report success to the caller. bdrv_child_try_set_perm() itself does not pass &error_abort, but it is the only public function to change permissions. As such, callers may pass &error_abort to it, expecting dropping permission restrictions to never fail. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-18 16:41:10 +02:00
Max Reitz	9eab154415	block: Add tighten_restrictions to check*_perm() This patch makes three functions report whether the necessary permission change tightens restrictions or not. These functions are: - bdrv_check_perm() - bdrv_check_update_perm() - bdrv_child_check_perm() Callers can use this result to decide whether a failure is fatal or not (see the next patch). Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-18 16:41:10 +02:00
Max Reitz	87ace5f8b6	block: Fix order in bdrv_replace_child() We have to start by applying the permission restrictions to new_bs before we can loosen them on old_bs. See the comment for the explanation. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-18 16:41:10 +02:00
Max Reitz	c1087f1206	block: Add bdrv_child_refresh_perms() If a block node uses bdrv_child_try_set_perm() to change the permission it takes on its child, the result may be very short-lived. If anything makes the block layer recalculate the permissions internally, it will invoke the node driver's .bdrv_child_perm() implementation. The permission/shared permissions masks that returns will then override the values previously passed to bdrv_child_try_set_perm(). If drivers want a child edge to have specific values for the permissions/shared permissions mask, it must return them in .bdrv_child_perm(). Consequentially, there is no need for them to pass the same values to bdrv_child_try_set_perm() then: It is better to have a function that invokes .bdrv_child_perm() and calls bdrv_child_try_set_perm() with the result. This patch adds such a function under the name of bdrv_child_refresh_perms(). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-18 16:41:10 +02:00
Vladimir Sementsov-Ogievskiy	b23c580c94	block: drop bs->job Drop remaining users of bs->job: 1. assertions actually duplicated by assert(!bs->refcnt) 2. trace-point seems not enough reason to change stream_start to return BlockJob pointer 3. Restricting creation of two jobs based on same bs is bad idea, as 3.1 Some jobs creates filters to be their main node, so, this check don't actually prevent creating second job on same real node (which will create another filter node) (but I hope it is restricted by other mechanisms) 3.2 Even without bs->job we have two systems of permissions: op-blockers and BLK_PERM 3.3 We may want to run several jobs on one node one day And finally, drop bs->job pointer itself. Hurrah! Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-18 16:41:10 +02:00
Kevin Wolf	42a65f02f9	block: Remove bdrv_set_aio_context() All callers of bdrv_set_aio_context() are eliminated now, they have moved to bdrv_try_set_aio_context() and related safe functions. Remove bdrv_set_aio_context(). With this, we can now know that the .set_aio_ctx callback must be present in bdrv_set_aio_context_ignore() because bdrv_can_set_aio_context() would have returned false previously, so instead of checking the condition, we can assert it. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-04 16:55:58 +02:00
Kevin Wolf	d0ee0204f4	block: Remove wrong bdrv_set_aio_context() calls The mirror and commit block jobs use bdrv_set_aio_context() to move their filter node into the right AioContext before hooking it up in the graph. Similarly, bdrv_open_backing_file() explicitly moves the backing file node into the right AioContext first. This isn't necessary any more, they get automatically moved into the right context now when attaching them. However, in the case of bdrv_open_backing_file() with a node reference, it's actually not only unnecessary, but even wrong: The unchecked bdrv_set_aio_context() changes the AioContext of the child node even if other parents require it to retain the old context. So this is not only a simplification, but a bug fix, too. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1684342 Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-04 15:22:22 +02:00
Kevin Wolf	ad943dcb22	block: Move node without parents to main AioContext A node should only be in a non-default AioContext if a user is attached to it that requires this. When the last parent of a node is gone, it can move back to the main AioContext. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-04 15:22:22 +02:00
Kevin Wolf	132ada80c4	block: Adjust AioContexts when attaching nodes So far, we only made sure that updating the AioContext of a node affected the whole subtree. However, if a node is newly attached to a new parent, we also need to make sure that both the subtree of the node and the parent are in the same AioContext. This tries to move the new child node to the parent AioContext and returns an error if this isn't possible. BlockBackends now actually apply their AioContext to their root node. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-04 15:22:22 +02:00
Kevin Wolf	d861ab3acf	block: Add BlockBackend.ctx This adds a new parameter to blk_new() which requires its callers to declare from which AioContext this BlockBackend is going to be used (or the locks of which AioContext need to be taken anyway). The given context is only stored and kept up to date when changing AioContexts. Actually applying the stored AioContext to the root node is saved for another commit. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-04 15:22:22 +02:00
Kevin Wolf	f871abd60f	block: Drain source node in bdrv_replace_node() Instead of just asserting that no requests are in flight in bdrv_replace_node(), which is a requirement that most callers ignore, we can just drain the source node right there. This fixes at least starting a commit job while I/O is active on the backing chain, but probably other callers, too. Having requests in flight on the target node isn't a problem because the target just gets new parents, but the call path of running requests isn't modified. So we can just drop this assertion without a replacement. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1711643 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2019-06-04 15:20:41 +02:00
Alberto Garcia	b441dc71c0	block: Make bdrv_root_attach_child() unref child_bs on failure A consequence of the previous patch is that bdrv_attach_child() transfers the reference to child_bs from the caller to parent_bs, which will drop it on bdrv_close() or when someone calls bdrv_unref_child(). But this only happens when bdrv_attach_child() succeeds. If it fails then the caller is responsible for dropping the reference to child_bs. This patch makes bdrv_attach_child() take the reference also when there is an error, freeing the caller for having to do it. A similar situation happens with bdrv_root_attach_child(), so the changes on this patch affect both functions. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 20dfb3d9ccec559cdd1a9690146abad5d204a186.1557754872.git.berto@igalia.com [mreitz: Removed now superfluous BdrvChild * variable in bdrv_open_child()] Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-05-28 20:30:55 +02:00
Alberto Garcia	dd4118c792	block: Use bdrv_unref_child() for all children in bdrv_close() bdrv_unref_child() does the following things: - Updates the child->bs->inherits_from pointer. - Calls bdrv_detach_child() to remove the BdrvChild from bs->children. - Calls bdrv_unref() to unref the child BlockDriverState. When bdrv_unref_child() was introduced in commit `33a604075c` it was not used in bdrv_close() because the drivers that had additional children (like quorum or blkverify) had already called bdrv_unref() on their children during their own close functions. This was changed later (in `0bd6e91a7e` for quorum, in `3e586be0b2` for blkverify) so there's no reason not to use bdrv_unref_child() in bdrv_close() anymore. After this there's also no need to remove bs->backing and bs->file separately from the rest of the children, so bdrv_close() can be simplified. Now bdrv_close() unrefs all children (before this patch it was only bs->file and bs->backing). As a result, none of the callers of brvd_attach_child() should remove their reference to child_bs (because this function effectively steals that reference). This patch updates a couple of tests that were doing their own bdrv_unref(). Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 6d1d5feaa53aa1ab127adb73d605dc4503e3abd5.1557754872.git.berto@igalia.com [mreitz: s/where/were/] Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-05-28 20:30:55 +02:00
Max Reitz	481e0eeef4	block: Improve "Block node is read-only" message This message does not make any sense when it appears as the response to making an R/W node read-only. We should detect that case and emit a different message, then. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-05-20 17:08:57 +02:00
Kevin Wolf	53a7d04185	block: Propagate AioContext change to parents All block nodes and users in any connected component of the block graph must be in the same AioContext, so changing the AioContext of one node must not only change all of its children, but all of its parents (and in turn their children etc.) as well. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-05-20 17:08:56 +02:00
Kevin Wolf	0d83708a1d	block: Move recursion to bdrv_set_aio_context() Instead of having two recursions, in bdrv_attach_aio_context() and in bdrv_detach_aio_context(), just having one recursion is enough. Said functions are only about a single node now. It is important that the recursion doesn't happen between detaching and attaching a context to the current node because the nested call will drain the node, and draining with a NULL context would segfault. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-05-20 17:08:56 +02:00
Kevin Wolf	a3a683c33d	block: Make bdrv_attach/detach_aio_context() static Since commit `b97511c7bc`, there is no reason for block drivers any more to call these functions (see the function comment in block_int.h). They are now just internal helper functions for bdrv_set_aio_context() and can be made static. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-05-20 17:08:56 +02:00
Kevin Wolf	5d2318499f	block: Add bdrv_try_set_aio_context() Eventually, we want to make sure that all parents and all children of a node are in the same AioContext as the node itself. This means that changing the AioContext may fail because one of the other involved parties (e.g. a guest device that was configured with an iothread) cannot allow switching to a different AioContext. Introduce a set of functions that allow to first check whether all involved nodes can switch to a new context and only then do the actual switch. The check recursively covers children and parents. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-05-20 17:08:56 +02:00
Anton Kuchin	30c321f97f	block: remove bs from lists before closing Close involves flush that can be performed asynchronously and bs must be protected from being referenced before it is deleted. Signed-off-by: Anton Kuchin <antonkuchin@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-05-10 16:45:40 +02:00
Nikita Alekseev	66a5bdf309	block: Add coroutine_fn to bdrv_check_co_entry bdrv_check_co_entry calls bdrv_co_check, which is a coroutine function. Thus, it also needs to be marked as a coroutine. Signed-off-by: Nikita Alekseev <n.alekseev2104@gmail.com> Message-id: 20190401093051.16488-1-n.alekseev2104@gmail.com Message-Id: <20190401093051.16488-1-n.alekseev2104@gmail.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-05-10 10:53:21 +01:00
Alberto Garcia	0b3ca76e52	block: Assert that drv->bdrv_child_perm is set in bdrv_child_perm() There is no need to check for this because all block drivers that have children implement bdrv_child_perm and all callers already ensure that bs->drv is set. Furthermore, if this check would fail then the callers would end up with uninitialized values for nperm and nshared. This patch replaces the check with an assertion. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 20190404112953.4058-1-berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-05-07 17:14:21 +02:00
Kevin Wolf	1bffe1ae7a	block: Fix AioContext switch for bs->drv == NULL Even for block nodes with bs->drv == NULL, we can't just ignore a bdrv_set_aio_context() call. Leaving the node in its old context can mean that it's still in an iothread context in bdrv_close_all() during shutdown, resulting in an attempted unlock of the AioContext lock which we don't hold. This is an example stack trace of a related crash: #0 0x00007ffff59da57f in raise () at /lib64/libc.so.6 #1 0x00007ffff59c4895 in abort () at /lib64/libc.so.6 #2 0x0000555555b97b1e in error_exit (err=<optimized out>, msg=msg@entry=0x555555d386d0 <__func__.19059> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36 #3 0x0000555555b97f7f in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5555568002f0, file=file@entry=0x555555d378df "util/async.c", line=line@entry=507) at util/qemu-thread-posix.c:97 #4 0x0000555555b92f55 in aio_context_release (ctx=ctx@entry=0x555556800290) at util/async.c:507 #5 0x0000555555b05cf8 in bdrv_prwv_co (child=child@entry=0x7fffc80012f0, offset=offset@entry=131072, qiov=qiov@entry=0x7fffffffd4f0, is_write=is_write@entry=true, flags=flags@entry=0) at block/io.c:833 #6 0x0000555555b060a9 in bdrv_pwritev (qiov=0x7fffffffd4f0, offset=131072, child=0x7fffc80012f0) at block/io.c:990 #7 0x0000555555b060a9 in bdrv_pwrite (child=0x7fffc80012f0, offset=131072, buf=<optimized out>, bytes=<optimized out>) at block/io.c:990 #8 0x0000555555ae172b in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568cc740, i=i@entry=0) at block/qcow2-cache.c:51 #9 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568cc740) at block/qcow2-cache.c:248 #10 0x0000555555ae15de in qcow2_cache_flush (bs=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259 #11 0x0000555555ae16b1 in qcow2_cache_flush_dependency (c=0x5555568a1700, c=0x5555568a1700, bs=0x555556810680) at block/qcow2-cache.c:194 #12 0x0000555555ae16b1 in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568a1700, i=i@entry=0) at block/qcow2-cache.c:194 #13 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568a1700) at block/qcow2-cache.c:248 #14 0x0000555555ae15de in qcow2_cache_flush (bs=bs@entry=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259 #15 0x0000555555ad242c in qcow2_inactivate (bs=bs@entry=0x555556810680) at block/qcow2.c:2124 #16 0x0000555555ad2590 in qcow2_close (bs=0x555556810680) at block/qcow2.c:2153 #17 0x0000555555ab0c62 in bdrv_close (bs=0x555556810680) at block.c:3358 #18 0x0000555555ab0c62 in bdrv_delete (bs=0x555556810680) at block.c:3542 #19 0x0000555555ab0c62 in bdrv_unref (bs=0x555556810680) at block.c:4598 #20 0x0000555555af4d72 in blk_remove_bs (blk=blk@entry=0x5555568103d0) at block/block-backend.c:785 #21 0x0000555555af4dbb in blk_remove_all_bs () at block/block-backend.c:483 #22 0x0000555555aae02f in bdrv_close_all () at block.c:3412 #23 0x00005555557f9796 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4776 The reproducer I used is a qcow2 image on gluster volume, where the virtual disk size (4 GB) is larger than the gluster volume size (64M), so we can easily trigger an ENOSPC. This backend is assigned to a virtio-blk device using an iothread, and then from the guest a 'dd if=/dev/zero of=/dev/vda bs=1G count=1' causes the VM to stop because of an I/O error. qemu_gluster_co_flush_to_disk() sets bs->drv = NULL on error, so when virtio-blk stops the dataplane, the block nodes stay in the iothread AioContext. A 'quit' monitor command issued from this paused state crashes the process. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1631227 Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>	2019-04-30 15:29:00 +02:00
Kevin Wolf	3f48686fac	block: Forward 'discard' to temporary overlay When bdrv_temp_snapshot_options() is called for snapshot=on, the 'discard' option in the options QDict hasn't been parsed and merged into the flags yet. So copy the dict entry to make sure that the temporary overlay enables discard when it was requested for the drive. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>	2019-04-08 16:48:46 +02:00
Alberto Garcia	0f0998f621	block: continue until base is found in bdrv_freeze_backing_chain() et al All three functions that handle the BdrvChild.frozen attribute walk the backing chain from 'bs' to 'base' and stop either when 'base' is found or at the end of the chain if 'base' is NULL. However if 'base' is not found then the functions return without errors as if it was NULL. This is wrong: if the caller passed an incorrect parameter that means that there is a bug in the code. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-04-02 12:04:44 +02:00
Kevin Wolf	2345bde647	block: Silence Coverity in bdrv_drop_intermediate() Coverity doesn't like that the return value of bdrv_check_update_perm() stays unused only in this place (CID 1399710). Even if checking local_err should be equivalent to checking ret < 0, let's switch to using the return value to be more consistent (and in case of a bug somewhere down the call chain, forgetting to assign errp is more likely than returning 0 for an error case). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Markus Armbruster <armbru@redhat.com>	2019-03-19 15:49:29 +01:00
Alberto Garcia	5019aece2a	block: Remove the AioContext parameter from bdrv_reopen_multiple() This parameter has been unused since `1a63a90750` Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-03-12 20:30:14 +01:00
Alberto Garcia	faf116b438	block: Add bdrv_reset_options_allowed() bdrv_reopen_prepare() receives a BDRVReopenState with (among other things) a new set of options to be applied to that BlockDriverState. If an option is missing then it means that we want to reset it to its default value rather than keep the previous one. This way the state of the block device after being reopened is comparable to that of a device added with "blockdev-add" using the same set of options. Not all options from all drivers can be changed this way, however. If the user attempts to reset an immutable option to its default value using this method then we must forbid it. This new function takes a BlockDriverState and a new set of options and checks if there's any option that was previously set but is missing from the new set of options. If the option is present in both sets we don't need to check that they have the same value. The loop at the end of bdrv_reopen_prepare() already takes care of that. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-03-12 20:30:14 +01:00
Alberto Garcia	cb828c31de	block: Allow changing the backing file on reopen This patch allows the user to change the backing file of an image that is being reopened. Here's what it does: - In bdrv_reopen_prepare(): check that the value of 'backing' points to an existing node or is null. If it points to an existing node it also needs to make sure that replacing the backing file will not create a cycle in the node graph (i.e. you cannot reach the parent from the new backing file). - In bdrv_reopen_commit(): perform the actual node replacement by calling bdrv_set_backing_hd(). There may be temporary implicit nodes between a BDS and its backing file (e.g. a commit filter node). In these cases bdrv_reopen_prepare() looks for the real (non-implicit) backing file and requires that the 'backing' option points to it. Replacing or detaching a backing file is forbidden if there are implicit nodes in the middle. Although x-blockdev-reopen is meant to be used like blockdev-add, there's an important thing that must be taken into account: the only way to set a new backing file is by using a reference to an existing node (previously added with e.g. blockdev-add). If 'backing' contains a dictionary with a new set of options ({"driver": "qcow2", "file": { ... }}) then it is interpreted that the _existing_ backing file must be reopened with those options. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-03-12 20:30:14 +01:00
Alberto Garcia	bacd9b87c4	block: Allow omitting the 'backing' option in certain cases Of all options of type BlockdevRef used to specify children in BlockdevOptions, 'backing' is the only one that is optional. For "x-blockdev-reopen" we want that if an option is omitted then it must be reset to its default value. The default value of 'backing' means that QEMU opens the backing file specified in the image metadata, but this is not something that we want to support for the reopen operation. Because of this the 'backing' option has to be specified during reopen, pointing to the existing backing file if we want to keep it, or pointing to a different one (or NULL) if we want to replace it (to be implemented in a subsequent patch). In order to simplify things a bit and not to require that the user passes the 'backing' option to every single block device even when it's clearly not necessary, this patch allows omitting this option if the block device being reopened doesn't have a backing file attached _and_ no default backing file is specified in the image metadata. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-03-12 20:30:14 +01:00
Alberto Garcia	8546632e61	block: Handle child references in bdrv_reopen_queue() Children in QMP are specified with BlockdevRef / BlockdevRefOrNull, which can contain a set of child options, a child reference, or NULL. In optional attributes like "backing" it can also be missing. Only the first case (set of child options) is being handled properly by bdrv_reopen_queue(). This patch deals with all the others. Here's how these cases should be handled when bdrv_reopen_queue() is deciding what to do with each child of a BlockDriverState: 1) Set of child options: if the child was implicitly created (i.e inherits_from points to the parent) then the options are removed from the parent's options QDict and are passed to the child with a recursive bdrv_reopen_queue() call. This case was already working fine. 2) Child reference: there's two possibilites here. 2a) Reference to the current child: if the child was implicitly created then it is put in the reopen queue, keeping its current set of options (since this was a child reference there was no way to specify a different set of options). If the child is not implicit then it keeps its current set of options but it is not reopened (and therefore does not inherit any new option from the parent). 2b) Reference to a different BDS: the current child is not put in the reopen queue at all. Passing a reference to a different BDS can be used to replace a child, although at the moment no driver implements this, so it results in an error. In any case, the current child is not going to be reopened (and might in fact disappear if it's replaced) 3) NULL: This is similar to (2b). Although no driver allows this yet it can be used to detach the current child so it should not be put in the reopen queue. 4) Missing option: at the moment "backing" is the only case where this can happen. With "blockdev-add", leaving "backing" out means that the default backing file is opened. We don't want to open a new image during reopen, so we require that "backing" is always present. We'll relax this requirement a bit in the next patch. If keep_old_opts is true and "backing" is missing then this behaves like 2a (the current child is reopened). Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-03-12 20:30:14 +01:00
Alberto Garcia	077e8e2018	block: Add 'keep_old_opts' parameter to bdrv_reopen_queue() The bdrv_reopen_queue() function is used to create a queue with the BDSs that are going to be reopened and their new options. Once the queue is ready bdrv_reopen_multiple() is called to perform the operation. The original options from each one of the BDSs are kept, with the new options passed to bdrv_reopen_queue() applied on top of them. For "x-blockdev-reopen" we want a function that behaves much like "blockdev-add". We want to ignore the previous set of options so that only the ones actually specified by the user are applied, with the rest having their default values. One of the things that we need is a way to tell bdrv_reopen_queue() whether we want to keep the old set of options or not, and that's what this patch does. All current callers are setting this new parameter to true and x-blockdev-reopen will set it to false. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-03-12 20:30:14 +01:00
Alberto Garcia	2cad1ebe70	block: Allow freezing BdrvChild links Our permission system is useful to define what operations are allowed on a certain block node and includes things like BLK_PERM_WRITE or BLK_PERM_RESIZE among others. One of the permissions is BLK_PERM_GRAPH_MOD which allows "changing the node that this BdrvChild points to". The exact meaning of this has never been very clear, but it can be understood as "change any of the links connected to the node". This can be used to prevent changing a backing link, but it's too coarse. This patch adds a new 'frozen' attribute to BdrvChild, which forbids detaching the link from the node it points to, and new API to freeze and unfreeze a backing chain. After this change a few functions can fail, so they need additional checks. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-03-12 20:30:14 +01:00
Kevin Wolf	3085513778	file-posix: Fix bdrv_open_flags() for snapshot=on Using a different read-only setting for bs->open_flags than for the flags to the driver's open function is just inconsistent and a bad idea. After this patch, the temporary snapshot keeps being opened read-only if read-only=on,snapshot=on is passed. If we wanted to change this behaviour to make only the orginal image file read-only, but the temporary overlay read-write (as the comment in the removed code suggests), that change would have to be made in bdrv_temp_snapshot_options() (where the comment suggests otherwise). Addressing this inconsistency before introducing dynamic auto-read-only is important because otherwise we would immediately try to reopen the temporary overlay even though the file is already unlinked. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-03-12 20:30:14 +01:00
Kevin Wolf	69b736e765	block: Make permission changes in reopen less wrong The way that reopen interacts with permission changes has one big problem: Both operations are recursive, and the permissions are changes for each node in the reopen queue. For a simple graph that consists just of parent and child, .bdrv_check_perm will be called twice for the child, once recursively when adjusting the permissions of parent, and once again when the child itself is reopened. Even worse, the first .bdrv_check_perm call happens before .bdrv_reopen_prepare was called for the child and the second one is called afterwards. Making sure that .bdrv_check_perm (and the other permission callbacks) are called only once is hard. We can cope with multiple calls right now, but as soon as file-posix gets a dynamic auto-read-only that may need to open a new file descriptor, we get the additional requirement that all of them are after the .bdrv_reopen_prepare call. So reorder things in bdrv_reopen_multiple() to first call .bdrv_reopen_prepare for all involved nodes and only then adjust permissions. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-03-12 20:30:14 +01:00
Kevin Wolf	a4615ab31c	block: Avoid useless local_err Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2019-03-12 20:30:14 +01:00
Andrey Shinkevich	9ac404c523	block: iterate_format with account of whitelisting bdrv_iterate_format (which is currently only used for printing out the formats supported by the block layer) doesn't take format whitelisting into account. This creates a problem for tests: they enumerate supported formats to decide which tests to enable, but then discover that QEMU doesn't let them actually use some of those formats. To avoid that, exclude formats that are not whitelisted from enumeration, if whitelisting is in use. Since we have separate whitelists for r/w and r/o, take this a parameter to bdrv_iterate_format, and print two lists of supported formats (r/w and r/o) in main qemu. Signed-off-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-03-08 12:26:45 +01:00
Max Reitz	62a01a27f7	block: BDS options may lack the "driver" option When BDSs are created by qemu itself (e.g. as filters in block jobs), they may not have a "driver" option in their options QDict. When generating a json:{} filename, however, it must always be present. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20190201192935.18394-31-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:27 +01:00
Max Reitz	fb695c74aa	block: Do not copy exact_filename from format file If a format BDS's file BDS is in turn a format BDS, we cannot simply use the same filename, because when opening a BDS tree based on a filename alone, qemu will create only one format node on top of one protocol node (disregarding a potential backing file). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20190201192935.18394-26-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:27 +01:00
Max Reitz	998b3a1e5a	block: Purify .bdrv_refresh_filename() Currently, BlockDriver.bdrv_refresh_filename() is supposed to both refresh the filename (BDS.exact_filename) and set BDS.full_open_options. Now that we have generic code in the central bdrv_refresh_filename() for creating BDS.full_open_options, we can drop the latter part from all BlockDriver.bdrv_refresh_filename() implementations. This also means that we can drop all of the existing default code for this from the global bdrv_refresh_filename() itself. Furthermore, we now have to call BlockDriver.bdrv_refresh_filename() after having set BDS.full_open_options, because the block driver's implementation should now be allowed to depend on BDS.full_open_options being set correctly. Finally, with this patch we can drop the @options parameter from BlockDriver.bdrv_refresh_filename(); also, add a comment on this function's purpose in block/block_int.h while touching its interface. This completely obsoletes blklogwrite's implementation of .bdrv_refresh_filename(). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190201192935.18394-25-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:27 +01:00
Max Reitz	97e2f021f8	block: Generically refresh runtime options Instead of having every block driver which implements bdrv_refresh_filename() copy all of the strong runtime options over to bs->full_open_options, implement this process generically in bdrv_refresh_filename(). This patch only adds this new generic implementation, it does not remove the old functionality. This is done in a follow-up patch. With this patch, some superfluous information (that should never have been there) may be removed from some JSON filenames, as can be seen in the change to iotests 110's and 228's reference outputs. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190201192935.18394-24-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:27 +01:00
Max Reitz	8df686165b	block: Use bdrv_dirname() for relative filenames bdrv_get_full_backing_filename_from_filename() breaks down when it comes to JSON filenames. Using bdrv_dirname() as the basis is better because since we have BDS, we can descend through the BDS tree to the protocol layer, which gives us a greater probability of finding a non-JSON name; also, bdrv_dirname() is more correct as it allows block drivers to override the generation of that directory name in a protocol-specific way. We still need to keep bdrv_get_full_backing_filename_from_filename(), though, because it has valid callers which need it during image creation when no BDS is available yet. This makes a test case in qemu-iotest 110, which was supposed to fail, work. That is actually good, but we need to change the reference output (and the comment in 110) accordingly. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20190201192935.18394-20-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:27 +01:00
Max Reitz	1e89d0f9be	block: Add bdrv_dirname() This function may be implemented by block drivers to derive a directory name from a BDS. Concatenating this g_free()-able string with a relative filename must result in a valid (not necessarily existing) filename, so this is a function that should generally be not implemented by format drivers, because this is protocol-specific. If a BDS's driver does not implement this function, bdrv_dirname() will fall through to the BDS's file if it exists. If it does not, the exact_filename field will be used to generate a directory name. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20190201192935.18394-15-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:26 +01:00
Max Reitz	2d9158ce79	block: Fix bdrv_find_backing_image() bdrv_find_backing_image() should use bdrv_get_full_backing_filename() or bdrv_make_absolute_filename() instead of trying to do what those functions do by itself. path_combine_deprecated() can now be dropped, so let's do that. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20190201192935.18394-14-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:26 +01:00
Max Reitz	9f4793d8f2	block: Add bdrv_make_absolute_filename() This is a general function for making a filename that is relative to a certain BDS absolute. It calls bdrv_get_full_backing_filename_from_filename() for now, but that will be changed in a follow-up patch. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20190201192935.18394-13-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:26 +01:00
Max Reitz	6b6833c1b4	block: bdrv_get_full_backing_filename's ret. val. Make bdrv_get_full_backing_filename() return an allocated string instead of placing the result in a caller-provided buffer. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20190201192935.18394-12-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:26 +01:00
Max Reitz	645ae7d88e	block: bdrv_get_full_backing_filename_from_...'s ret. val. Make bdrv_get_full_backing_filename_from_filename() return an allocated string instead of placing the result in a caller-provided buffer. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190201192935.18394-11-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:26 +01:00
Max Reitz	009b03aaa2	block: Make path_combine() return the path Besides being safe for arbitrary path lengths, after some follow-up patches all callers will want a freshly allocated buffer anyway. In the meantime, path_combine_deprecated() is added which has the same interface as path_combine() had before this patch. All callers to that function will be converted in follow-up patches. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20190201192935.18394-10-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:26 +01:00
Max Reitz	909936234c	block: Respect backing bs in bdrv_refresh_filename Basically, bdrv_refresh_filename() should respect all children of a BlockDriverState. However, generally those children are driver-specific, so this function cannot handle the general case. On the other hand, there are only few drivers which use other children than @file and @backing (that being vmdk, quorum, and blkverify). Most block drivers only use @file and/or @backing (if they use any children at all). Both can be implemented directly in bdrv_refresh_filename. The user overriding the file's filename is already handled, however, the user overriding the backing file is not. If this is done, opening the BDS with the plain filename of its file will not be correct, so we may not set bs->exact_filename in that case. iotest 051 contains test cases for overriding the backing file, and so its output changes with this patch applied. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20190201192935.18394-6-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:25 +01:00
Max Reitz	998c201923	block: Add BDS.auto_backing_file If the backing file is overridden, this most probably does change the guest-visible data of a BDS. Therefore, we will need to consider this in bdrv_refresh_filename(). To see whether it has been overridden, we might want to compare bs->backing_file and bs->backing->bs->filename. However, bs->backing_file is changed by bdrv_set_backing_hd() (which is just used to change the backing child at runtime, without modifying the image header), so bs->backing_file most of the time simply contains a copy of bs->backing->bs->filename anyway, so it is useless for such a comparison. This patch adds an auto_backing_file BDS field which contains the backing file path as indicated by the image header, which is not changed by bdrv_set_backing_hd(). Because of bdrv_refresh_filename() magic, however, a BDS's filename may differ from what has been specified during bdrv_open(). Then, the comparison between bs->auto_backing_file and bs->backing->bs->filename may fail even though bs->backing was opened from bs->auto_backing_file. To mitigate this, we can copy the real BDS's filename (after the whole bdrv_open() and bdrv_refresh_filename() process) into bs->auto_backing_file, if we know the former has been opened based on the latter. This is only possible if no options modifying the backing file's behavior have been specified, though. To simplify things, this patch only copies the filename from the backing file if no options have been specified for it at all. Furthermore, there are cases where an overlay is created by qemu which already contains a BDS's filename (e.g. in blockdev-snapshot-sync). We do not need to worry about updating the overlay's bs->auto_backing_file there, because we actually wrote a post-bdrv_refresh_filename() filename into the image header. So all in all, there will be false negatives where (as of a future patch) bdrv_refresh_filename() will assume that the backing file differs from what was specified in the image header, even though it really does not. However, these cases should be limited to where (1) the user actually did override something in the backing chain (e.g. by specifying options for the backing file), or (2) the user executed a QMP command to change some node's backing file (e.g. change-backing-file or block-commit with @backing-file given) where the given filename does not happen to coincide with qemu's idea of the backing BDS's filename. Then again, (1) really is limited to -drive. With -blockdev or blockdev-add, you have to adhere to the schema, so a user cannot give partial "unimportant" options (e.g. by just setting backing.node-name and leaving the rest to the image header). Therefore, trying to fix this would mean trying to fix something for -drive only. To improve on (2), we would need a full infrastructure to "canonicalize" an arbitrary filename (+ options), so it can be compared against another. That seems a bit over the top, considering that filenames nowadays are there mostly for the user's entertainment. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20190201192935.18394-5-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:25 +01:00
Max Reitz	bb808d5f5c	block: Skip implicit nodes for filename info bdrv_refresh_filename() should simply skip all implicit nodes. They are supposed to be invisible to the user, so they should not appear in filename information. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20190201192935.18394-4-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:25 +01:00
Max Reitz	e24518e303	block: Use children list in bdrv_refresh_filename bdrv_refresh_filename() should invoke itself recursively on all children, not just on file. With that change, we can remove the manual invocations in blkverify, quorum, commit, mirror, and blklogwrites. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20190201192935.18394-3-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:25 +01:00
Max Reitz	f30c66ba6e	block: Use bdrv_refresh_filename() to pull Before this patch, bdrv_refresh_filename() is used in a pushing manner: Whenever the BDS graph is modified, the parents of the modified edges are supposed to be updated (recursively upwards). However, that is nonviable, considering that we want child changes not to concern parents. Also, in the long run we want a pull model anyway: Here, we would have a bdrv_filename() function which returns a BDS's filename, freshly constructed. This patch is an intermediate step. It adds bdrv_refresh_filename() calls before every place a BDS.filename value is used. The only exceptions are protocol drivers that use their own filename, which clearly would not profit from refreshing that filename before. Also, bdrv_get_encrypted_filename() is removed along the way (as a user of BDS.filename), since it is completely unused. In turn, all of the calls to bdrv_refresh_filename() before this patch are removed, because we no longer have to call this function on graph changes. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190201192935.18394-2-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-02-25 15:11:25 +01:00
Vladimir Sementsov-Ogievskiy	f962e96150	block: fix bdrv_check_perm for non-tree subgraph bdrv_check_perm in it's recursion checks each node in context of new permissions for one parent, because of nature of DFS. It works well, while children subgraph of top-most updated node is a tree, i.e. it doesn't have any kind of loops. But if we have a loop (not oriented, of course), i.e. we have two different ways from top-node to some child-node, then bdrv_check_perm will do wrong thing: top \| \ \| \| v v A B \| \| v v node It will once check new permissions of node in context of new A permissions and old B permissions and once visa-versa. It's a wrong way and may lead to corruption of permission system. We may start with no-permissions and all-shared for both A->node and B->node relations and finish up with non shared write permission for both ways. The following commit will add a test, which shows this bug. To fix this situation, let's really set BdrvChild permissions during bdrv_check_perm procedure. And we are happy here, as check-perm is already written in transaction manner, so we just need to restore backed-up permissions in _abort. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-02-25 15:03:19 +01:00
Vladimir Sementsov-Ogievskiy	2f30b7c377	block: improve should_update_child As it already said in the comment, we don't want to create loops in parent->child relations. So, when we try to append @to to @c, we should check that @c is not in @to children subtree, and we should check it recursively, not only the first level. The patch provides BFS-based search, to check the relations. This is needed for further fleecing-hook filter usage: we need to append it to source, when the hook is already a parent of target, and source may be in a backing chain of target (fleecing-scheme). So, on appending, the hook should not became a child (direct or through children subtree) of the target. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-02-25 15:03:19 +01:00
Kevin Wolf	d70d595429	block: Use normal drain for bdrv_set_aio_context() Now that bdrv_set_aio_context() works inside drained sections, it can also use the real drain function instead of open coding something similar. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-02-25 15:03:19 +01:00
Kevin Wolf	e64f25f30b	block: Fix AioContext switch for drained node When a drained node changes its AioContext, we need to move its aio_disable_external() to the new context, too. Without this fix, drain_end will try to reenable the new context, which has never been disabled, so an assertion failure is triggered. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2019-02-25 15:03:19 +01:00
Kevin Wolf	6c75d761d0	block: Don't poll in bdrv_set_aio_context() The explicit aio_poll() call in bdrv_set_aio_context() was added in commit `c2b6428d38` as a workaround for bdrv_drain() failing to achieve to actually quiesce everything (specifically the NBD client code to switch AioContext). Now that the NBD client has been fixed to complete this operation during bdrv_drain(), we don't need the workaround any more. It was wrong anyway: aio_poll() must always be run in the home thread of the AioContext. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2019-02-25 15:03:19 +01:00
Denis Plotnikov	57830a499f	block: don't set the same context Adds a fast path on aio context setting preventing unnecessary context setting routine. Also, it prevents issues with cyclic walk of child bds-es appeared because of registering aio walking notifiers: Call stack: 0 __GI_raise 1 __GI_abort 2 __assert_fail_base 3 __GI___assert_fail 4 bdrv_detach_aio_context (bs=0x55f54d65c000) <<< 5 bdrv_detach_aio_context (bs=0x55f54fc8a800) 6 bdrv_set_aio_context (bs=0x55f54fc8a800, ...) 7 block_job_attached_aio_context 8 bdrv_attach_aio_context (bs=0x55f54d65c000, ...) <<< 9 bdrv_set_aio_context (bs=0x55f54d65c000) 10 blk_set_aio_context 11 virtio_blk_data_plane_stop 12 virtio_bus_stop_ioeventfd 13 virtio_vmstate_change 14 vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN) 15 do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=true) 16 vm_stop (state=RUN_STATE_SHUTDOWN) 17 main_loop_should_exit 18 main_loop 19 main This can happen because of "new" context attachment to VM disk bds. When attaching a new context the corresponding aio context handler is called for each of aio_notifiers registered on the VM disk bds context. Among those handlers, there is the block_job_attached_aio_context handler which sets a new aio context for the block job bds. When doing so, the old context is detached from all the block job bds children and one of them is the VM disk bds, serving as backing store for the blockjob bds, although the VM disk bds is actually the initializer of that process. Since the VM disk bds is protected with walking_aio_notifiers flag from double processing in recursive calls, the assert fires. Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-02-25 15:03:19 +01:00
Andrey Shinkevich	1bf6e9ca92	bdrv_query_image_info Error parameter added Inform a user in case qcow2_get_specific_info fails to obtain QCOW2 image specific information. This patch is preliminary to the one "qcow2: Add list of bitmaps to ImageInfoSpecificQCow2". Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <1549638368-530182-2-git-send-email-andrey.shinkevich@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2019-02-11 14:35:43 -06:00
Kevin Wolf	78fc3b3a26	block: Fix invalidate_cache error path for parent activation bdrv_co_invalidate_cache() clears the BDRV_O_INACTIVE flag before actually activating a node so that the correct permissions etc. are taken. In case of errors, the flag must be restored so that the next call to bdrv_co_invalidate_cache() retries activation. Restoring the flag was missing in the error path for a failed parent->role->activate() call. The consequence is that this attempt to activate all images correctly fails because we still set errp, however on the next attempt BDRV_O_INACTIVE is already clear, so we return success without actually retrying the failed action. An example where this is observable in practice is migration to a QEMU instance that has a raw format block node attached to a guest device with share-rw=off (the default) while another process holds BLK_PERM_WRITE for the same image. In this case, all activation steps before parent->role->activate() succeed because raw can tolerate other writers to the image. Only the parent callback (in particular blk_root_activate()) tries to implement the share-rw=on property and requests exclusive write permissions. This fails when the migration completes and correctly displays an error. However, a manual 'cont' will incorrectly resume the VM without calling blk_root_activate() again. This case is described in more detail in the following bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1531888 Fix this by correctly restoring the BDRV_O_INACTIVE flag in the error path. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-02-01 13:46:45 +01:00
Kevin Wolf	8be25de643	block: Apply auto-read-only for ro-whitelist drivers If QEMU was configured with a driver in --block-drv-ro-whitelist, trying to use that driver read-write resulted in an error message even if auto-read-only=on was set. Consider auto-read-only=on for the whitelist checking and use it to automatically degrade to read-only for block drivers on the read-only whitelist. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-02-01 13:46:45 +01:00
Kevin Wolf	4720cbeea1	block: Fix hangs in synchronous APIs with iothreads In the block layer, synchronous APIs are often implemented by creating a coroutine that calls the asynchronous coroutine-based implementation and then waiting for completion with BDRV_POLL_WHILE(). For this to work with iothreads (more specifically, when the synchronous API is called in a thread that is not the home thread of the block device, so that the coroutine will run in a different thread), we must make sure to call aio_wait_kick() at the end of the operation. Many places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if the condition has long become false. Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is generally not enough for most other operations because they haven't set the return value in the coroutine entry stub yet. To avoid race conditions there, we need to kick after setting the return value. The race window is small enough that the problem doesn't usually surface in the common path. However, it does surface and causes easily reproducible hangs if the operation can return early before even calling bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op success paths). The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is slightly different: These functions even neglected to schedule the coroutine in the home thread of the node. This avoids the hang, but is obviously wrong, too. Fix those to schedule the coroutine in the right AioContext in addition to adding aio_wait_kick() calls. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-02-01 13:46:44 +01:00
Vladimir Sementsov-Ogievskiy	5d3b4e9946	qapi: add x-debug-query-block-graph Add a new command, returning block nodes (and their users) graph. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20181221170909.25584-2-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-01-31 00:38:19 +01:00
Alberto Garcia	e6d79c41c9	block: Assert that flags are up-to-date in bdrv_reopen_prepare() Towards the end of bdrv_reopen_queue_child(), before starting to process the children, the update_flags_from_options() function is called in order to have BDRVReopenState.flags in sync with the options from the QDict. This is necessary because during the reopen process flags must be updated for all nodes in the queue so bdrv_is_writable_after_reopen() and the permission checks work correctly. Because of that, calling update_flags_from_options() again in bdrv_reopen_prepare() doesn't really change the flags (they are already up-to-date). But we need to call it in order to remove those options from QemuOpts and that way indicate that they have been processed. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:55:02 +01:00
Alberto Garcia	8eb4b07b6f	block: Remove assertions from update_flags_from_options() This function takes four options (cache.direct, cache.no-flush, read-only and auto-read-only) from a QemuOpts object and updates the flags accordingly. If any of those options is not set (because it was missing from the original QDict or because it had an invalid value) then the function aborts with a failed assertion: $ qemu-io -c 'reopen -o read-only=foo' hd.qcow2 block.c:1126: update_flags_from_options: Assertion `qemu_opt_find(opts, BDRV_OPT_CACHE_DIRECT)' failed. Aborted This assertion is unnecessary, and it forces any caller of bdrv_reopen() to pass all the aforementioned four options. This may have made sense in order to remove ambiguity when bdrv_reopen() was taking both flags and options, but that's not the case anymore. It's also unnecessary if we want to validate the option values, because bdrv_reopen_prepare() already takes care of that, as we can see if we remove the assertions: $ qemu-io -c 'reopen -o read-only=foo' hd.qcow2 Parameter 'read-only' expects 'on' or 'off' Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:55:02 +01:00
Alberto Garcia	9aa09ddd1e	block: Stop passing flags to bdrv_reopen_queue_child() Now that all callers are passing the new options using the QDict we no longer need the 'flags' parameter. This patch makes the following changes: 1) The update_options_from_flags() call is no longer necessary so it can be removed. 2) The update_flags_from_options() call is now used in all cases, and is moved down a few lines so it happens after the options QDict contains the final set of values. 3) The flags parameter is removed. Now the flags are initialized using the current value (for the top-level node) or the parent flags (after inherit_options()). In both cases the initial values are updated to reflect the new options in the QDict. This happens in bdrv_reopen_queue_child() (as explained above) and in bdrv_reopen_prepare(). Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:55:02 +01:00
Alberto Garcia	2e891722c5	block: Remove flags parameter from bdrv_reopen_queue() Now that all callers are passing all flag changes as QDict options, the flags parameter is no longer necessary, so we can get rid of it. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:55:02 +01:00
Alberto Garcia	295cf237c2	block: Drop bdrv_reopen() No one is using this function anymore, so we can safely remove it. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:55:02 +01:00
Alberto Garcia	e94d3dba6a	block: Use bdrv_reopen_set_read_only() in bdrv_backing_update_filename() This patch replaces the bdrv_reopen() calls that set and remove the BDRV_O_RDWR flag with the new bdrv_reopen_set_read_only() function. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:55:01 +01:00
Alberto Garcia	6e1000a863	block: Add bdrv_reopen_set_read_only() Most callers of bdrv_reopen() only use it to switch a BlockDriverState between read-only and read-write, so this patch adds a new function that does just that. We also want to get rid of the flags parameter in the bdrv_reopen() API, so this function sets the "read-only" option and passes the original flags (which will then be updated in bdrv_reopen_prepare()). Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:55:01 +01:00
Kevin Wolf	9e37271f50	block: Don't inactivate children before parents bdrv_child_cb_inactivate() asserts that parents are already inactive when children get inactivated. This precondition is necessary because parents could still issue requests in their inactivation code. When block nodes are created individually with -blockdev, all of them are monitor owned and will be returned by bdrv_next() in an undefined order (in practice, in the order of their creation, which is usually children before parents), which obviously fails the assertion: qemu: block.c:899: bdrv_child_cb_inactivate: Assertion `bs->open_flags & BDRV_O_INACTIVE' failed. This patch fixes the ordering by skipping nodes with still active parents in bdrv_inactivate_recurse() because we know that they will be covered by recursion when the last active parent becomes inactive. With the correct parents-before-children ordering, we also got rid of the reason why commit `aad0b7a0bf` introduced two passes, so we can go back to a single-pass recursion. This is necessary so we can rely on the BDRV_O_INACTIVE flag to skip nodes with active parents (the flag used to be set only in pass 2, so we would always skip non-root nodes in pass 1 because all parents would still be considered active; setting the flag in pass 1 would mean, that we never skip anything in pass 2 because all parents are already considered inactive). Because of the change to single pass, this patch is best reviewed with whitespace changes ignored. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-11-27 12:59:00 +01:00
Alberto Garcia	6bd858b311	block: Update BlockDriverState.inherits_from on bdrv_drop_intermediate() The previous patch fixed the inherits_from pointer after block-stream, and this one does the same for block-commit. When block-commit finishes and the 'top' node is not the topmost one from the backing chain then all nodes above 'base' up to and including 'top' are removed from the chain. The bdrv_drop_intermediate() call converts a chain like this one: base <- intermediate <- top <- active into this one: base <- active In a simple scenario each backing file from the first chain has the inherits_from attribute pointing to its parent. This means that reopening 'active' will recursively reopen all its children, whose options can be changed in the process. However after the 'block-commit' call base.inherits_from is NULL and the chain is broken, so 'base' does not inherit from 'active' and will not be reopened automatically: $ qemu-img create -f qcow2 hd0.qcow2 1M $ qemu-img create -f qcow2 -b hd0.qcow2 hd1.qcow2 $ qemu-img create -f qcow2 -b hd1.qcow2 hd2.qcow2 $ $QEMU -drive if=none,file=hd2.qcow2 { 'execute': 'block-commit', 'arguments': { 'device': 'none0', 'top': 'hd1.qcow2' } } { 'execute': 'human-monitor-command', 'arguments': { 'command-line': 'qemu-io none0 "reopen -o backing.l2-cache-size=2M"' } } { "return": "Cannot change the option 'backing.l2-cache-size'\r\n"} This patch updates base.inherits_from in this scenario, and adds a test case. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-11-22 19:37:31 +01:00
Alberto Garcia	0065c455f9	block: Update BlockDriverState.inherits_from on bdrv_set_backing_hd() When a BlockDriverState's child is opened (be it a backing file, the protocol layer, or any other) inherits_from is set to point to the parent node. Children opened separately and then attached to a parent don't have this pointer set. bdrv_reopen_queue_child() uses this to determine whether a node's children must also be reopened inheriting the options from the parent or not. If inherits_from points to the parent then the child is reopened and its options can be changed, like in this example: $ qemu-img create -f qcow2 hd0.qcow2 1M $ qemu-img create -f qcow2 hd1.qcow2 1M $ $QEMU -drive if=none,node-name=hd0,file=hd0.qcow2,\ backing.driver=qcow2,backing.file.filename=hd1.qcow2 (qemu) qemu-io hd0 "reopen -o backing.l2-cache-size=2M" If the child does not inherit from the parent then it does not get reopened and its options cannot be changed: $ $QEMU -drive if=none,node-name=hd1,file=hd1.qcow2 -drive if=none,node-name=hd0,file=hd0.qcow2,backing=hd1 (qemu) qemu-io hd0 "reopen -o backing.l2-cache-size=2M" Cannot change the option 'backing.l2-cache-size' If a disk image has a chain of backing files then all of them are also connected through their inherits_from pointers (i.e. it's possible to walk the chain in reverse order from base to top). However this is broken if the intermediate nodes are removed using e.g. block-stream because the inherits_from pointer from the base node becomes NULL: $ qemu-img create -f qcow2 hd0.qcow2 1M $ qemu-img create -f qcow2 -b hd0.qcow2 hd1.qcow2 $ qemu-img create -f qcow2 -b hd1.qcow2 hd2.qcow2 $ $QEMU -drive if=none,file=hd2.qcow2 (qemu) qemu-io none0 "reopen -o backing.l2-cache-size=2M" (qemu) block_stream none0 0 hd0.qcow2 (qemu) qemu-io none0 "reopen -o backing.l2-cache-size=2M" Cannot change the option 'backing.l2-cache-size' This patch updates the inherits_from pointer if the intermediate nodes of a backing chain are removed using bdrv_set_backing_hd(), and adds a test case for this scenario. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-11-22 19:37:31 +01:00
Alberto Garcia	2a3d4331fa	block: Fix update of BDRV_O_AUTO_RDONLY in update_flags_from_options() Commit `e35bdc123a` added the auto-read-only option and the code to update its corresponding flag in update_flags_from_options(), but forgot to clear the flag if auto-read-only is false. Signed-off-by: Alberto Garcia <berto@igalia.com> Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-11-22 16:43:52 +01:00
Max Reitz	9ad08c4456	block: Always abort reopen after prepare succeeded bdrv_reopen_multiple() does not invoke bdrv_reopen_abort() for the element of the reopen queue for which bdrv_reopen_prepare() failed, because it assumes that the prepare function will have rolled back all changes already. However, bdrv_reopen_prepare() does not do this in every case: It may notice an error after BlockDriver.bdrv_reopen_prepare() succeeded, and it will not invoke BlockDriver.bdrv_reopen_abort() then; and neither will bdrv_reopen_multiple(), as explained above. This is wrong because we must always call .bdrv_reopen_commit() or .bdrv_reopen_abort() after .bdrv_reopen_prepare() has succeeded. Otherwise, the block driver has no chance to undo what it has done in its implementation of .bdrv_reopen_prepare(). To fix this, bdrv_reopen_prepare() has to call .bdrv_reopen_abort() if it wants to return an error after .bdrv_reopen_prepare() has succeeded. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-11-19 14:31:48 +01:00
Kevin Wolf	eaa2410f1e	block: Require auto-read-only for existing fallbacks Some block drivers have traditionally changed their node to read-only mode without asking the user. This behaviour has been marked deprecated since 2.11, expecting users to provide an explicit read-only=on option. Now that we have auto-read-only=on, enable these drivers to make use of the option. This is the only use of bdrv_set_read_only(), so we can make it a bit more specific and turn it into a bdrv_apply_auto_read_only() that is more convenient for drivers to use. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-11-05 15:09:55 +01:00
Kevin Wolf	e35bdc123a	block: Add auto-read-only option If a management application builds the block graph node by node, the protocol layer doesn't inherit its read-only option from the format layer any more, so it must be set explicitly. Backing files should work on read-only storage, but at the same time, a block job like commit should be able to reopen them read-write if they are on read-write storage. However, without option inheritance, reopen only changes the read-only option for the root node (typically the format layer), but not the protocol layer, so reopening fails (the format layer wants to get write permissions, but the protocol layer is still read-only). A simple workaround for the problem in the management tool would be to open the protocol layer always read-write and to make only the format layer read-only for backing files. However, sometimes the file is actually stored on read-only storage and we don't know whether the image can be opened read-write (for example, for NBD it depends on the server we're trying to connect to). This adds an option that makes QEMU try to open the image read-write, but allows it to degrade to a read-only mode without returning an error. The documentation for this option is consciously phrased in a way that allows QEMU to switch to a better model eventually: Instead of trying when the image is first opened, making the read-only flag dynamic and changing it automatically whenever the first BLK_PERM_WRITE user is attached or the last one is detached would be much more useful behaviour. Unfortunately, this more useful behaviour is also a lot harder to implement, and libvirt needs a solution now before it can switch to -blockdev, so let's start with this easier approach for now. Instead of adding a new auto-read-only option, turning the existing read-only into an enum (with a bool alternate for compatibility) was considered, but it complicated the implementation to the point that it didn't seem to be worth it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-11-05 15:09:55 +01:00
Kevin Wolf	eeae6a596b	block: Update flags in bdrv_set_read_only() To fully change the read-only state of a node, we must not only change bs->read_only, but also update bs->open_flags. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>	2018-11-05 15:09:55 +01:00
Alberto Garcia	415bbca86d	block: replace "discard" literal with BDRV_OPT_DISCARD macro Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-11-05 15:09:54 +01:00
Vladimir Sementsov-Ogievskiy	9c98f145df	dirty-bitmaps: clean-up bitmaps loading and migration logic This patch aims to bring the following behavior: 1. We don't load bitmaps, when started in inactive mode. It's the case of incoming migration. In this case we wait for bitmaps migration through migration channel (if 'dirty-bitmaps' capability is enabled) or for invalidation (to load bitmaps from the image). 2. We don't remove persistent bitmaps on inactivation. Instead, we only remove bitmaps after storing. This is the only way to restore bitmaps, if we decided to resume source after [failed] migration with 'dirty-bitmaps' capability enabled (which means, that bitmaps were not stored). 3. We load bitmaps on open and any invalidation, it's ok for all cases: - normal open - migration target invalidation with dirty-bitmaps capability (bitmaps are migrating through migration channel, the are not stored, so they should have IN_USE flag set and will be skipped when loading. However, it would fail if bitmaps are read-only[1]) - migration target invalidation without dirty-bitmaps capability (normal load of the bitmaps, if migrated with shared storage) - source invalidation with dirty-bitmaps capability (skip because IN_USE) - source invalidation without dirty-bitmaps capability (bitmaps were dropped, reload them) [1]: to accurately handle this, migration of read-only bitmaps is explicitly forbidden in this patch. New mechanism for not storing bitmaps when migrate with dirty-bitmaps capability is introduced: migration filed in BdrvDirtyBitmap. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: John Snow <jsnow@redhat.com>	2018-10-29 16:23:17 -04:00
Markus Armbruster	da7e92cac9	block: Clean up bdrv_img_create()'s error reporting bdrv_img_create() takes an Error ** argument and uses it in the conventional way, except for one place: when qemu_opts_do_parse() fails, it first reports its error to stderr or the HMP monitor with error_report_err(), then error_setg()'s a generic error. When the caller reports that second error similarly, this produces two consecutive error messages on stderr or the HMP monitor. When the caller does something else with it, such as send it via QMP, the first error still goes to stderr or the HMP monitor. Fortunately, no such caller exists. Simply use the first error as is. Update expected output of qemu-iotest 049 accordingly. Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20181017082702.5581-37-armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2018-10-19 14:51:34 +02:00
Markus Armbruster	4b5766488f	error: Fix use of error_prepend() with &error_fatal, &error_abort From include/qapi/error.h: * Pass an existing error to the caller with the message modified: * error_propagate(errp, err); * error_prepend(errp, "Could not frobnicate '%s': ", name); Fei Li pointed out that doing error_propagate() first doesn't work well when @errp is &error_fatal or &error_abort: the error_prepend() is never reached. Since I doubt fixing the documentation will stop people from getting it wrong, introduce error_propagate_prepend(), in the hope that it lures people away from using its constituents in the wrong order. Update the instructions in error.h accordingly. Convert existing error_prepend() next to error_propagate to error_propagate_prepend(). If any of these get reached with &error_fatal or &error_abort, the error messages improve. I didn't check whether that's the case anywhere. Cc: Fei Li <fli@suse.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20181017082702.5581-2-armbru@redhat.com>	2018-10-19 14:51:34 +02:00
Alberto Garcia	543770bd2e	block: Allow changing 'detect-zeroes' on reopen 'detect-zeroes' is one of the basic BlockdevOptions available for all drivers, but it's not handled by bdrv_reopen_prepare(), so any attempt to change it results in an error: (qemu) qemu-io virtio0 "reopen -o detect-zeroes=on" Cannot change the option 'detect-zeroes' Since there's no reason why we shouldn't allow changing it and the implementation is simple let's just do it. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-10-01 12:51:12 +02:00
Alberto Garcia	593b307197	block: Allow changing 'discard' on reopen 'discard' is one of the basic BlockdevOptions available for all drivers, but it's not handled by bdrv_reopen_prepare() so any attempt to change it results in an error: (qemu) qemu-io virtio0 "reopen -o discard=on" Cannot change the option 'discard' Since there's no reason why we shouldn't allow changing it and the implementation is simple let's just do it. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-10-01 12:51:12 +02:00
Alberto Garcia	57f9db9a94	block: Forbid trying to change unsupported options during reopen The bdrv_reopen_prepare() function checks all options passed to each BlockDriverState (in the reopen_state->options QDict) and makes all necessary preparations to apply the option changes requested by the user. Options are removed from the QDict as they are processed, so at the end of bdrv_reopen_prepare() only the options that can't be changed are left. Then a loop goes over all remaining options and verifies that the old and new values are identical, returning an error if they're not. The problem is that at the moment there are options that are removed from the QDict although they can't be changed. The consequence of this is any modification to any of those options is silently ignored: (qemu) qemu-io virtio0 "reopen -o discard=on" This happens when all options from bdrv_runtime_opts are removed from the QDict but then only a few of them are processed. Since it's especially important that "node-name" and "driver" are not changed, the code puts them back into the QDict so they are checked at the end of the function. Instead of putting only those two options back into the QDict, this patch puts all unprocessed options using qemu_opts_to_qdict(). update_flags_from_options() also needs to be modified to prevent BDRV_OPT_CACHE_NO_FLUSH, BDRV_OPT_CACHE_DIRECT and BDRV_OPT_READ_ONLY from going back to the QDict. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-10-01 12:51:12 +02:00
Alberto Garcia	db905283b8	block: Allow child references on reopen In the previous patches we removed all child references from bs->{options,explicit_options} because keeping them is useless and wrong. Because of this, any attempt to reopen a BlockDriverState using a child reference as one of its options would result in a failure, because bdrv_reopen_prepare() would detect that there's a new option (the child reference) that wasn't present in bs->options. But passing child references on reopen can be useful. It's a way to specify a BDS's child without having to pass recursively all of the child's options, and if the reference points to a different BDS then this can allow us to replace the child. However, replacing the child is something that needs to be implemented case by case and only when it makes sense. For now, this patch allows passing a child reference as long as it points to the current child of the BlockDriverState. It's also important to remember that, as a consequence of the previous patches, this child reference will be removed from bs->{options,explicit_options} after the reopening has been completed. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-10-01 12:51:12 +02:00
Alberto Garcia	a600aaddc3	block: Don't look for child references in append_open_options() In the previous patch we removed child references from bs->options, so there's no need to look for them here anymore. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-10-01 12:51:12 +02:00
Alberto Garcia	50196d7a7c	block: Remove child references from bs->{options,explicit_options} Block drivers allow opening their children using a reference to an existing BlockDriverState. These references remain stored in the 'options' and 'explicit_options' QDicts, but we don't need to keep them once everything is open. What is more important, these values can become wrong if the children change: $ qemu-img create -f qcow2 hd0.qcow2 10M $ qemu-img create -f qcow2 hd1.qcow2 10M $ qemu-img create -f qcow2 hd2.qcow2 10M $ $QEMU -drive if=none,file=hd0.qcow2,node-name=hd0 \ -drive if=none,file=hd1.qcow2,node-name=hd1,backing=hd0 \ -drive file=hd2.qcow2,node-name=hd2,backing=hd1 After this hd2 has hd1 as its backing file. Now let's remove it using block_stream: (qemu) block_stream hd2 0 hd0.qcow2 Now hd0 is the backing file of hd2, but hd2's options QDicts still contain backing=hd1. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-10-01 12:51:12 +02:00
Kevin Wolf	cfe29d8294	block: Use a single global AioWait When draining a block node, we recurse to its parent and for subtree drains also to its children. A single AIO_WAIT_WHILE() is then used to wait for bdrv_drain_poll() to become true, which depends on all of the nodes we recursed to. However, if the respective child or parent becomes quiescent and calls bdrv_wakeup(), only the AioWait of the child/parent is checked, while AIO_WAIT_WHILE() depends on the AioWait of the original node. Fix this by using a single AioWait for all callers of AIO_WAIT_WHILE(). This may mean that the draining thread gets a few more unnecessary wakeups because an unrelated operation got completed, but we already wake it up when something _could_ have changed rather than only if it has certainly changed. Apart from that, drain is a slow path anyway. In theory it would be possible to use wakeups more selectively and still correctly, but the gains are likely not worth the additional complexity. In fact, this patch is a nice simplification for some places in the code. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-09-25 15:50:15 +02:00
Alberto Garcia	8961be33e8	block: Fix use after free error in bdrv_open_inherit() When a block device is opened with BDRV_O_SNAPSHOT and the bdrv_append_temp_snapshot() call fails then the error code path tries to unref the already destroyed 'options' QDict. This can be reproduced easily by setting TMPDIR to a location where the QEMU process can't write: $ TMPDIR=/nonexistent $QEMU -drive driver=null-co,snapshot=on Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-09-25 15:50:15 +02:00
Markus Armbruster	dd98e84819	qjson: Have qobject_from_json() & friends reject empty and blank The last case where qobject_from_json() & friends return null without setting an error is empty or blank input. Callers: * block.c's parse_json_protocol() reports "Could not parse the JSON options". It's marked as a work-around, because it also covered actual bugs, but they got fixed in the previous few commits. * qobject_input_visitor_new_str() reports "JSON parse error". Also marked as work-around. The recent fixes have made this unreachable, because it currently gets called only for input starting with '{'. * check-qjson.c's empty_input() and blank_input() demonstrate the behavior. * The other callers are not affected since they only pass input with exactly one JSON value or, in the case of negative tests, one error. Fail with "Expecting a JSON value" instead of returning null, and simplify callers. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180823164025.12553-48-armbru@redhat.com>	2018-08-24 20:26:37 +02:00
Alberto Garcia	261dbcb18f	block: Simplify append_open_options() This function returns a BDS's driver-specific options, excluding also those from its children. Since we have just removed all children options from bs->options there's no need to do this last step. We allow references to children, though ("backing": "node0"), so those we still have to remove. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-08-15 12:50:39 +02:00
Alberto Garcia	4c8350fe17	block: Update bs->options if bdrv_reopen() succeeds If bdrv_reopen() succeeds then bs->explicit_options is updated with the new values, but bs->options never changes. Here's an example: { "execute": "blockdev-add", "arguments": { "driver": "qcow2", "node-name": "hd0", "overlap-check": "all", "file": { "driver": "file", "filename": "hd0.qcow2" } } } After this, both bs->options and bs->explicit_options contain "overlap-check": "all". Now let's change that using qemu-io's reopen command: (qemu) qemu-io hd0 "reopen -o overlap-check=none" After this, bs->explicit_options contains the new value but bs->options still keeps the old one. This patch updates bs->options after a BDS has been successfully reopened. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-08-15 12:50:39 +02:00
Alberto Garcia	1bab38e7bd	block: Simplify bdrv_reopen_abort() If a bdrv_reopen_multiple() call fails, then the explicit_options QDict has to be deleted for every entry in the reopen queue. This must happen regardless of whether that entry's bdrv_reopen_prepare() call succeeded or not. This patch simplifies the cleanup code a bit. Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-08-15 12:50:39 +02:00
Alberto Garcia	2f624b80ba	block: Remove children options from bs->{options,explicit_options} When bdrv_open_inherit() opens a BlockDriverState the options QDict can contain options for some of its children, passed in the form of child-name.option=value So while each child is opened with that subset of options, those same options remain stored in the parent BDS, leaving (at least) two copies of each one of them ("child-name.option=value" in the parent and "option=value" in the child). Having the children options stored in the parent is unnecessary and it can easily lead to an inconsistent state: $ qemu-img create -f qcow2 hd0.qcow2 10M $ qemu-img create -f qcow2 -b hd0.qcow2 hd1.qcow2 $ qemu-img create -f qcow2 -b hd1.qcow2 hd2.qcow2 $ $QEMU -drive file=hd2.qcow2,node-name=hd2,backing.node-name=hd1 This opens a chain of images hd0 <- hd1 <- hd2. Now let's remove hd1 using block_stream: (qemu) block_stream hd2 0 hd0.qcow2 After this hd2 contains backing.node-name=hd1, which is no longer correct because hd1 doesn't exist anymore. This patch removes all children options from the parent dictionaries at the end of bdrv_open_inherit() and bdrv_reopen_queue_child(). Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-08-15 12:50:39 +02:00
Vladimir Sementsov-Ogievskiy	3c005293c2	block: make .bdrv_close optional Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-08-15 12:50:39 +02:00
Stefan Weil	50d6a8a352	block: Fix typos in comments (found by codespell) Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-23 16:50:43 +02:00
Kevin Wolf	4be6a6d118	block: Poll after drain on attaching a node Commit `dcf94a23b1` ('block: Don't poll in parent drain callbacks') removed polling in bdrv_child_cb_drained_begin() on the grounds that the original bdrv_drain() already will poll and BdrvChildRole.drained_begin calls must not cause graph changes (and therefore must not call aio_poll() or the recursion through the graph will break. This reasoning is correct for calls through bdrv_do_drained_begin(). However, BdrvChildRole.drained_begin is also called when a node that is already in a drained section (i.e. bdrv_do_drained_begin() has already returned and therefore can't poll any more) is attached to a new parent. In this case, we must explicitly poll to have all requests completed before the drained new child can be attached to the parent. In bdrv_replace_child_noperm(), we know that we're not inside the recursion of bdrv_do_drained_begin() because graph changes are not allowed there, and bdrv_replace_child_noperm() is a graph change. The call of BdrvChildRole.drained_begin() must therefore be followed by a BDRV_POLL_WHILE() that waits for the completion of requests. Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-10 10:36:15 +02:00
Kevin Wolf	824808dd77	block: Don't silently truncate node names If the user passes a too long node name string, we silently truncate it to fit into BlockDriverState.node_name, i.e. to 31 characters. Apart from surprising the user when the node has a different name than requested, this also bypasses the check for duplicate names, so that the same name can be assigned to multiple nodes. Fix this by just making too long node names an error. Reported-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-05 10:29:19 +02:00
Ari Sundholm	7ae9f3f61b	block: Move two block permission constants to the relevant enum This allows using the two constants outside of block.c, which will happen in a subsequent patch. Signed-off-by: Ari Sundholm <ari@tuxera.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-07-05 10:29:19 +02:00
Kevin Wolf	3d9f2d2af6	block: Move bdrv_truncate() implementation to io.c This moves the bdrv_truncate() implementation from block.c to block/io.c so it can have access to the tracked requests infrastructure. This involves making refresh_total_sectors() public (in block_int.h). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-29 14:20:56 +02:00
Kevin Wolf	061ca8a368	block: Convert .bdrv_truncate callback to coroutine_fn bdrv_truncate() is an operation that can block (even for a quite long time, depending on the PreallocMode) in I/O paths that shouldn't block. Convert it to a coroutine_fn so that we have the infrastructure for drivers to make their .bdrv_co_truncate implementation asynchronous. This change could potentially introduce new race conditions because bdrv_truncate() isn't necessarily executed atomically any more. Whether this is a problem needs to be evaluated for each block driver that supports truncate: * file-posix/win32, gluster, iscsi, nfs, rbd, ssh, sheepdog: The protocol drivers are trivially safe because they don't actually yield yet, so there is no change in behaviour. * copy-on-read, crypto, raw-format: Essentially just filter drivers that pass the request to a child node, no problem. * qcow2: The implementation modifies metadata, so it needs to hold s->lock to be safe with concurrent I/O requests. In order to avoid double locking, this requires pulling the locking out into preallocate_co() and using qcow2_write_caches() instead of bdrv_flush(). * qed: Does a single header update, this is fine without locking. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-29 14:20:56 +02:00
Max Reitz	ec9f10fe06	block: Generalize should_update_child() rule Currently, bdrv_replace_node() refuses to create loops from one BDS to itself if the BDS to be replaced is the backing node of the BDS to replace it: Say there is a node A and a node B. Replacing B by A means making all references to B point to A. If B is a child of A (i.e. A has a reference to B), that would mean we would have to make this reference point to A itself -- so we'd create a loop. bdrv_replace_node() (through should_update_child()) refuses to do so if B is the backing node of A. There is no reason why we should create loops if B is not the backing node of A, though. The BDS graph should never contain loops, so we should always refuse to create them. If B is a child of A and B is to be replaced by A, we should simply leave B in place there because it is the most sensible choice. A more specific argument would be: Putting filter drivers into the BDS graph is basically the same as appending an overlay to a backing chain. But the main child BDS of a filter driver is not "backing" but "file", so restricting the no-loop rule to backing nodes would fail here. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20180613181823.13618-7-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-18 17:04:54 +02:00
Kevin Wolf	0f12264e7a	block: Allow graph changes in bdrv_drain_all_begin/end sections bdrv_drain_all_*() used bdrv_next() to iterate over all root nodes and did a subtree drain for each of them. This works fine as long as the graph is static, but sadly, reality looks different. If the graph changes so that root nodes are added or removed, we would have to compensate for this. bdrv_next() returns each root node only once even if it's the root node for multiple BlockBackends or for a monitor-owned block driver tree, which would only complicate things. The much easier and more obviously correct way is to fundamentally change the way the functions work: Iterate over all BlockDriverStates, no matter who owns them, and drain them individually. Compensation is only necessary when a new BDS is created inside a drain_all section. Removal of a BDS doesn't require any action because it's gone afterwards anyway. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	6cd5c9d7b2	block: ignore_bds_parents parameter for drain functions In the future, bdrv_drained_all_begin/end() will drain all invidiual nodes separately rather than whole subtrees. This means that we don't want to propagate the drain to all parents any more: If the parent is a BDS, it will already be drained separately. Recursing to all parents is unnecessary work and would make it an O(n²) operation. Prepare the drain function for the changed drain_all by adding an ignore_bds_parents parameter to the internal implementation that prevents the propagation of the drain to BDS parents. We still (have to) propagate it to non-BDS parents like BlockBackends or Jobs because those are not drained separately. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	dcf94a23b1	block: Don't poll in parent drain callbacks bdrv_do_drained_begin() is only safe if we have a single BDRV_POLL_WHILE() after quiescing all affected nodes. We cannot allow that parent callbacks introduce a nested polling loop that could cause graph changes while we're traversing the graph. Split off bdrv_do_drained_begin_quiesce(), which only quiesces a single node without waiting for its requests to complete. These requests will be waited for in the BDRV_POLL_WHILE() call down the call chain. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	fe4f0614ef	block: Drain recursively with a single BDRV_POLL_WHILE() Anything can happen inside BDRV_POLL_WHILE(), including graph changes that may interfere with its callers (e.g. child list iteration in recursive callers of bdrv_do_drained_begin). Switch to a single BDRV_POLL_WHILE() call for the whole subtree at the end of bdrv_do_drained_begin() to avoid such effects. The recursion happens now inside the loop condition. As the graph can only change between bdrv_drain_poll() calls, but not inside of it, doing the recursion here is safe. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	89bd030533	block: Really pause block jobs on drain We already requested that block jobs be paused in .bdrv_drained_begin, but no guarantee was made that the job was actually inactive at the point where bdrv_drained_begin() returned. This introduces a new callback BdrvChildRole.bdrv_drained_poll() and uses it to make bdrv_drain_poll() consider block jobs using the node to be drained. For the test case to work as expected, we have to switch from block_job_sleep_ns() to qemu_co_sleep_ns() so that the test job is even considered active and must be waited for when draining the node. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Max Reitz	609f45ea95	block: Add block-specific QDict header There are numerous QDict functions that have been introduced for and are used only by the block layer. Move their declarations into an own header file to reflect that. While qdict_extract_subqdict() is in fact used outside of the block layer (in util/qemu-config.c), it is still a function related very closely to how the block layer works with nested QDicts, namely by sometimes flattening them. Therefore, its declaration is put into this header as well and util/qemu-config.c includes it with a comment stating exactly which function it needs. Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20180509165530.29561-7-mreitz@redhat.com> [Copyright note tweaked, superfluous includes dropped] Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Max Reitz	cc02214097	block: Make bdrv_is_writable() public This is a useful function for the whole block layer, so make it public. At the same time, users outside of block.c probably do not need to make use of the reopen functionality, so rename the current function to bdrv_is_writable_after_reopen() create a new bdrv_is_writable() function that just passes NULL to it for the reopen queue. Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180606193702.7113-2-mreitz@redhat.com Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Max Reitz	d1402b5026	block: Add Error parameter to bdrv_amend_options Looking at the qcow2 code that is riddled with error_report() calls, this is really how it should have been from the start. Along the way, turn the target_version/current_version comparisons at the beginning of qcow2_downgrade() into assertions (the caller has to make sure these conditions are met), and rephrase the error message on using compat=1.1 to get refcount widths other than 16 bits. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180509210023.20283-3-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Kevin Wolf	b3b5299d58	block: Cancel job in bdrv_close_all() callers Now that we cancel all jobs and not only block jobs on shutdown, doing that in bdrv_close_all() isn't really appropriate any more. Move the job_cancel_sync_all() call to the callers, and only assert that there are no job running in bdrv_close_all(). Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-05-23 14:30:51 +02:00
Kevin Wolf	3d70ff53b6	job: Move completion and cancellation to Job This moves the top-level job completion and cancellation functions from BlockJob to Job. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-05-23 14:30:51 +02:00
Marc-André Lureau	f5a74a5a50	qobject: Modify qobject_ref() to return obj For convenience and clarity, make it possible to call qobject_ref() at the time when the reference is associated with a variable, or argument, by making qobject_ref() return the same pointer as given. Use that to simplify the callers. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180419150145.24795-5-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Useless change to qobject_ref_impl() dropped, commit message improved slightly] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2018-05-04 08:27:53 +02:00
Marc-André Lureau	cb3e7f08ae	qobject: Replace qobject_incref/QINCREF qobject_decref/QDECREF Now that we can safely call QOBJECT() on QObject * as well as its subtypes, we can have macros qobject_ref() / qobject_unref() that work everywhere instead of having to use QINCREF() / QDECREF() for QObject and qobject_incref() / qobject_decref() for its subtypes. The replacement is mechanical, except I broke a long line, and added a cast in monitor_qmp_cleanup_req_queue_locked(). Unlike qobject_decref(), qobject_unref() doesn't accept void *. Note that the new macros evaluate their argument exactly once, thus no need to shout them. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180419150145.24795-4-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Rebased, semantic conflict resolved, commit message improved] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2018-05-04 08:27:53 +02:00
Max Reitz	4f7be2806e	block: Deprecate "backing": "" We have a clear replacement, so let's deprecate it. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-Id: <20180224154033.29559-8-mreitz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-19 14:58:36 -05:00
Max Reitz	e59a0cf17b	block: Handle null backing link Instead of converting all "backing": null instances into "backing": "", handle a null value directly in bdrv_open_inherit(). This enables explicitly null backing links for json:{} filenames. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-Id: <20180224154033.29559-7-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: rebase to qobject_to() parameter order and qapi headers split] Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-19 14:58:36 -05:00
Max Reitz	7dc847ebba	qapi: Replace qobject_to_X(o) by qobject_to(X, o) This patch was generated using the following Coccinelle script: @@ expression Obj; @@ ( - qobject_to_qnum(Obj) + qobject_to(QNum, Obj) \| - qobject_to_qstring(Obj) + qobject_to(QString, Obj) \| - qobject_to_qdict(Obj) + qobject_to(QDict, Obj) \| - qobject_to_qlist(Obj) + qobject_to(QList, Obj) \| - qobject_to_qbool(Obj) + qobject_to(QBool, Obj) ) and a bit of manual fix-up for overly long lines and three places in tests/check-qjson.c that Coccinelle did not find. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-Id: <20180224154033.29559-4-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: swap order from qobject_to(o, X), rebase to master, also a fix to latent false-positive compiler complaint about hw/i386/acpi-build.c] Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-19 14:58:36 -05:00
Fam Zheng	2c860e797a	block: Fix leak of ignore_children in error path Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-19 12:01:39 +01:00
Fam Zheng	1a5297366f	block: Fix flags in reopen queue Reopen flags are not synchronized according to the bdrv_reopen_queue_child precedence until bdrv_reopen_prepare. It is a bit too late: we already check the consistency in bdrv_check_perm before that. This fixes the bug that when bdrv_reopen a RO node as RW, the flags for backing child are wrong. Before, we could recurse with flags.rw=1; now, role->inherit_options + update_flags_from_options will make sure to clear the bit when necessary. Note that this will not clear an explicitly set bit, as in the case of parallel block jobs (e.g. test_stream_parallel in 030), because the explicit options include 'read-only=false' (for an intermediate node used by a different job). Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-19 12:01:39 +01:00
Kevin Wolf	cd8b7aaa07	block: Fail bdrv_truncate() with negative size Most callers have their own checks, but something like this should also be checked centrally. As it happens, x-blockdev-create can pass negative image sizes to format drivers (because there is no QAPI type that would reject negative numbers) and triggers the check added by this patch. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-03-09 15:17:48 +01:00
Kevin Wolf	e8eb863778	block: Make bdrv_is_whitelisted() public We'll use a separate source file for image creation, and we need to check there whether the requested driver is whitelisted. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2018-03-09 15:17:47 +01:00
Kevin Wolf	e1d74bc6c6	qcow2: Use BlockdevRef in qcow2_co_create() Instead of passing a separate BlockDriverState* into qcow2_co_create(), make use of the BlockdevRef that is included in BlockdevCreateOptions. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2018-03-09 15:17:47 +01:00
Paolo Bonzini	2fd6163884	block: convert bdrv_check callback to coroutine_fn Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1516279431-30424-8-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-09 15:17:47 +01:00
Paolo Bonzini	2b148f392b	block: convert bdrv_invalidate_cache callback to coroutine_fn QED's bdrv_invalidate_cache implementation would like to reuse functions that acquire/release the metadata locks. Call it from coroutine context to simplify the logic. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1516279431-30424-6-git-send-email-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-09 15:17:47 +01:00
Peter Maydell	58e2e17dba	Block layer patches -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJanYJPAAoJEH8JsnLIjy/WxjUQAJA+DTOmGXvaNpMs65BrU79K /r/iGVrzHv/RMLmrWMnqj96W9SnpMuiAP9hVLNsekqClY9q4ME4DpGcXhWfhSvF5 FC51ehvFJdfo8cPorsevcqNj60iWebjcx3lFfUq2606UOyYih3oijYxr6gSwWbRc GAgdGMqsvGYpzgqAQVEWHUhaX0La49/OzY42aR+E+LCBNfTYvlydvyoc+tUTdIpW 1eM/ASGndGsN0Cf2vxlbKgJ0/P6v+cRZuuIDhKZqre+YG+yM+pq7yZb+o7nf/P36 TPR93BsT7FSVAizRK7VFRuPIynHpiaxYygrJERCXF0sxsV4OlKjpmt/uUPamWFh+ 46Jx2NK1AuAx87BdErgmA119ObO3oAPxK0+2p981obb6SphTbbPxDj6SOlYCt4mJ mhff4JtIiwCmDSckAwd2mkBI1Tvl9qqcELrpyd2t2eU4ec2vf7fPd85EsK/Mq6Kr dbfqFvjNaaMxChoqFgkHAveYJ7zYqRFI2IY5o9c1QyZehCGPWjScxHXZZYdpDl59 YF9DkYQDOyvEX2jmMECaO1r/0nnO+BqQHu5ItJuTte9rjP9Q0do3iBISiIefewtf yji6/QNn2hFrnr1HPAwLFFC3kPgc8Mq8mIUb53j8vG/01KhVRCcnJm2K6D4IUwLZ S6ZnQJB97eE4y7YR5dNt =2axz -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches # gpg: Signature made Mon 05 Mar 2018 17:45:51 GMT # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (38 commits) block: Fix NULL dereference on empty drive error qcow2: Replace align_offset() with ROUND_UP() block/ssh: Add basic .bdrv_truncate() block/ssh: Make ssh_grow_file() blocking block/ssh: Pull ssh_grow_file() from ssh_create() qemu-img: Make resize error message more general qcow2: make qcow2_co_create2() a coroutine_fn block: rename .bdrv_create() to .bdrv_co_create_opts() Revert "IDE: Do not flush empty CDROM drives" block: test blk_aio_flush() with blk->root == NULL block: add BlockBackend->in_flight counter block: extract AIO_WAIT_WHILE() from BlockDriverState aio: rename aio_context_in_iothread() to in_aio_context_home_thread() docs: document how to use the l2-cache-entry-size parameter specs/qcow2: Fix documentation of the compressed cluster descriptor iotest 033: add misaligned write-zeroes test via truncate block: fix write with zero flag set and iovector provided block: Drop unused .bdrv_co_get_block_status() vvfat: Switch to .bdrv_co_block_status() vpc: Switch to .bdrv_co_block_status() ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # include/block/block.h	2018-03-06 11:20:44 +00:00
Markus Armbruster	9af2398977	Include less of the generated modular QAPI headers In my "build everything" tree, a change to the types in qapi-schema.json triggers a recompile of about 4800 out of 5100 objects. The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h, qapi-types.h. Each of these headers still includes all its shards. Reduce compile time by including just the shards we actually need. To illustrate the benefits: adding a type to qapi/migration.json now recompiles some 2300 instead of 4800 objects. The next commit will improve it further. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180211093607.27351-24-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> [eblake: rebase to master] Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-02 13:45:50 -06:00
Markus Armbruster	0dd13589b0	Include qapi/qmp/qerror.h exactly where needed Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180211093607.27351-2-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-02 13:14:08 -06:00
Stefan Hajnoczi	efc75e2a4c	block: rename .bdrv_create() to .bdrv_co_create_opts() BlockDriver->bdrv_create() has been called from coroutine context since commit `5b7e1542cf` ("block: make bdrv_create adopt coroutine"). Make this explicit by renaming to .bdrv_co_create_opts() and add the coroutine_fn annotation. This makes it obvious to block driver authors that they may yield, use CoMutex, or other coroutine_fn APIs. bdrv_co_create is reserved for the QAPI-based version that Kevin is working on. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20170705102231.20711-2-stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Stefan Hajnoczi	33f2a75777	block: add BlockBackend->in_flight counter BlockBackend currently relies on BlockDriverState->in_flight to track requests for blk_drain(). There is a corner case where BlockDriverState->in_flight cannot be used though: blk->root can be NULL when there is no medium. This results in a segfault when the NULL pointer is dereferenced. Introduce a BlockBackend->in_flight counter for aio requests so it works even when blk->root == NULL. Based on a patch by Kevin Wolf <kwolf@redhat.com>. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Stefan Hajnoczi	7719f3c968	block: extract AIO_WAIT_WHILE() from BlockDriverState BlockDriverState has the BDRV_POLL_WHILE() macro to wait on event loop activity while a condition evaluates to true. This is used to implement synchronous operations where it acts as a condvar between the IOThread running the operation and the main loop waiting for the operation. It can also be called from the thread that owns the AioContext and in that case it's just a nested event loop. BlockBackend needs this behavior but doesn't always have a BlockDriverState it can use. This patch extracts BDRV_POLL_WHILE() into the AioWait abstraction, which can be used with AioContext and isn't tied to BlockDriverState anymore. This feature could be built directly into AioContext but then all users would kick the event loop even if they signal different conditions. Imagine an AioContext with many BlockDriverStates, each time a request completes any waiter would wake up and re-check their condition. It's nicer to keep a separate AioWait object for each condition instead. Please see "block/aio-wait.h" for details on the API. The name AIO_WAIT_WHILE() avoids the confusion between AIO_POLL_WHILE() and AioContext polling. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-03-02 18:39:07 +01:00
Eric Blake	e24d813b29	block: Simplify bdrv_can_write_zeroes_with_unmap() We don't need the can_write_zeroes_with_unmap field in BlockDriverInfo, because it is redundant information with supported_zero_flags & BDRV_REQ_MAY_UNMAP. Note that BlockDriverInfo and supported_zero_flags are both per-device settings, rather than global state about the driver as a whole, which means one or both of these bits of information can already be conditional. Let's audit how they were set: crypto: always setting can_write_ to false is pointless (the struct starts life zero-initialized), no use of supported_ nbd: just recently fixed to set can_write_ if supported_ includes MAY_UNMAP (thus this commit effectively reverts bca80059e and solves the problem mentioned there in a more global way) file-posix, iscsi, qcow2: can_write_ is conditional, while supported_ was unconditional; but passing MAY_UNMAP would fail with ENOTSUP if the condition wasn't met qed: can_write_ is unconditional, but pwrite_zeroes lacks support for MAY_UNMAP and supported_ is not set. Perhaps support can be added later (since it would be similar to qcow2), but for now claiming false is no real loss all other drivers: can_write_ is not set, and supported_ is either unset or a passthrough Simplify the code by moving the conditional into supported_zero_flags for all drivers, then dropping the now-unused BDI field. For callers that relied on bdrv_can_write_zeroes_with_unmap(), we return the same per-device settings for drivers that had conditions (no observable change in behavior there); and can now return true (instead of false) for drivers that support passthrough (for example, the commit driver) which gives those drivers the same fix as nbd just got in bca80059e. For callers that relied on supported_zero_flags, we now have a few more places that can avoid a wasted call to pwrite_zeroes() that will just fail with ENOTSUP. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20180126193439.20219-1-eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-02-09 12:32:44 -06:00
Markus Armbruster	922a01a013	Move include qemu/option.h from qemu-common.h to actual users qemu-common.h includes qemu/option.h, but most places that include the former don't actually need the latter. Drop the include, and add it to the places that actually need it. While there, drop superfluous includes of both headers, and separate #include from file comment with a blank line. This cleanup makes the number of objects depending on qemu/option.h drop from 4545 (out of 4743) to 284 in my "build everything" tree. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-20-armbru@redhat.com> [Semantic conflict with commit `bdd6a90a9e` in block/nvme.c resolved]	2018-02-09 13:52:16 +01:00
Markus Armbruster	bd006b9818	Include qapi/qmp/qbool.h exactly where needed Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-15-armbru@redhat.com>	2018-02-09 13:52:15 +01:00
Markus Armbruster	fc81fa1eb0	Include qapi/qmp/qstring.h exactly where needed Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-14-armbru@redhat.com>	2018-02-09 13:52:15 +01:00
Markus Armbruster	452fcdbc49	Include qapi/qmp/qdict.h exactly where needed This cleanup makes the number of objects depending on qapi/qmp/qdict.h drop from 4550 (out of 4743) to 368 in my "build everything" tree. For qapi/qmp/qobject.h, the number drops from 4552 to 390. While there, separate #include from file comment with a blank line. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-13-armbru@redhat.com>	2018-02-09 13:52:15 +01:00
Markus Armbruster	abb297ed44	Include qmp-commands.h exactly where needed Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-7-armbru@redhat.com> [OSX breakage fixed]	2018-02-09 13:52:10 +01:00
Markus Armbruster	e688df6bc4	Include qapi/error.h exactly where needed This cleanup makes the number of objects depending on qapi/error.h drop from 1910 (out of 4743) to 1612 in my "build everything" tree. While there, separate #include from file comment with a blank line, and drop a useless comment on why qemu/osdep.h is included first. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-5-armbru@redhat.com> [Semantic conflict with commit `34e304e975` resolved, OSX breakage fixed]	2018-02-09 13:50:17 +01:00
Kevin Wolf	1a63a90750	block: Keep nodes drained between reopen_queue/multiple The bdrv_reopen*() implementation doesn't like it if the graph is changed between queuing nodes for reopen and actually reopening them (one of the reasons is that queuing can be recursive). So instead of draining the device only in bdrv_reopen_multiple(), require that callers already drained all affected nodes, and assert this in bdrv_reopen_queue(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>	2017-12-22 15:05:32 +01:00
Kevin Wolf	d736f119da	block: Allow graph changes in subtree drained section We need to remember how many of the drain sections in which a node is were recursive (i.e. subtree drain rather than node drain), so that they can be correctly applied when children are added or removed during the drained section. With this change, it is safe to modify the graph even inside a bdrv_subtree_drained_begin/end() section. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-12-22 15:05:32 +01:00
Kevin Wolf	0152bf400f	block: Don't notify parents in drain call chain This is in preparation for subtree drains, i.e. drained sections that affect not only a single node, but recursively all child nodes, too. Calling the parent callbacks for drain is pointless when we just came from that parent node recursively and leads to multiple increases of bs->quiesce_counter in a single drain call. Don't do it. In order for this to work correctly, the parent callback must be called for every bdrv_drain_begin/end() call, not only for the outermost one: If we have a node N with two parents A and B, recursive draining of A should cause the quiesce_counter of B to increase because its child N is drained independently of B. If now B is recursively drained, too, A must increase its quiesce_counter because N is drained independently of A only now, even if N is going from quiesce_counter 1 to 2. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-12-22 15:05:32 +01:00
Fam Zheng	cc954f01e3	block: Open backing image in force share mode for size probe Management tools create overlays of running guests with qemu-img: $ qemu-img create -b /image/in/use.qcow2 -f qcow2 /overlay/image.qcow2 but this doesn't work anymore due to image locking: qemu-img: /overlay/image.qcow2: Failed to get shared "write" lock Is another process using the image? Could not open backing image to determine size. Use the force share option to allow this use case again. Cc: qemu-stable@nongnu.org Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-12-22 15:03:41 +01:00
Kevin Wolf	5fbfabd313	block: Formats don't need CONSISTENT_READ with NO_IO Commit `1f4ad7d` fixed 'qemu-img info' for raw images that are currently in use as a mirror target. It is not enough for image formats, though, as these still unconditionally request BLK_PERM_CONSISTENT_READ. As this permission is geared towards whether the guest-visible data is consistent, and has no impact on whether the metadata is sane, and 'qemu-img info' does not read guest-visible data (except for the raw format), it makes sense to not require BLK_PERM_CONSISTENT_READ if there is not going to be any guest I/O performed, regardless of image format. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-12-22 15:03:41 +01:00
Paolo Bonzini	bd6458e410	block: avoid recursive AioContext acquire in bdrv_inactivate_all() BDRV_POLL_WHILE() does not support recursive AioContext locking. It only releases the AioContext lock once regardless of how many times the caller has acquired it. This results in a hang since the IOThread does not make progress while the AioContext is still locked. The following steps trigger the hang: $ qemu-system-x86_64 -M accel=kvm -m 1G -cpu host \ -object iothread,id=iothread0 \ -device virtio-scsi-pci,iothread=iothread0 \ -drive if=none,id=drive0,file=test.img,format=raw \ -device scsi-hd,drive=drive0 \ -drive if=none,id=drive1,file=test.img,format=raw \ -device scsi-hd,drive=drive1 $ qemu-system-x86_64 ...same options... \ -incoming tcp::1234 (qemu) migrate tcp:127.0.0.1:1234 ...hang... Tested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20171207201320.19284-2-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-12-19 10:25:09 +00:00
Alberto Garcia	50a3efb0f0	block: Close a BlockDriverState completely even when bs->drv is NULL bdrv_close() skips much of its logic when bs->drv is NULL. This is fine when we're closing a BlockDriverState that has just been created (because e.g the initialization process failed), but it's not enough in other cases. For example, when a valid qcow2 image is found to be corrupted then QEMU marks it as such in the file header and then sets bs->drv to NULL in order to make the BlockDriverState unusable. When that BDS is later closed then many of its data structures are not freed (leaking their memory) and none of its children are detached. This results in bdrv_close_all() failing to close all BDSs and making this assertion fail when QEMU is being shut down: bdrv_close_all: Assertion `QTAILQ_EMPTY(&all_bdrv_states)' failed. This patch makes bdrv_close() do the full uninitialization process in all cases. This fixes the problem with corrupted images and still works fine with freshly created BDSs. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 20171106145345.12038-1-berto@igalia.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-21 14:54:02 +01:00
Kevin Wolf	dacaa16238	block: Don't use BLK_PERM_CONSISTENT_READ for format probing For format probing, we don't really care whether all of the image content is consistent. The only thing we're looking at is the image header, and specifically the magic numbers that are expected to never change, no matter how inconsistent the guest visible disk content is. Therefore, don't request BLK_PERM_CONSISTENT_READ. This allows to use format probing, e.g. in the context of 'qemu-img info', even while the guest visible data in the image is inconsistent during a running block job. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>	2017-11-21 14:48:22 +01:00
Max Reitz	5e003f17ec	block: Make bdrv_next() keep strong references On one hand, it is a good idea for bdrv_next() to return a strong reference because ideally nearly every pointer should be refcounted. This fixes intermittent failure of iotest 194. On the other, it is absolutely necessary for bdrv_next() itself to keep a strong reference to both the BB (in its first phase) and the BDS (at least in the second phase) because when called the next time, it will dereference those objects to get a link to the next one. Therefore, it needs these objects to stay around until then. Just storing the pointer to the next in the iterator is not really viable because that pointer might become invalid as well. Both arguments taken together means we should probably just invoke bdrv_ref() and blk_ref() in bdrv_next(). This means we have to assert that bdrv_next() is always called from the main loop, but that was probably necessary already before this patch and judging from the callers, it also looks to actually be the case. Keeping these strong references means however that callers need to give them up if they decide to abort the iteration early. They can do so through the new bdrv_next_cleanup() function. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110172545.32609-1-mreitz@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-17 18:21:31 +01:00
Max Reitz	d470ad42ac	block: Guard against NULL bs->drv We currently do not guard everywhere against a NULL bs->drv where we should be doing so. Most of the places fixed here just do not care about that case at all. Some care implicitly, e.g. through a prior function call to bdrv_getlength() which would always fail for an ejected BDS. Add an assert there to make it more obvious. Other places seem to care, but do so insufficiently: Freeing clusters in a qcow2 image is an error-free operation, but it may leave the image in an unusable state anyway. Giving qcow2_free_clusters() an error code is not really viable, it is much easier to note that bs->drv may be NULL even after a successful driver call. This concerns bdrv_co_flush(), and the way the check is added to bdrv_co_pdiscard() (in every iteration instead of only once). Finally, some places employ at least an assert(bs->drv); somewhere, that may be reasonable (such as in the reopen code), but in bdrv_has_zero_init(), it is definitely not. Returning 0 there in case of an ejected BDS saves us much headache instead. Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1728660 Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110203111.7666-4-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-17 18:21:31 +01:00
Max Reitz	54fd1b0d26	block: qobject_is_equal() in bdrv_reopen_prepare() Currently, bdrv_reopen_prepare() assumes that all BDS options are strings. However, this is not the case if the BDS has been created through the json: pseudo-protocol or blockdev-add. Note that the user-invokable reopen command is an HMP command, so you can only specify strings there. Therefore, specifying a non-string option with the "same" value as it was when originally created will now return an error because the values are supposedly similar (and there is no way for the user to circumvent this but to just not specify the option again -- however, this is still strictly better than just crashing). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20171114180128.17076-5-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-17 18:21:30 +01:00
Kevin Wolf	dafe096057	block: Fix permissions in image activation Inactive images generally request less permissions for their image files than they would if they were active (in particular, write permissions). Activating the image involves extending the permissions, therefore. drv->bdrv_invalidate_cache() can already require write access to the image file, so we have to update the permissions earlier than that. The current code does it only later, so we have to move up this part. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2017-11-17 18:06:12 +01:00
Kevin Wolf	398e6ad014	block: Deprecate bdrv_set_read_only() and users bdrv_set_read_only() is used by some block drivers to override the read-only option given by the user. This is not how read-only images generally work in QEMU: Instead of second guessing what the user really meant (which currently includes making an image read-only even if the user didn't only use the default, but explicitly said read-only=off), we should error out if we can't provide what the user requested. This adds deprecation warnings to all callers of bdrv_set_read_only() so that the behaviour can be corrected after the usual deprecation period. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-11-17 13:35:59 +01:00
Kevin Wolf	6473069416	block: Fix error path in bdrv_backing_update_filename() error_setg_errno() takes a positive errno code. Spotted by Coverity (CID 1381628). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-11-17 13:35:59 +01:00
Peter Krempa	6bff597bf6	block: don't add 'driver' to options when referring to backing via node name When referring to a backing file of an image via node name bdrv_open_backing_file would add the 'driver' option to the option list filling it with the backing format driver. This breaks construction of the backing chain via -blockdev, as bdrv_open_inherit reports an error if both 'reference' and 'options' are provided. $ qemu-img create -f raw /tmp/backing.raw 64M $ qemu-img create -f qcow2 -F raw -b /tmp/backing.raw /tmp/test.qcow2 $ qemu-system-x86_64 \ -blockdev driver=file,filename=/tmp/backing.raw,node-name=backing \ -blockdev driver=qcow2,file.driver=file,file.filename=/tmp/test.qcow2,node-name=root,backing=backing qemu-system-x86_64: -blockdev driver=qcow2,file.driver=file,file.filename=/tmp/test.qcow2,node-name=root,backing=backing: Could not open backing file: Cannot reference an existing block device with additional options or a new filename Signed-off-by: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Kevin Wolf	bde70715b6	commit: Remove overlay_bs We don't need to make any assumptions about the graph layout above the top node of the commit operation any more. Remove the use of bdrv_find_overlay() and related variables from the commit job code. bdrv_drop_intermediate() doesn't use the 'active' parameter any more, so we can just drop it. The overlay node was previously added to the block job to get a BLK_PERM_GRAPH_MOD. We really need to respect those permissions in bdrv_drop_intermediate() now, but as long as we haven't figured out yet how BLK_PERM_GRAPH_MOD is actually supposed to work, just leave a TODO comment there. With this change, it is now possible to perform another block job on an overlay node without conflicts. qemu-iotests 030 is changed accordingly. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-10-06 16:28:58 +02:00
Kevin Wolf	61f09cea01	commit: Support multiple roots above top node This changes the commit block job to support operation in a graph where there is more than a single active layer that references the top node. This involves inserting the commit filter node not only on the path between the given active node and the top node, but between the top node and all of its parents. On completion, bdrv_drop_intermediate() must consider all parents for updating the backing file link. These parents may be backing files themselves and as such read-only; reopen them temporarily if necessary. Previously this was achieved by the bdrv_reopen() calls in the commit block job that made overlay_bs read-write for the whole duration of the block job, even though write access is only needed on completion. Now that we consider all parents, overlay_bs is meaningless. It is left in place in this commit, but we'll remove it soon. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-06 16:28:58 +02:00
Kevin Wolf	6858eba09e	block: Introduce BdrvChildRole.update_filename There is no good reason for bdrv_drop_intermediate() to know the active layer above the subchain it is operating on - even more so, because the assumption that there is a single active layer above it is not generally true. In order to prepare removal of the active parameter, use a BdrvChildRole callback to update the backing file string in the overlay image instead of directly calling bdrv_change_backing_file(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-10-06 16:28:58 +02:00
Eric Blake	1b6cc579de	dirty-bitmap: Avoid size query failure during truncate We've previously fixed several places where we failed to account for possible errors from bdrv_nb_sectors(). Fix another one by making bdrv_dirty_bitmap_truncate() take the new size from the caller instead of querying itself; then adjust the sole caller bdrv_truncate() to pass the size just determined by a successful resize, or to reuse the size given to the original truncate operation when refresh_total_sectors() was not able to confirm the actual size (the two sizes can potentially differ according to rounding constraints), thus avoiding sizing the bitmaps to -1. This also fixes a bug where not all failure paths in bdrv_truncate() would set errp. Note that bdrv_truncate() is still a bit awkward. We may want to revisit it later and clean up things to better guarantee that a resize attempt either fails cleanly up front, or cannot fail after guest-visible changes have been made (if temporary changes are made, then they need to be cleanly rolled back). But that is a task for another day; for now, the goal is the bare minimum fix to ensure that just bdrv_dirty_bitmap_truncate() cannot fail. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-06 16:28:58 +02:00
Eric Blake	a8b42a1c09	block: Make bdrv_img_create() size selection easier to read All callers of bdrv_img_create() pass in a size, or -1 to read the size from the backing file. We then set that size as the QemuOpt default, which means we will reuse that default rather than the final parameter to qemu_opt_get_size() several lines later. But it is rather confusing to read subsequent checks of 'size == -1' when it looks (without seeing the full context) like size defaults to 0; it also doesn't help that a size of 0 is valid (for some formats). Rework the logic to make things more legible. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-06 16:28:58 +02:00
Kevin Wolf	3045025991	block: Fix permissions after bdrv_reopen() If we switch between read-only and read-write, the permissions that image format drivers need on bs->file change, too. Make sure to update the permissions during bdrv_reopen(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-09-26 14:46:23 +02:00
Kevin Wolf	1857c97b76	block: reopen: Queue children after their parents We will calculate the required new permissions in the prepare stage of a reopen. Required permissions of children can be influenced by the changes made to their parents, but parents are independent from their children. This means that permissions need to be calculated top-down. In order to achieve this, queue parents before their children rather than queuing the children first. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-09-26 14:46:23 +02:00
Kevin Wolf	148eb13c84	block: Base permissions on rw state after reopen When new permissions are calculated during bdrv_reopen(), they need to be based on the state of the graph as it will be after the reopen has completed, not on the current state of the involved nodes. This patch makes bdrv_is_writable() optionally accept a BlockReopenQueue from which the new flags are taken. This is then used for determining the new bs->file permissions of format drivers as soon as we add the code to actually pass a non-NULL reopen queue to the .bdrv_child_perm callbacks. While moving bdrv_is_writable(), make it static. It isn't used outside block.c. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-09-26 14:46:23 +02:00
Kevin Wolf	3121fb45b0	block: Add reopen queue to bdrv_check_perm() In the context of bdrv_reopen(), we'll have to look at the state of the graph as it will be after the reopen. This interface addition is in preparation for the change. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-09-26 14:46:23 +02:00
Kevin Wolf	e0995dc3da	block: Add reopen_queue to bdrv_child_perm() In the context of bdrv_reopen(), we'll have to look at the state of the graph as it will be after the reopen. This interface addition is in preparation for the change. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-09-26 14:46:23 +02:00
Manos Pitsidianakis	f024aee867	block: remove unused bdrv_media_changed This function is not used anywhere, so remove it. Markus Armbruster adds: The i82078 floppy device model used to call bdrv_media_changed() to implement its media change bit when backed by a host floppy. This went away in `21fcf36` "fdc: simplify media change handling". Probably broke host floppy media change. Host floppy pass-through was dropped in commit `f709623`. bdrv_media_changed() has never been used for anything else. Remove it. (Source is Message-ID: <87y3ruaypm.fsf@dusky.pond.sub.org>) Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-09-04 18:31:13 +02:00
Manos Pitsidianakis	5a612c009e	block: pass bdrv_* methods to bs->file by default in block filters The following functions fail if bs->drv is a filter and does not implement them: bdrv_probe_blocksizes bdrv_probe_geometry bdrv_truncate bdrv_has_zero_init bdrv_get_info Instead, the call should be passed to bs->file if it exists, to allow filter drivers to support those methods without implementing them. This commit makes `drv->is_filter = true` imply that these callbacks will be forwarded to bs->file by default, so disabling support for these functions must be done explicitly. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-09-04 18:31:13 +02:00
Marc-André Lureau	f7abe0ecd4	qapi: Change data type of the FOO_lookup generated for enum FOO Currently, a FOO_lookup is an array of strings terminated by a NULL sentinel. A future patch will generate enums with "holes". NULL-termination will cease to work then. To prepare for that, store the length in the FOO_lookup by wrapping it in a struct and adding a member for the length. The sentinel will be dropped next. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20170822132255.23945-13-marcandre.lureau@redhat.com> [Basically redone] Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-16-git-send-email-armbru@redhat.com> [Rebased]	2017-09-04 13:09:13 +02:00
Markus Armbruster	5b5f825d44	qapi: Generate FOO_str() macro for QAPI enum FOO The next commit will put it to use. May look pointless now, but we're going to change the FOO_lookup's type, and then it'll help. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-13-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>	2017-09-04 13:09:13 +02:00
Markus Armbruster	06c60b6c46	qapi: Drop superfluous qapi_enum_parse() parameter max The lookup tables have a sentinel, no need to make callers pass their size. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-3-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [Rebased, commit message corrected]	2017-09-04 13:09:13 +02:00
Stefan Hajnoczi	7d5b526110	block: Update open_flags after ->inactivate() callback In the ->inactivate() callbacks, permissions are updated, which typically involves a recursive check of the whole graph. Setting BDRV_O_INACTIVE right before doing that creates a state that bdrv_is_writable() returns false, which causes permission update failure. Reorder them so the flag is updated after calling the function. Note that this doesn't break the assert in bdrv_child_cb_inactivate() because for any specific BDS, we still update its flags first before calling ->inactivate() on it one level deeper in the recursion. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170823134242.12080-5-famz@redhat.com> Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-08-23 10:21:55 -05:00
Kevin Wolf	fd4520212b	block: Set BDRV_O_ALLOW_RDWR during rw reopen Reopening an image should be consistent with opening it, so we should set BDRV_O_ALLOW_RDWR for any image that is reopened read-write like in bdrv_open_inherit(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-08-08 15:19:16 +02:00
Kevin Wolf	54a32bfec1	block: Allow reopen rw without BDRV_O_ALLOW_RDWR BDRV_O_ALLOW_RDWR is a flag that tells whether qemu can internally reopen a node read-write temporarily because the user requested read-write for the top-level image, but qemu decided that read-only is enough for this node (a backing file). bdrv_reopen() is different, it is also used for cases where the user changed their mind and wants to update the options. There is no reason to forbid making a node read-write in that case. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-08-08 15:19:16 +02:00
Kevin Wolf	8aecf1d1bd	block: Fix order in bdrv_replace_child() Commit `8ee03995` refactored the code incorrectly and broke the release of permissions on the old BDS. Instead of changing the permissions to the new required values after removing the old BDS from the list of children, it only re-obtains the permissions it already had. Change the order of operations so that the old BDS is removed again before calculating the new required permissions. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-08-08 15:19:16 +02:00
Manos Pitsidianakis	180ca19ae0	block: fix leaks in bdrv_open_driver() bdrv_open_driver() is called in two places, bdrv_new_open_driver() and bdrv_open_common(). In the latter, failure cleanup in is in its caller, bdrv_open_inherit(), which unrefs the bs->file of the failed driver open if it exists. Let's move the bs->file cleanup to bdrv_open_driver() to take care of all callers and do not set bs->drv to NULL unless the driver's open function failed. When bs is destroyed by removing its last reference, it calls bdrv_close() which checks bs->drv to perform the needed cleanups and also call the driver's close function. Since it cleans up options and opaque we must take care not leave dangling pointers. The error paths in bdrv_open_driver() are now two: If open fails, drv->bdrv_close() should not be called. Unref the child if it exists, free what we allocated and set bs->drv to NULL. Return the error and let callers free their stuff. If open succeeds but we fail after, return the error and let callers unref and delete their bs, while cleaning up their allocations. Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-01 18:09:33 +02:00
Manos Pitsidianakis	998cbd6a44	block: fix dangling bs->explicit_options in block.c In some error paths it is possible to QDECREF a freed dangling explicit_options, resulting in a heap overflow crash. For example bdrv_open_inherit()'s fail unrefs it, then calls bdrv_unref which calls bdrv_close which also unrefs it. Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-08-01 18:09:33 +02:00
Kevin Wolf	d3c8c67469	block: Skip implicit nodes in query-block/blockstats Commits `0db832f` and `6cdbceb` introduced the automatic insertion of filter nodes above the top layer of mirror and commit block jobs. The assumption made there was that since libvirt doesn't do node-level management of the block layer yet, it shouldn't be affected by added nodes. This is true as far as commands issued by libvirt are concerned. It only uses BlockBackend names to address nodes, so any operations it performs still operate on the root of the tree as intended. However, the assumption breaks down when you consider query commands, which return data for the wrong node now. These commands also return information on some child nodes (bs->file and/or bs->backing), which libvirt does make use of, and which refer to the wrong nodes, too. One of the consequences is that oVirt gets wrong information about the image size and stops the VM in response as long as a mirror or commit job is running: https://bugzilla.redhat.com/show_bug.cgi?id=1470634 This patch fixes the problem by hiding the implicit nodes created automatically by the mirror and commit block jobs in the output of query-block and BlockBackend-based query-blockstats as long as the user doesn't indicate that they are aware of those nodes by providing a node name for them in the QMP command to start the block job. The node-based commands query-named-block-nodes and query-blockstats with query-nodes=true still show all nodes, including implicit ones. This ensures that users that are capable of node-level management can still access the full information; users that only know BlockBackends won't use these commands. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Tested-by: Eric Blake <eblake@redhat.com>	2017-07-24 15:06:04 +02:00
John Snow	6e6e55f5c2	qemu-img: Check for backing image if specified during create Or, rather, force the open of a backing image if one was specified for creation. Using a similar -unsafe option as rebase, allow qemu-img to ignore the backing file validation if possible. It may not always be possible, as in the existing case when a filesize for the new image was not specified. This is accomplished by shifting around the conditionals in bdrv_img_create, such that a backing file is always opened unless we provide BDRV_O_NO_BACKING. qemu-img is adjusted to pass this new flag when -u is provided to create. Sorry for the heinous looking diffstat, but it's mostly whitespace. Inspired by: https://bugzilla.redhat.com/show_bug.cgi?id=1213786 Signed-off-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-07-18 15:27:37 +02:00
Peter Maydell	a309b290aa	Error reporting patches for 2017-07-13 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJZZ1/BAAoJEDhwtADrkYZTo7oP+gLj4B4kkp/DJnkzfuMMD1Ce ZPddZ8Z9RyXE4fS66sq1ODBQo5U+aQQZO7K234+jf8V4cKWW98lpVzLc3YdAHm2U ZF6Z9Rji5K4414ZsUcg92Zlovvdaji+mY0ooINav+4mqlONYrz29ntApWc0e0tGc e3tj4XDLhJrOM+mIx8vzixFlgSYj+6HgEiybYwolEK5svQbIQao3Y2omyb+zy0w0 RDT3XQnAAaZSOQAXcJGkhekkyMe0jMHOF0tULLx1uDQYctg9mUGlAGTZ5oTLgSve TCpSJwWCAx8XAJMkXyDRrdRFDLeUh6yGY7NTqAL3OuPSoAw9ygKrHyhTavxBJL+W rX7Qit3dmVrlZLviwNFQplAKYb10d08vBoKXmrnW5oVCmPEDvJIQfncbucpA/CNS ucdJ3RMLuDbbWdl+5tsL7jfiZAG7oSgAePTjN1rm0bDe5JN7NAU8WzHnKfE83iZq R+I3hofqGoiXSByYRLamZb+6nsURAxWPhcqcw7hdMsk7UI6dyZwWl9Fnm72w0BZK M5LHLkX0LYc+kZjiLKXlNK7Z50bXY0zKQpPCLH3nHA69iMiwVoozrjwa9iCKIxE+ 7ZlOfsu4ztExuicEyTr8b27CBrHjJjYDuFP0hroEOzqCKXUzegoq3oYMGP0doXxe o3xcwXVKT/1PudddyR4z =tByN -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2017-07-13' into staging Error reporting patches for 2017-07-13 # gpg: Signature made Thu 13 Jul 2017 12:55:45 BST # gpg: using RSA key 0x3870B400EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-error-2017-07-13: Convert error_report_err() to warn_report_err() error: Implement the warn and free Error functions char-socket: Report TCP socket waiting as information Convert error_report() to warn_report() error: Functions to report warnings and informational messages util/qemu-error: Rename error_print_loc() to be more generic websock: Don't try to set errp directly block: Don't try to set errp directly xilinx: Fix latent error handling bug Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-07-14 09:36:40 +01:00
Eduardo Habkost	57ef3f1278	block: Don't try to set errp directly Assigning directly to errp is not valid, as errp may be NULL, &error_fatal, or &error_abort. Use error_propagate() instead. With this, there's no need to check if errp is NULL anymore, as error_propagate() and error_prepend() are able to handle that. Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: qemu-block@nongnu.org Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20170608133906.12737-3-ehabkost@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-07-13 13:45:53 +02:00
Max Reitz	7ea37c3066	block: Add PreallocMode to bdrv_truncate() For block drivers that just pass a truncate request to the underlying protocol, we can now pass the preallocation mode instead of aborting if it is not PREALLOC_MODE_OFF. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-3-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:01 +02:00
Max Reitz	8243ccb743	block: Add PreallocMode to BD.bdrv_truncate() Add a PreallocMode parameter to the bdrv_truncate() function implemented by each block driver. Currently, we always pass PREALLOC_MODE_OFF and no driver accepts anything else. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170613202107.10125-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:01 +02:00
Stefan Hajnoczi	90880ff107	block: add bdrv_measure() API bdrv_measure() provides a conservative maximum for the size of a new image. This information is handy if storage needs to be allocated (e.g. a SAN or an LVM volume) ahead of time. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20170705125738.8777-2-stefanha@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:45:00 +02:00
Vladimir Sementsov-Ogievskiy	615b5dcf2d	block: release persistent bitmaps on inactivate We should release them here to reload on invalidate cache. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170628120530.31251-31-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy	67b792f5ed	block: add bdrv_can_store_new_dirty_bitmap This will be needed to check some restrictions before making bitmap persistent in qmp-block-dirty-bitmap-add (this functionality will be added by future patch) Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170628120530.31251-22-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy	cca43ae1e1	block: bdrv_close: release bitmaps after drv->bdrv_close Release bitmaps after 'if (bs->drv) { ... }' block. This will allow format driver to save persistent bitmaps, which will appear in following commits. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170628120530.31251-17-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	cb9ff6c25a	block: new bdrv_reopen_bitmaps_rw interface Add format driver handler, which should mark loaded read-only bitmaps as 'IN_USE' in the image and unset read_only field in corresponding BdrvDirtyBitmap's. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170628120530.31251-14-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy	50bf65bab6	block: refactor bdrv_reopen_commit Add bs local variable to simplify code. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170628120530.31251-13-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:58 +02:00
sochin.jiang	5ce6bfe255	mirror: Fix inconsistent backing AioContext for after mirroring mirror_complete opens the backing chain, which should have the same AioContext as the top when using iothreads. Make the code guarantee this, which fixes a failed assertion in bdrv_attach_child. Signed-off-by: sochin.jiang <sochin.jiang@huawei.com> Message-id: 1498475064-39816-1-git-send-email-sochin.jiang@huawei.com [mreitz: Reworded commit message] Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:57 +02:00
Daniel P. Berrange	c01c214b69	block: remove all encryption handling APIs Now that all encryption keys must be provided upfront via the QCryptoSecret API and associated block driver properties there is no need for any explicit encryption handling APIs in the block layer. Encryption can be handled transparently within the block driver. We only retain an API for querying whether an image is encrypted or not, since that is a potentially useful piece of metadata to report to the user. Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170623162419.26068-18-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-07-11 17:44:56 +02:00
Paolo Bonzini	2119882c7e	block: introduce dirty_bitmap_mutex It protects only the list of dirty bitmaps; in the next patch we will also protect their content. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170605123908.18777-15-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-06-16 07:55:00 +08:00
Paolo Bonzini	3783fa3dd3	block: protect tracked_requests and flush_queue with reqs_lock Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170605123908.18777-14-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-06-16 07:55:00 +08:00
Paolo Bonzini	47fec59941	block: access write_gen with atomics Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170605123908.18777-13-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-06-16 07:55:00 +08:00
Paolo Bonzini	d3faa13e5f	block: access copy_on_read with atomic ops Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170605123908.18777-2-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-06-16 07:55:00 +08:00
Max Reitz	03c320d803	block/file-: _parse_filename() and colons The file drivers' *_parse_filename() implementations just strip the optional protocol prefix off the filename. However, for e.g. "file:foo:bar", this would lead to "foo:bar" being stored as the BDS's filename which looks like it should be managed using the "foo" protocol. This is especially troublesome if you then try to resolve a backing filename based on "foo:bar". This issue can only occur if the stripped part is a relative filename ("file:/foo:bar" will be shortened to "/foo:bar" and having a slash before the first colon means that "/foo" is not recognized as a protocol part). Therefore, we can easily fix it by prepending "./" to such filenames. Before this patch: $ ./qemu-img create -f qcow2 backing.qcow2 64M Formatting 'backing.qcow2', fmt=qcow2 size=67108864 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16 $ ./qemu-img create -f qcow2 -b backing.qcow2 file🔝image.qcow2 Formatting 'file🔝image.qcow2', fmt=qcow2 size=67108864 backing_file=backing.qcow2 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16 $ ./qemu-io file🔝image.qcow2 can't open device file🔝image.qcow2: Could not open backing file: Unknown protocol 'top' After this patch: $ ./qemu-io file🔝image.qcow2 [no error] Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170522195217.12991-3-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-29 15:39:54 +02:00
Max Reitz	0d54a6fed3	block: Fix backing paths for filenames with colons path_combine() naturally tries to preserve a protocol prefix. However, it recognizes such a prefix by scanning for the first colon; which is different from what path_has_protocol() does: There only is a protocol prefix if there is a colon before the first slash. A protocol prefix that is not recognized by path_has_protocol() is none, and should thus not be taken as one. Case in point, before this patch: $ ./qemu-img create -f qcow2 -b backing.qcow2 ./top:image.qcow2 qemu-img: ./top:image.qcow2: Could not open './top:backing.qcow2': No such file or directory Afterwards: $ ./qemu-img create -f qcow2 -b backing.qcow2 ./top:image.qcow2 qemu-img: ./top:image.qcow2: Could not open './backing.qcow2': No such file or directory Reported-by: yangyang <yangyang@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20170522195217.12991-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-29 15:39:54 +02:00
Kevin Wolf	9c5e6594f1	block: Fix write/resize permissions for inactive images Format drivers for inactive nodes don't need write/resize permissions on their bs->file and can share write/resize with another VM (in fact, this is the whole point of keeping images inactive). Represent this fact in the op blocker system, so that image locking does the right thing without special-casing inactive images. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-05-11 12:08:24 +02:00
Kevin Wolf	38701b6aef	block: Inactivate parents before children The proper order for inactivating block nodes is that first the parents get inactivated and then the children. If we do things in this order, we can assert that we didn't accidentally leave a parent activated when one of its child nodes is inactive. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-05-11 12:08:24 +02:00
Kevin Wolf	cfa1a5723f	block: Drop permissions when migration completes With image locking, permissions affect other qemu processes as well. We want to be sure that the destination can run, so let's drop permissions on the source when migration completes. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-05-11 12:08:24 +02:00
Kevin Wolf	4417ab7adf	block: New BdrvChildRole.activate() for blk_resume_after_migration() Instead of manually calling blk_resume_after_migration() in migration code after doing bdrv_invalidate_cache_all(), integrate the BlockBackend activation with cache invalidation into a single function. This is achieved with a new callback in BdrvChildRole that is called by bdrv_invalidate_cache_all(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2017-05-11 12:08:24 +02:00
Fam Zheng	ffd1a5a25c	block: Respect "force-share" in perm propagating Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-05-11 11:02:38 +02:00
Fam Zheng	5a9347c673	block: Add, parse and store "force-share" option Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-05-11 11:02:38 +02:00
Fam Zheng	5176196c32	block: Make bdrv_perm_names public It can be used outside of block.c for making user friendly messages. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-05-11 11:02:38 +02:00
Eric Blake	ff6ed7141d	block: Simplify bdrv_append_temp_snapshot() logic Noticed while checking Coccinelle results. Naming a label 'out:' when it is only used on error paths is weird. Also, we had some dead stores to 'ret'. Meanwhile we know that snapshot_options is NULL on success and that QDECREF(NULL) is safe. So merge the two exit paths into one by careful control over bs_snapshot. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20170427215821.19397-8-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-05-09 09:14:39 +02:00
Eric Blake	46f5ac205a	qobject: Use simpler QDict/QList scalar insertion macros We now have macros in place to make it less verbose to add a scalar to QDict and QList, so use them. Patch created mechanically via: spatch --sp-file scripts/coccinelle/qobject.cocci \ --macro-file scripts/cocci-macro-file.h --dir . --in-place then touched up manually to fix a couple of '?:' back to original spacing, as well as avoiding a long line in monitor.c. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20170427215821.19397-7-eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-05-09 09:13:51 +02:00
Eric Blake	de6e7951fe	qobject: Drop useless QObject casts We have macros in place to make it less verbose to add a subtype of QObject to both QDict and QList. While we have made cleanups like this in the past (see commit `fcfcd8ffc`, for example), having it be automated by Coccinelle makes it easier to maintain. Patch created mechanically via: spatch --sp-file scripts/coccinelle/qobject.cocci \ --macro-file scripts/cocci-macro-file.h --dir . --in-place then I verified that no manual touchups were required. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20170427215821.19397-5-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2017-05-08 20:32:14 +02:00
Denis V. Lunev	504c205a0d	block: assert no image modification under BDRV_O_INACTIVE As long as BDRV_O_INACTIVE is set, the image file is only opened so we have a file descriptor for it. We're definitely not supposed to modify the image, it's still owned by the migration source. This commit is an addition to `09e0c771` but the assert() is added to bdrv_truncate(). Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> Message-id: 1491405505-31620-3-git-send-email-den@openvz.org Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-04-28 16:02:03 +02:00
Max Reitz	f59adb3256	block: Add .bdrv_truncate() error messages Add missing error messages for the block driver implementations of .bdrv_truncate(); drop the generic one from block.c's bdrv_truncate(). Since one of these changes touches a mis-indented block in block/file-posix.c, this patch fixes that coding style issue along the way. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170328205129.15138-5-mreitz@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-04-28 16:02:03 +02:00
Max Reitz	4bff28b81a	block: Add errp to BD.bdrv_truncate() Add an Error parameter to the block drivers' bdrv_truncate() interface. If a block driver does not set this in case of an error, the generic bdrv_truncate() implementation will do so. Where it is obvious, this patch also makes some block drivers set this value. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170328205129.15138-4-mreitz@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-04-28 16:02:03 +02:00
Max Reitz	ed3d2ec98a	block: Add errp to b{lk,drv}_truncate() For one thing, this allows us to drop the error message generation from qemu-img.c and blockdev.c and instead have it unified in bdrv_truncate(). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170328205129.15138-3-mreitz@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-04-28 16:02:02 +02:00
Max Reitz	4a0082401a	block: An empty filename counts as no filename Reproducer: $ ./qemu-img info '' qemu-img: ./block.c:1008: bdrv_open_driver: Assertion `!drv->bdrv_needs_filename \|\| bs->filename[0]' failed. [1] 26105 abort (core dumped) ./qemu-img info '' This patch fixes this to be: $ ./qemu-img info '' qemu-img: Could not open '': The 'file' block driver requires a file name Cc: qemu-stable <qemu-stable@nongnu.org> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-04-27 15:39:49 +02:00
Max Reitz	362b3786eb	Revert "block/io: Comment out permission assertions" This reverts commit `e3e0003a8f`. This commit was necessary for the 2.9 release because we were unable to fix the underlying issue(s) in time. However, we will be for 2.10. Signed-off-by: Max Reitz <mreitz@redhat.com> Acked-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-04-27 15:39:49 +02:00
Jeff Cody	3d8ce171cb	block: use bdrv_can_set_read_only() during reopen Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 00aed7ffdd7be4b9ed9ce1007d50028a72b34ebe.1491597120.git.jcody@redhat.com	2017-04-24 15:09:33 -04:00
Jeff Cody	45803a0396	block: introduce bdrv_can_set_read_only() Introduce check function for setting read_only flags. Will return < 0 on error, with appropriate Error value set. Does not alter any flags. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: e2bba34ac3bc76a0c42adc390413f358ae0566e8.1491597120.git.jcody@redhat.com	2017-04-24 15:09:33 -04:00
Jeff Cody	93ed524e3d	block: code movement Move bdrv_is_read_only() up with its friends. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Message-id: 73b2399459760c32506f9407efb9dddb3a2789de.1491597120.git.jcody@redhat.com	2017-04-24 15:09:33 -04:00
Jeff Cody	d6fcdf06d9	block: honor BDRV_O_ALLOW_RDWR when clearing bs->read_only The BDRV_O_ALLOW_RDWR flag allows / prohibits the changing of the BDS 'read_only' state, but there are a few places where it is ignored. In the bdrv_set_read_only() helper, make sure to honor the flag. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: be2e5fb2d285cbece2b6d06bed54a6f56520d251.1491597120.git.jcody@redhat.com	2017-04-24 15:09:33 -04:00
Jeff Cody	e2b8247a32	block: do not set BDS read_only if copy_on_read enabled A few block drivers will set the BDS read_only flag from their .bdrv_open() function. This means the bs->read_only flag could be set after we enable copy_on_read, as the BDRV_O_COPY_ON_READ flag check occurs prior to the call to bdrv->bdrv_open(). This adds an error return to bdrv_set_read_only(), and an error will be return if we try to set the BDS to read_only while copy_on_read is enabled. This patch also changes the behavior of vvfat. Before, vvfat could override the drive 'readonly' flag with its own, internal 'rw' flag. For instance, this -drive parameter would result in a writable image: "-drive format=vvfat,dir=/tmp/vvfat,rw,if=virtio,readonly=on" This is not correct. Now, attempting to use the above -drive parameter will result in an error (i.e., 'rw' is incompatible with 'readonly=on'). Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 0c5b4c1cc2c651471b131f21376dfd5ea24d2196.1491597120.git.jcody@redhat.com	2017-04-24 15:09:33 -04:00

... 3 4 5 6 7 ...

1507 Commits