mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Nishanth Aravamudan	ed6e216171	linux-aio: properly bubble up errors from initialization laio_init() can fail for a couple of reasons, which will lead to a NULL pointer dereference in laio_attach_aio_context(). To solve this, add a aio_setup_linux_aio() function which is called early in raw_open_common. If this fails, propagate the error up. The signature of aio_get_linux_aio() was not modified, because it seems preferable to return the actual errno from the possible failing initialization calls. Additionally, when the AioContext changes, we need to associate a LinuxAioState with the new AioContext. Use the bdrv_attach_aio_context callback and call the new aio_setup_linux_aio(), which will allocate a new AioContext if needed, and return errors on failures. If it fails for any reason, fallback to threaded AIO with an error message, as the device is already in-use by the guest. Add an assert that aio_get_linux_aio() cannot return NULL. Signed-off-by: Nishanth Aravamudan <naravamudan@digitalocean.com> Message-id: 20180622193700.6523-1-naravamudan@digitalocean.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-27 13:06:34 +01:00
Anton Nefedov	29cd0403f1	qapi: remove empty flat union branches and types Flat unions may now have uncovered branches, so it is possible to get rid of empty types defined for that purpose only. Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1529311206-76847-3-git-send-email-anton.nefedov@virtuozzo.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2018-06-22 16:33:46 +02:00
Max Reitz	481debaa32	block/mirror: Add copy mode QAPI interface This patch allows the user to specify whether to use active or only background mode for mirror block jobs. Currently, this setting will remain constant for the duration of the entire block job. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20180613181823.13618-14-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-18 17:05:16 +02:00
Max Reitz	d06107ade0	block/mirror: Add active mirroring This patch implements active synchronous mirroring. In active mode, the passive mechanism will still be in place and is used to copy all initially dirty clusters off the source disk; but every write request will write data both to the source and the target disk, so the source cannot be dirtied faster than data is mirrored to the target. Also, once the block job has converged (BLOCK_JOB_READY sent), source and target are guaranteed to stay in sync (unless an error occurs). Active mode is completely optional and currently disabled at runtime. A later patch will add a way for users to enable it. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20180613181823.13618-13-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-18 17:05:15 +02:00
Max Reitz	429076e88d	block/mirror: Add MirrorBDSOpaque This will allow us to access the block job data when the mirror block driver becomes more complex. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20180613181823.13618-11-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-18 17:04:59 +02:00
Max Reitz	72d10a9421	block/dirty-bitmap: Add bdrv_dirty_iter_next_area This new function allows to look for a consecutively dirty area in a dirty bitmap. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20180613181823.13618-10-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-18 17:04:57 +02:00
Max Reitz	a33fbb4f8b	hbitmap: Add @advance param to hbitmap_iter_next() This new parameter allows the caller to just query the next dirty position without moving the iterator. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20180613181823.13618-8-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-18 17:04:55 +02:00
Max Reitz	138f9fffb8	block/mirror: Use source as a BdrvChild With this, the mirror_top_bs is no longer just a technically required node in the BDS graph but actually represents the block job operation. Also, drop MirrorBlockJob.source, as we can reach it through mirror_top_bs->backing. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20180613181823.13618-6-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-18 17:04:54 +02:00
Max Reitz	1181e19a6d	block/mirror: Wait for in-flight op conflicts This patch makes the mirror code differentiate between simply waiting for any operation to complete (mirror_wait_for_free_in_flight_slot()) and specifically waiting for all operations touching a certain range of the virtual disk to complete (mirror_wait_on_conflicts()). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20180613181823.13618-5-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-18 17:04:53 +02:00
Max Reitz	12aa40822d	block/mirror: Use CoQueue to wait on in-flight ops Attach a CoQueue to each in-flight operation so if we need to wait for any we can use it to wait instead of just blindly yielding and hoping for some operation to wake us. A later patch will use this infrastructure to allow requests accessing the same area of the virtual disk to specifically wait for each other. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20180613181823.13618-4-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-18 17:04:52 +02:00
Max Reitz	2e1990b26e	block/mirror: Convert to coroutines In order to talk to the source BDS (and maybe in the future to the target BDS as well) directly, we need to convert our existing AIO requests into coroutine I/O requests. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20180613181823.13618-3-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-18 17:04:48 +02:00
Max Reitz	4295c5fc61	block/mirror: Pull out mirror_perform() When converting mirror's I/O to coroutines, we are going to need a point where these coroutines are created. mirror_perform() is going to be that point. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20180613181823.13618-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-18 17:04:43 +02:00
Greg Kurz	f45280cbf6	block: fix QEMU crash with scsi-hd and drive_del Removing a drive with drive_del while it is being used to run an I/O intensive workload can cause QEMU to crash. An AIO flush can yield at some point: blk_aio_flush_entry() blk_co_flush(blk) bdrv_co_flush(blk->root->bs) ... qemu_coroutine_yield() and let the HMP command to run, free blk->root and give control back to the AIO flush: hmp_drive_del() blk_remove_bs() bdrv_root_unref_child(blk->root) child_bs = blk->root->bs bdrv_detach_child(blk->root) bdrv_replace_child(blk->root, NULL) blk->root->bs = NULL g_free(blk->root) <============== blk->root becomes stale bdrv_unref(child_bs) bdrv_delete(child_bs) bdrv_close() bdrv_drained_begin() bdrv_do_drained_begin() bdrv_drain_recurse() aio_poll() ... qemu_coroutine_switch() and the AIO flush completion ends up dereferencing blk->root: blk_aio_complete() scsi_aio_complete() blk_get_aio_context(blk) bs = blk_bs(blk) ie, bs = blk->root ? blk->root->bs : NULL ^^^^^ stale The problem is that we should avoid making block driver graph changes while we have in-flight requests. Let's drain all I/O for this BB before calling bdrv_root_unref_child(). Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	0f12264e7a	block: Allow graph changes in bdrv_drain_all_begin/end sections bdrv_drain_all_*() used bdrv_next() to iterate over all root nodes and did a subtree drain for each of them. This works fine as long as the graph is static, but sadly, reality looks different. If the graph changes so that root nodes are added or removed, we would have to compensate for this. bdrv_next() returns each root node only once even if it's the root node for multiple BlockBackends or for a monitor-owned block driver tree, which would only complicate things. The much easier and more obviously correct way is to fundamentally change the way the functions work: Iterate over all BlockDriverStates, no matter who owns them, and drain them individually. Compensation is only necessary when a new BDS is created inside a drain_all section. Removal of a BDS doesn't require any action because it's gone afterwards anyway. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	6cd5c9d7b2	block: ignore_bds_parents parameter for drain functions In the future, bdrv_drained_all_begin/end() will drain all invidiual nodes separately rather than whole subtrees. This means that we don't want to propagate the drain to all parents any more: If the parent is a BDS, it will already be drained separately. Recursing to all parents is unnecessary work and would make it an O(n²) operation. Prepare the drain function for the changed drain_all by adding an ignore_bds_parents parameter to the internal implementation that prevents the propagation of the drain to BDS parents. We still (have to) propagate it to non-BDS parents like BlockBackends or Jobs because those are not drained separately. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	c8ca33d06d	block: Move bdrv_drain_all_begin() out of coroutine context Before we can introduce a single polling loop for all nodes in bdrv_drain_all_begin(), we must make sure to run it outside of coroutine context like we already do for bdrv_do_drained_begin(). Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	0109e7e6f8	block: Defer .bdrv_drain_begin callback to polling phase We cannot allow aio_poll() in bdrv_drain_invoke(begin=true) until we're done with propagating the drain through the graph and are doing the single final BDRV_POLL_WHILE(). Just schedule the coroutine with the callback and increase bs->in_flight to make sure that the polling phase will wait for it. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	dcf94a23b1	block: Don't poll in parent drain callbacks bdrv_do_drained_begin() is only safe if we have a single BDRV_POLL_WHILE() after quiescing all affected nodes. We cannot allow that parent callbacks introduce a nested polling loop that could cause graph changes while we're traversing the graph. Split off bdrv_do_drained_begin_quiesce(), which only quiesces a single node without waiting for its requests to complete. These requests will be waited for in the BDRV_POLL_WHILE() call down the call chain. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	fe4f0614ef	block: Drain recursively with a single BDRV_POLL_WHILE() Anything can happen inside BDRV_POLL_WHILE(), including graph changes that may interfere with its callers (e.g. child list iteration in recursive callers of bdrv_do_drained_begin). Switch to a single BDRV_POLL_WHILE() call for the whole subtree at the end of bdrv_do_drained_begin() to avoid such effects. The recursion happens now inside the loop condition. As the graph can only change between bdrv_drain_poll() calls, but not inside of it, doing the recursion here is safe. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	d30b8e64b7	block: Remove bdrv_drain_recurse() For bdrv_drain(), recursively waiting for child node requests is pointless because we didn't quiesce their parents, so new requests could come in anyway. Letting the function work only on a single node makes it more consistent. For subtree drains and drain_all, we already have the recursion in bdrv_do_drained_begin(), so the extra recursion doesn't add anything either. Remove the useless code. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	89bd030533	block: Really pause block jobs on drain We already requested that block jobs be paused in .bdrv_drained_begin, but no guarantee was made that the job was actually inactive at the point where bdrv_drained_begin() returned. This introduces a new callback BdrvChildRole.bdrv_drained_poll() and uses it to make bdrv_drain_poll() consider block jobs using the node to be drained. For the test case to work as expected, we have to switch from block_job_sleep_ns() to qemu_co_sleep_ns() so that the test job is even considered active and must be waited for when draining the node. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	1cc8e54ada	block: Avoid unnecessary aio_poll() in AIO_WAIT_WHILE() Commit `91af091f92` added an additional aio_poll() to BDRV_POLL_WHILE() in order to make sure that all pending BHs are executed on drain. This was the wrong place to make the fix, as it is useless overhead for all other users of the macro and unnecessarily complicates the mechanism. This patch effectively reverts said commit (the context has changed a bit and the code has moved to AIO_WAIT_WHILE()) and instead polls in the loop condition for drain. The effect is probably hard to measure in any real-world use case because actual I/O will dominate, but if I run only the initialisation part of 'qemu-img convert' where it calls bdrv_block_status() for the whole image to find out how much data there is copy, this phase actually needs only roughly half the time after this patch. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	c13ad59f01	block: Don't manually poll in bdrv_drain_all() All involved nodes are already idle, we called bdrv_do_drain_begin() on them. The comment in the code suggested that this was not correct because the completion of a request on one node could spawn a new request on a different node (which might have been drained before, so we wouldn't drain the new request). In reality, new requests to different nodes aren't spawned out of nothing, but only in the context of a parent request, and they aren't submitted to random nodes, but only to child nodes. As long as we still poll for the completion of the parent request (which we do), draining each root node separately is good enough. Remove the additional polling code from bdrv_drain_all_begin() and replace it with an assertion that all nodes are already idle after we drained them separately. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	7d40d9ef9d	block: Remove 'recursive' parameter from bdrv_drain_invoke() All callers pass false for the 'recursive' parameter now. Remove it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	79ab8b21dc	block: Use bdrv_do_drain_begin/end in bdrv_drain_all() bdrv_do_drain_begin/end() implement already everything that bdrv_drain_all_begin/end() need and currently still do manually: Disable external events, call parent drain callbacks, call block driver callbacks. It also does two more things: The first is incrementing bs->quiesce_counter. bdrv_drain_all() already stood out in the test case by behaving different from the other drain variants. Adding this is not only safe, but in fact a bug fix. The second is calling bdrv_drain_recurse(). We already do that later in the same function in a loop, so basically doing an early first iteration doesn't hurt. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	bb67568954	test-bdrv-drain: bdrv_drain() works with cross-AioContext events As long as nobody keeps the other I/O thread from working, there is no reason why bdrv_drain() wouldn't work with cross-AioContext events. The key is that the root request we're waiting for is in the AioContext we're polling (which it always is for bdrv_drain()) so that aio_poll() is woken up in the end. Add a test case that shows that it works. Remove the comment in bdrv_drain() that claims otherwise. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	b008326744	block: Remove deprecated -drive option serial The -drive option serial was deprecated in QEMU 2.10. It's time to remove it. Tests need to be updated to set the serial number with -global instead of using the -drive option. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	d083f954a9	rbd: New parameter key-secret Legacy -drive supports "password-secret" parameter that isn't available with -blockdev / blockdev-add. That's because we backed out our first try to provide it there due to interface design doubts, in commit `577d8c9a81`, v2.9.0. This is the second try. It brings back the parameter, except it's named "key-secret" now. Let's review our reasons for backing out the first try, as stated in the commit message: * BlockdevOptionsRbd member @password-secret isn't actually a password, it's a key generated by Ceph. Addressed by the rename. * We're not sure where member @password-secret belongs (see the previous commit). See previous commit. * How @password-secret interacts with settings from a configuration file specified with @conf is undocumented. Not actually true, the documentation for @conf says "Values in the configuration file will be overridden by options specified via QAPI", and we've tested this. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	a3699de4dd	rbd: New parameter auth-client-required Parameter auth-client-required lets you configure authentication methods. We tried to provide that in v2.9.0, but backed out due to interface design doubts (commit `464444fcc1`). This commit is similar to what we backed out, but simpler: we use a list of enumeration values instead of a list of objects with a member of enumeration type. Let's review our reasons for backing out the first try, as stated in the commit message: * The implementation uses deprecated rados_conf_set() key "auth_supported". No biggie. Fixed: we use "auth-client-required". * The implementation makes -drive silently ignore invalid parameters "auth" and "auth-supported..X" where X isn't "auth". Fixable (in fact I'm going to fix similar bugs around parameter server), so again no biggie. That fix is commit `2836284db6`. This commit doesn't bring the bugs back. BlockdevOptionsRbd member @password-secret applies only to authentication method cephx. Should it be a variant member of RbdAuthMethod? We've had time to ponder, and we decided to stick to the way Ceph configuration works: the key configured separately, and silently ignored if the authentication method doesn't use it. * BlockdevOptionsRbd member @user could apply to both methods cephx and none, but I'm not sure it's actually used with none. If it isn't, should it be a variant member of RbdAuthMethod? Likewise. * The client offers a set of authentication methods, not a list. Should the methods be optional members of BlockdevOptionsRbd instead of members of list @auth-supported? The latter begs the question what multiple entries for the same method mean. Trivial question now that RbdAuthMethod contains nothing but @type, but less so when RbdAuthMethod acquires other members, such the ones discussed above. Again, we decided to stick to the way Ceph configuration works, except we make auth-client-required a list of enumeration values instead of a string containing keywords separated by delimiters. * How BlockdevOptionsRbd member @auth-supported interacts with settings from a configuration file specified with @conf is undocumented. I suspect it's untested, too. Not actually true, the documentation for @conf says "Values in the configuration file will be overridden by options specified via QAPI", and we've tested this. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	f853465aac	block: Make remaining uses of qobject input visitor more robust Remaining uses of qobject_input_visitor_new_keyval() in the block subsystem: * block_crypto_open_opts_init() Currently doesn't visit any non-string scalars, thus safe. It's called from - block_crypto_open_luks() Creates the QDict with qemu_opts_to_qdict_filtered(), which creates only string scalars, but has a TODO asking for other types. - qcow_open() - qcow2_open(), qcow2_co_invalidate_cache(), qcow2_reopen_prepare() * block_crypto_create_opts_init(), called from - block_crypto_co_create_opts_luks() Also creates the QDict with qemu_opts_to_qdict_filtered(). * vdi_co_create_opts() Also creates the QDict with qemu_opts_to_qdict_filtered(). Replace these uses by qobject_input_visitor_new_flat_confused() for robustness. This adds crumpling. Right now, that's a no-op, but if we ever extend these things in non-flat ways, crumpling will be needed. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	af91062ee1	block: Factor out qobject_input_visitor_new_flat_confused() Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	92adf9dbcd	block: Clean up a misuse of qobject_to() in .bdrv_co_create_opts() The following pattern occurs in the .bdrv_co_create_opts() methods of parallels, qcow, qcow2, qed, vhdx and vpc: qobj = qdict_crumple_for_keyval_qiv(qdict, errp); qobject_unref(qdict); qdict = qobject_to(QDict, qobj); if (qdict == NULL) { ret = -EINVAL; goto done; } v = qobject_input_visitor_new_keyval(QOBJECT(qdict)); [...] ret = 0; done: qobject_unref(qdict); [...] return ret; If qobject_to() fails, we return failure without setting errp. That's wrong. As far as I can tell, it cannot fail here. Clean it up anyway, by removing the useless conversion. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	374c52467a	block: Fix -drive for certain non-string scalars The previous commit fixed -blockdev breakage due to misuse of the qobject input visitor's keyval flavor in bdrv_file_open(). The commit message explain why using the plain flavor would be just as wrong; it would break -drive. Turns out we break it in three places: nbd_open(), sd_open() and ssh_file_open(). They are even marked FIXME. Example breakage: $ qemu-system-x86 -drive node-name=n1,driver=nbd,server.type=inet,server.host=localhost,server.port=1234,server.numeric=off qemu-system-x86: -drive node-name=n1,driver=nbd,server.type=inet,server.host=localhost,server.port=1234,server.numeric=off: Invalid parameter type for 'numeric', expected: boolean Fix it the same way: replace qdict_crumple() by qdict_crumple_for_keyval_qiv(), and switch from plain to the keyval flavor. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	e5af0da1dc	block: Fix -blockdev for certain non-string scalars Configuration flows through the block subsystem in a rather peculiar way. Configuration made with -drive enters it as QemuOpts. Configuration made with -blockdev / blockdev-add enters it as QAPI type BlockdevOptions. The block subsystem uses QDict, QemuOpts and QAPI types internally. The precise flow is next to impossible to explain (I tried for this commit message, but gave up after wasting several hours). What I can explain is a flaw in the BlockDriver interface that leads to this bug: $ qemu-system-x86_64 -blockdev node-name=n1,driver=nfs,server.type=inet,server.host=localhost,path=/foo/bar,user=1234 qemu-system-x86_64: -blockdev node-name=n1,driver=nfs,server.type=inet,server.host=localhost,path=/foo/bar,user=1234: Internal error: parameter user invalid QMP blockdev-add is broken the same way. Here's what happens. The block layer passes configuration represented as flat QDict (with dotted keys) to BlockDriver methods .bdrv_file_open(). The QDict's members are typed according to the QAPI schema. nfs_file_open() converts it to QAPI type BlockdevOptionsNfs, with qdict_crumple() and a qobject input visitor. This visitor comes in two flavors. The plain flavor requires scalars to be typed according to the QAPI schema. That's the case here. The keyval flavor requires string scalars. That's not the case here. nfs_file_open() uses the latter, and promptly falls apart for members @user, @group, @tcp-syn-count, @readahead-size, @page-cache-size, @debug. Switching to the plain flavor would fix -blockdev, but break -drive, because there the scalars arrive in nfs_file_open() as strings. The proper fix would be to replace the QDict by QAPI type BlockdevOptions in the BlockDriver interface. Sadly, that's beyond my reach right now. Next best would be to fix the block layer to always pass correctly typed QDicts to the BlockDriver methods. Also beyond my reach. What I can do is throw another hack onto the pile: have nfs_file_open() convert all members to string, so use of the keyval flavor actually works, by replacing qdict_crumple() by new function qdict_crumple_for_keyval_qiv(). The pattern "pass result of qdict_crumple() to qobject_input_visitor_new_keyval()" occurs several times more: * qemu_rbd_open() Same issue as nfs_file_open(), but since BlockdevOptionsRbd has only string members, its only a latent bug. Fix it anyway. * parallels_co_create_opts(), qcow_co_create_opts(), qcow2_co_create_opts(), bdrv_qed_co_create_opts(), sd_co_create_opts(), vhdx_co_create_opts(), vpc_co_create_opts() These work, because they create the QDict with qemu_opts_to_qdict_filtered(), which creates only string scalars. The function sports a TODO comment asking for better typing; that's going to be fun. Use qdict_crumple_for_keyval_qiv() to be safe. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Max Reitz	609f45ea95	block: Add block-specific QDict header There are numerous QDict functions that have been introduced for and are used only by the block layer. Move their declarations into an own header file to reflect that. While qdict_extract_subqdict() is in fact used outside of the block layer (in util/qemu-config.c), it is still a function related very closely to how the block layer works with nested QDicts, namely by sometimes flattening them. Therefore, its declaration is put into this header as well and util/qemu-config.c includes it with a comment stating exactly which function it needs. Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20180509165530.29561-7-mreitz@redhat.com> [Copyright note tweaked, superfluous includes dropped] Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	deadbb8ebb	iscsi: Drop deprecated -drive parameter "filename" Parameter "filename" is deprecated since commit `5c3ad1a6a8`, v2.10.0. Time to get rid of it. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	bb9f762ff3	rbd: Drop deprecated -drive parameter "filename" Parameter "filename" is deprecated since commit `91589d9e5c`, v2.10.0. Time to get rid of it. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Vladimir Sementsov-Ogievskiy	b598e531f1	qapi: add x-block-dirty-bitmap-merge Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 20180606182449.1607-5-jsnow@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>	2018-06-11 14:53:32 -04:00
Vladimir Sementsov-Ogievskiy	8b1402ce80	block/dirty-bitmap: add lock to bdrv_enable/disable_dirty_bitmap Add locks and remove comments about BQL accordingly to dirty_bitmap_mutex definition in block_int.h. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 20180606182449.1607-2-jsnow@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>	2018-06-11 14:53:31 -04:00
Paolo Bonzini	b133c27f5d	block: simplify code around releasing bitmaps QLIST_REMOVE does not require walking the list, and once the "bitmap" argument is removed from bdrv_do_release_matching_dirty_bitmap_locked the code simplifies a lot and it is worth inlining everything in the callers of bdrv_do_release_matching_dirty_bitmap. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20180326104037.6894-1-pbonzini@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>	2018-06-11 14:53:31 -04:00
Paolo Bonzini	ab41fc4853	block: remove bdrv_dirty_bitmap_make_anon All this function is doing will be repeated by bdrv_do_release_matching_dirty_bitmap_locked, except resetting bm->persistent. But even that does not matter because the bitmap will be freed. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20180323164254.26487-1-pbonzini@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>	2018-06-11 14:53:31 -04:00
Max Reitz	ddf3b47ef4	qcow2: Do not mark inactive images corrupt When signaling a corruption on a read-only image, qcow2 already makes fatal events non-fatal (i.e., they will not result in the image being closed, and the image header's corrupt flag will not be set). This is necessary because we cannot set the corrupt flag on read-only images, and it is possible because further corruption of read-only images is impossible. Inactive images are effectively read-only, too, so we should do the same for them. bdrv_is_writable() can tell us whether an image can actually be written to, so use its result instead of !bs->read_only. (Otherwise, the assert(!(bs->open_flags & BDRV_O_INACTIVE)) in bdrv_co_pwritev() will fail, crashing qemu.) Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180606193702.7113-3-mreitz@redhat.com Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Alberto Garcia	bc33c047d1	throttle: Fix crash on reopen The throttle block filter can be reopened, and with this it is possible to change the throttle group that the filter belongs to. The way the code does that is the following: - On throttle_reopen_prepare(): create a new ThrottleGroupMember and attach it to the new throttle group. - On throttle_reopen_commit(): detach the old ThrottleGroupMember, delete it and replace it with the new one. The problem with this is that by replacing the ThrottleGroupMember the previous value of io_limits_disabled is lost, causing an assertion failure in throttle_co_drain_end(). This problem can be reproduced by reopening a throttle node: $QEMU -monitor stdio -object throttle-group,id=tg0,x-iops-total=1000 \ -blockdev node-name=hd0,driver=qcow2,file.driver=file,file.filename=hd.qcow2 \ -blockdev node-name=root,driver=throttle,throttle-group=tg0,file=hd0,read-only=on (qemu) block_stream root block/throttle.c:214: throttle_co_drain_end: Assertion `tgm->io_limits_disabled' failed. Since we only want to change the throttle group on reopen there's no need to create a ThrottleGroupMember and discard the old one. It's easier if we simply detach it from its current group and attach it to the new one. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 20180608151536.7378-1-berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Vladimir Sementsov-Ogievskiy	7eb24009db	block/qcow2-bitmap: fix free_bitmap_clusters This assert may fail, because bitmap_table is not initialized. Just drop it, as it's obvious, that bitmap_table_load sets bitmap_table parameter only when returning zero. Reported-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20180608101225.2575-1-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Max Reitz	3cce51c919	qcow2: Repair OFLAG_COPIED when fixing leaks Repairing OFLAG_COPIED is usually safe because it is done after the refcounts have been repaired. Therefore, it we did not find anyone else referencing a data or L2 cluster, it makes no sense to not set OFLAG_COPIED -- and the other direction (clearing OFLAG_COPIED) is always safe, anyway, it may just induce leaks. Furthermore, if OFLAG_COPIED is actually consistent with a wrong (leaky) refcount, we will decrement the refcount with -r leaks, but OFLAG_COPIED will then be wrong. qemu-img check should not produce images that are more corrupted afterwards then they were before. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1527085 Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20180509200059.31125-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Max Reitz	d1402b5026	block: Add Error parameter to bdrv_amend_options Looking at the qcow2 code that is riddled with error_report() calls, this is really how it should have been from the start. Along the way, turn the target_version/current_version comparisons at the beginning of qcow2_downgrade() into assertions (the caller has to make sure these conditions are met), and rephrase the error message on using compat=1.1 to get refcount widths other than 16 bits. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180509210023.20283-3-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Max Reitz	b8cf1913a9	block/file-posix: File locking during creation When creating a file, we should take the WRITE and RESIZE permissions. We do not need either for the creation itself, but we do need them for clearing and resizing it. So we can take the proper permissions by replacing O_TRUNC with an explicit truncation to 0, and by taking the appropriate file locks between those two steps. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180509215336.31304-3-mreitz@redhat.com Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Max Reitz	d0a96155de	block/file-posix: Pass FD to locking helpers raw_apply_lock_bytes() and raw_check_lock_bytes() currently take a BDRVRawState *, but they only use the lock_fd field. During image creation, we do not have a BDRVRawState, but we do have an FD; so if we want to reuse the functions there, we should modify them to receive only the FD. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20180509215336.31304-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Paolo Bonzini	68acc99f14	sheepdog: remove huge BSS object block/sheepdog.o has a 4M static variable that is 90% of QEMU's whole .bss section. Replace it with a heap-allocated block, and make it smaller too since only the inode header is actually being used. bss size goes down from 4464280 to 269976. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20180523160721.14018-3-pbonzini@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>	2018-06-05 10:15:12 -04:00
Paolo Bonzini	03b036cc0c	sheepdog: cleanup repeated expression The expression "SD_INODE_SIZE - sizeof(inode.data_vdi_id)" already has a macro defined for the same value (though with a nicer definition using offsetof). Replace it. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20180523160721.14018-2-pbonzini@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>	2018-06-05 10:15:12 -04:00

1 2 3 4 5 ...

3861 Commits