mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Kevin Wolf	6cd5c9d7b2	block: ignore_bds_parents parameter for drain functions In the future, bdrv_drained_all_begin/end() will drain all invidiual nodes separately rather than whole subtrees. This means that we don't want to propagate the drain to all parents any more: If the parent is a BDS, it will already be drained separately. Recursing to all parents is unnecessary work and would make it an O(n²) operation. Prepare the drain function for the changed drain_all by adding an ignore_bds_parents parameter to the internal implementation that prevents the propagation of the drain to BDS parents. We still (have to) propagate it to non-BDS parents like BlockBackends or Jobs because those are not drained separately. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	c8ca33d06d	block: Move bdrv_drain_all_begin() out of coroutine context Before we can introduce a single polling loop for all nodes in bdrv_drain_all_begin(), we must make sure to run it outside of coroutine context like we already do for bdrv_do_drained_begin(). Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	0109e7e6f8	block: Defer .bdrv_drain_begin callback to polling phase We cannot allow aio_poll() in bdrv_drain_invoke(begin=true) until we're done with propagating the drain through the graph and are doing the single final BDRV_POLL_WHILE(). Just schedule the coroutine with the callback and increase bs->in_flight to make sure that the polling phase will wait for it. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	dcf94a23b1	block: Don't poll in parent drain callbacks bdrv_do_drained_begin() is only safe if we have a single BDRV_POLL_WHILE() after quiescing all affected nodes. We cannot allow that parent callbacks introduce a nested polling loop that could cause graph changes while we're traversing the graph. Split off bdrv_do_drained_begin_quiesce(), which only quiesces a single node without waiting for its requests to complete. These requests will be waited for in the BDRV_POLL_WHILE() call down the call chain. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	fe4f0614ef	block: Drain recursively with a single BDRV_POLL_WHILE() Anything can happen inside BDRV_POLL_WHILE(), including graph changes that may interfere with its callers (e.g. child list iteration in recursive callers of bdrv_do_drained_begin). Switch to a single BDRV_POLL_WHILE() call for the whole subtree at the end of bdrv_do_drained_begin() to avoid such effects. The recursion happens now inside the loop condition. As the graph can only change between bdrv_drain_poll() calls, but not inside of it, doing the recursion here is safe. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	d30b8e64b7	block: Remove bdrv_drain_recurse() For bdrv_drain(), recursively waiting for child node requests is pointless because we didn't quiesce their parents, so new requests could come in anyway. Letting the function work only on a single node makes it more consistent. For subtree drains and drain_all, we already have the recursion in bdrv_do_drained_begin(), so the extra recursion doesn't add anything either. Remove the useless code. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	89bd030533	block: Really pause block jobs on drain We already requested that block jobs be paused in .bdrv_drained_begin, but no guarantee was made that the job was actually inactive at the point where bdrv_drained_begin() returned. This introduces a new callback BdrvChildRole.bdrv_drained_poll() and uses it to make bdrv_drain_poll() consider block jobs using the node to be drained. For the test case to work as expected, we have to switch from block_job_sleep_ns() to qemu_co_sleep_ns() so that the test job is even considered active and must be waited for when draining the node. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	1cc8e54ada	block: Avoid unnecessary aio_poll() in AIO_WAIT_WHILE() Commit `91af091f92` added an additional aio_poll() to BDRV_POLL_WHILE() in order to make sure that all pending BHs are executed on drain. This was the wrong place to make the fix, as it is useless overhead for all other users of the macro and unnecessarily complicates the mechanism. This patch effectively reverts said commit (the context has changed a bit and the code has moved to AIO_WAIT_WHILE()) and instead polls in the loop condition for drain. The effect is probably hard to measure in any real-world use case because actual I/O will dominate, but if I run only the initialisation part of 'qemu-img convert' where it calls bdrv_block_status() for the whole image to find out how much data there is copy, this phase actually needs only roughly half the time after this patch. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	c13ad59f01	block: Don't manually poll in bdrv_drain_all() All involved nodes are already idle, we called bdrv_do_drain_begin() on them. The comment in the code suggested that this was not correct because the completion of a request on one node could spawn a new request on a different node (which might have been drained before, so we wouldn't drain the new request). In reality, new requests to different nodes aren't spawned out of nothing, but only in the context of a parent request, and they aren't submitted to random nodes, but only to child nodes. As long as we still poll for the completion of the parent request (which we do), draining each root node separately is good enough. Remove the additional polling code from bdrv_drain_all_begin() and replace it with an assertion that all nodes are already idle after we drained them separately. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	7d40d9ef9d	block: Remove 'recursive' parameter from bdrv_drain_invoke() All callers pass false for the 'recursive' parameter now. Remove it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	79ab8b21dc	block: Use bdrv_do_drain_begin/end in bdrv_drain_all() bdrv_do_drain_begin/end() implement already everything that bdrv_drain_all_begin/end() need and currently still do manually: Disable external events, call parent drain callbacks, call block driver callbacks. It also does two more things: The first is incrementing bs->quiesce_counter. bdrv_drain_all() already stood out in the test case by behaving different from the other drain variants. Adding this is not only safe, but in fact a bug fix. The second is calling bdrv_drain_recurse(). We already do that later in the same function in a loop, so basically doing an early first iteration doesn't hurt. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	bb67568954	test-bdrv-drain: bdrv_drain() works with cross-AioContext events As long as nobody keeps the other I/O thread from working, there is no reason why bdrv_drain() wouldn't work with cross-AioContext events. The key is that the root request we're waiting for is in the AioContext we're polling (which it always is for bdrv_drain()) so that aio_poll() is woken up in the end. Add a test case that shows that it works. Remove the comment in bdrv_drain() that claims otherwise. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-18 15:03:25 +02:00
Kevin Wolf	b008326744	block: Remove deprecated -drive option serial The -drive option serial was deprecated in QEMU 2.10. It's time to remove it. Tests need to be updated to set the serial number with -global instead of using the -drive option. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	d083f954a9	rbd: New parameter key-secret Legacy -drive supports "password-secret" parameter that isn't available with -blockdev / blockdev-add. That's because we backed out our first try to provide it there due to interface design doubts, in commit `577d8c9a81`, v2.9.0. This is the second try. It brings back the parameter, except it's named "key-secret" now. Let's review our reasons for backing out the first try, as stated in the commit message: * BlockdevOptionsRbd member @password-secret isn't actually a password, it's a key generated by Ceph. Addressed by the rename. * We're not sure where member @password-secret belongs (see the previous commit). See previous commit. * How @password-secret interacts with settings from a configuration file specified with @conf is undocumented. Not actually true, the documentation for @conf says "Values in the configuration file will be overridden by options specified via QAPI", and we've tested this. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	a3699de4dd	rbd: New parameter auth-client-required Parameter auth-client-required lets you configure authentication methods. We tried to provide that in v2.9.0, but backed out due to interface design doubts (commit `464444fcc1`). This commit is similar to what we backed out, but simpler: we use a list of enumeration values instead of a list of objects with a member of enumeration type. Let's review our reasons for backing out the first try, as stated in the commit message: * The implementation uses deprecated rados_conf_set() key "auth_supported". No biggie. Fixed: we use "auth-client-required". * The implementation makes -drive silently ignore invalid parameters "auth" and "auth-supported..X" where X isn't "auth". Fixable (in fact I'm going to fix similar bugs around parameter server), so again no biggie. That fix is commit `2836284db6`. This commit doesn't bring the bugs back. BlockdevOptionsRbd member @password-secret applies only to authentication method cephx. Should it be a variant member of RbdAuthMethod? We've had time to ponder, and we decided to stick to the way Ceph configuration works: the key configured separately, and silently ignored if the authentication method doesn't use it. * BlockdevOptionsRbd member @user could apply to both methods cephx and none, but I'm not sure it's actually used with none. If it isn't, should it be a variant member of RbdAuthMethod? Likewise. * The client offers a set of authentication methods, not a list. Should the methods be optional members of BlockdevOptionsRbd instead of members of list @auth-supported? The latter begs the question what multiple entries for the same method mean. Trivial question now that RbdAuthMethod contains nothing but @type, but less so when RbdAuthMethod acquires other members, such the ones discussed above. Again, we decided to stick to the way Ceph configuration works, except we make auth-client-required a list of enumeration values instead of a string containing keywords separated by delimiters. * How BlockdevOptionsRbd member @auth-supported interacts with settings from a configuration file specified with @conf is undocumented. I suspect it's untested, too. Not actually true, the documentation for @conf says "Values in the configuration file will be overridden by options specified via QAPI", and we've tested this. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	f853465aac	block: Make remaining uses of qobject input visitor more robust Remaining uses of qobject_input_visitor_new_keyval() in the block subsystem: * block_crypto_open_opts_init() Currently doesn't visit any non-string scalars, thus safe. It's called from - block_crypto_open_luks() Creates the QDict with qemu_opts_to_qdict_filtered(), which creates only string scalars, but has a TODO asking for other types. - qcow_open() - qcow2_open(), qcow2_co_invalidate_cache(), qcow2_reopen_prepare() * block_crypto_create_opts_init(), called from - block_crypto_co_create_opts_luks() Also creates the QDict with qemu_opts_to_qdict_filtered(). * vdi_co_create_opts() Also creates the QDict with qemu_opts_to_qdict_filtered(). Replace these uses by qobject_input_visitor_new_flat_confused() for robustness. This adds crumpling. Right now, that's a no-op, but if we ever extend these things in non-flat ways, crumpling will be needed. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	af91062ee1	block: Factor out qobject_input_visitor_new_flat_confused() Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	92adf9dbcd	block: Clean up a misuse of qobject_to() in .bdrv_co_create_opts() The following pattern occurs in the .bdrv_co_create_opts() methods of parallels, qcow, qcow2, qed, vhdx and vpc: qobj = qdict_crumple_for_keyval_qiv(qdict, errp); qobject_unref(qdict); qdict = qobject_to(QDict, qobj); if (qdict == NULL) { ret = -EINVAL; goto done; } v = qobject_input_visitor_new_keyval(QOBJECT(qdict)); [...] ret = 0; done: qobject_unref(qdict); [...] return ret; If qobject_to() fails, we return failure without setting errp. That's wrong. As far as I can tell, it cannot fail here. Clean it up anyway, by removing the useless conversion. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	374c52467a	block: Fix -drive for certain non-string scalars The previous commit fixed -blockdev breakage due to misuse of the qobject input visitor's keyval flavor in bdrv_file_open(). The commit message explain why using the plain flavor would be just as wrong; it would break -drive. Turns out we break it in three places: nbd_open(), sd_open() and ssh_file_open(). They are even marked FIXME. Example breakage: $ qemu-system-x86 -drive node-name=n1,driver=nbd,server.type=inet,server.host=localhost,server.port=1234,server.numeric=off qemu-system-x86: -drive node-name=n1,driver=nbd,server.type=inet,server.host=localhost,server.port=1234,server.numeric=off: Invalid parameter type for 'numeric', expected: boolean Fix it the same way: replace qdict_crumple() by qdict_crumple_for_keyval_qiv(), and switch from plain to the keyval flavor. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	e5af0da1dc	block: Fix -blockdev for certain non-string scalars Configuration flows through the block subsystem in a rather peculiar way. Configuration made with -drive enters it as QemuOpts. Configuration made with -blockdev / blockdev-add enters it as QAPI type BlockdevOptions. The block subsystem uses QDict, QemuOpts and QAPI types internally. The precise flow is next to impossible to explain (I tried for this commit message, but gave up after wasting several hours). What I can explain is a flaw in the BlockDriver interface that leads to this bug: $ qemu-system-x86_64 -blockdev node-name=n1,driver=nfs,server.type=inet,server.host=localhost,path=/foo/bar,user=1234 qemu-system-x86_64: -blockdev node-name=n1,driver=nfs,server.type=inet,server.host=localhost,path=/foo/bar,user=1234: Internal error: parameter user invalid QMP blockdev-add is broken the same way. Here's what happens. The block layer passes configuration represented as flat QDict (with dotted keys) to BlockDriver methods .bdrv_file_open(). The QDict's members are typed according to the QAPI schema. nfs_file_open() converts it to QAPI type BlockdevOptionsNfs, with qdict_crumple() and a qobject input visitor. This visitor comes in two flavors. The plain flavor requires scalars to be typed according to the QAPI schema. That's the case here. The keyval flavor requires string scalars. That's not the case here. nfs_file_open() uses the latter, and promptly falls apart for members @user, @group, @tcp-syn-count, @readahead-size, @page-cache-size, @debug. Switching to the plain flavor would fix -blockdev, but break -drive, because there the scalars arrive in nfs_file_open() as strings. The proper fix would be to replace the QDict by QAPI type BlockdevOptions in the BlockDriver interface. Sadly, that's beyond my reach right now. Next best would be to fix the block layer to always pass correctly typed QDicts to the BlockDriver methods. Also beyond my reach. What I can do is throw another hack onto the pile: have nfs_file_open() convert all members to string, so use of the keyval flavor actually works, by replacing qdict_crumple() by new function qdict_crumple_for_keyval_qiv(). The pattern "pass result of qdict_crumple() to qobject_input_visitor_new_keyval()" occurs several times more: * qemu_rbd_open() Same issue as nfs_file_open(), but since BlockdevOptionsRbd has only string members, its only a latent bug. Fix it anyway. * parallels_co_create_opts(), qcow_co_create_opts(), qcow2_co_create_opts(), bdrv_qed_co_create_opts(), sd_co_create_opts(), vhdx_co_create_opts(), vpc_co_create_opts() These work, because they create the QDict with qemu_opts_to_qdict_filtered(), which creates only string scalars. The function sports a TODO comment asking for better typing; that's going to be fun. Use qdict_crumple_for_keyval_qiv() to be safe. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Max Reitz	609f45ea95	block: Add block-specific QDict header There are numerous QDict functions that have been introduced for and are used only by the block layer. Move their declarations into an own header file to reflect that. While qdict_extract_subqdict() is in fact used outside of the block layer (in util/qemu-config.c), it is still a function related very closely to how the block layer works with nested QDicts, namely by sometimes flattening them. Therefore, its declaration is put into this header as well and util/qemu-config.c includes it with a comment stating exactly which function it needs. Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20180509165530.29561-7-mreitz@redhat.com> [Copyright note tweaked, superfluous includes dropped] Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	deadbb8ebb	iscsi: Drop deprecated -drive parameter "filename" Parameter "filename" is deprecated since commit `5c3ad1a6a8`, v2.10.0. Time to get rid of it. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Markus Armbruster	bb9f762ff3	rbd: Drop deprecated -drive parameter "filename" Parameter "filename" is deprecated since commit `91589d9e5c`, v2.10.0. Time to get rid of it. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-06-15 14:49:44 +02:00
Vladimir Sementsov-Ogievskiy	b598e531f1	qapi: add x-block-dirty-bitmap-merge Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 20180606182449.1607-5-jsnow@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>	2018-06-11 14:53:32 -04:00
Vladimir Sementsov-Ogievskiy	8b1402ce80	block/dirty-bitmap: add lock to bdrv_enable/disable_dirty_bitmap Add locks and remove comments about BQL accordingly to dirty_bitmap_mutex definition in block_int.h. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Message-id: 20180606182449.1607-2-jsnow@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>	2018-06-11 14:53:31 -04:00
Paolo Bonzini	b133c27f5d	block: simplify code around releasing bitmaps QLIST_REMOVE does not require walking the list, and once the "bitmap" argument is removed from bdrv_do_release_matching_dirty_bitmap_locked the code simplifies a lot and it is worth inlining everything in the callers of bdrv_do_release_matching_dirty_bitmap. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20180326104037.6894-1-pbonzini@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>	2018-06-11 14:53:31 -04:00
Paolo Bonzini	ab41fc4853	block: remove bdrv_dirty_bitmap_make_anon All this function is doing will be repeated by bdrv_do_release_matching_dirty_bitmap_locked, except resetting bm->persistent. But even that does not matter because the bitmap will be freed. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20180323164254.26487-1-pbonzini@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>	2018-06-11 14:53:31 -04:00
Max Reitz	ddf3b47ef4	qcow2: Do not mark inactive images corrupt When signaling a corruption on a read-only image, qcow2 already makes fatal events non-fatal (i.e., they will not result in the image being closed, and the image header's corrupt flag will not be set). This is necessary because we cannot set the corrupt flag on read-only images, and it is possible because further corruption of read-only images is impossible. Inactive images are effectively read-only, too, so we should do the same for them. bdrv_is_writable() can tell us whether an image can actually be written to, so use its result instead of !bs->read_only. (Otherwise, the assert(!(bs->open_flags & BDRV_O_INACTIVE)) in bdrv_co_pwritev() will fail, crashing qemu.) Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180606193702.7113-3-mreitz@redhat.com Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Alberto Garcia	bc33c047d1	throttle: Fix crash on reopen The throttle block filter can be reopened, and with this it is possible to change the throttle group that the filter belongs to. The way the code does that is the following: - On throttle_reopen_prepare(): create a new ThrottleGroupMember and attach it to the new throttle group. - On throttle_reopen_commit(): detach the old ThrottleGroupMember, delete it and replace it with the new one. The problem with this is that by replacing the ThrottleGroupMember the previous value of io_limits_disabled is lost, causing an assertion failure in throttle_co_drain_end(). This problem can be reproduced by reopening a throttle node: $QEMU -monitor stdio -object throttle-group,id=tg0,x-iops-total=1000 \ -blockdev node-name=hd0,driver=qcow2,file.driver=file,file.filename=hd.qcow2 \ -blockdev node-name=root,driver=throttle,throttle-group=tg0,file=hd0,read-only=on (qemu) block_stream root block/throttle.c:214: throttle_co_drain_end: Assertion `tgm->io_limits_disabled' failed. Since we only want to change the throttle group on reopen there's no need to create a ThrottleGroupMember and discard the old one. It's easier if we simply detach it from its current group and attach it to the new one. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 20180608151536.7378-1-berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Vladimir Sementsov-Ogievskiy	7eb24009db	block/qcow2-bitmap: fix free_bitmap_clusters This assert may fail, because bitmap_table is not initialized. Just drop it, as it's obvious, that bitmap_table_load sets bitmap_table parameter only when returning zero. Reported-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20180608101225.2575-1-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Max Reitz	3cce51c919	qcow2: Repair OFLAG_COPIED when fixing leaks Repairing OFLAG_COPIED is usually safe because it is done after the refcounts have been repaired. Therefore, it we did not find anyone else referencing a data or L2 cluster, it makes no sense to not set OFLAG_COPIED -- and the other direction (clearing OFLAG_COPIED) is always safe, anyway, it may just induce leaks. Furthermore, if OFLAG_COPIED is actually consistent with a wrong (leaky) refcount, we will decrement the refcount with -r leaks, but OFLAG_COPIED will then be wrong. qemu-img check should not produce images that are more corrupted afterwards then they were before. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1527085 Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20180509200059.31125-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Max Reitz	d1402b5026	block: Add Error parameter to bdrv_amend_options Looking at the qcow2 code that is riddled with error_report() calls, this is really how it should have been from the start. Along the way, turn the target_version/current_version comparisons at the beginning of qcow2_downgrade() into assertions (the caller has to make sure these conditions are met), and rephrase the error message on using compat=1.1 to get refcount widths other than 16 bits. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180509210023.20283-3-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Max Reitz	b8cf1913a9	block/file-posix: File locking during creation When creating a file, we should take the WRITE and RESIZE permissions. We do not need either for the creation itself, but we do need them for clearing and resizing it. So we can take the proper permissions by replacing O_TRUNC with an explicit truncation to 0, and by taking the appropriate file locks between those two steps. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180509215336.31304-3-mreitz@redhat.com Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Max Reitz	d0a96155de	block/file-posix: Pass FD to locking helpers raw_apply_lock_bytes() and raw_check_lock_bytes() currently take a BDRVRawState *, but they only use the lock_fd field. During image creation, we do not have a BDRVRawState, but we do have an FD; so if we want to reuse the functions there, we should modify them to receive only the FD. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20180509215336.31304-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-06-11 16:18:45 +02:00
Paolo Bonzini	68acc99f14	sheepdog: remove huge BSS object block/sheepdog.o has a 4M static variable that is 90% of QEMU's whole .bss section. Replace it with a heap-allocated block, and make it smaller too since only the inode header is actually being used. bss size goes down from 4464280 to 269976. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20180523160721.14018-3-pbonzini@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>	2018-06-05 10:15:12 -04:00
Paolo Bonzini	03b036cc0c	sheepdog: cleanup repeated expression The expression "SD_INODE_SIZE - sizeof(inode.data_vdi_id)" already has a macro defined for the same value (though with a nicer definition using offsetof). Replace it. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20180523160721.14018-2-pbonzini@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>	2018-06-05 10:15:12 -04:00
Peter Maydell	0d514fa234	Pull request * Copy offloading for qemu-img convert (iSCSI, raw, and qcow2) If the underlying storage supports copy offloading, qemu-img convert will use it instead of performing reads and writes. This avoids data transfers and thus frees up storage bandwidth for other purposes. SCSI EXTENDED COPY and Linux copy_file_range(2) are used to implement this optimization. * Drop spurious "WARNING: I\/O thread spun for 1000 iterations" warning -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJbFSBoAAoJEJykq7OBq3PISpEIAIcMao4/rzinAWXzS+ncK9LO 6FtRVpgutpHaWX2ayySaz5n2CdR3cNMrpCI7sjY2Kw0lrdkqxPgl5n0SWD+VCl4W 7+JLz/uF0iUV8X+99e7WGAjZbm9LSlxgn5AQKfrrwyPf0ZfzoYQ5nBMcQ6xjEeQP 48j2WqJqN9/u8RBD07o11yn0+CE5g56/f12xVjR5ASVodzsAmcZ2OQRMQbM01isU 1mBekJQkDxJkt5l13Rql8+t+vWz8/9BEW2c/eIDKvoayMqYJpdfKv4DqLloIuHnc 3RkquA0zUuKtl7xEnEkH/We7fi4QPGW/vyBN7ychS/zKzZFQrXmwqrAuFSw3dKU= =vZp+ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging Pull request * Copy offloading for qemu-img convert (iSCSI, raw, and qcow2) If the underlying storage supports copy offloading, qemu-img convert will use it instead of performing reads and writes. This avoids data transfers and thus frees up storage bandwidth for other purposes. SCSI EXTENDED COPY and Linux copy_file_range(2) are used to implement this optimization. * Drop spurious "WARNING: I\/O thread spun for 1000 iterations" warning # gpg: Signature made Mon 04 Jun 2018 12:20:08 BST # gpg: using RSA key 9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/block-pull-request: main-loop: drop spin_counter qemu-img: Convert with copy offloading block-backend: Add blk_co_copy_range iscsi: Implement copy offloading iscsi: Create and use iscsi_co_wait_for_task iscsi: Query and save device designator when opening file-posix: Implement bdrv_co_copy_range qcow2: Implement copy offloading raw: Implement copy offloading raw: Check byte range uniformly block: Introduce API for copy offloading Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-06-04 18:34:04 +01:00
Peter Maydell	f67c9b693a	acpi, vhost, misc: fixes, features vDPA support, fix to vhost blk RO bit handling, some include path cleanups, NFIT ACPI table. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJbEXNvAAoJECgfDbjSjVRpc8gH/R8xrcFrV+k9wwbgYcOcGb6Y LWjseE31pqJcxRV80vLOdzYEuLStZQKQQY7xBDMlA5vdyvZxIA6FLO2IsiJSbFAk EK8pclwhpwQAahr8BfzenabohBv2UO7zu5+dqSvuJCiMWF3jGtPAIMxInfjXaOZY odc1zY2D2EgsC7wZZ1hfraRbISBOiRaez9BoGDKPOyBY9G1ASEgxJgleFgoBLfsK a1XU+fDM6hAVdxftfkTm0nibyf7PWPDyzqghLqjR9WXLvZP3Cqud4p8N29mY51pR KSTjA4FYk6Z9EVMltyBHfdJs6RQzglKjxcNGdlrvacDfyFi79fGdiosVllrjfJM= =3+V0 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging acpi, vhost, misc: fixes, features vDPA support, fix to vhost blk RO bit handling, some include path cleanups, NFIT ACPI table. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Fri 01 Jun 2018 17:25:19 BST # gpg: using RSA key 281F0DB8D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" # Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67 # Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469 * remotes/mst/tags/for_upstream: (31 commits) vhost-blk: turn on pre-defined RO feature bit ACPI testing: test NFIT platform capabilities nvdimm, acpi: support NFIT platform capabilities tests/.gitignore: add entry for generated file arch_init: sort architectures ui: use local path for local headers qga: use local path for local headers colo: use local path for local headers migration: use local path for local headers usb: use local path for local headers sd: fix up include vhost-scsi: drop an unused include ppc: use local path for local headers rocker: drop an unused include e1000e: use local path for local headers ioapic: fix up includes ide: use local path for local headers display: use local path for local headers trace: use local path for local headers migration: drop an unused include ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-06-04 10:15:16 +01:00
Fam Zheng	b5679fa49c	block-backend: Add blk_co_copy_range It's a BlockBackend wrapper of the BDS interface. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20180601092648.24614-10-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-01 14:41:48 +01:00
Fam Zheng	604dfaaa32	iscsi: Implement copy offloading Issue EXTENDED COPY (LID1) command to implement the copy_range API. The parameter data construction code is modified from libiscsi's iscsi-dd.c. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20180601092648.24614-9-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-01 14:41:48 +01:00
Fam Zheng	66e75c03b2	iscsi: Create and use iscsi_co_wait_for_task This loop is repeated a growing number times. Make a helper. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20180601092648.24614-8-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-01 14:41:48 +01:00
Fam Zheng	cc9743c236	iscsi: Query and save device designator when opening The device designator data returned in INQUIRY command will be useful to fill in source/target fields during copy offloading. Do this when connecting to the target and save the data for later use. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20180601092648.24614-7-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-01 14:41:48 +01:00
Fam Zheng	1efad060d7	file-posix: Implement bdrv_co_copy_range With copy_file_range(2), we can implement the bdrv_co_copy_range semantics. Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 20180601092648.24614-6-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-01 14:41:48 +01:00
Fam Zheng	fd9fcd37a8	qcow2: Implement copy offloading The two callbacks are implemented quite similarly to the read/write functions: bdrv_co_copy_range_from maps for read and calls into bs->file or bs->backing depending on the allocation status; bdrv_co_copy_range_to maps for write and calls into bs->file. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 20180601092648.24614-5-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-01 14:41:48 +01:00
Fam Zheng	72d219e2f9	raw: Implement copy offloading Just pass down to ->file. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20180601092648.24614-4-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-01 14:41:47 +01:00
Fam Zheng	3844553852	raw: Check byte range uniformly We don't verify the request range against s->size in the I/O callbacks except for raw_co_pwritev. This is inconsistent (especially for raw_co_pwrite_zeroes and raw_co_pdiscard), so fix them, in the meanwhile make the helper reusable by the coming new callbacks. Note that in most cases the block layer already verifies the request byte range against our reported image length, before invoking the driver callbacks. The exception is during image creating, after blk_set_allow_write_beyond_eof(blk, true) is called. But in that case, the requests are not directly from the user or guest. So there is no visible behavior change in adding the check code. The int64_t -> uint64_t inconsistency, as shown by the type casting, is pre-existing due to the interface. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 20180601092648.24614-3-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-01 14:41:47 +01:00
Fam Zheng	fcc6767836	block: Introduce API for copy offloading Introduce the bdrv_co_copy_range() API for copy offloading. Block drivers implementing this API support efficient copy operations that avoid reading each block from the source device and writing it to the destination devices. Examples of copy offload primitives are SCSI EXTENDED COPY and Linux copy_file_range(2). Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20180601092648.24614-2-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-06-01 14:41:47 +01:00
Michael S. Tsirkin	0d8c41dae5	block: use local path for local headers When pulling in headers that are in the same directory as the C file (as opposed to one in include/), we should use its relative path, without a directory. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>	2018-05-31 04:16:06 +03:00
Kevin Wolf	3fb588a0f2	block/create: Mark blockdev-create stable We're ready to declare the blockdev-create job stable. This renames the corresponding QMP command from x-blockdev-create to blockdev-create. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>	2018-05-30 13:31:18 +02:00
Kevin Wolf	e5ab4347f9	block/create: Make x-blockdev-create a job This changes the x-blockdev-create QMP command so that it doesn't block the monitor and the main loop any more, but starts a background job that performs the image creation. The basic job as implemented here is all that is necessary to make image creation asynchronous and to provide a QMP interface that can be marked stable, but it still lacks a few features that jobs usually provide: The job will ignore pause commands and it doesn't publish more than very basic progress yet (total-progress is 1 and current-progress advances from 0 to 1 when the driver callbacks returns). These features can be added later without breaking compatibility. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>	2018-05-30 13:31:07 +02:00

1 2 3 4 5 ...

3847 Commits