mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Eric Blake	8417e1378c	qemu-img: Make unallocated part of backing chain obvious in map The recently-added NBD context qemu:allocation-depth is able to distinguish between locally-present data (even when that data is sparse) [shown as depth 1 over NBD], and data that could not be found anywhere in the backing chain [shown as depth 0]; and the libnbd project was recently patched to give the human-readable name "absent" to an allocation-depth of 0. But qemu-img map --output=json predates that addition, and has the unfortunate behavior that all portions of the backing chain that resolve without finding a hit in any backing layer report the same depth as the final backing layer. This makes it harder to reconstruct a qcow2 backing chain using just 'qemu-img map' output, especially when using "backing":null to artificially limit a backing chain, because it is impossible to distinguish between a QCOW2_CLUSTER_UNALLOCATED (which defers to a [missing] backing file) and a QCOW2_CLUSTER_ZERO_PLAIN cluster (which would override any backing file), since both types of clusters otherwise show as "data":false,"zero":true" (but note that we can distinguish a QCOW2_CLUSTER_ZERO_ALLOCATED, which would also have an "offset": listing). The task of reconstructing a qcow2 chain was made harder in commit `0da9856851` (nbd: server: Report holes for raw images), because prior to that point, it was possible to abuse NBD's block status command to see which portions of a qcow2 file resulted in BDRV_BLOCK_ALLOCATED (showing up as NBD_STATE_ZERO in isolation) vs. missing from the chain (showing up as NBD_STATE_ZERO\|NBD_STATE_HOLE); but now qemu reports more accurate sparseness information over NBD. An obvious solution is to make 'qemu-img map --output=json' add an additional "present":false designation to any cluster lacking an allocation anywhere in the chain, without any change to the "depth" parameter to avoid breaking existing clients. The iotests have several examples where this distinction demonstrates the additional accuracy. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20210701190655.2131223-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: fix more iotest fallout] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-07-12 11:10:53 -05:00
Eric Blake	b66ff2c298	iotests: Specify explicit backing format where sensible There are many existing qcow2 images that specify a backing file but no format. This has been the source of CVEs in the past, but has become more prominent of a problem now that libvirt has switched to -blockdev. With older -drive, at least the probing was always done by qemu (so the only risk of a changed format between successive boots of a guest was if qemu was upgraded and probed differently). But with newer -blockdev, libvirt must specify a format; if libvirt guesses raw where the image was formatted, this results in data corruption visible to the guest; conversely, if libvirt guesses qcow2 where qemu was using raw, this can result in potential security holes, so modern libvirt instead refuses to use images without explicit backing format. The change in libvirt to reject images without explicit backing format has pointed out that a number of tools have been far too reliant on probing in the past. It's time to set a better example in our own iotests of properly setting this parameter. iotest calls to create, rebase, and convert are all impacted to some degree. It's a bit annoying that we are inconsistent on command line - while all of those accept -o backing_file=...,backing_fmt=..., the shortcuts are different: create and rebase have -b and -F, while convert has -B but no -F. (amend has no shortcuts, but the previous patch just deprecated the use of amend to change backing chains). Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200706203954.341758-9-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-14 15:18:59 +02:00
Eric Blake	c61e684e44	block: Exploit BDRV_BLOCK_EOF for larger zero blocks When we have a BDS with unallocated clusters, but asking the status of its underlying bs->file or backing layer encounters an end-of-file condition, we know that the rest of the unallocated area will read as zeroes. However, pre-patch, this required two separate calls to bdrv_get_block_status(), as the first call stops at the point where the underlying file ends. Thanks to BDRV_BLOCK_EOF, we can now widen the results of the primary status if the secondary status already includes BDRV_BLOCK_ZERO. In turn, this fixes a TODO mentioned in iotest 154, where we can now see that all sectors in a partial cluster at the end of a file read as zero when coupling the shorter backing file's status along with our knowledge that the remaining sectors came from an unallocated cluster. Also, note that the loop in bdrv_co_get_block_status_above() had an inefficent exit: in cases where the active layer sets BDRV_BLOCK_ZERO but does NOT set BDRV_BLOCK_ALLOCATED (namely, where we know we read zeroes merely because our unallocated clusters lie beyond the backing file's shorter length), we still ended up probing the backing layer even though we already had a good answer. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170505021500.19315-3-eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-06-30 21:48:06 +08:00
Eric Blake	fbaa6bb3d3	qcow2: Optimize write zero of unaligned tail cluster We've already improved discards to operate efficiently on the tail of an unaligned qcow2 image; it's time to make a similar improvement to write zeroes. The special case is only valid at the tail cluster of a file, where we must recognize that any sectors beyond the image end would implicitly read as zero, and therefore should not penalize our logic for widening a partial cluster into writing the whole cluster as zero. However, note that for now, the special case of end-of-file is only recognized if there is no backing file, or if the backing file has the same length; that's because when the backing file is shorter than the active layer, we don't have code in place to recognize that reads of a sector unallocated at the top and beyond the backing end-of-file are implicitly zero. It's not much of a real loss, because most people don't use images that aren't cluster-aligned, or where the active layer is a different size than the backing layer (especially where the difference falls within a single cluster). Update test 154 to cover the new scenarios, using two images of intentionally differing length. While at it, fix the test to gracefully skip when run as ./check -qcow2 -o compat=0.10 154 since the older format lacks zero clusters already required earlier in the test. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170507000552.20847-11-eblake@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-11 14:28:07 +02:00
Eric Blake	d9ca2214bd	iotests: Improve _filter_qemu_img_map Although _filter_qemu_img_map documents that it scrubs offsets, it was only doing so for human mode. Of the existing tests using the filter (97, 122, 150, 154, 176), two of them are affected, but it does not hurt the validity of the tests to not require particular mappings (another test, 66, uses offsets but intentionally does not pass through _filter_qemu_img_map, because it checks that offsets are unchanged before and after an operation). Another justification for this patch is that it will allow a future patch to utilize 'qemu-img map --output=json' to check the status of preallocated zero clusters without regards to the mapping (since the qcow2 mapping can be very sensitive to the chosen cluster size, when preallocation is not in use). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170507000552.20847-9-eblake@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-05-11 14:28:07 +02:00
Eric Blake	ebb718a5c7	qcow2: Catch more unaligned write_zero into zero cluster is_zero_cluster() and is_zero_cluster_top_locked() are used only by qcow2_co_write_zeroes(). The former is too broad (we don't care if the sectors we are about to overwrite are non-zero, only that all other sectors in the cluster are zero), so it needs to be called up to twice but with smaller limits - rename it along with adding the neeeded parameter. The latter can be inlined for more compact code. The testsuite change shows that we now have a sparser top file when an unaligned write_zeroes overwrites the only portion of the backing file with data. Based on a patch proposal by Denis V. Lunev. CC: Denis V. Lunev <den@openvz.org> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-06-08 10:21:08 +02:00
Eric Blake	31ad4fdf91	qemu-iotests: Test one more spot for optimizing write_zeroes Add another test to 154, showing that we currently allocate a data cluster in the top layer if any sector of the backing file was allocated. The next patch will optimize this case. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-06-08 10:21:08 +02:00
Denis V. Lunev	443668ca40	block: split write_zeroes always We should split requests even if they are less than write_zeroes_alignment. For example we can have the following request: offset 62k size 4k write_zeroes_alignment 64k The original code sent 1 request covering 2 qcow2 clusters, and resulted in both clusters being allocated. But by splitting the request, we can cater to the case where one of the two clusters can be zeroed as a whole, for only 1 cluster allocated after the operation. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Eric Blake <eblake@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> Message-Id: <1463476543-3087-2-git-send-email-den@openvz.org> [eblake: Avoid exceeding nb_sectors, hoist alignment checks out of loop, and update testsuite to show that patch works] Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-06-08 10:21:08 +02:00
Kevin Wolf	1ef7d01021	qemu-iotests: Some more write_zeroes tests This covers some more write_zeroes cases which are relevant for the recent qcow2 optimisations that check the allocation status of the backing file for partial cluster write_zeroes requests. This needs to be separate from 034 because we can only support qcow2 in this test case for multiple reasons: We check the allocation status after write_zeroes with 'qemu-img map' and the optimised behaviour that produces zero clusters is only implemented in qcow2; second, the map command returns offsets that are qcow2 specific; and finally, we also use 512 byte clusters which aren't supported for formats like qed. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2016-05-19 16:45:31 +02:00

9 Commits