mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Alberto Garcia	8f91d6906c	qcow2: Add calculate_l2_meta() handle_alloc() creates a QCowL2Meta structure in order to update the image metadata and perform the necessary copy-on-write operations. This patch moves that code to a separate function so it can be used from other places. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <e5bc4a648dac31972bfa7a0e554be8064be78799.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-25 08:33:20 +02:00
Alberto Garcia	388e581615	qcow2: Convert qcow2_get_cluster_offset() into qcow2_get_host_offset() qcow2_get_cluster_offset() takes an (unaligned) guest offset and returns the (aligned) offset of the corresponding cluster in the qcow2 image. In practice none of the callers need to know where the cluster starts so this patch makes the function calculate and return the final host offset directly. The function is also renamed accordingly. There is a pre-existing exception with compressed clusters: in this case the function returns the complete cluster descriptor (containing the offset and size of the compressed data). This does not change with this patch but it is now documented. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <ffae6cdc5ca8950e8280ac0f696dcc376cb07095.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-25 08:33:20 +02:00
Alberto Garcia	9c4269d54b	qcow2: Make Qcow2AioTask store the full host offset The file_cluster_offset field of Qcow2AioTask stores a cluster-aligned host offset. In practice this is not very useful because all users() of this structure need the final host offset into the cluster, which they calculate using host_offset = file_cluster_offset + offset_into_cluster(s, offset) There is no reason why Qcow2AioTask cannot store host_offset directly and that is what this patch does. () compressed clusters are the exception: in this case what file_cluster_offset was storing was the full compressed cluster descriptor (offset + size). This does not change with this patch but it is documented now. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <07c4b15c644dcf06c9459f98846ac1c4ea96e26f.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-25 08:33:20 +02:00
Marc-André Lureau	5e5733e599	meson: convert block Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-21 06:30:18 -04:00
Paolo Bonzini	243af0225a	trace: switch position of headers to what Meson requires Meson doesn't enjoy the same flexibility we have with Make in choosing the include path. In particular the tracing headers are using $(build_root)/$(<D). In order to keep the include directives unchanged, the simplest solution is to generate headers with patterns like "trace/trace-audio.h" and place forwarding headers in the source tree such that for example "audio/trace.h" includes "trace/trace-audio.h". This patch is too ugly to be applied to the Makefiles now. It's only a way to separate the changes to the tracing header files from the Meson rewrite of the tracing logic. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-21 06:18:24 -04:00
Stefan Reiter	7661a886a1	block/block-copy: always align copied region to cluster size Since commit `42ac214406` (block/block-copy: refactor task creation) block_copy_task_create calculates the area to be copied via bdrv_dirty_bitmap_next_dirty_area, but that can return an unaligned byte count if the image's last cluster end is not aligned to the bitmap's granularity. Always ALIGN_UP the resulting bytes value to satisfy block_copy_do_copy, which requires the 'bytes' parameter to be aligned to cluster size. Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com> Message-Id: <20200810095523.15071-1-s.reiter@proxmox.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-10 17:12:46 +02:00
Tuguoyi	348fcc4f7a	qcow2-cluster: Fix integer left shift error in qcow2_alloc_cluster_link_l2() When calculating the offset, the result of left shift operation will be promoted to type int64 automatically because the left operand of + operator is uint64_t. but the result after integer promotion may be produce an error value for us and trigger the following asserting error. For example, consider i=0x2000, cluster_bits=18, the result of left shift operation will be 0x80000000. Cause argument i is of signed integer type, the result is automatically promoted to 0xffffffff80000000 which is not we expected The way to trigger the assertion error: qemu-img create -f qcow2 -o preallocation=full,cluster_size=256k tmpdisk 10G This patch fix it by casting @i to uint64_t before doing left shift operation Signed-off-by: Guoyi Tu <tu.guoyi@h3c.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 81ba90fe0c014f269621c283269b42ad@h3c.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-08-05 14:56:11 +01:00
Max Reitz	fe16c7ddf8	qcow2: Release read-only bitmaps when inactivated During migration, we release all bitmaps after storing them on disk, as long as they are (1) stored on disk, (2) not read-only, and (3) consistent. (2) seems arbitrary, though. The reason we do not release them is because we do not write them, as there is no need to; and then we just forget about all bitmaps that we have not written to the file. However, read-only persistent bitmaps are still in the file and in sync with their in-memory representation, so we may as well release them just like any R/W bitmap that we have updated. It leads to actual problems, too: After migration, letting the source continue may result in an error if there were any bitmaps on read-only nodes (such as backing images), because those have not been released by bdrv_inactive_all(), but bdrv_invalidate_cache_all() attempts to reload them (which fails, because they are still present in memory). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200730120234.49288-2-mreitz@redhat.com> Tested-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-08-03 08:59:37 -05:00
Peter Maydell	5045be872d	nbd patches for 2020-07-28 - fix NBD handling of trim/zero requests larger than 2G - allow no-op resizes on NBD (in turn fixing qemu-img convert -c into NBD) - several deadlock fixes when using NBD reconnect -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAl8gPV4ACgkQp6FrSiUn Q2ozdQgAiDHaHG2NX4jmduID7677/XhsLoVl1MV7UZnU+y9qQ2p+Mbsw1oMneu8P Dtfgx/mlWVGu68gn31f4xVq74VTZH6p3IGV7PMcYZ50xbESoFs6CYUwUWUp1GeC3 +kPOl0EpLvm1W/V93sKmg8FflGmNiJHNkfl/ddfk0gs6Z3EfjkmGJt7IP/pv1UCs 4icWvCJsqw2z8TnEwtTpMX5HZlWth1x37lUOShlPL5kA5hZqU+zYU/bYB5iKx+16 MebYg7C7CXYCCtH9cDH/swUWhOdQLkywA6yBAwc1zENsKy84aIAJIUls/Ji0q6CY A4s5c0FovLBuMDd9oLr0kJbkJQeVZA== =DD6l -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2020-07-28' into staging nbd patches for 2020-07-28 - fix NBD handling of trim/zero requests larger than 2G - allow no-op resizes on NBD (in turn fixing qemu-img convert -c into NBD) - several deadlock fixes when using NBD reconnect # gpg: Signature made Tue 28 Jul 2020 15:59:42 BST # gpg: using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full] # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full] # gpg: aka "[jpeg image of size 6874]" [full] # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2020-07-28: block/nbd: nbd_co_reconnect_loop(): don't sleep if drained block/nbd: on shutdown terminate connection attempt block/nbd: allow drain during reconnect attempt block/nbd: split nbd_establish_connection out of nbd_client_connect iotests: Test convert to qcow2 compressed to NBD iotests: Add more qemu_img helpers iotests: Make qemu_nbd_popen() a contextmanager block: nbd: Fix convert qcow2 compressed to nbd nbd: Fix large trim/zero requests Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-07-28 20:43:03 +01:00
Peter Maydell	0c4fa5bc1a	Block patches for 5.1.0: - Fix block I/O for split transfers - Fix iotest 197 for non-qcow2 formats -----BEGIN PGP SIGNATURE----- iQFGBAABCAAwFiEEkb62CjDbPohX0Rgp9AfbAGHVz0AFAl8gK/gSHG1yZWl0ekBy ZWRoYXQuY29tAAoJEPQH2wBh1c9AR+kIALv+Z/A6SPpsAHjpyuRbluuhznfqPuiX mIVX0qNhsFBDAUVw1tOkMtfxOIvuaQW/QWzM0UPaHqB/I4ckzE6Dp98ys9uwHPdq ez23blWvBuB3P3y2ZBAYhhRlCqt3w4uI/lIJMu7VZBghXxj3fGcuTnLlWx8gb1IH 74MiBX8XPt532FiFTnpzxgns8NYkZY8mF6zduGqBPx6bPmdNdDfqAhL68Fv8uKJA k4dVH6ffPLZD+RrCz9GL5rsYQ6NR6tfyEoRMPqtJznhtzWwu5h5EF3p46VkcKheI k0axygEBAr9JbeCwbIK3a4hjQ7eaFQ6j9JR+lPZBRaDbLHv/xGNNuvw= =C4Lq -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2020-07-28' into staging Block patches for 5.1.0: - Fix block I/O for split transfers - Fix iotest 197 for non-qcow2 formats # gpg: Signature made Tue 28 Jul 2020 14:45:28 BST # gpg: using RSA key 91BEB60A30DB3E8857D11829F407DB0061D5CF40 # gpg: issuer "mreitz@redhat.com" # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full] # Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40 * remotes/maxreitz/tags/pull-block-2020-07-28: iotests/197: Fix for non-qcow2 formats iotests/028: Add test for cross-base-EOF reads block: Fix bdrv_aligned_p*v() for qiov_offset != 0 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-07-28 18:00:21 +01:00
Vladimir Sementsov-Ogievskiy	12c75e20a2	block/nbd: nbd_co_reconnect_loop(): don't sleep if drained We try to go to wakeable sleep, so that, if drain begins it will break the sleep. But what if nbd_client_co_drain_begin() already called and s->drained is already true? We'll go to sleep, and drain will have to wait for the whole timeout. Let's improve it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727184751.15704-5-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-07-28 09:54:43 -05:00
Vladimir Sementsov-Ogievskiy	fbeb3e63b3	block/nbd: on shutdown terminate connection attempt On shutdown nbd driver may be in a connecting state. We should shutdown it as well, otherwise we may hang in nbd_teardown_connection, waiting for conneciton_co to finish in BDRV_POLL_WHILE(bs, s->connection_co) loop if remote server is down. How to reproduce the dead lock: 1. Create nbd-fault-injector.conf with the following contents: [inject-error "mega1"] event=data io=readwrite when=before 2. In one terminal run nbd-fault-injector in a loop, like this: n=1; while true; do echo $n; ((n++)); ./nbd-fault-injector.py 127.0.0.1:10000 nbd-fault-injector.conf; done 3. In another terminal run qemu-io in a loop, like this: n=1; while true; do echo $n; ((n++)); ./qemu-io -c 'read 0 512' nbd://127.0.0.1:10000; done After some time, qemu-io will hang. Note, that this hang may be triggered by another bug, so the whole case is fixed only together with commit "block/nbd: allow drain during reconnect attempt". Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200727184751.15704-4-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-07-28 09:54:43 -05:00
Vladimir Sementsov-Ogievskiy	dd1ec1a4af	block/nbd: allow drain during reconnect attempt It should be safe to reenter qio_channel_yield() on io/channel read/write path, so it's safe to reduce in_flight and allow attaching new aio context. And no problem to allow drain itself: connection attempt is not a guest request. Moreover, if remote server is down, we can hang in negotiation, blocking drain section and provoking a dead lock. How to reproduce the dead lock: 1. Create nbd-fault-injector.conf with the following contents: [inject-error "mega1"] event=data io=readwrite when=before 2. In one terminal run nbd-fault-injector in a loop, like this: n=1; while true; do echo $n; ((n++)); ./nbd-fault-injector.py 127.0.0.1:10000 nbd-fault-injector.conf; done 3. In another terminal run qemu-io in a loop, like this: n=1; while true; do echo $n; ((n++)); ./qemu-io -c 'read 0 512' nbd://127.0.0.1:10000; done After some time, qemu-io will hang trying to drain, for example, like this: #3 aio_poll (ctx=0x55f006bdd890, blocking=true) at util/aio-posix.c:600 #4 bdrv_do_drained_begin (bs=0x55f006bea710, recursive=false, parent=0x0, ignore_bds_parents=false, poll=true) at block/io.c:427 #5 bdrv_drained_begin (bs=0x55f006bea710) at block/io.c:433 #6 blk_drain (blk=0x55f006befc80) at block/block-backend.c:1710 #7 blk_unref (blk=0x55f006befc80) at block/block-backend.c:498 #8 bdrv_open_inherit (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000", reference=0x0, options=0x55f006be86d0, flags=24578, parent=0x0, child_class=0x0, child_role=0, errp=0x7fffba154620) at block.c:3491 #9 bdrv_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000", reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at block.c:3513 #10 blk_new_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000", reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at block/block-backend.c:421 And connection_co stack like this: #0 qemu_coroutine_switch (from_=0x55f006bf2650, to_=0x7fe96e07d918, action=COROUTINE_YIELD) at util/coroutine-ucontext.c:302 #1 qemu_coroutine_yield () at util/qemu-coroutine.c:193 #2 qio_channel_yield (ioc=0x55f006bb3c20, condition=G_IO_IN) at io/channel.c:472 #3 qio_channel_readv_all_eof (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0, niov=1, errp=0x7fe96d729eb0) at io/channel.c:110 #4 qio_channel_readv_all (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0, niov=1, errp=0x7fe96d729eb0) at io/channel.c:143 #5 qio_channel_read_all (ioc=0x55f006bb3c20, buf=0x7fe96d729d28 "\300.\366\004\360U", buflen=8, errp=0x7fe96d729eb0) at io/channel.c:247 #6 nbd_read (ioc=0x55f006bb3c20, buffer=0x7fe96d729d28, size=8, desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at /work/src/qemu/master/include/block/nbd.h:365 #7 nbd_read64 (ioc=0x55f006bb3c20, val=0x7fe96d729d28, desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at /work/src/qemu/master/include/block/nbd.h:391 #8 nbd_start_negotiate (aio_context=0x55f006bdd890, ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0, outioc=0x55f006bf19f8, structured_reply=true, zeroes=0x7fe96d729dca, errp=0x7fe96d729eb0) at nbd/client.c:904 #9 nbd_receive_negotiate (aio_context=0x55f006bdd890, ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0, outioc=0x55f006bf19f8, info=0x55f006bf1a00, errp=0x7fe96d729eb0) at nbd/client.c:1032 #10 nbd_client_connect (bs=0x55f006bea710, errp=0x7fe96d729eb0) at block/nbd.c:1460 #11 nbd_reconnect_attempt (s=0x55f006bf19f0) at block/nbd.c:287 #12 nbd_co_reconnect_loop (s=0x55f006bf19f0) at block/nbd.c:309 #13 nbd_connection_entry (opaque=0x55f006bf19f0) at block/nbd.c:360 #14 coroutine_trampoline (i0=113190480, i1=22000) at util/coroutine-ucontext.c:173 Note, that the hang may be triggered by another bug, so the whole case is fixed only together with commit "block/nbd: on shutdown terminate connection attempt". Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200727184751.15704-3-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-07-28 09:54:43 -05:00
Vladimir Sementsov-Ogievskiy	fa35591b9c	block/nbd: split nbd_establish_connection out of nbd_client_connect We are going to implement non-blocking version of nbd_establish_connection, which for a while will be used only for nbd_reconnect_attempt, not for nbd_open, so we need to call it separately. Refactor nbd_reconnect_attempt in a way which makes next commit simpler. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200727184751.15704-2-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-07-28 09:54:43 -05:00
Nir Soffer	a2b333c018	block: nbd: Fix convert qcow2 compressed to nbd When converting to qcow2 compressed format, the last step is a special zero length compressed write, ending in a call to bdrv_co_truncate(). This call always fails for the nbd driver since it does not implement bdrv_co_truncate(). For block devices, which have the same limits, the call succeeds since the file driver implements bdrv_co_truncate(). If the caller asked to truncate to the same or smaller size with exact=false, the truncate succeeds. Implement the same logic for nbd. Example failing without this change: In one shell start qemu-nbd: $ truncate -s 1g test.tar $ qemu-nbd --socket=/tmp/nbd.sock --persistent --format=raw --offset 1536 test.tar In another shell convert an image to qcow2 compressed via NBD: $ echo "disk data" > disk.raw $ truncate -s 1g disk.raw $ qemu-img convert -f raw -O qcow2 -c disk1.raw nbd+unix:///?socket=/tmp/nbd.sock; echo $? 1 qemu-img failed, but the conversion was successful: $ qemu-img info nbd+unix:///?socket=/tmp/nbd.sock image: nbd+unix://?socket=/tmp/nbd.sock file format: qcow2 virtual size: 1 GiB (1073741824 bytes) ... $ qemu-img check nbd+unix:///?socket=/tmp/nbd.sock No errors were found on the image. 1/16384 = 0.01% allocated, 100.00% fragmented, 100.00% compressed clusters Image end offset: 393216 $ qemu-img compare disk.raw nbd+unix:///?socket=/tmp/nbd.sock Images are identical. Fixes: https://bugzilla.redhat.com/1860627 Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-Id: <20200727215846.395443-2-nsoffer@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: typo fixes] Signed-off-by: Eric Blake <eblake@redhat.com>	2020-07-28 09:54:19 -05:00
Peter Maydell	2649915121	bitmaps patches for 2020-07-27 - Improve handling of various post-copy bitmap migration scenarios. A lost bitmap should merely mean that the next backup must be full rather than incremental, rather than abruptly breaking the entire guest migration. - Associated iotest improvements -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAl8fPRkACgkQp6FrSiUn Q2qanQf/dRTrqZ7/hs8aENySf44o0dBzOLZr+FBcrqEj2sd0c6jPzV2X5CVtnA1v gBgKJJGLpti3mSeNQDbaXZIQrsesBAuxvJsc6vZ9npDCdMYnK/qPE3Zfw1bx12qR cb39ba28P4izgs216h92ZACtUewnvjkxyJgN7zfmCJdNcwZINMUItAS183tSbQjn n39Wb7a+umsRgV9HQv/6cXlQIPqFMyAOl5kkzV3evuw7EBoHFnNq4cjPrUnjkqiD xf2pcSomaedYd37SpvoH57JxfL3z/90OBcuXhFvbqFk4FgQ63rJ32nRve2ZbIDI0 XPbohnYjYoFv6Xs/jtTzctZCbZ+jTg== =1dmz -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/ericb/tags/pull-bitmaps-2020-07-27' into staging bitmaps patches for 2020-07-27 - Improve handling of various post-copy bitmap migration scenarios. A lost bitmap should merely mean that the next backup must be full rather than incremental, rather than abruptly breaking the entire guest migration. - Associated iotest improvements # gpg: Signature made Mon 27 Jul 2020 21:46:17 BST # gpg: using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full] # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full] # gpg: aka "[jpeg image of size 6874]" [full] # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-bitmaps-2020-07-27: (24 commits) migration: Fix typos in bitmap migration comments iotests: Adjust which migration tests are quick qemu-iotests/199: add source-killed case to bitmaps postcopy qemu-iotests/199: add early shutdown case to bitmaps postcopy qemu-iotests/199: check persistent bitmaps qemu-iotests/199: prepare for new test-cases addition migration/savevm: don't worry if bitmap migration postcopy failed migration/block-dirty-bitmap: cancel migration on shutdown migration/block-dirty-bitmap: relax error handling in incoming part migration/block-dirty-bitmap: keep bitmap state for all bitmaps migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete migration/block-dirty-bitmap: rename finish_lock to just lock migration/block-dirty-bitmap: refactor state global variables migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup migration/block-dirty-bitmap: rename state structure types migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start qemu-iotests/199: increase postcopy period qemu-iotests/199: change discard patterns qemu-iotests/199: improve performance: set bitmap by discard ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-07-28 14:38:17 +01:00
Max Reitz	134b7dec6e	block: Fix bdrv_aligned_p*v() for qiov_offset != 0 Since these functions take a @qiov_offset, they must always take it into account when working with @qiov. There are a couple of places where they do not, but they should. Fixes: `65cd4424b9` ("block/io: bdrv_aligned_preadv: use and support qiov_offset") Fixes: `28c4da2869` ("block/io: bdrv_aligned_pwritev: use and support qiov_offset") Reported-by: Claudio Fontana <cfontana@suse.de> Reported-by: Bruce Rogers <brogers@suse.com> Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200728120806.265916-2-mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Tested-by: Claudio Fontana <cfontana@suse.de> Tested-by: Bruce Rogers <brogers@suse.com>	2020-07-28 15:28:47 +02:00
Andrey Shinkevich	8098969cf2	qcow2: Fix capitalization of header extension constant. Make the capitalization of the hexadecimal numbers consistent for the QCOW2 header extension constants in docs/interop/qcow2.txt. Suggested-by: Eric Blake <eblake@redhat.com> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <1594973699-781898-2-git-send-email-andrey.shinkevich@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-07-27 15:39:58 -05:00
Max Reitz	984c367814	block/amend: Check whether the node exists We should check whether the user-specified node-name actually refers to a node. The simplest way to do that is to use bdrv_lookup_bs() instead of bdrv_find_node() (the former wraps the latter, and produces an error message if necessary). Reported-by: Coverity (CID 1430268) Fixes: `ced914d0ab` Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200710095037.10885-1-mreitz@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>	2020-07-27 12:37:25 +02:00
Peter Maydell	0c1fd2f41f	Block layer patches: - file-posix: Handle `EINVAL` fallocate return value - qemu-img convert -n: Keep qcow2 v2 target sparse -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAl8XDZgRHGt3b2xmQHJl ZGhhdC5jb20ACgkQfwmycsiPL9bpFQ/9EnS4iV9w0KW/NuJ4FIVdBD/VZFzokDLi 1vXVVEjoxAxxiP8KlGM9HRi5NtvOMgzKhNGias0wOFiBorx8Ppfc+3sqwygc2dnw Vbl/od2D7xQZkddnp4Upo70m+eWRW6xaxX+lAcl6iS3gBPDwExLaYfBN8lFUyRrs T4C0miD+abEEyL3C5A4cEZJ7CIs0n7AqZkqgytWA7clwy79VgDSuMOgP6DOP1tGH 1uK4gMCB0xbn+PHk96lXPORcLwDBOP0PIluo/zBmffzsEZN1Lv5ddVmxMQWSivin UmAbpeEtSw9Py5lRVmLSBYvolVOUleE/Rlzad2iue2be5/G8VP8xiRYMp9mUVpLO +LPMUd9NRkPx7wjUJMPKF0G9FgVO7R0+9J6rC33aKBj2XAlxY6qQlqUN2Jo11/fK 2+9AkU7WVqx3vuW2Zz7wjq3Rjvpg/sK+V3P3Cm6HTwwwPbEwv8GcFe6eKdvJrZ9K hhwiFSUOd90OUAdKOQXKMFSZ/t1TrZhdX882Hvth11/AlQAUY4cxQbSKcc2nrvLu Axk0Va3haOD+ReRTs8W/iYNdrXGZmbr3MCkNiK3QSvnrdj602ompco7xyDTX1/qH 6Hu28q7jUG3p3cApLQIZVjmogfqcGU7SWIY4lp9HZqtGP0z+pmWg46UNqzlKLyv5 Y/fVHHshlRU= =QhOf -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches: - file-posix: Handle `EINVAL` fallocate return value - qemu-img convert -n: Keep qcow2 v2 target sparse # gpg: Signature made Tue 21 Jul 2020 16:45:28 BST # gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6 # gpg: issuer "kwolf@redhat.com" # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: iotests: Test sparseness for qemu-img convert -n qcow2: Implement v2 zero writes with discard if possible file-posix: Handle `EINVAL` fallocate return value Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-07-21 19:25:48 +01:00
Peter Maydell	b50dab9eca	QOM patches for 2020-07-21 -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAl8XDGsSHGFybWJydUBy ZWRoYXQuY29tAAoJEDhwtADrkYZTPTYP/11AHdREDvz+KDl7AXtndxE9wo6GPGUy GhMSXODgOm6a4/nk4xQQR7MU+57C2IMnYbAAdrNXA3YVHjUPMURVsAeTsl9ruchC 1JiNrtXKgaWv5P8WeNzuXcH9B7n3UZ/mV4FH8v0FhjZhsH6EHP0zOhgNScPyQurz AsmYN5hSoSz8jRFQ8xlNzBhNrYX5dG6fOs4M6yBRUgzaCuRcSn/+cgjngfAmcH4k F0KK+rBQA+KUAKYp0QUaUcbD5oj/6KaQMsfqsFROR8m+AAJrhkP2ffmcDCHfCW8A SQsp8k2SkdHXNgRoDJSea1TYMitCFrTBDv+MfBWOLH4ewQrGAzvsuG5o3FlnVeN7 CKHkkxqOEifJ/vnThVBTIsqxf8/HefDXi1B5BXSKYSvnPKnnh8HCZ7VxvAsNGZiR epr4gEGBCcb9/bXFUmVVPz6H+lUWORzhF4P0spNJwH5BT9FLybWgJt9o4KH+pxYA DL4GF5vOkBgIhgUR+vn535vik7M38u6gsB8m22s2FRkZmIxTpxp9eH2ehHGSoVYM Yl1kVzJmFMPakl0gG1dMmM4+DJGRTHLCfBFS4pzMs9DaCNHUF3CB8FjXQs3NIFCS XGnChbri/wF83DEueTIrUAqF2w0XgEy55aVBOZkmFT6DXPXbx+Y3Q/AomaktBLyv FFUe9SfMn63P =HTNj -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/armbru/tags/pull-qom-2020-07-21' into staging QOM patches for 2020-07-21 # gpg: Signature made Tue 21 Jul 2020 16:40:27 BST # gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653 # gpg: issuer "armbru@redhat.com" # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full] # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full] # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-qom-2020-07-21: qom: Make info qom-tree sort children more efficiently qom: Document object_get_canonical_path() returns malloced string qom: Change object_get_canonical_path_component() not to malloc Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-07-21 18:31:52 +01:00
Kevin Wolf	61b3043965	qcow2: Implement v2 zero writes with discard if possible qcow2 version 2 images don't support the zero flag for clusters, so for write_zeroes requests, we return -ENOTSUP and get explicit zero buffer writes. If the image doesn't have a backing file, we can do better: Just discard the respective clusters. This is relevant for 'qemu-img convert -O qcow2 -n', where qemu-img has to assume that the existing target image may contain any data, so it has to write zeroes. Without this patch, this results in a fully allocated target image, even if the source image was empty. Reported-by: Nir Soffer <nsoffer@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200721135520.72355-2-kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-21 16:28:57 +02:00
Antoine Damhet	bae127d4dc	file-posix: Handle `EINVAL` fallocate return value The `detect-zeroes=unmap` option may issue unaligned `FALLOC_FL_PUNCH_HOLE` requests, raw block devices can (and will) return `EINVAL`, qemu should then write the zeroes to the blockdev instead of issuing an `IO_ERROR`. The problem can be reprodced like this: $ qemu-io -c 'write -P 0 42 1234' --image-opts driver=host_device,filename=/dev/loop0,detect-zeroes=unmap write failed: Invalid argument Signed-off-by: Antoine Damhet <antoine.damhet@blade-group.com> Message-Id: <20200717135603.51180-1-antoine.damhet@blade-group.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-21 16:28:57 +02:00
Markus Armbruster	7a309cc95b	qom: Change object_get_canonical_path_component() not to malloc object_get_canonical_path_component() returns a malloced copy of a property name on success, null on failure. 19 of its 25 callers immediately free the returned copy. Change object_get_canonical_path_component() to return the property name directly. Since modifying the name would be wrong, adjust the return type to const char *. Drop the free from the 19 callers become simpler, add the g_strdup() to the other six. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200714160202.3121879-4-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Li Qiang <liq3ea@gmail.com>	2020-07-21 16:23:43 +02:00
Stefan Hajnoczi	1d719ddc35	block: fix bdrv_aio_cancel() for ENOMEDIUM requests bdrv_aio_cancel() calls aio_poll() on the AioContext for the given I/O request until it has completed. ENOMEDIUM requests are special because there is no BlockDriverState when the drive has no medium! Define a .get_aio_context() function for BlkAioEmAIOCB requests so that bdrv_aio_cancel() can find the AioContext where the completion BH is pending. Without this function bdrv_aio_cancel() aborts on ENOMEDIUM requests! libFuzzer triggered the following assertion: cat << EOF \| qemu-system-i386 -M pc-q35-5.0 \ -nographic -monitor none -serial none \ -qtest stdio -trace ide\* outl 0xcf8 0x8000fa24 outl 0xcfc 0xe106c000 outl 0xcf8 0x8000fa04 outw 0xcfc 0x7 outl 0xcf8 0x8000fb20 write 0x0 0x3 0x2780e7 write 0xe106c22c 0xd 0x1130c218021130c218021130c2 write 0xe106c218 0x15 0x110010110010110010110010110010110010110010 EOF ide_exec_cmd IDE exec cmd: bus 0x56170a77a2b8; state 0x56170a77a340; cmd 0xe7 ide_reset IDEstate 0x56170a77a340 Aborted (core dumped) (gdb) bt #1 0x00007ffff4f93895 in abort () at /lib64/libc.so.6 #2 0x0000555555dc6c00 in bdrv_aio_cancel (acb=0x555556765550) at block/io.c:2745 #3 0x0000555555dac202 in blk_aio_cancel (acb=0x555556765550) at block/block-backend.c:1546 #4 0x0000555555b1bd74 in ide_reset (s=0x555557213340) at hw/ide/core.c:1318 #5 0x0000555555b1e3a1 in ide_bus_reset (bus=0x5555572132b8) at hw/ide/core.c:2422 #6 0x0000555555b2aa27 in ahci_reset_port (s=0x55555720eb50, port=2) at hw/ide/ahci.c:650 #7 0x0000555555b29fd7 in ahci_port_write (s=0x55555720eb50, port=2, offset=44, val=16) at hw/ide/ahci.c:360 #8 0x0000555555b2a564 in ahci_mem_write (opaque=0x55555720eb50, addr=556, val=16, size=1) at hw/ide/ahci.c:513 #9 0x000055555598415b in memory_region_write_accessor (mr=0x55555720eb80, addr=556, value=0x7fffffffb838, size=1, shift=0, mask=255, attrs=...) at softmmu/memory.c:483 Looking at bdrv_aio_cancel: 2728 /* async I/Os / 2729 2730 void bdrv_aio_cancel(BlockAIOCB acb) 2731 { 2732 qemu_aio_ref(acb); 2733 bdrv_aio_cancel_async(acb); 2734 while (acb->refcnt > 1) { 2735 if (acb->aiocb_info->get_aio_context) { 2736 aio_poll(acb->aiocb_info->get_aio_context(acb), true); 2737 } else if (acb->bs) { 2738 /* qemu_aio_ref and qemu_aio_unref are not thread-safe, so 2739 * assert that we're not using an I/O thread. Thread-safe 2740 * code should use bdrv_aio_cancel_async exclusively. 2741 */ 2742 assert(bdrv_get_aio_context(acb->bs) == qemu_get_aio_context()); 2743 aio_poll(bdrv_get_aio_context(acb->bs), true); 2744 } else { 2745 abort(); <=============== 2746 } 2747 } 2748 qemu_aio_unref(acb); 2749 } Fixes: `02c50efe08` ("block: Add bdrv_aio_cancel_async") Reported-by: Alexander Bulekov <alxndr@bu.edu> Buglink: https://bugs.launchpad.net/qemu/+bug/1878255 Originally-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20200720100141.129739-1-stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-07-21 12:00:38 +02:00
Maxim Levitsky	662d0c5392	block/crypto: disallow write sharing by default My commit 'block/crypto: implement the encryption key management' accidently allowed raw luks images to be shared between different qemu processes without share-rw=on explicit override. Fix that. Fixes: `bbfdae91fb` ("block/crypto: implement the encryption key management") Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1857490 Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20200719122059.59843-2-mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-07-21 10:49:02 +02:00
Kevin Wolf	a8c5cf27c9	file-posix: Fix leaked fd in raw_open_common() error path Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200717105426.51134-4-kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-17 14:20:57 +02:00
Kevin Wolf	bca5283bd4	file-posix: Fix check_hdev_writable() with auto-read-only For Linux block devices, being able to open the device read-write doesn't necessarily mean that the device is actually writable (one example is a read-only LV, as you get with lvchange -pr <device>). We have check_hdev_writable() to check this condition and fail opening the image read-write if it's not actually writable. However, this check doesn't take auto-read-only into account, but results in a hard failure instead of downgrading to read-only where possible. Fix this and do the writable check not based on BDRV_O_RDWR, but only when this actually results in opening the file read-write. A second check is inserted in raw_reconfigure_getfd() to have the same check when dynamic auto-read-only upgrades an image file from read-only to read-write. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200717105426.51134-3-kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-17 14:20:57 +02:00
Kevin Wolf	20eaf1bf6e	file-posix: Move check_hdev_writable() up We'll need to call it in raw_open_common(), so move the function to avoid a forward declaration. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200717105426.51134-2-kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-17 14:20:57 +02:00
Kevin Wolf	5edc85571e	file-posix: Allow byte-aligned O_DIRECT with NFS Since commit `a6b257a08e` ('file-posix: Handle undetectable alignment'), we assume that if we open a file with O_DIRECT and alignment probing returns 1, we just couldn't find out the real alignment requirement because some filesystems make the requirement only for allocated blocks. In this case, a safe default of 4k is used. This is too strict for NFS, which does actually allow byte-aligned requests even with O_DIRECT. Because we can't distinguish both cases with generic code, let's just look at the file system magic and disable s->needs_alignment for NFS. This way, O_DIRECT can still be used on NFS for images that are not aligned to 4k. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200716142601.111237-3-kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-17 14:20:57 +02:00
Marc-André Lureau	a08464521c	Remove VXHS block device The vxhs code doesn't compile since v2.12.0. There's no point in fixing and then adding CI for a config that our users have demonstrated that they do not use; better to just remove it. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200711065926.2204721-1-marcandre.lureau@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-17 14:20:57 +02:00
Peter Maydell	d2628b1eb7	Block layer patches: - file-posix: Mitigate file fragmentation with extent size hints - Tighten qemu-img rules on missing backing format - qemu-img map: Don't limit block status request size - Fix crash with virtio-scsi and iothreads -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAl8NsgMRHGt3b2xmQHJl ZGhhdC5jb20ACgkQfwmycsiPL9Z0tA//eqauxD7cTEpwrtLNrRtpiBtMG64BBpxz QfkURzB38bMVahHlwq3Gt7Zcov8V4V7vxK66h688Z/fhw3vmqIeVe8+P6+Y5s9FL jil8lewHuLTa6xELeugoV7SZXH8AAh1W2fQmiR7EPiOmpSE0wf7C5IShVlX8A04E r0n09+61qGjRIe1hNTwTtldqQEfx6UGnxQWcQb81JUPA1lZhX3cnPg/j94Bofr+m v/DbVTfsmUtTMjc0PdU7n4DKTWu8OS5B/X0unF21rTtO//cYBrhAeY3ax2jbFBWi CIZK8HLI5m9/HFyltql1LOsd+B5TtfnXMfSdvDh2jaVUlto7wTeTnWU1fv4wxUB5 hk7XgJo/y203ebFNHpTmW8tvLfGTP8uqCVfOEFxzjy+JHGrarlbWkwL2LMOFFAZ2 s2WcwlfqiYGFTG4+OFdhPf9qPWKSqMr+jTdZJTse64/c6+YXWHk+pP9lfYEUOgSi OYwdQUY9uiZ1K13q5Tif2TbFvs+c118xdTgVhAV7VtfPnWc3c647dX7iaq8Szknc IT93670Iqf/PzEj+L7XUbbLIIsAcmxD0sr7QAQEt7bfiYIDRIQLiVPyzXplETFg2 SEkvtqBovm84ct7pWQzqA6lFvr3oIFDNquR40XFGozHNnlBeNi5s7pXQnqUBLElr wDDuEi+z5QM= =DB0q -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches: - file-posix: Mitigate file fragmentation with extent size hints - Tighten qemu-img rules on missing backing format - qemu-img map: Don't limit block status request size - Fix crash with virtio-scsi and iothreads # gpg: Signature made Tue 14 Jul 2020 14:24:19 BST # gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6 # gpg: issuer "kwolf@redhat.com" # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: block: Avoid stale pointer dereference in blk_get_aio_context() qemu-img: Deprecate use of -b without -F block: Add support to warn on backing file change without format iotests: Specify explicit backing format where sensible qcow2: Deprecate use of qemu-img amend to change backing file block: Error if backing file fails during creation without -u qcow: Tolerate backing_fmt= vmdk: Add trivial backing_fmt support sheepdog: Add trivial backing_fmt support block: Finish deprecation of 'qemu-img convert -n -o' qemu-img: Flush stdout before before potential stderr messages file-posix: Mitigate file fragmentation with extent size hints iotests/059: Filter out disk size with more standard filter qemu-img map: Don't limit block status request size iotests: Simplify _filter_img_create() a bit Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-07-14 19:39:52 +01:00
Greg Kurz	e6cada9231	block: Avoid stale pointer dereference in blk_get_aio_context() It is possible for blk_remove_bs() to race with blk_drain_all(), causing the latter to dereference a stale blk->root pointer: blk_remove_bs(blk) bdrv_root_unref_child(blk->root) child_bs = blk->root->bs bdrv_detach_child(blk->root) ... g_free(blk->root) <============== blk->root becomes stale bdrv_unref(child_bs) <============ yield at some point A blk_drain_all() can be triggered by some guest action in the meantime, eg. on POWER, SLOF might disable bus mastering on a virtio-scsi-pci device: virtio_write_config() virtio_pci_stop_ioeventfd() virtio_bus_stop_ioeventfd() virtio_scsi_dataplane_stop() blk_drain_all() blk_get_aio_context() bs = blk->root ? blk->root->bs : NULL ^^^^^^^^^ stale Then, depending on one's luck, QEMU either crashes with SEGV or hits the assertion in blk_get_aio_context(). blk->root is set by blk_insert_bs() which calls bdrv_root_attach_child() first. The blk_remove_bs() function should rollback the changes made by blk_insert_bs() in the opposite order (or it should be documented somewhere why this isn't the case). Clear blk->root before calling bdrv_root_unref_child() in blk_remove_bs(). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <159430264541.389456.11925072456012783045.stgit@bahia.lan> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-14 15:24:15 +02:00
Eric Blake	e54ee1b385	block: Add support to warn on backing file change without format For now, this is a mechanical addition; all callers pass false. But the next patch will use it to improve 'qemu-img rebase -u' when selecting a backing file with no format. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Message-Id: <20200706203954.341758-10-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-14 15:18:59 +02:00
Eric Blake	bc5ee6da71	qcow2: Deprecate use of qemu-img amend to change backing file The use of 'qemu-img amend' to change qcow2 backing files is not tested very well. In particular, our implementation has a bug where if a new backing file is provided without a format, then the prior format is blindly reused, even if this results in data corruption, but this is not caught by iotests. There are also situations where amending other options needs access to the original backing file (for example, on a downgrade to a v2 image, knowing whether a v3 zero cluster must be allocated or may be left unallocated depends on knowing whether the backing file already reads as zero), but the command line does not have a nice way to tell us both the backing file to use for opening the image as well as the backing file to install after the operation is complete. Even if we do allow changing the backing file, it is redundant with the existing ability to change backing files via 'qemu-img rebase -u'. It is time to deprecate this support (leaving the existing behavior intact, even if it is buggy), and at a point in the future, require the use of only 'qemu-img rebase' for adjusting backing chain relations, saving 'qemu-img amend' for changes unrelated to the backing chain. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200706203954.341758-8-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-14 15:18:59 +02:00
Eric Blake	344acbd62f	qcow: Tolerate backing_fmt= qcow has no space in the metadata to store a backing format, and there are existing qcow images backed both by raw or by other formats (usually qcow) images, reliant on probing to tell the difference. On the bright side, because we probe every time, raw files are marked as probed and we thus forbid a commit action into the backing file where guest-controlled contents could change the result of the probe next time around (the iotest added here proves that). Still, allowing the user to specify the backing format during creation, even if we can't record it, is a good thing. This patch blindly allows any value that resolves to a known driver, even if the user's request is a mismatch from what probing finds; then the next patch will further enhance things to verify that the user's request matches what we actually probe. With this and the next patch in place, we will finally be ready to deprecate the creation of images where a backing format was not explicitly specified by the user. Note that this is only for QemuOpts usage; there is no change to the QAPI to allow a format through -blockdev. Add a new iotest 301 just for qcow, to demonstrate the latest behavior, and to make it easier to show the improvements made in the next patch. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200706203954.341758-6-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-14 15:18:59 +02:00
Eric Blake	d51a814cf4	vmdk: Add trivial backing_fmt support vmdk already requires that if backing_file is present, that it be another vmdk image (see vmdk_co_do_create). Meanwhile, we want to move towards always being explicit about the backing format for other drivers where it matters. So for convenience, make qemu-img create -F vmdk work, while rejecting all other explicit formats (note that this is only for QemuOpts usage; there is no change to the QAPI to allow a format through -blockdev). Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200706203954.341758-5-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-14 15:18:59 +02:00
Eric Blake	80fa43e7df	sheepdog: Add trivial backing_fmt support Sheepdog already requires that if backing_file is present, that it be another sheepdog image (see sd_co_create). Meanwhile, we want to move towards always being explicit about the backing format for other drivers where it matters. So for convenience, make qemu-img create -F sheepdog work, while rejecting all other explicit formats (note that this is only for QemuOpts usage; there is no change to the QAPI to allow a format through -blockdev). Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200706203954.341758-4-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-14 15:18:59 +02:00
Kevin Wolf	ffa244c84a	file-posix: Mitigate file fragmentation with extent size hints Especially when O_DIRECT is used with image files so that the page cache indirection can't cause a merge of allocating requests, the file will fragment on the file system layer, with a potentially very small fragment size (this depends on the requests the guest sent). On Linux, fragmentation can be reduced by setting an extent size hint when creating the file (at least on XFS, it can't be set any more after the first extent has been allocated), basically giving raw files a "cluster size" for allocation. This adds a create option to set the extent size hint, and changes the default from not setting a hint to setting it to 1 MB. The main reason why qcow2 defaults to smaller cluster sizes is that COW becomes more expensive, which is not an issue with raw files, so we can choose a larger size. The tradeoff here is only potentially wasted disk space. For qcow2 (or other image formats) over file-posix, the advantage should even be greater because they grow sequentially without leaving holes, so there won't be wasted space. Setting even larger extent size hints for such images may make sense. This can be done with the new option, but let's keep the default conservative for now. The effect is very visible with a test that intentionally creates a badly fragmented file with qemu-img bench (the time difference while creating the file is already remarkable) and then looks at the number of extents and the time a simple "qemu-img map" takes. Without an extent size hint: $ ./qemu-img create -f raw -o extent_size_hint=0 ~/tmp/test.raw 10G Formatting '/home/kwolf/tmp/test.raw', fmt=raw size=10737418240 extent_size_hint=0 $ ./qemu-img bench -f raw -t none -n -w ~/tmp/test.raw -c 1000000 -S 8192 -o 0 Sending 1000000 write requests, 4096 bytes each, 64 in parallel (starting at offset 0, step size 8192) Run completed in 25.848 seconds. $ ./qemu-img bench -f raw -t none -n -w ~/tmp/test.raw -c 1000000 -S 8192 -o 4096 Sending 1000000 write requests, 4096 bytes each, 64 in parallel (starting at offset 4096, step size 8192) Run completed in 19.616 seconds. $ filefrag ~/tmp/test.raw /home/kwolf/tmp/test.raw: 2000000 extents found $ time ./qemu-img map ~/tmp/test.raw Offset Length Mapped to File 0 0x1e8480000 0 /home/kwolf/tmp/test.raw real 0m1,279s user 0m0,043s sys 0m1,226s With the new default extent size hint of 1 MB: $ ./qemu-img create -f raw -o extent_size_hint=1M ~/tmp/test.raw 10G Formatting '/home/kwolf/tmp/test.raw', fmt=raw size=10737418240 extent_size_hint=1048576 $ ./qemu-img bench -f raw -t none -n -w ~/tmp/test.raw -c 1000000 -S 8192 -o 0 Sending 1000000 write requests, 4096 bytes each, 64 in parallel (starting at offset 0, step size 8192) Run completed in 11.833 seconds. $ ./qemu-img bench -f raw -t none -n -w ~/tmp/test.raw -c 1000000 -S 8192 -o 4096 Sending 1000000 write requests, 4096 bytes each, 64 in parallel (starting at offset 4096, step size 8192) Run completed in 10.155 seconds. $ filefrag ~/tmp/test.raw /home/kwolf/tmp/test.raw: 178 extents found $ time ./qemu-img map ~/tmp/test.raw Offset Length Mapped to File 0 0x1e8480000 0 /home/kwolf/tmp/test.raw real 0m0,061s user 0m0,040s sys 0m0,014s Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200707142329.48303-1-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-14 15:18:59 +02:00
Eric Blake	00d69986da	nbd: Avoid off-by-one in long export name truncation When snprintf returns the same value as the buffer size, the final byte was truncated to ensure a NUL terminator. Fortunately, such long export names are unusual enough, with no real impact other than what is displayed to the user. Fixes: `5c86bdf120` Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200622210355.414941-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2020-07-13 09:01:01 -05:00
Xie Yongji	c58daf76a6	iscsi: return -EIO when sense fields are meaningless When an I/O request failed, now we only return correct value on scsi check condition. We should also have a default errno such as -EIO in other case. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Message-Id: <20200701105444.3226-2-xieyongji@bytedance.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-07-10 18:02:23 -04:00
Xie Yongji	dd3b00202a	iscsi: handle check condition status in retry loop The handling of check condition was incorrect because we would only do it after retries exceed maximum. Fixes: `8c460269aa` ("iscsi: base all handling of check condition on scsi_sense_to_errno") Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Message-Id: <20200701105444.3226-1-xieyongji@bytedance.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-07-10 18:02:23 -04:00
Vladimir Sementsov-Ogievskiy	795d946d07	nbd: Use ERRP_GUARD() If we want to check error after errp-function call, we need to introduce local_err and then propagate it to errp. Instead, use the ERRP_GUARD() macro, benefits are: 1. No need of explicit error_propagate call 2. No need of explicit local_err variable: use errp directly 3. ERRP_GUARD() leaves errp as is if it's not NULL or &error_fatal, this means that we don't break error_abort (we'll abort on error_set, not on error_propagate) If we want to add some info to errp (by error_prepend() or error_append_hint()), we must use the ERRP_GUARD() macro. Otherwise, this info will not be added when errp == &error_fatal (the program will exit prior to the error_append_hint() or error_prepend() call). Fix several such cases, e.g. in nbd_read(). This commit is generated by command sed -n '/^Network Block Device (NBD)$/,/^$/{s/^F: //p}' \ MAINTAINERS \| \ xargs git ls-files \| grep '\.[hc]$' \| \ xargs spatch \ --sp-file scripts/coccinelle/errp-guard.cocci \ --macro-file scripts/cocci-macro-file.h \ --in-place --no-show-diff --max-width 80 Reported-by: Kevin Wolf <kwolf@redhat.com> Reported-by: Greg Kurz <groug@kaod.org> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Commit message tweaked] Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200707165037.1026246-8-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [ERRP_AUTO_PROPAGATE() renamed to ERRP_GUARD(), and auto-propagated-errp.cocci to errp-guard.cocci. Commit message tweaked again.]	2020-07-10 15:18:09 +02:00
Markus Armbruster	386f6c07d2	error: Avoid error_propagate() after migrate_add_blocker() When migrate_add_blocker(blocker, &errp) is followed by error_propagate(errp, err), we can often just as well do migrate_add_blocker(..., errp). Do that with this Coccinelle script: @@ expression blocker, err, errp; expression ret; @@ - ret = migrate_add_blocker(blocker, &err); - if (err) { + ret = migrate_add_blocker(blocker, errp); + if (ret < 0) { ... when != err; - error_propagate(errp, err); ... } @@ expression blocker, err, errp; @@ - migrate_add_blocker(blocker, &err); - if (err) { + if (migrate_add_blocker(blocker, errp) < 0) { ... when != err; - error_propagate(errp, err); ... } Double-check @err is not used afterwards. Dereferencing it would be use after free, but checking whether it's null would be legitimate. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-43-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	b11a093c60	qapi: Smooth another visitor error checking pattern Convert visit_type_FOO(v, ..., &ptr, &err); ... if (err) { ... } to visit_type_FOO(v, ..., &ptr, errp); ... if (!ptr) { ... } for functions that set @ptr to non-null / null on success / error. Eliminate error_propagate() that are now unnecessary. Delete @err that are now unused. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-40-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	4bc6d7ee0e	block/parallels: Simplify parallels_open() after previous commit Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-39-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	a5f9b9df25	error: Reduce unnecessary error propagation When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away, even when we need to keep error_propagate() for other error paths. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-38-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	992861fb1e	error: Eliminate error_propagate() manually When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. The previous two commits did that for sufficiently simple cases with Coccinelle. Do it for several more manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-37-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	af175e85f9	error: Eliminate error_propagate() with Coccinelle, part 2 When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. The previous commit did that with a Coccinelle script I consider fairly trustworthy. This commit uses the same script with the matching of return taken out, i.e. we convert if (!foo(..., &err)) { ... error_propagate(errp, err); ... } to if (!foo(..., errp)) { ... ... } This is unsound: @err could still be read between afterwards. I don't know how to express "no read of @err without an intervening write" in Coccinelle. Instead, I manually double-checked for uses of @err. Suboptimal line breaks tweaked manually. qdev_realize() simplified further to placate scripts/checkpatch.pl. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-36-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	668f62ec62	error: Eliminate error_propagate() with Coccinelle, part 1 When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. Convert if (!foo(..., &err)) { ... error_propagate(errp, err); ... return ... } to if (!foo(..., errp)) { ... ... return ... } where nothing else needs @err. Coccinelle script: @rule1 forall@ identifier fun, err, errp, lbl; expression list args, args2; binary operator op; constant c1, c2; symbol false; @@ if ( ( - fun(args, &err, args2) + fun(args, errp, args2) \| - !fun(args, &err, args2) + !fun(args, errp, args2) \| - fun(args, &err, args2) op c1 + fun(args, errp, args2) op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; \| return c2; \| return false; ) } @rule2 forall@ identifier fun, err, errp, lbl; expression list args, args2; expression var; binary operator op; constant c1, c2; symbol false; @@ - var = fun(args, &err, args2); + var = fun(args, errp, args2); ... when != err if ( ( var \| !var \| var op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; \| return c2; \| return false; \| return var; ) } @depends on rule1 \|\| rule2@ identifier err; @@ - Error *err = NULL; ... when != err Not exactly elegant, I'm afraid. The "when != lbl:" is necessary to avoid transforming if (fun(args, &err)) { goto out } ... out: error_propagate(errp, err); even though other paths to label out still need the error_propagate(). For an actual example, see sclp_realize(). Without the "when strict", Coccinelle transforms vfio_msix_setup(), incorrectly. I don't know what exactly "when strict" does, only that it helps here. The match of return is narrower than what I want, but I can't figure out how to express "return where the operand doesn't use @err". For an example where it's too narrow, see vfio_intx_enable(). Silently fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Converted manually. Line breaks tidied up manually. One nested declaration of @local_err deleted manually. Preexisting unwanted blank line dropped in hw/riscv/sifive_e.c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-35-armbru@redhat.com>	2020-07-10 15:18:08 +02:00

1 2 3 4 5 ...

4836 Commits