mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Philippe Mathieu-Daudé	74e4a8a961	block/io_uring: Remove superfluous semicolon Fixes: `6663a0a337` Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200218094402.26625-5-philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 10:54:02 +01:00
Kevin Wolf	9ad1e79f3f	commit: Fix is_read for block_job_error_action() block_job_error_action() needs to know if reading from the top node or writing to the base node failed so that it can set the right 'operation' in the BLOCK_JOB_ERROR QMP event. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200214200812.28180-6-kwolf@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 10:53:56 +01:00
Kevin Wolf	0c42e175fc	commit: Inline commit_populate() commit_populate() is a very short function and only called in a single place. Its return value doesn't tell us whether an error happened while reading or writing, which would be necessary for sending the right data in the BLOCK_JOB_ERROR QMP event. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200214200812.28180-5-kwolf@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 10:53:56 +01:00
Kevin Wolf	c5507b4d55	commit: Fix argument order for block_job_error_action() The block_job_error_action() error call in the commit job gives the on_err and is_read arguments in the wrong order. Fix this. (Of course, hard-coded is_read = false is wrong, too, but that's a separate problem for a separate patch.) Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200214200812.28180-4-kwolf@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 10:53:56 +01:00
Kevin Wolf	d71e65ec1d	commit: Remove unused bytes_written The bytes_written variable is only ever written to, it serves no purpose. This has actually been the case since the commit job was first introduced in commit `747ff60263`. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200214200812.28180-3-kwolf@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 10:53:56 +01:00
Philippe Mathieu-Daudé	5b1405db0f	block/qcow2-bitmap: Remove unneeded variable assignment Fix warning reported by Clang static code analyzer: CC block/qcow2-bitmap.o block/qcow2-bitmap.c:650:5: warning: Value stored to 'ret' is never read ret = -EINVAL; ^ ~~~~~~~ Fixes: `88ddffae8` Reported-by: Clang Static Analyzer Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200215161557.4077-2-philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 10:53:56 +01:00
Kevin Wolf	c3b6658c1a	qcow2: Fix qcow2_alloc_cluster_abort() for external data file For external data file, cluster allocations return an offset in the data file and are not refcounted. In this case, there is nothing to do for qcow2_alloc_cluster_abort(). Freeing the same offset in the qcow2 file is wrong and causes crashes in the better case or image corruption in the worse case. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200211094900.17315-3-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 10:53:56 +01:00
Kevin Wolf	dea9052ef1	qcow2: update_refcount(): Reset old_table_index after qcow2_cache_put() In the case that update_refcount() frees a refcount block, it evicts it from the metadata cache. Before doing so, however, it returns the currently used refcount block to the cache because it might be the same. Returning the refcount block early means that we need to reset old_table_index so that we reload the refcount block in the next iteration if it is actually still in use. Fixes: `f71c08ea8e` Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20200211094900.17315-2-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 10:53:56 +01:00
Hikaru Nishida	8475ea4854	block/vvfat: Do not unref qcow on closing backing bdrv Before this commit, BDRVVVFATState.qcow is unrefed in write_target_close on closing backing bdrv of vvfat. However, qcow bdrv is opend as a child of vvfat in enable_write_target() so it will be also unrefed on closing vvfat itself. This causes use-after-free of qcow on freeing vvfat which has backing bdrv and qcow bdrv as children in this order because bdrv_close(vvfat) tries to free qcow bdrv after freeing backing bdrv as QLIST_FOREACH_SAFE() loop keeps next pointer, but BdrvChild of qcow is already freed in bdrv_close(backing bdrv). Signed-off-by: Hikaru Nishida <hikarupsp@gmail.com> Message-Id: <20200209175156.85748-1-hikarupsp@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 10:53:56 +01:00
Alberto Garcia	2d4b5256cf	qcow2: Fix alignment checks in encrypted images I/O requests to encrypted media should be aligned to the sector size used by the underlying encryption method, not to BDRV_SECTOR_SIZE. Fortunately this doesn't break anything at the moment because both existing QCRYPTO_BLOCK_*_SECTOR_SIZE have the same value as BDRV_SECTOR_SIZE. The checks in qcow2_co_preadv_encrypted() are also unnecessary because they are repeated immediately afterwards in qcow2_co_encdec(). Signed-off-by: Alberto Garcia <berto@igalia.com> Message-Id: <20200213171646.15876-1-berto@igalia.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-02-18 10:53:56 +01:00
Kevin Wolf	7e6c4ff792	mirror: Don't let an operation wait for itself mirror_wait_for_free_in_flight_slot() just picks a random operation to wait for. However, when mirror_co_read() waits for free slots, its MirrorOp is already in s->ops_in_flight, so if not enough slots are immediately available, an operation can end up waiting for itself to complete, which results in a hang. Fix this by passing the current MirrorOp and skipping this operation when picking an operation to wait for. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1794692 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2020-02-18 10:53:56 +01:00
Kevin Wolf	eed325b92c	mirror: Store MirrorOp.co for debuggability If a coroutine is launched, but the coroutine pointer isn't stored anywhere, debugging any problems inside the coroutine is quite hard. Let's store the coroutine pointer of a mirror operation in MirrorOp to have it available in the debugger. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2020-02-18 10:53:56 +01:00
Vladimir Sementsov-Ogievskiy	ac9d00bf7b	block: fix crash on zero-length unaligned write and read Commit `7a3f542fbd` "block/io: refactor padding" occasionally dropped aligning for zero-length request: bdrv_init_padding() blindly return false if bytes == 0, like there is nothing to align. This leads the following command to crash: ./qemu-io --image-opts -c 'write 1 0' \ driver=blkdebug,align=512,image.driver=null-co,image.size=512 >> qemu-io: block/io.c:1955: bdrv_aligned_pwritev: Assertion `(offset & (align - 1)) == 0' failed. >> Aborted (core dumped) Prior to `7a3f542fbd` we does aligning of such zero requests. Instead of recovering this behavior let's just do nothing on such requests as it is useless. Note that driver may have special meaning of zero-length reqeusts, like qcow2_co_pwritev_compressed_part, so we can't skip any zero-length operation. But for unaligned ones, we can't pass it to driver anyway. This commit also fixes crash in iotest 80 running with -nocache: ./check -nocache -qcow2 80 which crashes on same assertion due to trying to read empty extra data in qcow2_do_read_snapshots(). Cc: qemu-stable@nongnu.org # v4.2 Fixes: `7a3f542fbd` Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20200206164245.17781-1-vsementsov@virtuozzo.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-02-07 16:46:59 +00:00
Vladimir Sementsov-Ogievskiy	0df62f45c1	block/backup-top: fix failure path We can't access top after call bdrv_backup_top_drop, as it is already freed at this time. Also, no needs to unref target child by hand, it will be unrefed on bdrv_close() automatically. So, just do bdrv_backup_top_drop if append succeed and one bdrv_unref otherwise. Note, that in !appended case bdrv_unref(top) moved into drained section on source. It doesn't really matter, but just for code simplicity. Fixes: `7df7868b96` Cc: qemu-stable@nongnu.org # v4.2.0 Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20200121142802.21467-2-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-02-06 13:47:45 +01:00
Alberto Garcia	3afea40243	qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value This replaces all remaining instances in the qcow2 code. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: b5f74b606c2d9873b12d29acdb7fd498029c4025.1579374329.git.berto@igalia.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-02-06 13:47:45 +01:00
Alberto Garcia	25ae71db55	qcow2: Don't require aligned offsets in qcow2_co_copy_range_from() qemu-img's convert_co_copy_range() operates at the sector level and block_copy() operates at the cluster level so this condition is always true, but it is not necessary to restrict this here, so let's leave it to the driver implementation return an error if there is any. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: a4264aaee656910c84161a2965f7a501437379ca.1579374329.git.berto@igalia.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-02-06 13:47:45 +01:00
Alberto Garcia	da86f8cbad	qcow2: Use bs->bl.request_alignment when updating an L1 entry When updating an L1 entry the qcow2 driver writes a (512-byte) sector worth of data to avoid a read-modify-write cycle. Instead of always writing 512 bytes we should follow the alignment requirements of the storage backend. (the only exception is when the alignment is larger than the cluster size because then we could be overwriting data after the L1 table) Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 71f34d4ae4b367b32fb36134acbf4f4f7ee681f4.1579374329.git.berto@igalia.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-02-06 13:47:45 +01:00
Alberto Garcia	344ffea951	qcow2: Tighten cluster_offset alignment assertions qcow2_alloc_cluster_offset() and qcow2_get_cluster_offset() always return offsets that are cluster-aligned so don't just check that they are sector-aligned. The check in qcow2_co_preadv_task() is also replaced by an assertion for the same reason. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 558ba339965f858bede4c73ce3f50f0c0493597d.1579374329.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-02-06 13:47:45 +01:00
Alberto Garcia	ef97d608c7	qcow2: Don't round the L1 table allocation up to the sector size The L1 table is read from disk using the byte-based bdrv_pread() and is never accessed beyond its last element, so there's no need to allocate more memory than that. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: b2e27214ec7b03a585931bcf383ee1ac3a641a10.1579374329.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-02-06 13:47:45 +01:00
Alberto Garcia	7cdca2e233	qcow2: Use a GString in report_unsupported_feature() This is a bit more efficient than having to allocate and free memory for each item. The default size (60) is enough for all the existing incompatible features or the "Unknown incompatible feature" message. Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20200115135626.19442-1-berto@igalia.com Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-02-06 13:47:45 +01:00
Alberto Garcia	3a75a870ef	qcow2: Assert that host cluster offsets fit in L2 table entries The standard cluster descriptor in L2 table entries has a field to store the host cluster offset. When we need to get that offset from an entry we use L2E_OFFSET_MASK to ensure that we only use the bits that belong to that field. But while that mask is used every time we read from an L2 entry, it is never used when we write to it. Due to the QCOW_MAX_CLUSTER_OFFSET limit set in the cluster allocation code QEMU can never produce offsets that don't fit in that field so any such offset would indicate a bug in QEMU. Compressed cluster descriptors contain two fields (host cluster offset and size of the compressed data) and the situation with them is similar. In this case the masks are not constant but are stored in the csize_mask and cluster_offset_mask fields of BDRVQcow2State. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20200113161146.20099-1-berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-02-06 13:47:45 +01:00
Aarushi Mehta	daffeb027b	block/io_uring: adds userspace completion polling Signed-off-by: Aarushi Mehta <mehta.aaru20@gmail.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200120141858.587874-11-stefanha@redhat.com Message-Id: <20200120141858.587874-11-stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-01-30 20:59:42 +00:00
Aarushi Mehta	d803f59050	block: add trace events for io_uring Signed-off-by: Aarushi Mehta <mehta.aaru20@gmail.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200120141858.587874-10-stefanha@redhat.com Message-Id: <20200120141858.587874-10-stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-01-30 20:59:42 +00:00
Aarushi Mehta	c644751069	block/file-posix.c: extend to use io_uring Signed-off-by: Aarushi Mehta <mehta.aaru20@gmail.com> Reviewed-by: Maxim Levitsky <maximlevitsky@gmail.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200120141858.587874-9-stefanha@redhat.com Message-Id: <20200120141858.587874-9-stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-01-30 20:59:42 +00:00
Aarushi Mehta	6663a0a337	block/io_uring: implements interfaces for io_uring Aborts when sqe fails to be set as sqes cannot be returned to the ring. Adds slow path for short reads for older kernels Signed-off-by: Aarushi Mehta <mehta.aaru20@gmail.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200120141858.587874-5-stefanha@redhat.com Message-Id: <20200120141858.587874-5-stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-01-30 20:59:41 +00:00
Paolo Bonzini	3ba0e1a00c	block/io: take bs->reqs_lock in bdrv_mark_request_serialising bdrv_mark_request_serialising is writing the overlap_offset and overlap_bytes fields of BdrvTrackedRequest. Take bs->reqs_lock for the whole duration of it, and not just when waiting for serialising requests, so that tracked_request_overlaps does not look at a half-updated request. The new code does not unlock/relock around retries. This is unnecessary because a retry is always preceded by a CoQueue wait, which already releases and reacquires bs->reqs_lock. Reported-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1578495356-46219-4-git-send-email-pbonzini@redhat.com Message-Id: <1578495356-46219-4-git-send-email-pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-01-30 20:59:41 +00:00
Paolo Bonzini	18fbd0dec7	block/io: wait for serialising requests when a request becomes serialising Marking without waiting would not result in actual serialising behavior. Thus, make a call bdrv_mark_request_serialising sufficient for serialisation to happen. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1578495356-46219-3-git-send-email-pbonzini@redhat.com Message-Id: <1578495356-46219-3-git-send-email-pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-01-30 20:59:41 +00:00
Paolo Bonzini	c53cb42769	block: eliminate BDRV_REQ_NO_SERIALISING It is unused since commit `00e30f0` ("block/backup: use backup-top instead of write notifiers", 2019-10-01), drop it to simplify the code. While at it, drop redundant assertions on flags. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1578495356-46219-2-git-send-email-pbonzini@redhat.com Message-Id: <1578495356-46219-2-git-send-email-pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-01-30 20:59:41 +00:00
Kevin Wolf	5fbf1d56c2	iscsi: Don't access non-existent scsi_lba_status_descriptor In iscsi_co_block_status(), we may have received num_descriptors == 0 from the iscsi server. Therefore, we can't unconditionally access lbas->descriptors[0]. Add the missing check. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Felipe Franciosi <felipe@nutanix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de>	2020-01-27 17:19:53 +01:00
Felipe Franciosi	693fd2acdf	iscsi: Cap block count from GET LBA STATUS (CVE-2020-1711) When querying an iSCSI server for the provisioning status of blocks (via GET LBA STATUS), Qemu only validates that the response descriptor zero's LBA matches the one requested. Given the SCSI spec allows servers to respond with the status of blocks beyond the end of the LUN, Qemu may have its heap corrupted by clearing/setting too many bits at the end of its allocmap for the LUN. A malicious guest in control of the iSCSI server could carefully program Qemu's heap (by selectively setting the bitmap) and then smash it. This limits the number of bits that iscsi_co_block_status() will try to update in the allocmap so it can't overflow the bitmap. Fixes: CVE-2020-1711 Cc: qemu-stable@nongnu.org Signed-off-by: Felipe Franciosi <felipe@nutanix.com> Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com> Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-01-27 17:19:53 +01:00
Eiichi Tsukata	fb574de81b	block/backup: fix memory leak in bdrv_backup_top_append() bdrv_open_driver() allocates bs->opaque according to drv->instance_size. There is no need to allocate it and overwrite opaque in bdrv_backup_top_append(). Reproducer: $ QTEST_QEMU_BINARY=./x86_64-softmmu/qemu-system-x86_64 valgrind -q --leak-check=full tests/test-replication -p /replication/secondary/start ==29792== 24 bytes in 1 blocks are definitely lost in loss record 52 of 226 ==29792== at 0x483AB1A: calloc (vg_replace_malloc.c:762) ==29792== by 0x4B07CE0: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.6000.7) ==29792== by 0x12BAB9: bdrv_open_driver (block.c:1289) ==29792== by 0x12BEA9: bdrv_new_open_driver (block.c:1359) ==29792== by 0x1D15CB: bdrv_backup_top_append (backup-top.c:190) ==29792== by 0x1CC11A: backup_job_create (backup.c:439) ==29792== by 0x1CD542: replication_start (replication.c:544) ==29792== by 0x1401B9: replication_start_all (replication.c:52) ==29792== by 0x128B50: test_secondary_start (test-replication.c:427) ... Fixes: `7df7868b96` ("block: introduce backup-top filter driver") Signed-off-by: Eiichi Tsukata <devel@etsukata.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-01-27 17:19:53 +01:00
Sergio Lopez	0abf258171	block/backup-top: Don't acquire context while dropping top All paths that lead to bdrv_backup_top_drop(), except for the call from backup_clean(), imply that the BDS AioContext has already been acquired, so doing it there too can potentially lead to QEMU hanging on AIO_WAIT_WHILE(). An easy way to trigger this situation is by issuing a two actions transaction, with a proper and a bogus blockdev-backup, so the second one will trigger a rollback. This will trigger a hang with an stack trace like this one: #0 0x00007fb680c75016 in __GI_ppoll (fds=0x55e74580f7c0, nfds=1, timeout=<optimized out>, timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39 #1 0x000055e743386e09 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 #2 0x000055e743386e09 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:336 #3 0x000055e743388dc4 in aio_poll (ctx=0x55e7458925d0, blocking=blocking@entry=true) at util/aio-posix.c:669 #4 0x000055e743305dea in bdrv_flush (bs=bs@entry=0x55e74593c0d0) at block/io.c:2878 #5 0x000055e7432be58e in bdrv_close (bs=0x55e74593c0d0) at block.c:4017 #6 0x000055e7432be58e in bdrv_delete (bs=<optimized out>) at block.c:4262 #7 0x000055e7432be58e in bdrv_unref (bs=bs@entry=0x55e74593c0d0) at block.c:5644 #8 0x000055e743316b9b in bdrv_backup_top_drop (bs=bs@entry=0x55e74593c0d0) at block/backup-top.c:273 #9 0x000055e74331461f in backup_job_create (job_id=0x0, bs=bs@entry=0x55e7458d5820, target=target@entry=0x55e74589f640, speed=0, sync_mode=MIRROR_SYNC_MODE_FULL, sync_bitmap=sync_bitmap@entry=0x0, bitmap_mode=BITMAP_SYNC_MODE_ON_SUCCESS, compress=false, filter_node_name=0x0, on_source_error=BLOCKDEV_ON_ERROR_REPORT, on_target_error=BLOCKDEV_ON_ERROR_REPORT, creation_flags=0, cb=0x0, opaque=0x0, txn=0x0, errp=0x7ffddfd1efb0) at block/backup.c:478 #10 0x000055e74315bc52 in do_backup_common (backup=backup@entry=0x55e746c066d0, bs=bs@entry=0x55e7458d5820, target_bs=target_bs@entry=0x55e74589f640, aio_context=aio_context@entry=0x55e7458a91e0, txn=txn@entry=0x0, errp=errp@entry=0x7ffddfd1efb0) at blockdev.c:3580 #11 0x000055e74315c37c in do_blockdev_backup (backup=backup@entry=0x55e746c066d0, txn=0x0, errp=errp@entry=0x7ffddfd1efb0) at /usr/src/debug/qemu-kvm-4.2.0-2.module+el8.2.0+5135+ed3b2489.x86_64/./qapi/qapi-types-block-core.h:1492 #12 0x000055e74315c449 in blockdev_backup_prepare (common=0x55e746a8de90, errp=0x7ffddfd1f018) at blockdev.c:1885 #13 0x000055e743160152 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=0x55e7467fe2c0, errp=errp@entry=0x7ffddfd1f088) at blockdev.c:2340 #14 0x000055e743287ff5 in qmp_marshal_transaction (args=<optimized out>, ret=<optimized out>, errp=0x7ffddfd1f0f8) at qapi/qapi-commands-transaction.c:44 #15 0x000055e74333de6c in do_qmp_dispatch (errp=0x7ffddfd1f0f0, allow_oob=<optimized out>, request=<optimized out>, cmds=0x55e743c28d60 <qmp_commands>) at qapi/qmp-dispatch.c:132 #16 0x000055e74333de6c in qmp_dispatch (cmds=0x55e743c28d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175 #17 0x000055e74325c061 in monitor_qmp_dispatch (mon=0x55e745908030, req=<optimized out>) at monitor/qmp.c:145 #18 0x000055e74325c6fa in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234 #19 0x000055e743385866 in aio_bh_call (bh=0x55e745807ae0) at util/async.c:117 #20 0x000055e743385866 in aio_bh_poll (ctx=ctx@entry=0x55e7458067a0) at util/async.c:117 #21 0x000055e743388c54 in aio_dispatch (ctx=0x55e7458067a0) at util/aio-posix.c:459 #22 0x000055e743385742 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 #23 0x00007fb68543e67d in g_main_dispatch (context=0x55e745893a40) at gmain.c:3176 #24 0x00007fb68543e67d in g_main_context_dispatch (context=context@entry=0x55e745893a40) at gmain.c:3829 #25 0x000055e743387d08 in glib_pollfds_poll () at util/main-loop.c:219 #26 0x000055e743387d08 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 #27 0x000055e743387d08 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 #28 0x000055e74316a3c1 in main_loop () at vl.c:1828 #29 0x000055e743016a72 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504 Fix this by not acquiring the AioContext there, and ensuring all paths leading to it have it already acquired (backup_clean()). RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782111 Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-01-27 17:19:53 +01:00
Wangyong	2558cb8dd4	linux-aio: increasing MAX_EVENTS to a larger hardcoded value Since commit `6040aedddb` "virtio-blk: make queue size configurable",if the user set the queue size to more than 128 ,it will not take effect. That's because linux aio's maximum outstanding requests at a time is always less than or equal to 128. This patch simply increase MAX_EVENTS to a larger hardcoded value of 1024 as a shortterm fix. Signed-off-by: wangyong <wang.yongD@h3c.com> Message-id: faa5781afd354a96a0be152b288f636f@h3c.com Message-Id: <faa5781afd354a96a0be152b288f636f@h3c.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-01-13 16:41:45 +00:00
Max Reitz	503ca1262b	backup-top: Begin drain earlier When dropping backup-top, we need to drain the node before freeing the BlockCopyState. Otherwise, requests may still be in flight and then the assertion in shres_destroy() will fail. (This becomes visible in intermittent failure of 056.) Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20191219182638.104621-1-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-01-06 14:26:23 +01:00
Andrey Shinkevich	0d483dce38	qcow2: Allow writing compressed data of multiple clusters QEMU currently supports writing compressed data of the size equal to one cluster. This patch allows writing QCOW2 compressed data that exceed one cluster. Now, we split buffered data into separate clusters and write them compressed using the block/aio_task API. Suggested-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1575288906-551879-3-git-send-email-andrey.shinkevich@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-01-06 13:43:07 +01:00
Andrey Shinkevich	f41388e0fb	block: introduce compress filter driver Allow writing all the data compressed through the filter driver. The written data will be aligned by the cluster size. Based on the QEMU current implementation, that data can be written to unallocated clusters only. May be used for a backup job. Suggested-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 1575288906-551879-2-git-send-email-andrey.shinkevich@virtuozzo.com [mreitz: Replace NULL bdrv_get_format_name() by "(no format)"] Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-01-06 13:43:07 +01:00
Vladimir Sementsov-Ogievskiy	a1db8733d2	qcow2-bitmaps: fix qcow2_can_store_new_dirty_bitmap qcow2_can_store_new_dirty_bitmap works wrong, as it considers only bitmaps already stored in the qcow2 image and ignores persistent BdrvDirtyBitmap objects. So, let's instead count persistent BdrvDirtyBitmaps. We load all qcow2 bitmaps on open, so there should not be any bitmap in the image for which we don't have BdrvDirtyBitmaps version. If it is - it's a kind of corruption, and no reason to check for corruptions here (open() and close() are better places for it). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20191014115126.15360-2-vsementsov@virtuozzo.com Reviewed-by: Max Reitz <mreitz@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-01-06 13:43:06 +01:00
PanNengyuan	88be15a9e1	throttle-groups: fix memory leak in throttle_group_set_limit: This avoid a memory leak when qom-set is called to set throttle_group limits, here is an easy way to reproduce: 1. run qemu-iotests as follow and check the result with asan: ./check -qcow2 184 Following is the asan output backtrack: Direct leak of 912 byte(s) in 3 object(s) allocated from: #0 0xffff8d7ab3c3 in __interceptor_calloc (/lib64/libasan.so.4+0xd33c3) #1 0xffff8d4c31cb in g_malloc0 (/lib64/libglib-2.0.so.0+0x571cb) #2 0x190c857 in qobject_input_start_struct /mnt/sdc/qemu-master/qemu-4.2.0-rc0/qapi/qobject-input-visitor.c:295 #3 0x19070df in visit_start_struct /mnt/sdc/qemu-master/qemu-4.2.0-rc0/qapi/qapi-visit-core.c:49 #4 0x1948b87 in visit_type_ThrottleLimits qapi/qapi-visit-block-core.c:3759 #5 0x17e4aa3 in throttle_group_set_limits /mnt/sdc/qemu-master/qemu-4.2.0-rc0/block/throttle-groups.c:900 #6 0x1650eff in object_property_set /mnt/sdc/qemu-master/qemu-4.2.0-rc0/qom/object.c:1272 #7 0x1658517 in object_property_set_qobject /mnt/sdc/qemu-master/qemu-4.2.0-rc0/qom/qom-qobject.c:26 #8 0x15880bb in qmp_qom_set /mnt/sdc/qemu-master/qemu-4.2.0-rc0/qom/qom-qmp-cmds.c:74 #9 0x157e3e3 in qmp_marshal_qom_set qapi/qapi-commands-qom.c:154 Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: PanNengyuan <pannengyuan@huawei.com> Message-id: 1574835614-42028-1-git-send-email-pannengyuan@huawei.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-01-06 13:43:06 +01:00
Max Reitz	69c6449ff1	blkdebug: Allow taking/unsharing permissions Sometimes it is useful to be able to add a node to the block graph that takes or unshare a certain set of permissions for debugging purposes. This patch adds this capability to blkdebug. (Note that you cannot make blkdebug release or share permissions that it needs to take or cannot share, because this might result in assertion failures in the block layer. But if the blkdebug node has no parents, it will not take any permissions and share everything by default, so you can then freely choose what permissions to take and share.) Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20191108123455.39445-4-mreitz@redhat.com Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-01-06 13:43:06 +01:00
Peter Maydell	1010af540b	Block layer patches: - qemu-img: fix info --backing-chain --image-opts - Error out on image creation with conflicting size options - Fix external snapshot with VM state - hmp: Allow using qdev ID for qemu-io command - Misc code cleanup - Many iotests improvements -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJd+7H/AAoJEH8JsnLIjy/WwIoP/igzSCfiH7yOzWexk7J7DUh7 SplXRiBQSQAZ8KD2umUk80UO6/W19rHAl3bLLu50D5uzQ5PNz1wSW2S81JZsj0nQ ErUEr5yxLGAI/zHIw/LrjUSEygbqcZ21R7foNmIy3lInEXf6gGwPQEnZnZgGkH0x v5jwv1HpUpEyetnHunX2cK3JpSNEJCYsV+X1nzDwhjYFuGQq0nlhgUZ8BDt6+fEr b/S3TuHmj/FXMH+5wrSr0LiCKaIyAwPdREvC+61aklMNFuA8YwF33tOaJLoWeQCD qhxFA8jVd8b3dQ0NZXXbliXST5d6AzjpeqbNzsyKze1duVfcfF1RYsC6McojpQ7f 9qFIXPC9IHe3sNkUZpTtvONLnmCeHTc9lwVIyVp9B5KXcmcLedvYM04WqhZ+/eeJ BfgCvXOyF9sL2GaVPFD+98E2DOYG4dI4rmzqn98kQj+RAXDFvvJ+zxuDUC9omE6T pVvxczGPXmnxP2ITRNljJyLVCN1YgE2g9S0GAXNM2i4uOXvhFToEwrJLFytVeVsw UwtYkE0F8KJyqjje5N3HiXclBVa++PK3+jjNFA+3B6XZ9wcZZRS2ZXmzsyNOJ7qG AeIPtrqBCz+QKsHtmS9CwYpvLrPafobHA3WS2z0dsevcWBHYBQBEM2EgXRFC3N3a mymYGxMOsG6+8onwqwjJ =laVP -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches: - qemu-img: fix info --backing-chain --image-opts - Error out on image creation with conflicting size options - Fix external snapshot with VM state - hmp: Allow using qdev ID for qemu-io command - Misc code cleanup - Many iotests improvements # gpg: Signature made Thu 19 Dec 2019 17:23:11 GMT # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (30 commits) iotests: Test external snapshot with VM state hmp: Allow using qdev ID for qemu-io command block: Activate recursively even for already active nodes iotests: 211: Remove duplication with VM.blockdev_create() iotests: 207: Remove duplication with VM.blockdev_create() iotests: 266: Convert to VM.blockdev_create() iotests: 237: Convert to VM.blockdev_create() iotests: 213: Convert to VM.blockdev_create() iotests: 212: Convert to VM.blockdev_create() iotests: 210: Convert to VM.blockdev_create() iotests: 206: Convert to VM.blockdev_create() iotests: 255: Drop blockdev_create() iotests: Create VM.blockdev_create() qcow2: Move error check of local_err near its assignment iotests: Fix IMGOPTSSYNTAX for nbd iotests/273: Filter format-specific information iotests: Add more "_require_drivers" checks to the shell-based tests MAINTAINERS: fix qcow2-bitmap.c under Dirty Bitmaps header qcow2: Use offset_into_cluster() iotests: Support job-complete in run_job() ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-12-20 18:25:32 +00:00
Tuguoyi	66be5c3e78	qcow2: Move error check of local_err near its assignment The local_err check outside of the if block was necessary when it was introduced in commit `d1258dd0c8` because it needed to be executed even if qcow2_load_autoloading_dirty_bitmaps() returned false. After some modifications that all required the error check to remain where it is, commit `9c98f145df` finally moved the qcow2_load_dirty_bitmaps() call into the if block, so now the error check should be there, too. Signed-off-by: Guoyi Tu <tu.guoyi@h3c.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-12-19 14:31:52 +01:00
Alberto Garcia	74e60fb56a	qcow2: Use offset_into_cluster() There's a couple of places left in the qcow2 code that still do the calculation manually, so let's replace them. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-12-18 11:21:16 +01:00
Kevin Wolf	3b65081638	qcow2: Declare BDRV_REQ_NO_FALLBACK supported In the common case, qcow2_co_pwrite_zeroes() already only modifies metadata case, so we're fine with or without BDRV_REQ_NO_FALLBACK set. The only exception is when using an external data file, where the request is passed down to the block driver of the external data file. We are forwarding the BDRV_REQ_NO_FALLBACK flag there, though, so this is fine, too. Declare the flag supported therefore. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>	2019-12-18 11:21:16 +01:00
Vladimir Sementsov-Ogievskiy	d936613547	nbd: assert that Error** is not NULL in nbd_iter_channel_error All callers of nbd_iter_channel_error() pass the address of a local_err variable, and only call this function if an error has already occurred, using this function to propagate that error. This is already implied by its name (local_err instead of the classic errp), but it is worth additionally stressing this by adding an assertion to make it part of the function contract. The local_err parameter is not here to return information about nbd_iter_channel_error failure. Instead it's assumed to be filled when passed to the function. This is already stressed by its name (local_err, instead of classic errp). Stress it additionally by assertion. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20191205174635.18758-22-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2019-12-18 08:43:19 +01:00
Vladimir Sementsov-Ogievskiy	e53a578a8b	block/snapshot: rename Error ** parameter to more common errp Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@Redhat.com> Message-Id: <20191205174635.18758-11-vsementsov@virtuozzo.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2019-12-18 08:43:19 +01:00
Vladimir Sementsov-Ogievskiy	f56281abd9	block/qcow2-bitmap: fix crash bug in qcow2_co_remove_persistent_dirty_bitmap Here is double bug: First, return error but not set errp. This may lead to: qmp block-dirty-bitmap-remove may report success when actually failed block-dirty-bitmap-remove used in a transaction will crash, as qmp_transaction will think that it returned success and will call block_dirty_bitmap_remove_commit which will crash, as state->bitmap is NULL Second (like in anecdote), this case is not an error at all. As it is documented in the comment above bdrv_co_remove_persistent_dirty_bitmap definition, absence of bitmap is not an error, and similar case handled at start of qcow2_co_remove_persistent_dirty_bitmap, it returns 0 when there is no bitmaps at all. But when there are some bitmaps, but not the requested one, it return error with errp unset. Fix that. Trigger: 1. create persistent bitmap A 2. shutdown vm (bitmap A is synced) 3. start vm 4. create persistent bitmap B 5. remove bitmap B - it fails (and crashes if in transaction) Potential workaround (rather invasive to ask clients to implement it): 1. create persistent bitmap A 2. shutdown vm 3. start vm 4. create persistent bitmap B 5. remember, that we want to remove bitmap B after vm shutdown ... some other operations ... 6. vm shutdown 7. start vm in stopped mode, and remove all bitmaps marked for removing 8. stop vm Fixes: `b56a1e3175` Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20191205193049.30666-1-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> [eblake: commit message tweaks] Signed-off-by: Eric Blake <eblake@redhat.com>	2019-12-09 09:23:04 -06:00
Markus Armbruster	cb09104ea8	block/file-posix: Fix laio_init() error handling crash bug raw_aio_attach_aio_context() passes uninitialized Error *local_err by reference to laio_init() via aio_setup_linux_aio(). When laio_init() fails, it passes it on to error_setg_errno(), tripping error_setv()'s assertion unless @local_err is null by dumb luck. Fix by initializing @local_err properly. Fixes: `ed6e216171` Cc: Nishanth Aravamudan <naravamudan@digitalocean.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20191130194240.10517-4-armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2019-12-02 16:14:41 +01:00
Vladimir Sementsov-Ogievskiy	a505475b95	block/qcow2-bitmap: fix bitmap migration Fix bitmap migration with dirty-bitmaps capability enabled and shared storage. We should ignore IN_USE bitmaps in the image on target, when migrating bitmaps through migration channel, see new comment below. Fixes: `74da6b9435` Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20191125125229.13531-2-vsementsov@virtuozzo.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-11-26 14:18:01 +01:00
Eric Blake	93676c88d7	nbd: Don't send oversize strings Qemu as server currently won't accept export names larger than 256 bytes, nor create dirty bitmap names longer than 1023 bytes, so most uses of qemu as client or server have no reason to get anywhere near the NBD spec maximum of a 4k limit per string. However, we weren't actually enforcing things, ignoring when the remote side violates the protocol on input, and also having several code paths where we send oversize strings on output (for example, qemu-nbd --description could easily send more than 4k). Tighten things up as follows: client: - Perform bounds check on export name and dirty bitmap request prior to handing it to server - Validate that copied server replies are not too long (ignoring NBD_INFO_* replies that are not copied is not too bad) server: - Perform bounds check on export name and description prior to advertising it to client - Reject client name or metadata query that is too long - Adjust things to allow full 4k name limit rather than previous 256 byte limit Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20191114024635.11363-4-eblake@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2019-11-18 16:01:34 -06:00
Eric Blake	cf7c49cf6a	bitmap: Enforce maximum bitmap name length We document that for qcow2 persistent bitmaps, the name cannot exceed 1023 bytes. It is inconsistent if transient bitmaps do not have to abide by the same limit, and it is unlikely that any existing client even cares about using bitmap names this long. It's time to codify that ALL bitmaps managed by qemu (whether persistent in qcow2 or not) have a documented maximum length. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20191114024635.11363-3-eblake@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2019-11-18 16:01:34 -06:00
Max Reitz	24552feb6a	qcow2: Fix QCOW2_COMPRESSED_SECTOR_MASK Masks for L2 table entries should have 64 bit. Fixes: `b6c246942b` Buglink: https://bugs.launchpad.net/qemu/+bug/1850000 Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20191028161841.1198-2-mreitz@redhat.com Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-11-07 14:37:46 +01:00
Tuguoyi	570542ecb1	qcow2-bitmap: Fix uint64_t left-shift overflow There are two issues in In check_constraints_on_bitmap(), 1) The sanity check on the granularity will cause uint64_t integer left-shift overflow when cluster_size is 2M and the granularity is BIGGER than 32K. 2) The way to calculate image size that the maximum bitmap supported can map to is a bit incorrect. This patch fix it by add a helper function to calculate the number of bytes needed by a normal bitmap in image and compare it to the maximum bitmap bytes supported by qemu. Fixes: `5f72826e7f` Signed-off-by: Guoyi Tu <tu.guoyi@h3c.com> Message-id: 4ba40cd1e7ee4a708b40899952e49f22@h3c.com Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-11-07 14:37:33 +01:00
Max Reitz	292d06b925	block/file-posix: Let post-EOF fallocate serialize The XFS kernel driver has a bug that may cause data corruption for qcow2 images as of qemu commit `c8bb23cbdb`. We can work around it by treating post-EOF fallocates as serializing up until infinity (INT64_MAX in practice). Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20191101152510.11719-4-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-11-04 09:33:51 +01:00
Max Reitz	c28107e9e5	block: Add bdrv_co_get_self_request() Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20191101152510.11719-3-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-11-04 09:32:51 +01:00
Max Reitz	304d9d7f03	block: Make wait/mark serialising requests public Make both bdrv_mark_request_serialising() and bdrv_wait_serialising_requests() public so they can be used from block drivers. Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20191101152510.11719-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-11-04 09:29:15 +01:00
Vladimir Sementsov-Ogievskiy	dcfbece684	block/block-copy: fix s->copy_size for compressed cluster `0e2402452f` allowed writes larger than cluster, but that's unsupported for compressed write. Fix it. Fixes: `0e2402452f` Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20191029150934.26416-1-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-11-04 09:21:45 +01:00
Peter Maydell	aaffb85335	Block patches for softfreeze: - iotest patches - Improve performance of the mirror block job in write-blocking mode - Limit memory usage for the backup block job - Add discard and write-zeroes support to the NVMe host block driver - Fix a bug in the mirror job - Prevent the qcow2 driver from creating technically non-compliant qcow2 v3 images (where there is not enough extra data for snapshot table entries) - Allow callers of bdrv_truncate() (etc.) to determine whether the file must be resized to the exact given size or whether it is OK for block devices not to shrink -----BEGIN PGP SIGNATURE----- iQFGBAABCAAwFiEEkb62CjDbPohX0Rgp9AfbAGHVz0AFAl2224ESHG1yZWl0ekBy ZWRoYXQuY29tAAoJEPQH2wBh1c9AeXMH/RXKEX4BZYMRKCe41P18tJC9Bl2x0T20 YeOsZVvpARlr7o/36BF2kGFF4MnL0OQ+9ELuyROX865rk/VL2rWqnHDE5oQM889a dFwMs+0zvNbig3iLNcw0H5OkE2mrdM+a1EUdn/lBe/39Z8dPqPxRGqIYHq38Ugdu emwSy1nWen7o0f71HRJfyVtI3KcrzXx71FrA/FY2yL/eHz+zRYGZj2SpAdFPkXP/ lgaz+m0tWhnSW1QzEOXB0Gh69ULt/DczCinYmv5qUY1noW5TPPtiDNCQTts5O4ba oJsR3AJv5/l9m65JTmiyQSqnQfPcstrQ5FqOcSnP637cfqUFyWsvdks= =L7v1 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2019-10-28' into staging Block patches for softfreeze: - iotest patches - Improve performance of the mirror block job in write-blocking mode - Limit memory usage for the backup block job - Add discard and write-zeroes support to the NVMe host block driver - Fix a bug in the mirror job - Prevent the qcow2 driver from creating technically non-compliant qcow2 v3 images (where there is not enough extra data for snapshot table entries) - Allow callers of bdrv_truncate() (etc.) to determine whether the file must be resized to the exact given size or whether it is OK for block devices not to shrink # gpg: Signature made Mon 28 Oct 2019 12:13:53 GMT # gpg: using RSA key 91BEB60A30DB3E8857D11829F407DB0061D5CF40 # gpg: issuer "mreitz@redhat.com" # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full] # Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40 * remotes/maxreitz/tags/pull-block-2019-10-28: (69 commits) qemu-iotests: restrict 264 to qcow2 only Revert "qemu-img: Check post-truncation size" block: Pass truncate exact=true where reasonable block: Let format drivers pass @exact block: Evaluate @exact in protocol drivers block: Add @exact parameter to bdrv_co_truncate() block: Do not truncate file node when formatting block/cor: Drop cor_co_truncate() block: Handle filter truncation like native impl. iotests: Test qcow2's snapshot table handling iotests: Add peek_file* functions qcow2: Fix v3 snapshot table entry compliancy qcow2: Repair snapshot table with too many entries qcow2: Fix overly long snapshot tables qcow2: Keep track of the snapshot table length qcow2: Fix broken snapshot table entries qcow2: Add qcow2_check_fix_snapshot_table() qcow2: Separate qcow2_check_read_snapshot_table() qcow2: Write v3-compliant snapshot list on upgrade qcow2: Put qcow2_upgrade() into its own function ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-10-28 14:40:01 +00:00
Max Reitz	e8d04f9237	block: Pass truncate exact=true where reasonable This is a change in behavior, so all instances need a good justification. The comments added here should explain my reasoning. qed already had a comment that suggests it always expected bdrv_truncate()/blk_truncate() to behave as if exact=true were passed (`c743849bee` came eight months before `55b949c847`), so it was simply broken until now. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190918095144.955-8-mreitz@redhat.com Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> [mreitz: Changed comment in qed.c to explain why a new QED file must be empty, as requested and suggested by Maxim] Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 12:08:45 +01:00
Max Reitz	e61a28a9b6	block: Let format drivers pass @exact When truncating a format node, the @exact parameter is generally handled simply by virtue of the format storing the new size in the image metadata. Such formats do not need to pass on the parameter to their file nodes. There are exceptions, though: - raw and crypto cannot store the image size, and thus must pass on @exact. - When using qcow2 with an external data file, it just makes sense to keep its size in sync with the qcow2 virtual disk (because the external data file is the virtual disk). Therefore, we should pass @exact when truncating it. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190918095144.955-7-mreitz@redhat.com Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 12:05:30 +01:00
Max Reitz	82325ae5f2	block: Evaluate @exact in protocol drivers We have two protocol drivers that return success when trying to shrink a block device even though they cannot shrink it. This behavior is now only allowed with exact=false, so they should return an error with exact=true. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190918095144.955-6-mreitz@redhat.com Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 12:05:24 +01:00
Max Reitz	c80d8b06cf	block: Add @exact parameter to bdrv_co_truncate() We have two drivers (iscsi and file-posix) that (in some cases) return success from their .bdrv_co_truncate() implementation if the block device is larger than the requested offset, but cannot be shrunk. Some callers do not want that behavior, so this patch adds a new parameter that they can use to turn off that behavior. This patch just adds the parameter and lets the block/io.c and block/block-backend.c functions pass it around. All other callers always pass false and none of the implementations evaluate it, so that this patch does not change existing behavior. Future patches take care of that. Suggested-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190918095144.955-5-mreitz@redhat.com Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 12:00:07 +01:00
Max Reitz	26536c7fc2	block: Do not truncate file node when formatting There is no reason why the format drivers need to truncate the protocol node when formatting it. When using the old .bdrv_co_create_ops() interface, the file will be created with no size option anyway, which generally gives it a size of 0. (Exceptions are block devices, which cannot be truncated anyway.) When using blockdev-create, the user must have given the file node some size anyway, so there is no reason why we should override that. qed is an exception, it needs the file to start completely empty (as explained by `c743849bee`). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190918095144.955-4-mreitz@redhat.com Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:59:57 +01:00
Max Reitz	bb8160eb78	block/cor: Drop cor_co_truncate() No other filter driver has a .bdrv_co_truncate() implementation, and there is no need to because the general block layer code can handle it just as well. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190918095144.955-3-mreitz@redhat.com Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:59:51 +01:00
Max Reitz	6b7e8f8b1c	block: Handle filter truncation like native impl. Make the filter truncation (passing it through to bs->file) a first-class citizen and handle it exactly as if it was the filter driver's native implementation of .bdrv_co_truncate(). I do not see a reason not to, it makes the code a bit shorter, and may be even more correct because this gets us to finish the write_req that we prepared before (may be important to e.g. bring dirty bitmaps to the correct size). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190918095144.955-2-mreitz@redhat.com Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:59:45 +01:00
Max Reitz	e40e6e88f6	qcow2: Fix v3 snapshot table entry compliancy qcow2 v3 images require every snapshot table entry to have at least 16 bytes of extra data. If they do not, let qemu-img check -r all fix it. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20191011152814.14791-15-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:54:09 +01:00
Max Reitz	d2b1d1ec73	qcow2: Repair snapshot table with too many entries The user cannot choose which snapshots are removed. This is fine because we have chosen the maximum snapshot table size to be so large (65536 entries) that it cannot be reasonably reached. If the snapshot table exceeds this size, the image has probably been corrupted in some way; in this case, it is most important to just make the image usable such that the user can copy off at least the active layer. (Also note that the snapshots will be removed only with "-r all", so a plain "check" or "check -r leaks" will not delete any data.) Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20191011152814.14791-14-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:54:08 +01:00
Max Reitz	099febf3ac	qcow2: Fix overly long snapshot tables We currently refuse to open qcow2 images with overly long snapshot tables. This patch makes qemu-img check -r all drop all offending entries past what we deem acceptable. The user cannot choose which snapshots are removed. This is fine because we have chosen the maximum snapshot table size to be so large (64 MB) that it cannot be reasonably reached. If the snapshot table exceeds this size, the image has probably been corrupted in some way; in this case, it is most important to just make the image usable such that the user can copy off at least the active layer. (Also note that the snapshots will be removed only with "-r all", so a plain "check" or "check -r leaks" will not delete any data.) Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20191011152814.14791-13-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:54:04 +01:00
Max Reitz	624143355c	qcow2: Keep track of the snapshot table length When repairing the snapshot table, we truncate entries that have too much extra data. This frees up space that we do not have to count towards the snapshot table size. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20191011152814.14791-12-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:54:02 +01:00
Max Reitz	f91f1f159b	qcow2: Fix broken snapshot table entries The only case where we currently reject snapshot table entries is when they have too much extra data. Fix them with qemu-img check -r all by counting it as a corruption, reducing their extra_data_size, and then letting qcow2_check_fix_snapshot_table() do the rest. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20191011152814.14791-11-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:54:02 +01:00
Max Reitz	fe446b5da2	qcow2: Add qcow2_check_fix_snapshot_table() qcow2_check_read_snapshot_table() can perform consistency checks, but it cannot fix everything. Specifically, it cannot allocate new clusters, because that should wait until the refcount structures are known to be consistent (i.e., after qcow2_check_refcounts()). Thus, it cannot call qcow2_write_snapshots(). Do that in qcow2_check_fix_snapshot_table(), which is called after qcow2_check_refcounts(). Currently, there is nothing that would set result->corruptions, so this is a no-op. A follow-up patch will change that. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20191011152814.14791-10-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:54:01 +01:00
Max Reitz	8bc584fe03	qcow2: Separate qcow2_check_read_snapshot_table() Reading the snapshot table can fail. That is a problem when we want to repair the image. Therefore, stop reading the snapshot table in qcow2_do_open() in check mode. Instead, add a new function qcow2_check_read_snapshot_table() that reads the snapshot table at a later point. In the future, we want to handle errors here and fix them. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20191011152814.14791-9-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:54:00 +01:00
Max Reitz	0a85af351d	qcow2: Write v3-compliant snapshot list on upgrade qcow2 v3 requires every snapshot table entry to have two extra data fields: The 64-bit VM state size, and the virtual disk size. Both are optional for v2 images, so they may not be present. qcow2_upgrade() therefore should update the snapshot table to ensure all entries have these extra data fields. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1727347 Reported-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20191011152814.14791-8-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:53:52 +01:00
Max Reitz	722efb0c7c	qcow2: Put qcow2_upgrade() into its own function This does not make sense right now, but it will make sense once we need to do more than to just update s->qcow_version. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20191011152814.14791-7-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:53:20 +01:00
Max Reitz	e0314b56b2	qcow2: Make qcow2_write_snapshots() public Updating the snapshot list will be useful when upgrading a v2 image to v3, so we will need to call this function in qcow2.c. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20191011152814.14791-6-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:53:17 +01:00
Max Reitz	fcf9a6b728	qcow2: Keep unknown extra snapshot data The qcow2 specification says to ignore unknown extra data fields in snapshot table entries. Currently, we discard it whenever we update the image, which is a bit different from "ignore". This patch makes the qcow2 driver keep all unknown extra data fields when updating an image's snapshot table. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20191011152814.14791-5-mreitz@redhat.com [mreitz: Adjusted comments as proposed by Eric] Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:52:57 +01:00
Max Reitz	ecf6c7c0c1	qcow2: Add Error ** to qcow2_read_snapshots() Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20191011152814.14791-4-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:51:09 +01:00
Max Reitz	d8fa8442ad	qcow2: Use endof() Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20191011152814.14791-3-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:51:08 +01:00
Max Reitz	f93c3add3a	mirror: Do not dereference invalid pointers mirror_exit_common() may be called twice (if it is called from mirror_prepare() and fails, it will be called from mirror_abort() again). In such a case, many of the pointers in the MirrorBlockJob object will already be freed. This can be seen most reliably for s->target, which is set to NULL (and then dereferenced by blk_bs()). Cc: qemu-stable@nongnu.org Fixes: `737efc1eda` Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20191014153931.20699-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:49:37 +01:00
Maxim Levitsky	e87a09d625	block/nvme: add support for discard Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-id: 20190913133627.28450-3-mlevitsk@redhat.com Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:34:35 +01:00
Maxim Levitsky	e0dd95e373	block/nvme: add support for write zeros Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-id: 20190913133627.28450-2-mlevitsk@redhat.com Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:34:30 +01:00
Vladimir Sementsov-Ogievskiy	0e2402452f	block/block-copy: increase buffered copy request No reason to limit buffered copy to one cluster. Let's allow up to 1 MiB. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20191022111805.3432-7-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:22:31 +01:00
Vladimir Sementsov-Ogievskiy	7f739d0e53	block/block-copy: add memory limit Currently total allocation for parallel requests to block-copy instance is unlimited. Let's limit it to 128 MiB. For now block-copy is used only in backup, so actually we limit total allocation for backup job. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20191022111805.3432-6-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:22:31 +01:00
Vladimir Sementsov-Ogievskiy	e332a726da	block/block-copy: refactor copying Merge copying code into one function block_copy_do_copy, which only calls bdrv_ io functions and don't do any synchronization (like dirty bitmap set/reset). Refactor block_copy() function so that it takes full decision about size of chunk to be copied and does all the synchronization (checking intersecting requests, set/reset dirty bitmaps). It will help: - introduce parallel processing of block_copy iterations: we need to calculate chunk size, start async chunk copying and go to the next iteration - simplify synchronization improvement (like memory limiting in further commit and reducing critical section (now we lock the whole requested range, when actually we need to lock only dirty region which we handle at the moment)) Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20191022111805.3432-4-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:22:31 +01:00
Vladimir Sementsov-Ogievskiy	b3b7036afb	block/block-copy: limit copy_range_size to 16 MiB Large copy range may imply memory allocation and large io effort, so using 2G copy range request may be bad idea. Let's limit it to 16 MiB. It also helps the following patch to refactor copy-with-offload fallback to copy-with-bounce-buffer. Note, that total memory usage of backup is still not limited, it will be fixed in further commit. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20191022111805.3432-3-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:22:31 +01:00
Vladimir Sementsov-Ogievskiy	3816edd2cb	block/block-copy: allocate buffer in block_copy_with_bounce_buffer Move bounce_buffer allocation block_copy_with_bounce_buffer. This commit simplifies further work on implementing copying by larger chunks (of different size) and further asynchronous handling of block_copy iterations (with help of block/aio_task API). Allocation works fast, a lot faster than disk io, so it's not a problem that we now allocate/free bounce_buffer more times. And we anyway will have to allocate several bounce_buffers for parallel execution of loop iterations in future. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20191022111805.3432-2-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:22:31 +01:00
Vladimir Sementsov-Ogievskiy	994b44ab20	Revert "mirror: Only mirror granularity-aligned chunks" This reverts commit `9adc1cb49a`. "mirror: Only mirror granularity-aligned chunks" Since previous commit unaligned chunks are supported by do_sync_target_write. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20191011090711.19940-6-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:22:30 +01:00
Vladimir Sementsov-Ogievskiy	dbdf699cad	block/mirror: support unaligned write in active mirror Prior `9adc1cb49a` do_sync_target_write had a bug: it reset aligned-up region in the dirty bitmap, which means that we may not copy some bytes and assume them copied, which actually leads to producing corrupted target. So `9adc1cb49a` forced dirty bitmap granularity to be request_alignment for mirror-top filter, so we are not working with unaligned requests. However forcing large alignment obviously decreases performance of unaligned requests. This commit provides another solution for the problem: if unaligned padding is already dirty, we can safely ignore it, as 1. It's dirty, it will be copied by mirror_iteration anyway 2. It's dirty, so skipping it now we don't increase dirtiness of the bitmap and therefore don't damage "synchronicity" of the write-blocking mirror. If unaligned padding is not dirty, we just write it, no reason to touch dirty bitmap if we succeed (on failure we'll set the whole region ofcourse, but we loss "synchronicity" on failure anyway). Note: we need to disable dirty_bitmap, otherwise we will not be able to see in do_sync_target_write bitmap state before current operation. We may of course check dirty bitmap before the operation in bdrv_mirror_top_do_write and remember it, but we don't need active dirty bitmap for write-blocking mirror anyway. New code-path is unused until the following commit reverts `9adc1cb49a`. Suggested-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20191011090711.19940-5-vsementsov@virtuozzo.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:22:30 +01:00
Vladimir Sementsov-Ogievskiy	b30168647f	block/block-backend: add blk_co_pwritev_part Add blk write function with qiov_offset parameter. It's needed for the following commit. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20191011090711.19940-4-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:22:30 +01:00
Vladimir Sementsov-Ogievskiy	5c511ac375	block/mirror: simplify do_sync_target_write do_sync_target_write is called from bdrv_mirror_top_do_write after write/discard operation, all inside active_write/active_write_settle protecting us from mirror iteration. So the whole area is dirty for sure, no reason to examine dirty bitmap. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20191011090711.19940-3-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-28 11:22:30 +01:00
Wei Yang	038adc2f58	core: replace getpagesize() with qemu_real_host_page_size There are three page size in qemu: real host page size host page size target page size All of them have dedicate variable to represent. For the last two, we use the same form in the whole qemu project, while for the first one we use two forms: qemu_real_host_page_size and getpagesize(). qemu_real_host_page_size is defined to be a replacement of getpagesize(), so let it serve the role. [Note] Not fully tested for some arch or device. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-26 15:38:06 +02:00
Kevin Wolf	5e97855052	qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation() qcow2_detect_metadata_preallocation() calls qcow2_get_refcount() which requires s->lock to be taken to protect its accesses to the refcount table and refcount blocks. However, nothing in this code path actually took the lock. This could cause the same cache entry to be used by two requests at the same time, for different tables at different offsets, resulting in image corruption. As it would be preferable to base the detection on consistent data (even though it's just heuristics), let's take the lock not only around the qcow2_get_refcount() calls, but around the whole function. This patch takes the lock in qcow2_co_block_status() earlier and asserts in qcow2_detect_metadata_preallocation() that we hold the lock. Fixes: `69f47505ee` Cc: qemu-stable@nongnu.org Reported-by: Michael Weiser <michael.weiser@gmx.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Michael Weiser <michael.weiser@gmx.de> Reviewed-by: Michael Weiser <michael.weiser@gmx.de> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2019-10-25 15:18:55 +02:00
Vladimir Sementsov-Ogievskiy	8ccf458af5	block/backup: drop dead code from backup_job_create After commit `00e30f05de`, there is no more "goto error" points after job creation, so after "error:" @job is always NULL and we don't need roll-back job creation. Reported-by: Coverity (CID 1406402) Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-10-25 15:15:01 +02:00
Vladimir Sementsov-Ogievskiy	f7651539d8	block/nbd: nbd reconnect Implement reconnect. To achieve this: 1. add new modes: connecting-wait: means, that reconnecting is in progress, and there were small number of reconnect attempts, so all requests are waiting for the connection. connecting-nowait: reconnecting is in progress, there were a lot of attempts of reconnect, all requests will return errors. two old modes are used too: connected: normal state quit: exiting after fatal error or on close Possible transitions are: * -> quit connecting-* -> connected connecting-wait -> connecting-nowait (transition is done after reconnect-delay seconds in connecting-wait mode) connected -> connecting-wait 2. Implement reconnect in connection_co. So, in connecting-* mode, connection_co, tries to reconnect unlimited times. 3. Retry nbd queries on channel error, if we are in connecting-wait state. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20191009084158.15614-3-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2019-10-22 09:22:07 -05:00
Vladimir Sementsov-Ogievskiy	4dd09f6223	qcow2-bitmap: move bitmap reopen-rw code to qcow2_reopen_commit The only reason I can imagine for this strange code at the very-end of bdrv_reopen_commit is the fact that bs->read_only updated after calling drv->bdrv_reopen_commit in bdrv_reopen_commit. And in the same time, prior to previous commit, qcow2_reopen_bitmaps_rw did a wrong check for being writable, when actually it only need writable file child not self. So, as it's fixed, let's move things to correct place. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Acked-by: Max Reitz <mreitz@redhat.com> Message-id: 20190927122355.7344-10-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:53:28 -04:00
Vladimir Sementsov-Ogievskiy	f6333cbf8b	block/qcow2-bitmap: fix and improve qcow2_reopen_bitmaps_rw - Correct check for write access to file child, and in correct place (only if we want to write). - Support reopen rw -> rw (which will be used in following commit), for example, !bdrv_dirty_bitmap_readonly() is not a corruption if bitmap is marked IN_USE in the image. - Consider unexpected bitmap as a corruption and check other combinations of in-image and in-RAM bitmaps. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20190927122355.7344-9-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:53:28 -04:00
Vladimir Sementsov-Ogievskiy	644ddbb754	block/qcow2-bitmap: do not remove bitmaps on reopen-ro qcow2_reopen_bitmaps_ro wants to store bitmaps and then mark them all readonly. But the latter don't work, as qcow2_store_persistent_dirty_bitmaps removes bitmaps after storing. It's OK for inactivation but bad idea for reopen-ro. And this leads to the following bug: Assume we have persistent bitmap 'bitmap0'. Create external snapshot bitmap0 is stored and therefore removed Commit snapshot now we have no bitmaps Do some writes from guest () they are not marked in bitmap Shutdown Start bitmap0 is loaded as valid, but it is actually broken! It misses writes () Incremental backup it will be inconsistent So, let's stop removing bitmaps on reopen-ro. But don't rejoice: reopening bitmaps to rw is broken too, so the whole scenario will not work after this patch and we can't enable corresponding test cases in 260 iotests still. Reopening bitmaps rw will be fixed in the following patches. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20190927122355.7344-7-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:02:32 -04:00
Vladimir Sementsov-Ogievskiy	bd429a884c	block/qcow2-bitmap: drop qcow2_reopen_bitmaps_rw_hint() The function is unused, drop it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20190927122355.7344-6-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:02:32 -04:00
Vladimir Sementsov-Ogievskiy	f88676c149	block/qcow2-bitmap: get rid of bdrv_has_changed_persistent_bitmaps Firstly, no reason to optimize failure path. Then, function name is ambiguous: it checks for readonly and similar things, but someone may think that it will ignore normal bitmaps which was just unchanged, and this is in bad relation with the fact that we should drop IN_USE flag for unchanged bitmaps in the image. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20190927122355.7344-5-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:02:32 -04:00
Vladimir Sementsov-Ogievskiy	ef9041a7b8	block/dirty-bitmap: refactor bdrv_dirty_bitmap_next bdrv_dirty_bitmap_next is always used in same pattern. So, split it into _next and _first, instead of combining two functions into one and add FOR_EACH_DIRTY_BITMAP macro. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20190916141911.5255-5-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:02:32 -04:00
Vladimir Sementsov-Ogievskiy	1e63830160	block/dirty-bitmap: drop BdrvDirtyBitmap.mutex mutex field is just a pointer to bs->dirty_bitmap_mutex, so no needs to store it in BdrvDirtyBitmap when we have bs pointer in it (since previous patch). Drop mutex field. Constantly use bdrv_dirty_bitmaps_lock/unlock in block/dirty-bitmap.c to make it more obvious that it's not per-bitmap lock. Still, for simplicity, leave bdrv_dirty_bitmap_lock/unlock functions as an external API. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20190916141911.5255-4-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:02:32 -04:00

1 2 3 4 5 ...

4632 Commits