mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Kevin Wolf	aa269ff888	qcow2: Fix open with 'file' in iothread qcow2_open() doesn't work correctly when opening the 'file' child moves bs to an iothread, for several reasons: - It uses BDRV_POLL_WHILE() to wait for the qcow2_open_entry() coroutine, which involves dropping the AioContext lock for bs when it is not in the main context - but we don't hold it, so this crashes. - It runs the qcow2_open_entry() coroutine in the current thread instead of the new AioContext of bs. - qcow2_open_entry() doesn't notify the main loop when it's done. This patches fixes these issues around delegating work to a coroutine. Temporarily dropping the main AioContext lock is not necessary because we know we run in the main thread. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230525124713.401149-7-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-05-30 17:21:23 +02:00
Kevin Wolf	e3e31dc872	qcow2: Unlock the graph in qcow2_do_open() where necessary qcow2_do_open() calls a few no_co_wrappers that wrap functions taking the graph lock internally as a writer. Therefore, it can't hold the reader lock across these calls, it causes deadlocks. Drop the lock temporarily around the calls. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230510203601.418015-4-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-05-19 19:12:12 +02:00
Kevin Wolf	4db7ba3b87	block: Call .bdrv_co_create(_opts) unlocked These are functions that modify the graph, so they must be able to take a writer lock. This is impossible if they already hold the reader lock. If they need a reader lock for some of their operations, they should take it internally. Many of them go through blk_(), which will always take the lock itself. Direct calls of bdrv_() need to take the reader lock. Note that while locking for bdrv_co_() calls is checked by TSA, this is not the case for the mixed_coroutine_fns bdrv_(). Holding the lock is still required when they are called from coroutine context like here! This effectively reverts `4ec8df0183`, but adds some internal locking instead. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230510203601.418015-2-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-05-19 19:12:12 +02:00
Kevin Wolf	1a30b0f5d7	block: .bdrv_open is non-coroutine and unlocked Drivers were a bit confused about whether .bdrv_open can run in a coroutine and whether or not it holds a graph lock. It cannot keep a graph lock from the caller across the whole function because it both changes the graph (requires a writer lock) and does I/O (requires a reader lock). Therefore, it should take these locks internally as needed. The functions used to be called in coroutine context during image creation. This was buggy for other reasons, and as of commit `32192301`, all block drivers go through no_co_wrappers. So it is not called in coroutine context any more. Fix qcow2 and qed to work with the correct assumptions: The graph lock needs to be taken internally instead of just assuming it's already there, and the coroutine path is dead code that can be removed. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-9-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-05-10 14:16:53 +02:00
Kevin Wolf	b2ab5f545f	block: bdrv/blk_co_unref() for calls in coroutine context These functions must not be called in coroutine context, because they need write access to the graph. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-4-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-05-10 14:16:53 +02:00
Kevin Wolf	0050c163ff	qcow2: Don't call bdrv_getlength() in coroutine_fns There is a bdrv_co_getlength() now, which should be used in coroutine context. This requires adding GRAPH_RDLOCK to some functions so that this still compiles with TSA because bdrv_co_getlength() is GRAPH_RDLOCK. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230504115750.54437-2-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-05-10 14:16:53 +02:00
Paolo Bonzini	a39bae4ecd	qcow2: mark various functions as coroutine_fn and GRAPH_RDLOCK Functions that can do I/O (including calling bdrv_is_allocated and bdrv_block_status functions) are prime candidates for being coroutine_fns. Make the change for those that are themselves called only from coroutine_fns. Also annotate that they are called with the graph rdlock taken, thus allowing them to call bdrv_co_*() functions for I/O. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20230309084456.304669-9-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-04-25 13:17:28 +02:00
Kevin Wolf	4ec8df0183	block: Mark bdrv_co_create() and callers GRAPH_RDLOCK This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_create() need to hold a reader lock for the graph. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-17-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-02-23 19:49:23 +01:00
Emanuele Giuseppe Esposito	742bf09b20	block: Mark bdrv_co_copy_range() GRAPH_RDLOCK This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_copy_range() need to hold a reader lock for the graph. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-15-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-02-23 19:49:20 +01:00
Kevin Wolf	b9b10c35e5	block: Mark public read/write functions GRAPH_RDLOCK This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_pread/pwrite() need to hold a reader lock for the graph. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-12-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-02-23 19:49:17 +01:00
Kevin Wolf	7b1fb72e2c	block: Mark read/write in block/io.c GRAPH_RDLOCK This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_driver_*() need to hold a reader lock for the graph. It doesn't add the annotation to public functions yet. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-11-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-02-23 19:49:16 +01:00
Kevin Wolf	abaf8b750b	block: Mark bdrv_co_pwrite_zeroes() and callers GRAPH_RDLOCK This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_pwrite_zeroes() need to hold a reader lock for the graph. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-10-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-02-23 19:49:14 +01:00
Kevin Wolf	c2b8e31516	block: Mark bdrv_co_truncate() and callers GRAPH_RDLOCK This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_co_truncate() need to hold a reader lock for the graph. For some places, we know that they will hold the lock, but we don't have the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock() with a FIXME comment. These places will be removed once everything is properly annotated. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230203152202.49054-4-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-02-23 19:49:03 +01:00
Kevin Wolf	ecbc57caba	qcow2: Fix open/create to open images with no_co_wrapper .bdrv_co_create implementations run in a coroutine, as does qcow2_do_open(). Therefore they are not allowed to open images directly. Fix the calls to use the corresponding no_co_wrappers instead. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230126172432.436111-7-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-02-17 11:22:19 +01:00
Emanuele Giuseppe Esposito	ca5e2ad98d	block: Rename bdrv_load/save_vmstate() to bdrv_co_load/save_vmstate() Since these functions always run in coroutine context, adjust their name to include "_co_", just like all other BlockDriver callbacks. No functional change intended. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230113204212.359076-15-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-02-01 16:52:32 +01:00
Emanuele Giuseppe Esposito	3d47eb0a2a	block: Convert bdrv_get_info() to co_wrapper_mixed bdrv_get_info() is categorized as an I/O function, and it currently doesn't run in a coroutine. We should let it take a graph rdlock since it traverses the block nodes graph, which however is only possible in a coroutine. Therefore turn it into a co_wrapper to move the actual function into a coroutine where the lock can be taken. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230113204212.359076-11-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-02-01 16:52:32 +01:00
Markus Armbruster	e2c1c34f13	include/block: Untangle inclusion loops We have two inclusion loops: block/block.h -> block/block-global-state.h -> block/block-common.h -> block/blockjob.h -> block/block.h block/block.h -> block/block-io.h -> block/block-common.h -> block/blockjob.h -> block/block.h I believe these go back to Emanuele's reorganization of the block API, merged a few months ago in commit `d7e2fe4aac`. Fortunately, breaking them is merely a matter of deleting unnecessary includes from headers, and adding them back in places where they are now missing. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221221133551.3967339-2-armbru@redhat.com>	2023-01-20 07:24:28 +01:00
Emanuele Giuseppe Esposito	2475a0d0f4	block: bdrv_create_file is a coroutine_fn It is always called in coroutine_fn callbacks, therefore it can directly call bdrv_co_create(). Rename it to bdrv_co_create_file too. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20221128142337.657646-9-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-12-15 16:07:43 +01:00
Markus Armbruster	54fde4ff06	qapi block: Elide redundant has_FOO in generated C The has_FOO for pointer-valued FOO are redundant, except for arrays. They are also a nuisance to work with. Recent commit "qapi: Start to elide redundant has_FOO in generated C" provided the means to elide them step by step. This is the step for qapi/block*.json. Said commit explains the transformation in more detail. There is one instance of the invariant violation mentioned there: qcow2_signal_corruption() passes false, "" when node_name is an empty string. Take care to pass NULL then. The previous two commits cleaned up two more. Additionally, helper bdrv_latency_histogram_stats() loses its output parameters and returns a value instead. Cc: Kevin Wolf <kwolf@redhat.com> Cc: Hanna Reitz <hreitz@redhat.com> Cc: qemu-block@nongnu.org Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221104160712.3005652-11-armbru@redhat.com> [Fixes for #ifndef LIBRBD_SUPPORTS_ENCRYPTION and MacOS squashed in]	2022-12-14 20:03:25 +01:00
Alberto Faria	38505e2a14	qcow2: switch to _co_ functions Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-20-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-10-27 20:14:11 +02:00
Alberto Faria	014688a1b5	qcow2: add coroutine_fn annotation for indirect-called functions Signed-off-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221013123711.620631-10-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy	8393078032	block: introduce bdrv_open_file_child() helper Almost all drivers call bdrv_open_child() similarly. Let's create a helper for this. The only not updated drivers that call bdrv_open_child() to set bs->file are raw-format and snapshot-access: raw-format sometimes want to have filtered child but don't set drv->is_filter to true. snapshot-access wants only DATA \| PRIMARY Possibly we should implement drv->is_filter_func() handler, to consider raw-format as filter when it works as filter.. But it's another story. Note also, that we decrease assignments to bs->file in code: it helps us restrict modifying this field in further commit. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220726201134.924743-3-vsementsov@yandex-team.ru> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-10-27 20:14:11 +02:00
Paolo Bonzini	050ed2e736	qcow2: add missing coroutine_fn annotations Callers of coroutine_fn must be coroutine_fn themselves, or the call must be within "if (qemu_in_coroutine())". Apply coroutine_fn to functions where this holds. Reviewed-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220922084924.201610-15-pbonzini@redhat.com> [kwolf: Fixed up coding style] Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-10-07 12:11:40 +02:00
Hanna Reitz	ec64b1ca08	block/qcow2: Keep auto_backing_file if possible qcow2_do_open() is used by qcow2_co_invalidate_cache(), i.e. may be run on an image that has been opened before. When reading the backing file string from the image header, compare it against the existing bs->backing_file, and update bs->auto_backing_file only if they differ. auto_backing_file should ideally contain the filename the backing BDS will actually have after opening, i.e. a post-bdrv_refresh_filename() version of what is in the image header. So for example, if the image header reports the following backing file string: json:{"driver": "qcow2", "file": { "driver": "file", "filename": "/tmp/backing.qcow2" }} Then auto_backing_file should contain simply "/tmp/backing.qcow2". Because bdrv_refresh_filename() only works on existing BDSs, though, the way how we get this auto_backing_file value is to have the format driver set it to whatever is in the image header, and when the backing BDS is opened based on that, we update it with the filename the backing BDS actually got. However, qcow2's qcow2_co_invalidate_cache() implementation breaks this because it just resets auto_backing_file to whatever is in the image file without opening a BDS based on it, so we never get auto_backing_file back to the "refreshed" version, and in the example above, it would stay "json:{...}". Then, bs->backing->bs->filename will differ from bs->auto_backing_file, making bdrv_backing_overridden(bs) return true, which will lead bdrv_refresh_filename(bs) to generate a json:{} filename for bs, even though that may not have been necessary. This is reported in the issue linked below. Therefore, skip updating auto_backing_file if nothing has changed in the image header since we last read it. Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1117 Signed-off-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220803144446.20723-2-hreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-09-30 18:43:44 +02:00
lu zhipeng	38f034e728	qcow2: fix memory leak in qcow2_read_extensions Free feature_table if it is failed in bdrv_pread. Signed-off-by: lu zhipeng <luzhipeng@cestc.cn> Message-Id: <20220921144515.1166-1-luzhipeng@cestc.cn> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-09-30 18:02:30 +02:00
Alberto Faria	a9262f551e	block: Change blk_{pread,pwrite}() param order Swap 'buf' and 'bytes' around for consistency with blk_co_{pread,pwrite}(), and in preparation to implement these functions using generated_co_wrapper. Callers were updated using this Coccinelle script: @@ expression blk, offset, buf, bytes, flags; @@ - blk_pread(blk, offset, buf, bytes, flags) + blk_pread(blk, offset, bytes, buf, flags) @@ expression blk, offset, buf, bytes, flags; @@ - blk_pwrite(blk, offset, buf, bytes, flags) + blk_pwrite(blk, offset, bytes, buf, flags) It had no effect on hw/block/nand.c, presumably due to the #if, so that file was updated manually. Overly-long lines were then fixed by hand. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-4-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>	2022-07-12 12:14:56 +02:00
Alberto Faria	86da43220c	block/qcow2: Use bdrv_pwrite_sync() in qcow2_mark_dirty() Use bdrv_pwrite_sync() instead of calling bdrv_pwrite() and bdrv_flush() separately. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220609152744.3891847-11-afaria@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>	2022-07-12 12:14:56 +02:00
Alberto Faria	a8f0e83cef	block: Use bdrv_co_pwrite_sync() when caller is coroutine_fn Convert uses of bdrv_pwrite_sync() into bdrv_co_pwrite_sync() when the callers are already coroutine_fn. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220609152744.3891847-10-afaria@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>	2022-07-12 12:14:56 +02:00
Alberto Faria	757dda54b4	crypto: Make block callbacks return 0 on success They currently return the value of their headerlen/buflen parameter on success. Returning 0 instead makes it clear that short reads/writes are not possible. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220609152744.3891847-5-afaria@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>	2022-07-12 12:14:55 +02:00
Alberto Faria	353a5d84b2	block: Make bdrv_{pread,pwrite}() return 0 on success They currently return the value of their 'bytes' parameter on success. Make them return 0 instead, for consistency with other I/O functions and in preparation to implement them using generated_co_wrapper. This also makes it clear that short reads/writes are not possible. The few callers that rely on the previous behavior are adjusted accordingly by hand. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220609152744.3891847-4-afaria@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>	2022-07-12 12:14:55 +02:00
Alberto Faria	32cc71def9	block: Change bdrv_{pread,pwrite,pwrite_sync}() param order Swap 'buf' and 'bytes' around for consistency with bdrv_co_{pread,pwrite}(), and in preparation to implement these functions using generated_co_wrapper. Callers were updated using this Coccinelle script: @@ expression child, offset, buf, bytes, flags; @@ - bdrv_pread(child, offset, buf, bytes, flags) + bdrv_pread(child, offset, bytes, buf, flags) @@ expression child, offset, buf, bytes, flags; @@ - bdrv_pwrite(child, offset, buf, bytes, flags) + bdrv_pwrite(child, offset, bytes, buf, flags) @@ expression child, offset, buf, bytes, flags; @@ - bdrv_pwrite_sync(child, offset, buf, bytes, flags) + bdrv_pwrite_sync(child, offset, bytes, buf, flags) Resulting overly-long lines were then fixed by hand. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20220609152744.3891847-3-afaria@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>	2022-07-12 12:14:55 +02:00
Alberto Faria	53fb7844f0	block: Add a 'flags' param to bdrv_{pread,pwrite,pwrite_sync}() For consistency with other I/O functions, and in preparation to implement them using generated_co_wrapper. Callers were updated using this Coccinelle script: @@ expression child, offset, buf, bytes; @@ - bdrv_pread(child, offset, buf, bytes) + bdrv_pread(child, offset, buf, bytes, 0) @@ expression child, offset, buf, bytes; @@ - bdrv_pwrite(child, offset, buf, bytes) + bdrv_pwrite(child, offset, buf, bytes, 0) @@ expression child, offset, buf, bytes; @@ - bdrv_pwrite_sync(child, offset, buf, bytes) + bdrv_pwrite_sync(child, offset, buf, bytes, 0) Resulting overly-long lines were then fixed by hand. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20220609152744.3891847-2-afaria@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>	2022-07-12 12:14:55 +02:00
Hanna Reitz	06e9cd19a4	qcow2: Do not reopen data_file in invalidate_cache qcow2_co_invalidate_cache() closes and opens the qcow2 file, by calling qcow2_close() and qcow2_do_open(). These two functions must thus be usable from both a global-state and an I/O context. As they are, they are not safe to call in an I/O context, because they use bdrv_unref_child() and bdrv_open_child() to close/open the data_file child, respectively, both of which are global-state functions. When used from qcow2_co_invalidate_cache(), we do not need to close/open the data_file child, though (we do not do this for bs->file or bs->backing either), and so we should skip it in the qcow2_co_invalidate_cache() path. To do so, add a parameter to qcow2_do_open() and qcow2_close() to make them skip handling s->data_file, and have qcow2_co_invalidate_cache() exempt it from the memset() on the BDRVQcow2State. (Note that the QED driver similarly closes/opens the QED image by invoking bdrv_qed_close()+bdrv_qed_do_open(), but both functions seem safe to use in an I/O context.) Fixes: https://gitlab.com/qemu-project/qemu/-/issues/945 Signed-off-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220427114057.36651-3-hreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-05-04 15:55:23 +02:00
Peter Maydell	5df022cf2e	osdep: Move memalign-related functions to their own header Move the various memalign-related functions out of osdep.h and into their own header, which we include only where they are used. While we're doing this, add some brief documentation comments. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org	2022-03-07 13:16:49 +00:00
Vladimir Sementsov-Ogievskiy	083c24561a	qcow2: simple case support for downgrading of qcow2 images with zstd If image doesn't have any compressed cluster we can easily switch to zlib compression, which may allow to downgrade the image. That's mostly needed to support IMGOPTS='compression_type=zstd' in some iotests which do qcow2 downgrade. While being here also fix checkpatch complain against '#' in printf formatting. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20211223160144.1097696-13-vsementsov@virtuozzo.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>	2022-02-01 10:51:39 +01:00
Vladimir Sementsov-Ogievskiy	0c8022876f	block: use int64_t instead of int in driver discard handlers We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, convert driver discard handlers bytes parameter to int64_t. The only caller of all updated function is bdrv_co_pdiscard in block/io.c. It is already prepared to work with 64bit requests, but pass at most max(bs->bl.max_pdiscard, INT_MAX) to the driver. Let's look at all updated functions: blkdebug: all calculations are still OK, thanks to bdrv_check_qiov_request(). both rule_check and bdrv_co_pdiscard are 64bit blklogwrites: pass to blk_loc_writes_co_log which is 64bit blkreplay, copy-on-read, filter-compress: pass to bdrv_co_pdiscard, OK copy-before-write: pass to bdrv_co_pdiscard which is 64bit and to cbw_do_copy_before_write which is 64bit file-posix: one handler calls raw_account_discard() is 64bit and both handlers calls raw_do_pdiscard(). Update raw_do_pdiscard, which pass to RawPosixAIOData::aio_nbytes, which is 64bit (and calls raw_account_discard()) gluster: somehow, third argument of glfs_discard_async is size_t. Let's set max_pdiscard accordingly. iscsi: iscsi_allocmap_set_invalid is 64bit, !is_byte_request_lun_aligned is 64bit. list.num is uint32_t. Let's clarify max_pdiscard and pdiscard_alignment. mirror_top: pass to bdrv_mirror_top_do_write() which is 64bit nbd: protocol limitation. max_pdiscard is alredy set strict enough, keep it as is for now. nvme: buf.nlb is uint32_t and we do shift. So, add corresponding limits to nvme_refresh_limits(). preallocate: pass to bdrv_co_pdiscard() which is 64bit. rbd: pass to qemu_rbd_start_co() which is 64bit. qcow2: calculations are still OK, thanks to bdrv_check_qiov_request(), qcow2_cluster_discard() is 64bit. raw-format: raw_adjust_offset() is 64bit, bdrv_co_pdiscard too. throttle: pass to bdrv_co_pdiscard() which is 64bit and to throttle_group_co_io_limits_intercept() which is 64bit as well. test-block-iothread: bytes argument is unused Great! Now all drivers are prepared to handle 64bit discard requests, or else have explicit max_pdiscard limits. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210903102807.27127-11-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-09-29 13:46:32 -05:00
Vladimir Sementsov-Ogievskiy	f34b2bcf8c	block: use int64_t instead of int in driver write_zeroes handlers We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, convert driver write_zeroes handlers bytes parameter to int64_t. The only caller of all updated function is bdrv_co_do_pwrite_zeroes(). bdrv_co_do_pwrite_zeroes() itself is of course OK with widening of callee parameter type. Also, bdrv_co_do_pwrite_zeroes()'s max_write_zeroes is limited to INT_MAX. So, updated functions all are safe, they will not get "bytes" larger than before. Still, let's look through all updated functions, and add assertions to the ones which are actually unprepared to values larger than INT_MAX. For these drivers also set explicit max_pwrite_zeroes limit. Let's go: blkdebug: calculations can't overflow, thanks to bdrv_check_qiov_request() in generic layer. rule_check() and bdrv_co_pwrite_zeroes() both have 64bit argument. blklogwrites: pass to blk_log_writes_co_log() with 64bit argument. blkreplay, copy-on-read, filter-compress: pass to bdrv_co_pwrite_zeroes() which is OK copy-before-write: Calls cbw_do_copy_before_write() and bdrv_co_pwrite_zeroes, both have 64bit argument. file-posix: both handler calls raw_do_pwrite_zeroes, which is updated. In raw_do_pwrite_zeroes() calculations are OK due to bdrv_check_qiov_request(), bytes go to RawPosixAIOData::aio_nbytes which is uint64_t. Check also where that uint64_t gets handed: handle_aiocb_write_zeroes_block() passes a uint64_t[2] to ioctl(BLKZEROOUT), handle_aiocb_write_zeroes() calls do_fallocate() which takes off_t (and we compile to always have 64-bit off_t), as does handle_aiocb_write_zeroes_unmap. All look safe. gluster: bytes go to GlusterAIOCB::size which is int64_t and to glfs_zerofill_async works with off_t. iscsi: Aha, here we deal with iscsi_writesame16_task() that has uint32_t num_blocks argument and iscsi_writesame16_task() has uint16_t argument. Make comments, add assertions and clarify max_pwrite_zeroes calculation. iscsi_allocmap_() functions already has int64_t argument is_byte_request_lun_aligned is simple to update, do it. mirror_top: pass to bdrv_mirror_top_do_write which has uint64_t argument nbd: Aha, here we have protocol limitation, and NBDRequest::len is uint32_t. max_pwrite_zeroes is cleanly set to 32bit value, so we are OK for now. nvme: Again, protocol limitation. And no inherent limit for write-zeroes at all. But from code that calculates cdw12 it's obvious that we do have limit and alignment. Let's clarify it. Also, obviously the code is not prepared to handle bytes=0. Let's handle this case too. trace events already 64bit preallocate: pass to handle_write() and bdrv_co_pwrite_zeroes(), both 64bit. rbd: pass to qemu_rbd_start_co() which is 64bit. qcow2: offset + bytes and alignment still works good (thanks to bdrv_check_qiov_request()), so tail calculation is OK qcow2_subcluster_zeroize() has 64bit argument, should be OK trace events updated qed: qed_co_request wants int nb_sectors. Also in code we have size_t used for request length which may be 32bit. So, let's just keep INT_MAX as a limit (aligning it down to pwrite_zeroes_alignment) and don't care. raw-format: Is OK. raw_adjust_offset and bdrv_co_pwrite_zeroes are both 64bit. throttle: Both throttle_group_co_io_limits_intercept() and bdrv_co_pwrite_zeroes() are 64bit. vmdk: pass to vmdk_pwritev which is 64bit quorum: pass to quorum_co_pwritev() which is 64bit Hooray! At this point all block drivers are prepared to support 64bit write-zero requests, or have explicitly set max_pwrite_zeroes. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210903102807.27127-8-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: use <= rather than < in assertions relying on max_pwrite_zeroes] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-09-29 13:46:32 -05:00
Vladimir Sementsov-Ogievskiy	485350497b	block: use int64_t instead of uint64_t in copy_range driver handlers We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, convert driver copy_range handlers parameters which are already 64bit to signed type. Now let's consider all callers. Simple git grep '\->bdrv_co_copy_range' shows the only caller: bdrv_co_copy_range_internal(), which does bdrv_check_request32(), so everything is OK. Still, the functions may be called directly, not only by drv->... Let's check: git grep '\.bdrv_co_copy_range_$from\\|to$\s*=' \| \ awk '{print $4}' \| sed 's/,//' \| sed 's/&//' \| sort \| uniq \| \ while read func; do git grep "$func(" \| \ grep -v "$func(BlockDriverState"; done shows no more callers. So, we are done. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210903102807.27127-6-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-09-29 13:46:32 -05:00
Vladimir Sementsov-Ogievskiy	e75abedab7	block: use int64_t instead of uint64_t in driver write handlers We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, convert driver write handlers parameters which are already 64bit to signed type. While being here, convert also flags parameter to be BdrvRequestFlags. Now let's consider all callers. Simple git grep '\->bdrv_$aio\\|co$_pwritev$_part$\?' shows that's there three callers of driver function: bdrv_driver_pwritev() and bdrv_driver_pwritev_compressed() in block/io.c, both pass int64_t, checked by bdrv_check_qiov_request() to be non-negative. qcow2_save_vmstate() does bdrv_check_qiov_request(). Still, the functions may be called directly, not only by drv->... Let's check: git grep '\.bdrv_$aio\\|co$_pwritev$_part$\?\s*=' \| \ awk '{print $4}' \| sed 's/,//' \| sed 's/&//' \| sort \| uniq \| \ while read func; do git grep "$func(" \| \ grep -v "$func(BlockDriverState"; done shows several callers: qcow2: qcow2_co_truncate() write at most up to @offset, which is checked in generic qcow2_co_truncate() by bdrv_check_request(). qcow2_co_pwritev_compressed_task() pass the request (or part of the request) that already went through normal write path, so it should be OK qcow: qcow_co_pwritev_compressed() pass int64_t, it's updated by this patch quorum: quorum_co_pwrite_zeroes() pass int64_t and int - OK throttle: throttle_co_pwritev_compressed() pass int64_t, it's updated by this patch vmdk: vmdk_co_pwritev_compressed() pass int64_t, it's updated by this patch Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210903102807.27127-5-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-09-29 13:46:31 -05:00
Vladimir Sementsov-Ogievskiy	f7ef38dd13	block: use int64_t instead of uint64_t in driver read handlers We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, convert driver read handlers parameters which are already 64bit to signed type. While being here, convert also flags parameter to be BdrvRequestFlags. Now let's consider all callers. Simple git grep '\->bdrv_$aio\\|co$_preadv$_part$\?' shows that's there three callers of driver function: bdrv_driver_preadv() in block/io.c, passes int64_t, checked by bdrv_check_qiov_request() to be non-negative. qcow2_load_vmstate() does bdrv_check_qiov_request(). do_perform_cow_read() has uint64_t argument. And a lot of things in qcow2 driver are uint64_t, so converting it is big job. But we must not work with requests that don't satisfy bdrv_check_qiov_request(), so let's just assert it here. Still, the functions may be called directly, not only by drv->... Let's check: git grep '\.bdrv_$aio\\|co$_preadv$_part$\?\s*=' \| \ awk '{print $4}' \| sed 's/,//' \| sed 's/&//' \| sort \| uniq \| \ while read func; do git grep "$func(" \| \ grep -v "$func(BlockDriverState"; done The only one such caller: QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, &data, 1); ... ret = bdrv_replace_test_co_preadv(bs, 0, 1, &qiov, 0); in tests/unit/test-bdrv-drain.c, and it's OK obviously. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210903102807.27127-4-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: fix typos] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-09-29 13:46:31 -05:00
Vladimir Sementsov-Ogievskiy	558902cc3d	qcow2: check request on vmstate save/load path We modify the request by adding an offset to vmstate. Let's check the modified request. It will help us to safely move .bdrv_co_preadv_part and .bdrv_co_pwritev_part to int64_t type of offset and bytes. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20210903102807.27127-3-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-09-29 13:46:31 -05:00
Vladimir Sementsov-Ogievskiy	a6e098462b	qcow2: introduce qcow2_parse_compressed_l2_entry() helper Add helper to parse compressed l2_entry and use it everywhere instead of open-coding. Note, that in most places we move to precise coffset/csize instead of sector-aligned. Still it should work good enough for updating refcounts. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20210914122454.141075-4-vsementsov@virtuozzo.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>	2021-09-15 18:42:38 +02:00
Vladimir Sementsov-Ogievskiy	9a3978a46b	qcow2: compressed read: simplify cluster descriptor passing Let's pass the whole L2 entry and not bother with L2E_COMPRESSED_OFFSET_SIZE_MASK. It also helps further refactoring that adds generic qcow2_parse_compressed_l2_entry() helper. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20210914122454.141075-3-vsementsov@virtuozzo.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>	2021-09-15 18:42:38 +02:00
Kevin Wolf	bcfd86d6a6	qcow2: Fix dangling pointer after reopen for 'file' Without an external data file, s->data_file is a second pointer with the same value as bs->file. When changing bs->file to a different BdrvChild and freeing the old BdrvChild, s->data_file must also be updated, otherwise it points to freed memory and causes crashes. This problem was caught by iotests case 245. Fixes: df2b7086f169239ebad5d150efa29c9bb6d4f820 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20210708114709.206487-2-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-07-09 13:19:11 +02:00
Eric Blake	5a385bf5c5	qcow2: Prohibit backing file changes in 'qemu-img amend' This was deprecated back in `bc5ee6da7` (qcow2: Deprecate use of qemu-img amend to change backing file), and no one in the meantime has given any reasons why it should be supported. Time to make change attempts a hard error (but for convenience, specifying the _same_ backing chain is not forbidden). Update a couple of iotests to match. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20210503213600.569128-2-eblake@redhat.com> Reviewed-by: Connor Kuehl <ckuehl@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-07-09 12:26:05 +02:00
Vladimir Sementsov-Ogievskiy	307261b243	block: consistently use bdrv_is_read_only() It's better to use accessor function instead of bs->read_only directly. In some places use bdrv_is_writable() instead of checking both BDRV_O_RDWR set and BDRV_O_INACTIVE not set. In bdrv_open_common() it's a bit strange to add one more variable, but we are going to drop bs->read_only in the next patch, so new ro local variable substitutes it here. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210527154056.70294-2-vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-06-02 14:23:20 +02:00
Vladimir Sementsov-Ogievskiy	38b4409647	qcow2: set bdi->is_dirty Set bdi->is_dirty, so that qemu-img info could show dirty flag. After this commit the following check will show '"dirty-flag": true': ./build/qemu-img create -f qcow2 -o lazy_refcounts=on x 1M ./build/qemu-io x qemu-io> write 0 1M After "write" command success, kill the qemu-io process: kill -9 <qemu-io pid> ./build/qemu-img info --output=json x This will show '"dirty-flag": true' among other things. (before this commit it shows '"dirty-flag": false') Note, that qcow2's dirty-bit is not a "dirty bit for the image". It only protects qcow2 lazy refcounts feature. So, there are a lot of conditions when qcow2 session may be not closed correctly, but bit is 0. Still, when bit is set, the last session is definitely not finished correctly and it's better to report it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210504160656.462836-1-vsementsov@virtuozzo.com> Tested-by: Kirill Tkhai <ktkhai@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-05-18 11:08:13 +02:00
Max Reitz	484108293d	qcow2: Force preallocation with data-file-raw Setting the qcow2 data-file-raw bit means that you can ignore the qcow2 metadata when reading from the external data file. It does not mean that you have to ignore it, though. Therefore, the data read must be the same regardless of whether you interpret the metadata or whether you ignore it, and thus the L1/L2 tables must all be present and give a 1:1 mapping. This patch changes 244's output: First, the qcow2 file is larger right after creation, because of metadata preallocation. Second, the qemu-img map output changes: Everything that was not explicitly discarded or zeroed is now a data area. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20210326145509.163455-2-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2021-03-30 13:02:10 +02:00
Pavel Dovgalyuk	ad0ce64279	qcow2: use external virtual timers Regular virtual timers are used to emulate timings related to vCPU and peripheral states. QCOW2 uses timers to clean the cache. These timers should have external flag. In the opposite case they affect the execution and it can't be recorded and replayed. This patch adds external flag to the timer for qcow2 cache clean. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <161700516327.1141158.8366564693714562536.stgit@pasha-ThinkPad-X280> Signed-off-by: Max Reitz <mreitz@redhat.com>	2021-03-29 18:04:29 +02:00
Vladimir Sementsov-Ogievskiy	1184b41101	block/qcow2: refactor qcow2_update_options_prepare error paths Keep setting ret close to setting errp and don't merge different error paths into one. This way it's more obvious that we don't return error without setting errp. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-Id: <20210202124956.63146-15-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-03-08 16:04:46 -06:00

1 2 3 4 5 ...

739 Commits