There is a bug in the blklogwrites driver pertaining to logging "write
zeroes" operations, causing log corruption. This can be easily observed
by setting detect-zeroes to something other than "off" for the driver.
The issue is caused by a concurrency bug pertaining to the fact that
"write zeroes" operations have to be logged in two parts: first the log
entry metadata, then the zeroed-out region. While the log entry
metadata is being written by bdrv_co_pwritev(), another operation may
begin in the meanwhile and modify the state of the blklogwrites driver.
This is as intended by the coroutine-driven I/O model in QEMU, of
course.
Unfortunately, this specific scenario is mishandled. A short example:
1. Initially, in the current operation (#1), the current log sector
number in the driver state is only incremented by the number of sectors
taken by the log entry metadata, after which the log entry metadata is
written. The current operation yields.
2. Another operation (#2) may start while the log entry metadata is
being written. It uses the current log position as the start offset for
its log entry. This is in the sector right after the operation #1 log
entry metadata, which is bad!
3. After bdrv_co_pwritev() returns (#1), the current log sector
number is reread from the driver state in order to find out the start
offset for bdrv_co_pwrite_zeroes(). This is an obvious blunder, as the
offset will be the sector right after the (misplaced) operation #2 log
entry, which means that the zeroed-out region begins at the wrong
offset.
4. As a result of the above, the log is corrupt.
Fix this by only reading the driver metadata once, computing the
offsets and sizes in one go (including the optional zeroed-out region)
and setting the log sector number to the appropriate value for the next
operation in line.
Signed-off-by: Ari Sundholm <ari@tuxera.com>
Cc: qemu-stable@nongnu.org
Message-ID: <20240109184646.1128475-1-megari@gmx.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit a9c8ea9547)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Using fleecing backup like in [0] on a qcow2 image (with metadata
preallocation) can lead to the following assertion failure:
> bdrv_co_do_block_status: Assertion `!(ret & BDRV_BLOCK_ZERO)' failed.
In the reproducer [0], it happens because the BDRV_BLOCK_RECURSE flag
will be set by the qcow2 driver, so the caller will recursively check
the file child. Then the BDRV_BLOCK_ZERO set too. Later up the call
chain, in bdrv_co_do_block_status() for the snapshot-access driver,
the assertion failure will happen, because both flags are set.
To fix it, clear the recurse flag after the recursive check was done.
In detail:
> #0 qcow2_co_block_status
Returns 0x45 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA |
BDRV_BLOCK_OFFSET_VALID.
> #1 bdrv_co_do_block_status
Because of the data flag, bdrv_co_do_block_status() will now also set
BDRV_BLOCK_ALLOCATED. Because of the recurse flag,
bdrv_co_do_block_status() for the bdrv_file child will be called,
which returns 0x16 = BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_OFFSET_VALID |
BDRV_BLOCK_ZERO. Now the return value inherits the zero flag.
Returns 0x57 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA |
BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_ZERO.
> #2 bdrv_co_common_block_status_above
> #3 bdrv_co_block_status_above
> #4 bdrv_co_block_status
> #5 cbw_co_snapshot_block_status
> #6 bdrv_co_snapshot_block_status
> #7 snapshot_access_co_block_status
> #8 bdrv_co_do_block_status
Return value is propagated all the way up to here, where the assertion
failure happens, because BDRV_BLOCK_RECURSE and BDRV_BLOCK_ZERO are
both set.
> #9 bdrv_co_common_block_status_above
> #10 bdrv_co_block_status_above
> #11 block_copy_block_status
> #12 block_copy_dirty_clusters
> #13 block_copy_common
> #14 block_copy_async_co_entry
> #15 coroutine_trampoline
[0]:
> #!/bin/bash
> rm /tmp/disk.qcow2
> ./qemu-img create /tmp/disk.qcow2 -o preallocation=metadata -f qcow2 1G
> ./qemu-img create /tmp/fleecing.qcow2 -f qcow2 1G
> ./qemu-img create /tmp/backup.qcow2 -f qcow2 1G
> ./qemu-system-x86_64 --qmp stdio \
> --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 \
> --blockdev qcow2,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.qcow2 \
> --blockdev qcow2,node-name=node2,file.driver=file,file.filename=/tmp/backup.qcow2 \
> <<EOF
> {"execute": "qmp_capabilities"}
> {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", "file": "node0", "target": "node1", "node-name": "node3" } }
> {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", "file": "node3", "node-name": "snap0" } }
> {"execute": "blockdev-backup", "arguments": { "device": "snap0", "target": "node1", "sync": "full", "job-id": "backup0" } }
> EOF
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-id: 20240116154839.401030-1-f.ebner@proxmox.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 8a9be79924)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
bdrv_is_read_only() only checks if the node is configured to be
read-only eventually, but even if it returns false, writing to the node
may not be permitted at the moment (because it's inactive).
bdrv_is_writable() checks that the node can be written to right now, and
this is what the snapshot operations really need.
Change bdrv_can_snapshot() to use bdrv_is_writable() to fix crashes like
the following:
$ ./qemu-system-x86_64 -hda /tmp/test.qcow2 -loadvm foo -incoming defer
qemu-system-x86_64: ../block/io.c:1990: int bdrv_co_write_req_prepare(BdrvChild *, int64_t, int64_t, BdrvTrackedRequest *, int): Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.
The resulting error message after this patch isn't perfect yet, but at
least it doesn't crash any more:
$ ./qemu-system-x86_64 -hda /tmp/test.qcow2 -loadvm foo -incoming defer
qemu-system-x86_64: Device 'ide0-hd0' is writable but does not support snapshots
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231201142520.32255-2-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit d3007d348a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If the text description file is larger than DESC_SIZE, we force the last
byte in the buffer to be 0 and write it out.
This results in a corruption.
Try to allocate a big buffer in this case.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1923
Signed-off-by: Fam Zheng <fam@euphon.net>
Message-ID: <20231124115654.3239137-1-fam@euphon.net>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 9fb7b350ba)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: fixups in block/vmdk.c due to missing-in-7.2 v8.0.0-2084-g28944f99c4
"vmdk: mark more functions as coroutine_fns and GRAPH_RDLOCK")
NVMeQueuePair::reqs has length NVME_NUM_REQS, which less than
NVME_QUEUE_SIZE by 1.
Fixes: 1086e95da1 ("block/nvme: switch to a NVMeRequest freelist")
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Maksim Davydov <davydov-max@yandex-team.ru>
Message-id: 20231017125941.810461-5-vsementsov@yandex-team.ru
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit cc8fb0c3ae)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
hmp_commit() calls blk_is_available() from a non-coroutine context (and
in the main loop). blk_is_available() is a co_wrapper_mixed_bdrv_rdlock
function, and in the non-coroutine context it calls AIO_WAIT_WHILE(),
which crashes if the aio_context lock is not taken before.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1615
Signed-off-by: Wang Liang <wangliangzz@inspur.com>
Message-Id: <20230424103902.45265-1-wangliangzz@126.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 8c1e8fb2e7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The corruption occurs when a BAT entry aligned to 4096 bytes is changed.
Specifically, the corruption occurs during the creation of the LOG Data
Descriptor. The incorrect behavior involves copying 4088 bytes from the
original 4096 bytes aligned offset to `tmp[8..4096]` and then copying
the new value for the first BAT entry to the beginning `tmp[0..8]`.
This results in all existing BAT entries inside the 4K region being
incorrectly moved by 8 bytes and the last entry being lost.
This bug did not cause noticeable corruption when only sequentially
writing once to an empty dynamic VHDX (e.g.
using `qemu-img convert -O vhdx -o subformat=dynamic ...`), but it
still resulted in invalid values for the (unused) Sector Bitmap BAT
entries.
Importantly, this corruption would only become noticeable after the
corrupted BAT is re-read from the file.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/727
Cc: qemu-stable@nongnu.org
Signed-off-by: Lukas Tschoke <lukts330@gmail.com>
Message-Id: <6cfb6d6b-adc5-7772-c8a5-6bae9a0ad668@gmail.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 8af037fe4c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
* 7.55.0 deprecates CURLINFO_CONTENT_LENGTH_DOWNLOAD in favour of a *_T
version, which returns curl_off_t instead of a double.
* 7.85.0 deprecates CURLOPT_PROTOCOLS and CURLOPT_REDIR_PROTOCOLS in
favour of *_STR variants, specifying the desired protocols via a
string.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1440
Signed-off-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230123201431.23118-1-anjo@rev.ng>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit e7b8d9d038)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit 8c460269aa ("iscsi: base all handling of check condition on
scsi_sense_to_errno", 2019-07-15) removed a "goto out" so that the
same coroutine is re-entered twice; once from iscsi_co_generic_cb,
once from the timer callback iscsi_retry_timer_expired. This can
cause a crash.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1378
Reported-by: Grzegorz Zdanowski <https://gitlab.com/kiler129>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5080152e2e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When a write request is converted into a write zeroes request by the
detect-zeroes= feature, it is no longer associated with an I/O buffer.
The BDRV_REQ_REGISTERED_BUF flag doesn't make sense without an I/O
buffer and must be cleared because bdrv_co_do_pwrite_zeroes() fails with
-EINVAL when it's set.
Fiona Ebner <f.ebner@proxmox.com> bisected and diagnosed this QEMU 7.2
regression where writes containing zeroes to a blockdev with
discard=unmap,detect-zeroes=unmap fail.
Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1404
Fixes: e8b6535533 ("block: add BDRV_REQ_REGISTERED_BUF request flag")
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230207203719.242926-2-stefanha@redhat.com>
(cherry picked from commit 3c5867156e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In order to write the bitmap table to the image file, it is converted to
big endian. If the write fails, it is passed to clear_bitmap_table() to
free all of the clusters it had allocated before. However, if we don't
convert it back to native endianness first, we'll free things at a wrong
offset.
In practical terms, the offsets will be so high that we won't actually
free any allocated clusters, but just run into an error, but in theory
this can cause image corruption.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230112191454.169353-2-kwolf@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b03dd9613b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
bdrv_*() APIs expect a valid BlockDriverState. Calling them with bs=NULL
leads to undefined behavior.
Jonathan Cameron reported this following NULL pointer dereference when a
VM with a virtio-blk device and a memory-backend-file object is
terminated:
1. qemu_cleanup() closes all drives, setting blk->root to NULL
2. qemu_cleanup() calls user_creatable_cleanup(), which results in a RAM
block notifier callback because the memory-backend-file is destroyed.
3. blk_unregister_buf() is called by virtio-blk's BlockRamRegistrar
notifier callback and undefined behavior occurs.
Fixes: baf422684d ("virtio-blk: use BDRV_REQ_REGISTERED_BUF optimization hint")
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221121211923.1993171-1-stefanha@redhat.com>
bdrv_parent_drained_{begin,end}_single() are supposed to operate on the
parent, not on the child, so they should not attempt to get the context
to poll from the child but the parent instead. BDRV_POLL_WHILE(c->bs)
does get the context from the child, so we should replace it with
AIO_WAIT_WHILE() on the parent's context instead.
This problem becomes apparent when bdrv_replace_child_noperm() invokes
bdrv_parent_drained_end_single() after removing a child from a subgraph
that is in an I/O thread. By the time bdrv_parent_drained_end_single()
is called, child->bs is NULL, and so BDRV_POLL_WHILE(c->bs, ...) will
poll the main loop instead of the I/O thread; but anything that
bdrv_parent_drained_end_single_no_poll() may have scheduled is going to
want to run in the I/O thread, but because we poll the main loop, the
I/O thread is never unpaused, and nothing is run, resulting in a
deadlock.
Closes: https://gitlab.com/qemu-project/qemu/-/issues/1215
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20221107151321.211175-4-hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
blk_get_aio_context() asserts that blk->ctx is always equal to the root
BDS's context (if there is a root BDS). Therefore,
blk_do_set_aio_context() must update blk->ctx immediately after the root
BDS's context has changed.
Without this patch, the next patch would break iotest 238, because
bdrv_drained_begin() (called by blk_do_set_aio_context()) may then
invoke bdrv_child_get_parent_aio_context() on the root child, i.e.
blk_get_aio_context(). However, by this point, blk->ctx would not have
been updated and thus differ from the root node's context. This patch
fixes that.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20221107151321.211175-3-hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We want to use bdrv_child_get_parent_aio_context() from
bdrv_parent_drained_{begin,end}_single(), both of which are "I/O or GS"
functions.
Prior to 3ed4f708fe, all the implementations were I/O code anyway.
3ed4f708fe has put block jobs' AioContext field under the job mutex, so
to make child_job_get_parent_aio_context() work in an I/O context, we
need to take that lock there.
Furthermore, blk_root_get_parent_aio_context() is not marked as
anything, but is safe to run in an I/O context, so mark it that way now.
(blk_get_aio_context() is an I/O code function.)
With that done, all implementations explicitly are I/O code, so we can
mark bdrv_child_get_parent_aio_context() as I/O code, too, so callers
know it is safe to run from both GS and I/O contexts.
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20221107151321.211175-2-hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Setting it to true can cause the device size to be queried from libblkio
in otherwise fast paths, degrading performance. Set it to false and
require users to refresh the device size explicitly instead.
Fixes: 4c8f4fda05 ("block/blkio: Tolerate device size changes")
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Message-Id: <20221108144433.1334074-1-afaria@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is a small gap in mirror_start_job() before putting the mirror
filter node into the block graph (bdrv_append() call) and the actual job
being created. Before the job is created, MirrorBDSOpaque.job is NULL.
It is possible that requests come in when bdrv_drained_end() is called,
and those requests would see MirrorBDSOpaque.job == NULL. Have our
filter node handle that case gracefully.
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20221109165452.67927-4-hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
mirror_wait_for_free_in_flight_slot() is the only remaining user of
mirror_wait_for_any_operation(), so inline the latter into the former.
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20221109165452.67927-3-hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Waiting for all active writes to settle before daring to create a
background copying operation means that we will never do background
operations while the guest does anything (in write-blocking mode), and
therefore cannot converge. Yes, we also will not diverge, but actually
converging would be even nicer.
It is unclear why we did decide to wait for all active writes to settle
before creating a background operation, but it just does not seem
necessary. Active writes will put themselves into the in_flight bitmap
and thus properly block actually conflicting background requests.
It is important for active requests to wait on overlapping background
requests, which we do in active_write_prepare(). However, so far it was
not documented why it is important. Add such documentation now, and
also to the other call of mirror_wait_on_conflicts(), so that it becomes
more clear why and when requests need to actively wait for other
requests to settle.
Another thing to note is that of course we need to ensure that there are
no active requests when the job completes, but that is done by virtue of
the BDS being drained anyway, so there cannot be any active requests at
that point.
With this change, we will need to explicitly keep track of how many
bytes are in flight in active requests so that
job_progress_set_remaining() in mirror_run() can set the correct number
of remaining bytes.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2123297
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20221109165452.67927-2-hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220929093035.4231-5-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
improve error handling during module load, by changing:
bool module_load(const char *prefix, const char *lib_name);
void module_load_qom(const char *type);
to:
int module_load(const char *prefix, const char *name, Error **errp);
int module_load_qom(const char *type, Error **errp);
where the return value is:
-1 on module load error, and errp is set with the error
0 on module or one of its dependencies are not installed
1 on module load success
2 on module load success (module already loaded or built-in)
module_load_qom_one has been introduced in:
commit 28457744c3 ("module: qom module support"), which built on top of
module_load_one, but discarded the bool return value. Restore it.
Adapt all callers to emit errors, or ignore them, or fail hard,
as appropriate in each context.
Replace the previous emission of errors via fprintf in _some_ error
conditions with Error and error_report, so as to emit to the appropriate
target.
A memory leak is also fixed as part of the module_load changes.
audio: when attempting to load an audio module, report module load errors.
Note that still for some callers, a single issue may generate multiple
error reports, and this could be improved further.
Regarding the audio code itself, audio_add() seems to ignore errors,
and this should probably be improved.
block: when attempting to load a block module, report module load errors.
For the code paths that already use the Error API, take advantage of those
to report module load errors into the Error parameter.
For the other code paths, we currently emit the error, but this could be
improved further by adding Error parameters to all possible code paths.
console: when attempting to load a display module, report module load errors.
qdev: when creating a new qdev Device object (DeviceState), report load errors.
If a module cannot be loaded to create that device, now abort execution
(if no CONFIG_MODULE) or exit (if CONFIG_MODULE).
qom/object.c: when initializing a QOM object, or looking up class_by_name,
report module load errors.
qtest: when processing the "module_load" qtest command, report errors
in the load of the module.
Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220929093035.4231-4-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Note that we're still discussing "block/blkio: Make driver nvme-io_uring take a
"path" instead of a "filename"". I have sent the pull request now so everything
is ready for the soft freeze tomorrow if we decide to go ahead with the patch.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmNgGQkACgkQnKSrs4Gr
c8hLFgf/dnszoHO02hjoJCN2LPAxDalyYKzog+ZU8U5VdzJn2gione1jVlf3Xb0l
mhTgrioSbKLKXavGZTSwWUki/xRgCJMtG3m07EFmMsLX0QiSOIyzLr0DslQawYdZ
FlXyCCyAVTUILz7oUXBqORlfTKsGPHms6nlXQYhitTOsDbPyqbT9nNPKAlfGkqfj
Pwn+oWJmjLC0aARpcrB1bXCMbqQrtZGh4bBgfIXRUJmprWqk227bkFvXNCuXU16x
PC4oH552+6nyQyRxGpHc3o1W/8gqlxU9DTBb5arDUQaDvsDTKVkuGe2HdDI7knAT
/m57/BFVUnA35SYOxX+0piiEbawI6Q==
=UWL7
-----END PGP SIGNATURE-----
Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging
Pull request
Note that we're still discussing "block/blkio: Make driver nvme-io_uring take a
"path" instead of a "filename"". I have sent the pull request now so everything
is ready for the soft freeze tomorrow if we decide to go ahead with the patch.
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmNgGQkACgkQnKSrs4Gr
# c8hLFgf/dnszoHO02hjoJCN2LPAxDalyYKzog+ZU8U5VdzJn2gione1jVlf3Xb0l
# mhTgrioSbKLKXavGZTSwWUki/xRgCJMtG3m07EFmMsLX0QiSOIyzLr0DslQawYdZ
# FlXyCCyAVTUILz7oUXBqORlfTKsGPHms6nlXQYhitTOsDbPyqbT9nNPKAlfGkqfj
# Pwn+oWJmjLC0aARpcrB1bXCMbqQrtZGh4bBgfIXRUJmprWqk227bkFvXNCuXU16x
# PC4oH552+6nyQyRxGpHc3o1W/8gqlxU9DTBb5arDUQaDvsDTKVkuGe2HdDI7knAT
# /m57/BFVUnA35SYOxX+0piiEbawI6Q==
# =UWL7
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 31 Oct 2022 14:50:49 EDT
# gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [ultimate]
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [ultimate]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* tag 'block-pull-request' of https://gitlab.com/stefanha/qemu:
block/blkio: Make driver nvme-io_uring take a "path" instead of a "filename"
block/blkio: Tolerate device size changes
block/blkio: Add virtio-blk-vfio-pci BlockDriver
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
There is a difference in the mkdir() call for win32 and non-win32
platforms, and currently is handled in the codes with #ifdefs.
glib provides a portable g_mkdir() API and we can use it to unify
the codes without #ifdefs.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221006151927.2079583-6-bmeng.cn@gmail.com>
Message-Id: <20221027183637.2772968-14-alex.bennee@linaro.org>
The nvme-io_uring driver expects a character special file such as
/dev/ng0n1. Follow the convention of having a "filename" option when a
regular file is expected, and a "path" option otherwise.
This makes io_uring the only libblkio-based driver with a "filename"
option, as it accepts a regular file (even though it can also take a
block special file).
Signed-off-by: Alberto Faria <afaria@redhat.com>
Message-id: 20221028233854.839933-1-afaria@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Some libblkio drivers may be able to work with regular files (e.g.,
io_uring) or otherwise resizable devices. Conservatively set
BlockDriver::has_variable_length to true to ensure bdrv_nb_sectors()
always gives up-to-date results.
Also implement BlockDriver::bdrv_co_truncate for the case where no
preallocation is needed and the device already has a size compatible
with what was requested.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Message-id: 20221029122031.975273-1-afaria@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
blk_set_enable_write_cache() is defined as GLOBAL_STATE_CODE
but can be invoked from iothreads when handling scsi requests.
This triggers an assertion failure:
0x00007fd6c3515ce1 in raise () from /lib/x86_64-linux-gnu/libc.so.6
0x00007fd6c34ff537 in abort () from /lib/x86_64-linux-gnu/libc.so.6
0x00007fd6c34ff40f in ?? () from /lib/x86_64-linux-gnu/libc.so.6
0x00007fd6c350e662 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
0x000056149e2cea03 in blk_set_enable_write_cache (wce=true, blk=0x5614a01c27f0)
at ../src/block/block-backend.c:1949
0x000056149e2d0a67 in blk_set_enable_write_cache (blk=0x5614a01c27f0,
wce=<optimized out>) at ../src/block/block-backend.c:1951
0x000056149dfe9c59 in scsi_disk_apply_mode_select (p=0x7fd6b400c00e "\004",
page=<optimized out>, s=<optimized out>) at ../src/hw/scsi/scsi-disk.c:1520
mode_select_pages (change=true, len=18, p=0x7fd6b400c00e "\004", r=0x7fd6b4001ff0)
at ../src/hw/scsi/scsi-disk.c:1570
scsi_disk_emulate_mode_select (inbuf=<optimized out>, r=0x7fd6b4001ff0) at
../src/hw/scsi/scsi-disk.c:1640
scsi_disk_emulate_write_data (req=0x7fd6b4001ff0) at ../src/hw/scsi/scsi-disk.c:1934
0x000056149e18ff16 in virtio_scsi_handle_cmd_req_submit (req=<optimized out>,
req=<optimized out>, s=0x5614a12f16b0) at ../src/hw/scsi/virtio-scsi.c:719
virtio_scsi_handle_cmd_vq (vq=0x7fd6bab92140, s=0x5614a12f16b0) at
../src/hw/scsi/virtio-scsi.c:761
virtio_scsi_handle_cmd (vq=<optimized out>, vdev=<optimized out>) at
../src/hw/scsi/virtio-scsi.c:775
virtio_scsi_handle_cmd (vdev=0x5614a12f16b0, vq=0x7fd6bab92140) at
../src/hw/scsi/virtio-scsi.c:765
0x000056149e1a8aa6 in virtio_queue_notify_vq (vq=0x7fd6bab92140) at
../src/hw/virtio/virtio.c:2365
0x000056149e3ccea5 in aio_dispatch_handler (ctx=ctx@entry=0x5614a01babe0,
node=<optimized out>) at ../src/util/aio-posix.c:369
0x000056149e3cd868 in aio_dispatch_ready_handlers (ready_list=0x7fd6c09b2680,
ctx=0x5614a01babe0) at ../src/util/aio-posix.c:399
aio_poll (ctx=0x5614a01babe0, blocking=blocking@entry=true) at
../src/util/aio-posix.c:713
0x000056149e2a7796 in iothread_run (opaque=opaque@entry=0x56149ffde500) at
../src/iothread.c:67
0x000056149e3d0859 in qemu_thread_start (args=0x7fd6c09b26f0) at
../src/util/qemu-thread-posix.c:504
0x00007fd6c36b9ea7 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
0x00007fd6c35d9aef in clone () from /lib/x86_64-linux-gnu/libc.so.6
Changing GLOBAL_STATE_CODE in IO_CODE is allowed, since GSC callers are
allowed to call IO_CODE.
Resolves: #1272
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20221027072726.2681500-1-eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Tested-by: Antoine Damhet <antoine.damhet@shadow.tech>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-24-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-23-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-22-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-21-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-20-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-19-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-18-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-17-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-16-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-15-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The validity of these was double-checked with Alberto Faria's static analyzer.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-14-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The validity of these was double-checked with Alberto Faria's static
analyzer.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-13-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
get_cluster_offset() and decompress_cluster() are only called from
the read and write paths.
The validity of these was double-checked with Alberto Faria's static analyzer.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-12-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-11-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-10-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-9-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
ssh_write is only called from ssh_co_writev.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-5-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
hmp_block_resize and hmp_screendump are defined as a ".coroutine = true" command,
so they must be coroutine_fn.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-4-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-3-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The .set_speed callback is not called from coroutine.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-2-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
libnfs.h declares nfs_fstat() as the following for win32:
int nfs_fstat(struct nfs_context *nfs, struct nfsfh *nfsfh,
struct __stat64 *st);
The 'st' parameter should be of type 'struct __stat64'. The
codes happen to build successfully for 64-bit Windows, but it
does not build for 32-bit Windows.
Fixes: 6542aa9c75 ("block: add native support for NFS")
Fixes: 18a8056e0b ("block/nfs: cache allocated filesize for read-only files")
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <20220908132817.1831008-6-bmeng.cn@gmail.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>