libnfs.h declares nfs_fstat() as the following for win32:
int nfs_fstat(struct nfs_context *nfs, struct nfsfh *nfsfh,
struct __stat64 *st);
The 'st' parameter should be of type 'struct __stat64'. The
codes happen to build successfully for 64-bit Windows, but it
does not build for 32-bit Windows.
Fixes: 6542aa9c75 ("block: add native support for NFS")
Fixes: 18a8056e0b ("block/nfs: cache allocated filesize for read-only files")
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <20220908132817.1831008-6-bmeng.cn@gmail.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
No functional change intended.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221025084952.2139888-11-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
No functional changes intended.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221025084952.2139888-10-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Together with all _can_set_ and _set_ APIs, as they are not needed
anymore.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221025084952.2139888-9-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Replace all direct usage of ->can_set_aio_ctx and ->set_aio_ctx,
and call bdrv_child_try_change_aio_context() in
bdrv_try_set_aio_context(), the main function called through
the whole block layer.
From this point onwards, ->can_set_aio_ctx and ->set_aio_ctx
won't be used anymore.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221025084952.2139888-8-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
blk_root_change_aio_ctx() is very similar to blk_root_can_set_aio_ctx(),
but implements a new transaction so that if all check pass, the new
transaction's .commit will take care of changing the BlockBackend
AioContext. blk_root_set_aio_ctx_commit() is the same as
blk_root_set_aio_ctx().
Note: bdrv_child_try_change_aio_context() is not called by
anyone at this point.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221025084952.2139888-7-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now the indirection is not actually used, we can safely reduce it to
simple pointer. For consistency do a bit of refactoring to get rid of
_ptr suffixes that become meaningless.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-15-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bs->file and bs->backing are a kind of duplication of part of
bs->children. But very useful diplication, so let's not drop them at
all:)
We should manage bs->file and bs->backing in same place, where we
manage bs->children, to keep them in sync.
Moreover, generic io paths are unprepared to BdrvChild without a bs, so
it's double good to clear bs->file / bs->backing when we detach the
child.
Detach is simple: if we detach bs->file or bs->backing child, just
set corresponding field to NULL.
Attach is a bit more complicated. But we still can precisely detect
should we set one of bs->file / bs->backing or not:
- if role is BDRV_CHILD_COW, we definitely deal with bs->backing
- else, if role is BDRV_CHILD_FILTERED (it must be also
BDRV_CHILD_PRIMARY), it's a filtered child. Use
bs->drv->filtered_child_is_backing to chose the pointer field to
modify.
- else, if role is BDRV_CHILD_PRIMARY, we deal with bs->file
- in all other cases, it's neither bs->backing nor bs->file. It's some
other child and we shouldn't care
OK. This change brings one more good thing: we can (and should) get rid
of all indirect pointers in the block-graph-change transactions:
bdrv_attach_child_common() stores BdrvChild** into transaction to clear
it on abort.
bdrv_attach_child_common() has two callers: bdrv_attach_child_noperm()
just pass-through this feature, bdrv_root_attach_child() doesn't need
the feature.
Look at bdrv_attach_child_noperm() callers:
- bdrv_attach_child() doesn't need the feature
- bdrv_set_file_or_backing_noperm() uses the feature to manage
bs->file and bs->backing, we don't want it anymore
- bdrv_append() uses the feature to manage bs->backing, again we
don't want it anymore
So, we should drop this stuff! Great!
We could probably keep BdrvChild** argument to keep the int return
value, but it seems not worth the complexity.
Finally, we now set .file / .backing automatically in generic code and
want to restring setting them by hand outside of .attach/.detach.
So, this patch cleanups all remaining places where they were set.
To find such places I use:
git grep '\->file ='
git grep '\->backing ='
git grep '&.*\<backing\>'
git grep '&.*\<file\>'
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-14-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Actually what we chose is a primary child. Let's stress it in the code.
We are going to drop indirect pointer logic here in future. Actually
this commit simplifies the future work: we drop use of indirection in
the assertion now.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-9-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We don't need to remove bs->file, generic layer takes care of it. No
other driver cares to remove bs->file on failure by hand.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-4-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Almost all drivers call bdrv_open_child() similarly. Let's create a
helper for this.
The only not updated drivers that call bdrv_open_child() to set
bs->file are raw-format and snapshot-access:
raw-format sometimes want to have filtered child but
don't set drv->is_filter to true.
snapshot-access wants only DATA | PRIMARY
Possibly we should implement drv->is_filter_func() handler, to consider
raw-format as filter when it works as filter.. But it's another story.
Note also, that we decrease assignments to bs->file in code: it helps
us restrict modifying this field in further commit.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-3-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Unfortunately not all filters use .file child as filtered child. Two
exclusions are mirror_top and commit_top. Happily they both are private
filters. Bad thing is that this inconsistency is observable through qmp
commands query-block / query-named-block-nodes. So, could we just
change mirror_top and commit_top to use file child as all other filter
driver is an open question. Probably, we could do that with some kind
of deprecation period, but how to warn users during it?
For now, let's just add a field so we can distinguish them in generic
code, it will be used in further commits.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-2-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1193
The commit "Use io_uring_register_ring_fd() to skip fd operations" broke
when booting a guest with iothread and io_uring. That is because the
io_uring_register_ring_fd() call is made from the main thread instead of
IOThread where io_uring_submit() is called. It can not be guaranteed
to register the ring fd in the correct thread or unregister the same ring
fd if the IOThread is disabled. This optimization is not critical so we
will revert previous commit.
This reverts commit e2848bc574
and 77e3f038af.
Cc: qemu-stable@nongnu.org
Signed-off-by: Sam Li <faithilikerun@gmail.com>
Message-Id: <20220924144815.5591-1-faithilikerun@gmail.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Tested-by: Dario Faggioli <dfaggioli@suse.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In R/W mode, files with spaces were never created on host side.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1176
Fixes: c79e243ed6
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221010175511.3414357-3-hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
'reserved1' field in bootsector is used to mark volume dirty, or need to verify.
Allow writes to bootsector which only changes the 'reserved1' field.
This fixes I/O errors on Windows guests.
Resolves: https://bugs.launchpad.net/qemu/+bug/1889421
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-Id: <20221010175511.3414357-2-hpoussin@reactos.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
At present there are two callers of get_tmp_filename() and they are
inconsistent.
One does:
/* TODO: extra byte is a hack to ensure MAX_PATH space on Windows. */
char *tmp_filename = g_malloc0(PATH_MAX + 1);
...
ret = get_tmp_filename(tmp_filename, PATH_MAX + 1);
while the other does:
s->qcow_filename = g_malloc(PATH_MAX);
ret = get_tmp_filename(s->qcow_filename, PATH_MAX);
As we can see different 'size' arguments are passed. There are also
platform specific implementations inside the function, and the use
of snprintf is really undesirable.
The function name is also misleading. It creates a temporary file,
not just a filename.
Refactor this routine by changing its name and signature to:
char *create_tmp_file(Error **errp)
and use g_get_tmp_dir() / g_mkstemp() for a consistent implementation.
While we are here, add some comments to mention that /var/tmp is
preferred over /tmp on non-win32 hosts.
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <20221010040432.3380478-2-bin.meng@windriver.com>
[kwolf: Fixed incorrect errno negation and iotest 051]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Avoid bounce buffers when QEMUIOVector elements are within previously
registered bdrv_register_buf() buffers.
The idea is that emulated storage controllers will register guest RAM
using bdrv_register_buf() and set the BDRV_REQ_REGISTERED_BUF on I/O
requests. Therefore no blkio_map_mem_region() calls are necessary in the
performance-critical I/O code path.
This optimization doesn't apply if the I/O buffer is internally
allocated by QEMU (e.g. qcow2 metadata). There we still take the slow
path because BDRV_REQ_REGISTERED_BUF is not set.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20221013185908.1297568-13-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Emulated devices and other BlockBackend users wishing to take advantage
of blk_register_buf() all have the same repetitive job: register
RAMBlocks with the BlockBackend using RAMBlockNotifier.
Add a BlockRAMRegistrar API to do this. A later commit will use this
from hw/block/virtio-blk.c.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20221013185908.1297568-10-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Registering an I/O buffer is only a performance optimization hint but it
is still necessary to return errors when it fails.
Later patches will need to detect errors when registering buffers but an
immediate advantage is that error_report() calls are no longer needed in
block driver .bdrv_register_buf() functions.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20221013185908.1297568-8-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Block drivers may optimize I/O requests accessing buffers previously
registered with bdrv_register_buf(). Checking whether all elements of a
request's QEMUIOVector are within previously registered buffers is
expensive, so we need a hint from the user to avoid costly checks.
Add a BDRV_REQ_REGISTERED_BUF request flag to indicate that all
QEMUIOVector elements in an I/O request are known to be within
previously registered buffers.
Always pass the flag through to driver read/write functions. There is
little harm in passing the flag to a driver that does not use it.
Passing the flag to drivers avoids changes across many block drivers.
Filter drivers would need to explicitly support the flag and pass
through to their children when the children support it. That's a lot of
code changes and it's hard to remember to do that everywhere, leading to
silent reduced performance when the flag is accidentally dropped.
The only problematic scenario with the approach in this patch is when a
driver passes the flag through to internal I/O requests that don't use
the same I/O buffer. In that case the hint may be set when it should
actually be clear. This is a rare case though so the risk is low.
Some drivers have assert(!flags), which no longer works when
BDRV_REQ_REGISTERED_BUF is passed in. These assertions aren't very
useful anyway since the functions are called almost exclusively by
bdrv_driver_preadv/pwritev() so if we get flags handling right there
then the assertion is not needed.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20221013185908.1297568-7-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The only implementor of bdrv_register_buf() is block/nvme.c, where the
size is not needed when unregistering a buffer. This is because
util/vfio-helpers.c can look up mappings by address.
Future block drivers that implement bdrv_register_buf() may not be able
to do their job given only the buffer address. Add a size argument to
bdrv_unregister_buf().
Also document the assumptions about
bdrv_register_buf()/bdrv_unregister_buf() calls. The same <host, size>
values that were given to bdrv_register_buf() must be given to
bdrv_unregister_buf().
gcc 11.2.1 emits a spurious warning that img_bench()'s buf_size local
variable might be uninitialized, so it's necessary to silence the
compiler.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20221013185908.1297568-5-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
libblkio (https://gitlab.com/libblkio/libblkio/) is a library for
high-performance disk I/O. It currently supports io_uring,
virtio-blk-vhost-user, and virtio-blk-vhost-vdpa with additional drivers
under development.
One of the reasons for developing libblkio is that other applications
besides QEMU can use it. This will be particularly useful for
virtio-blk-vhost-user which applications may wish to use for connecting
to qemu-storage-daemon.
libblkio also gives us an opportunity to develop in Rust behind a C API
that is easy to consume from QEMU.
This commit adds io_uring, nvme-io_uring, virtio-blk-vhost-user, and
virtio-blk-vhost-vdpa BlockDrivers to QEMU using libblkio. It will be
easy to add other libblkio drivers since they will share the majority of
code.
For now I/O buffers are copied through bounce buffers if the libblkio
driver requires it. Later commits add an optimization for
pre-registering guest RAM to avoid bounce buffers.
The syntax is:
--blockdev io_uring,node-name=drive0,filename=test.img,readonly=on|off,cache.direct=on|off
--blockdev nvme-io_uring,node-name=drive0,filename=/dev/ng0n1,readonly=on|off,cache.direct=on
--blockdev virtio-blk-vhost-vdpa,node-name=drive0,path=/dev/vdpa...,readonly=on|off,cache.direct=on
--blockdev virtio-blk-vhost-user,node-name=drive0,path=vhost-user-blk.sock,readonly=on|off,cache.direct=on
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20221013185908.1297568-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The field is unused (only ever set, but never read) since commit
ac9185603. Additionally, the commit message of commit 34fa110e already
explained earlier why it's unreliable. Remove it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20220923142838.91043-1-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Change the job_{lock/unlock} and macros to use job_mutex.
Now that they are not nop anymore, remove the aiocontext
to avoid deadlocks.
Therefore:
- when possible, remove completely the aiocontext lock/unlock pair
- if it is used by some other function too, reduce the locking
section as much as possible, leaving the job API outside.
- change AIO_WAIT_WHILE in AIO_WAIT_WHILE_UNLOCKED, since we
are not using the aiocontext lock anymore
The only functions that still need the aiocontext lock are:
- the JobDriver callbacks, already documented in job.h
- job_cancel_sync() in replication.c is called with aio_context_lock
taken, but now job is using AIO_WAIT_WHILE_UNLOCKED so we need to
release the lock.
Reduce the locking section to only cover the callback invocation
and document the functions that take the AioContext lock,
to avoid taking it twice.
Also remove real_job_{lock/unlock}, as they are replaced by the
public functions.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220926093214.506243-19-eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
iostatus is the only field (together with .job) that needs
protection using the job mutex.
It is set in the main loop (GLOBAL_STATE functions) but read
in I/O code (block_job_error_action).
In order to protect it, change block_job_iostatus_set_err
to block_job_iostatus_set_err_locked(), always called under
job lock.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20220926093214.506243-17-eesposit@redhat.com>
[kwolf: Fixed up type of iostatus]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In order to make it thread safe, implement a "fake rwlock",
where we allow reads under BQL *or* job_mutex held, but
writes only under BQL *and* job_mutex.
The only write we have is in child_job_set_aio_ctx, which always
happens under drain (so the job is paused).
For this reason, introduce job_set_aio_context and make sure that
the context is set under BQL, job_mutex and drain.
Also make sure all other places where the aiocontext is read
are protected.
The reads in commit.c and mirror.c are actually safe, because always
done under BQL.
Note: at this stage, job_{lock/unlock} and job lock guard macros
are *nop*.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220926093214.506243-14-eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Once job lock is used and aiocontext is removed, mirror has
to perform job operations under the same critical section,
Note: at this stage, job_{lock/unlock} and job lock guard macros
are *nop*.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20220926093214.506243-11-eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221006122607.162769-1-kwolf@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-24-pbonzini@redhat.com>
[kwolf: Fixed up coding style]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-21-pbonzini@redhat.com>
[kwolf: Fixed up coding style]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-20-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-19-pbonzini@redhat.com>
[kwolf: Fixed up coding style]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-18-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-17-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-16-pbonzini@redhat.com>
[kwolf: Fixed up coding style]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-15-pbonzini@redhat.com>
[kwolf: Fixed up coding style]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-14-pbonzini@redhat.com>
[kwolf: Fixed up coding style]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-13-pbonzini@redhat.com>
[kwolf: Fixed up coding style]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-12-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Reviewed-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-11-pbonzini@redhat.com>
[kwolf: Fixed up coding style]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Reviewed-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-10-pbonzini@redhat.com>
[kwolf: Fixed up coding style]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Reviewed-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-9-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Reviewed-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-8-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is incorrect because qcow2_mark_clean() calls qcow2_flush_caches().
qcow2_mark_clean() is called from non-coroutine context in
qcow2_inactivate() and qcow2_amend_options().
Reviewed-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-4-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())". Apply coroutine_fn to
functions where this holds.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-3-pbonzini@redhat.com>
[kwolf: Fixed up coding style]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
nvme_get_free_req has very difference semantics when called in
coroutine context (where it waits) and in non-coroutine context
(where it doesn't). Split the two cases to make it clear what
is being requested.
Cc: qemu-block@nongnu.org
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-2-pbonzini@redhat.com>
[kwolf: Fixed up coding style]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This helps us construct strings elsewhere before echoing to the
monitor. It avoids having to jump through hoops like:
monitor_printf(mon, "%s", s->str);
It will be useful in following patches but for now convert all
existing plain "%s" printfs to use the _puts api.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220929114231.583801-33-alex.bennee@linaro.org>
An iov length needs to be aligned to the logical block size, which may
be larger than the memory alignment.
Tested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20220929200523.3218710-3-kbusch@meta.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is only user of bdrv_qiov_is_aligned(), so move the alignment
function to there and make it static.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20220929200523.3218710-2-kbusch@meta.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Just like qcow2, qed invokes its open function in its
.bdrv_co_invalidate_cache() implementation. Therefore, just like done
for qcow2 in HEAD^, update auto_backing_file only if the backing file
string in the image header differs from the one we have read before.
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220803144446.20723-3-hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2_do_open() is used by qcow2_co_invalidate_cache(), i.e. may be run
on an image that has been opened before. When reading the backing file
string from the image header, compare it against the existing
bs->backing_file, and update bs->auto_backing_file only if they differ.
auto_backing_file should ideally contain the filename the backing BDS
will actually have after opening, i.e. a post-bdrv_refresh_filename()
version of what is in the image header. So for example, if the image
header reports the following backing file string:
json:{"driver": "qcow2", "file": {
"driver": "file", "filename": "/tmp/backing.qcow2"
}}
Then auto_backing_file should contain simply "/tmp/backing.qcow2".
Because bdrv_refresh_filename() only works on existing BDSs, though, the
way how we get this auto_backing_file value is to have the format driver
set it to whatever is in the image header, and when the backing BDS is
opened based on that, we update it with the filename the backing BDS
actually got.
However, qcow2's qcow2_co_invalidate_cache() implementation breaks this
because it just resets auto_backing_file to whatever is in the image
file without opening a BDS based on it, so we never get
auto_backing_file back to the "refreshed" version, and in the example
above, it would stay "json:{...}".
Then, bs->backing->bs->filename will differ from bs->auto_backing_file,
making bdrv_backing_overridden(bs) return true, which will lead
bdrv_refresh_filename(bs) to generate a json:{} filename for bs, even
though that may not have been necessary. This is reported in the issue
linked below.
Therefore, skip updating auto_backing_file if nothing has changed in the
image header since we last read it.
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1117
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220803144446.20723-2-hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The gluster protocol driver used to parse URIs (filenames) but was
extended with a richer JSON syntax in commit 6c7189bb29
("block/gluster: add support for multiple gluster servers"). The gluster
drivers that have JSON parsing set .bdrv_needs_filename to false.
The gluster+unix and gluster+rdma drivers still to require a filename
even though the JSON parser is equipped to parse the same
volume/path/sockaddr details as the URI parser. Let's allow JSON parsing
for these drivers too.
Note that the gluster+rdma driver actually uses TCP because RDMA support
is not available, so the JSON server.type field must be "inet".
Drop .bdrv_needs_filename since both the filename and the JSON parsers
can handle gluster+unix and gluster+rdma. This change is in preparation
for eventually removing .bdrv_needs_filename across the entire codebase.
Cc: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220811164905.430834-1-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Return codes of the following functions are never used in the code:
* bdrv_wait_serialising_requests_locked
* bdrv_wait_serialising_requests
* bdrv_make_request_serialising
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Hanna Reitz <hreitz@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Fam Zheng <fam@euphon.net>
CC: Ronnie Sahlberg <ronniesahlberg@gmail.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Peter Lieven <pl@kamp.de>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20220817083736.40981-3-den@openvz.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
I believe that if the helper exists, it must be used always for reading
of the value. It breaks expectations in the other case.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Hanna Reitz <hreitz@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Fam Zheng <fam@euphon.net>
CC: Ronnie Sahlberg <ronniesahlberg@gmail.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Peter Lieven <pl@kamp.de>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20220817083736.40981-2-den@openvz.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 5f76a7aac1 is looking harmless from
the first glance, but it has changed things a lot. 'libvirt' uses it to
detect that it should follow new initialization way and this changes
things considerably. With this procedure followed, blockdev_init() is
not called anymore and thus block_acct_setup() helper is not called.
This means in particular that defaults for block accounting statistics
are changed and account_invalid/account_failed are actually initialized
as false instead of true originally.
This commit changes things to match original world. There are the following
constraints:
* new default value in block_acct_init() is set to true
* block_acct_setup() inside blockdev_init() is called before
blkconf_apply_backend_options()
* thus newly created option in block device properties has precedence if
specified
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
CC: Peter Krempa <pkrempa@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: John Snow <jsnow@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220824095044.166009-3-den@openvz.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We would have one more place for block_acct_setup() calling, which should
not corrupt original value.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
CC: Peter Krempa <pkrempa@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: John Snow <jsnow@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220824095044.166009-2-den@openvz.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit d1258dd0c8 ("qcow2: autoloading dirty bitmaps") added the
set_readonly_helper() GFunc handler, correctly casting the gpointer
user_data in both the g_slist_foreach() caller and the handler.
Few commits later (commit 1b6b0562db), the handler is reused in
qcow2_reopen_bitmaps_rw() but missing the gpointer cast, resulting
in the following error when using Homebrew GCC 12.2.0:
[2/658] Compiling C object libblock.fa.p/block_qcow2-bitmap.c.o
../../block/qcow2-bitmap.c: In function 'qcow2_reopen_bitmaps_rw':
../../block/qcow2-bitmap.c:1211:60: error: incompatible type for argument 3 of 'g_slist_foreach'
1211 | g_slist_foreach(ro_dirty_bitmaps, set_readonly_helper, false);
| ^~~~~
| |
| _Bool
In file included from /opt/homebrew/Cellar/glib/2.72.3_1/include/glib-2.0/glib/gmain.h:26,
from /opt/homebrew/Cellar/glib/2.72.3_1/include/glib-2.0/glib/giochannel.h:33,
from /opt/homebrew/Cellar/glib/2.72.3_1/include/glib-2.0/glib.h:54,
from /Users/philmd/source/qemu/include/glib-compat.h:32,
from /Users/philmd/source/qemu/include/qemu/osdep.h:144,
from ../../block/qcow2-bitmap.c:28:
/opt/homebrew/Cellar/glib/2.72.3_1/include/glib-2.0/glib/gslist.h:127:61: note: expected 'gpointer' {aka 'void *'} but argument is of type '_Bool'
127 | gpointer user_data);
| ~~~~~~~~~~~~~~~~~~^~~~~~~~~
At top level:
FAILED: libblock.fa.p/block_qcow2-bitmap.c.o
Fix by adding the missing gpointer cast.
Fixes: 1b6b0562db ("qcow2: support .bdrv_reopen_bitmaps_rw")
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220919182755.51967-1-f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Free feature_table if it is failed in bdrv_pread.
Signed-off-by: lu zhipeng <luzhipeng@cestc.cn>
Message-Id: <20220921144515.1166-1-luzhipeng@cestc.cn>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The commit "Use io_uring_register_ring_fd() to skip fd operations" uses
warn_report but did not include the header file "qemu/error-report.h".
This causes "error: implicit declaration of function ‘warn_report’".
Include this header file.
Fixes: e2848bc574 ("Use io_uring_register_ring_fd() to skip fd operations")
Signed-off-by: Jinhao Fan <fanjinhao21s@ict.ac.cn>
Message-Id: <20220721065645.577404-1-fanjinhao21s@ict.ac.cn>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20220707163720.1421716-5-berrange@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Commit a4072543cc has changed the I/O here
from working on a local one-element I/O vector to just using the buffer
directly (using the bdrv_co_pread()/bdrv_co_pwrite() helper functions
introduced shortly before).
However, it only changed the bdrv_co_preadv() call to bdrv_co_pread() -
the subsequent bdrv_co_pwritev() call stayed this way, and so still
expects a QEMUIOVector pointer instead of a plain buffer. We must
change that to be a bdrv_co_pwrite() call.
Fixes: a4072543cc ("block/parallels: use buffer-based io")
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20220714132801.72464-2-hreitz@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Some can be made static, others are unused generated_co_wrappers.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-19-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Keep generated_co_wrapper and coroutine_fn pairs together. This should
make it clear that each I/O function has these two versions.
Also move blk_co_{pread,pwrite}()'s implementations out of the header
file for consistency.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220705161527.1054072-18-afaria@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Also convert blk_truncate() into a generated_co_wrapper.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-17-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Also convert blk_ioctl() into a generated_co_wrapper.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-16-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-15-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-14-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-13-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Also convert blk_pwrite_compressed() into a generated_co_wrapper.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-12-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Swap 'buf' and 'bytes' around for consistency with other I/O functions.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-11-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Also convert it into a generated_co_wrapper.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-10-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Implement blk_preadv_part() using generated_co_wrapper.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-9-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
We need to add include/sysemu/block-backend-io.h to the inputs of the
block-gen.c target defined in block/meson.build.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-7-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
For consistency with other I/O functions, and in preparation to
implement them using generated_co_wrapper.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-5-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Swap 'buf' and 'bytes' around for consistency with
blk_co_{pread,pwrite}(), and in preparation to implement these functions
using generated_co_wrapper.
Callers were updated using this Coccinelle script:
@@ expression blk, offset, buf, bytes, flags; @@
- blk_pread(blk, offset, buf, bytes, flags)
+ blk_pread(blk, offset, bytes, buf, flags)
@@ expression blk, offset, buf, bytes, flags; @@
- blk_pwrite(blk, offset, buf, bytes, flags)
+ blk_pwrite(blk, offset, bytes, buf, flags)
It had no effect on hw/block/nand.c, presumably due to the #if, so that
file was updated manually.
Overly-long lines were then fixed by hand.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-4-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
For consistency with other I/O functions, and in preparation to
implement it using generated_co_wrapper.
Callers were updated using this Coccinelle script:
@@ expression blk, offset, buf, bytes; @@
- blk_pread(blk, offset, buf, bytes)
+ blk_pread(blk, offset, buf, bytes, 0)
It had no effect on hw/block/nand.c, presumably due to the #if, so that
file was updated manually.
Overly-long lines were then fixed by hand.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-3-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
They currently return the value of their 'bytes' parameter on success.
Make them return 0 instead, for consistency with other I/O functions and
in preparation to implement them using generated_co_wrapper. This also
makes it clear that short reads/writes are not possible.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Message-Id: <20220705161527.1054072-2-afaria@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Use bdrv_pwrite_sync() instead of calling bdrv_pwrite() and bdrv_flush()
separately.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220609152744.3891847-11-afaria@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Convert uses of bdrv_pwrite_sync() into bdrv_co_pwrite_sync() when the
callers are already coroutine_fn.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220609152744.3891847-10-afaria@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Also convert bdrv_pwrite_sync() to being implemented using
generated_co_wrapper.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220609152744.3891847-9-afaria@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
bdrv_{pread,pwrite}() now return -EIO instead of -EINVAL when 'bytes' is
negative, making them consistent with bdrv_{preadv,pwritev}() and
bdrv_co_{pread,pwrite,preadv,pwritev}().
bdrv_pwrite_zeroes() now also calls trace_bdrv_co_pwrite_zeroes() and
clears the BDRV_REQ_MAY_UNMAP flag when appropriate, which it didn't
previously.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Message-Id: <20220609152744.3891847-8-afaria@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
For consistency with other I/O functions, and in preparation to
implement bdrv_{pread,pwrite}() using generated_co_wrapper.
unsigned int fits in int64_t, so all callers remain correct.
bdrv_check_request32() is called further down the stack and causes -EIO
to be returned if 'bytes' is negative or greater than
BDRV_REQUEST_MAX_BYTES, which in turns never exceeds SIZE_MAX.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Message-Id: <20220609152744.3891847-7-afaria@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
They currently return the value of their headerlen/buflen parameter on
success. Returning 0 instead makes it clear that short reads/writes are
not possible.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220609152744.3891847-5-afaria@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
They currently return the value of their 'bytes' parameter on success.
Make them return 0 instead, for consistency with other I/O functions and
in preparation to implement them using generated_co_wrapper. This also
makes it clear that short reads/writes are not possible.
The few callers that rely on the previous behavior are adjusted
accordingly by hand.
Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220609152744.3891847-4-afaria@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Jens Axboe has confirmed that short reads are rare but can happen:
https://lore.kernel.org/io-uring/YsU%2FCGkl9ZXUI+Tj@stefanha-x1.localdomain/T/#m729963dc577d709b709c191922e98ec79d7eef54
The luring_resubmit_short_read() comment claimed they were only due to a
specific io_uring bug that was fixed in Linux commit 9d93a3f5a0c
("io_uring: punt short reads to async context"), which is wrong.
Dominique Martinet found that a btrfs bug also causes short reads. There
may be more kernel code paths that result in short reads.
Let's consider short reads fair game.
Cc: Dominique Martinet <dominique.martinet@atmark-techno.com>
Based-on: <20220630010137.2518851-1-dominique.martinet@atmark-techno.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20220706080341.1206476-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
sqeq.off here is the offset to read within the disk image, so obviously
not 'nread' (the amount we just read), but as the author meant to write
its current value incremented by the amount we just read.
Normally recent versions of linux will not issue short reads,
but it can happen so we should fix this.
This lead to weird image corruptions when short read happened
Fixes: 6663a0a337 ("block/io_uring: implements interfaces for io_uring")
Link: https://lkml.kernel.org/r/YrrFGO4A1jS0GI0G@atmark-techno.com
Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
Message-Id: <20220630010137.2518851-1-dominique.martinet@atmark-techno.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch makes in_flight field 'unsigned' for BDRVNBDState and
MirrorBlockJob. This matches the definition of this field on BDS
and is generically correct - we should never get negative value here.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: John Snow <jsnow@redhat.com>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Hanna Reitz <hreitz@redhat.com>
CC: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
At the moment there are 2 sources of lengthy operations if configured:
* open connection, which could retry inside and
* reconnect of already opened connection
These operations could be quite lengthy and cumbersome to catch thus
it would be quite natural to add trace points for them.
This patch is based on the original downstream work made by Vladimir.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Hanna Reitz <hreitz@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
In some scenarios, when copy-before-write operations lasts too long
time, it's better to cancel it.
Most useful would be to use the new option together with
on-cbw-error=break-snapshot: this way if cbw operation takes too long
time we'll just cancel backup process but do not disturb the guest too
much.
Note the tricky point of realization: we keep additional point in
bs->in_flight during block_copy operation even if it's timed-out.
Background "cancelled" block_copy operations will finish at some point
and will want to access state. We should care to not free the state in
.bdrv_close() earlier.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
[vsementsov: use bdrv_inc_in_flight()/bdrv_dec_in_flight() instead of
direct manipulation on bs->in_flight]
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Add possibility to limit block_copy() call in time. To be used in the
next commit.
As timed-out block_copy() call will continue in background anyway (we
can't immediately cancel IO operation), it's important also give user a
possibility to pass a callback, to do some additional actions on
block-copy call finish.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Currently, behavior on copy-before-write operation failure is simple:
report error to the guest.
Let's implement alternative behavior: break the whole copy-before-write
process (and corresponding backup job or NBD client) but keep guest
working. It's needed if we consider guest stability as more important.
The realisation is simple: on copy-before-write failure we set
s->snapshot_ret and continue guest operations. s->snapshot_ret being
set will lead to all further snapshot API requests. Note that all
in-flight snapshot-API requests may still success: we do wait for them
on BREAK_SNAPSHOT-failure path in cbw_do_copy_before_write().
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
We are going to add one more option of enum type. Let's refactor option
parsing so that we can simply work with BlockdevOptionsCbw object.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Currently we use 'id' option as the name of VDUSE device.
It's a bit confusing since we use one value for two different
purposes: the ID to identfy the export within QEMU (must be
distinct from any other exports in the same QEMU process, but
can overlap with names used by other processes), and the VDUSE
name to uniquely identify it on the host (must be distinct from
other VDUSE devices on the same host, but can overlap with other
export types like NBD in the same process). To make it clear,
this patch adds a separate 'name' option to specify the VDUSE
name for the vduse-blk export instead.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Message-Id: <20220614051532.92-7-xieyongji@bytedance.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add a 'serial' option to allow user to specify this value
explicitly. And the default value is changed to an empty
string as what we did in "hw/block/virtio-blk.c".
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Message-Id: <20220614051532.92-6-xieyongji@bytedance.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
CID 1488362 points out that the second 'rc >= 0' check is now dead
code.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Fixes: 172f5f1a40(nbd: remove peppering of nbd_client_connected)
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20220516210519.76135-1-eblake@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On 64-bit platforms, assigning SIZE_MAX to the int64_t max_pdiscard
results in a negative value, and the following assertion would trigger
down the line (it's not the same max_pdiscard, but computed from the
other one):
qemu-system-x86_64: ../block/io.c:3166: bdrv_co_pdiscard: Assertion
`max_pdiscard >= bs->bl.request_alignment' failed.
On 32-bit platforms, it's fine to keep using SIZE_MAX.
The assertion in qemu_gluster_co_pdiscard() is checking that the value
of 'bytes' can safely be passed to glfs_discard_async(), which takes a
size_t for the argument in question, so it is kept as is. And since
max_pdiscard is still <= SIZE_MAX, relying on max_pdiscard is still
fine.
Fixes: 0c8022876f ("block: use int64_t instead of int in driver discard handlers")
Cc: qemu-stable@nongnu.org
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Message-Id: <20220520075922.43972-1-f.ebner@proxmox.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the namespace does not exist, rbd_create() fails with -ENOENT and
QEMU reports a generic "error rbd create: No such file or directory":
$ qemu-img create rbd:rbd/namespace/image 1M
Formatting 'rbd:rbd/namespace/image', fmt=raw size=1048576
qemu-img: rbd:rbd/namespace/image: error rbd create: No such file or directory
Unfortunately rados_ioctx_set_namespace() does not fail if the namespace
does not exist, so let's use rbd_namespace_exists() in qemu_rbd_connect()
to check if the namespace exists, reporting a more understandable error:
$ qemu-img create rbd:rbd/namespace/image 1M
Formatting 'rbd:rbd/namespace/image', fmt=raw size=1048576
qemu-img: rbd:rbd/namespace/image: namespace 'namespace' does not exist
Reported-by: Tingting Mao <timao@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220517071012.6120-1-sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
To support reconnecting after restart or crash, VDUSE backend
might need to resubmit inflight I/Os. This stores the metadata
such as the index of inflight I/O's descriptors to a shm file so
that VDUSE backend can restore them during reconnecting.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Message-Id: <20220523084611.91-9-xieyongji@bytedance.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>