Commit Graph

456 Commits

Author SHA1 Message Date
Emanuele Giuseppe Esposito
ca5e2ad98d block: Rename bdrv_load/save_vmstate() to bdrv_co_load/save_vmstate()
Since these functions always run in coroutine context, adjust
their name to include "_co_", just like all other BlockDriver callbacks.

No functional change intended.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230113204212.359076-15-kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-01 16:52:32 +01:00
Emanuele Giuseppe Esposito
c834dc0586 block: Convert bdrv_debug_event() to co_wrapper_mixed
bdrv_debug_event() is categorized as an I/O function, and it currently
doesn't run in a coroutine. We should let it take a graph rdlock since
it traverses the block nodes graph, which however is only possible in a
coroutine.

Therefore turn it into a co_wrapper_mixed to move the actual function
into a coroutine where the lock can be taken.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230113204212.359076-14-kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-01 16:52:32 +01:00
Emanuele Giuseppe Esposito
3d47eb0a2a block: Convert bdrv_get_info() to co_wrapper_mixed
bdrv_get_info() is categorized as an I/O function, and it currently
doesn't run in a coroutine. We should let it take a graph rdlock since
it traverses the block nodes graph, which however is only possible in a
coroutine.

Therefore turn it into a co_wrapper to move the actual function into a
coroutine where the lock can be taken.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230113204212.359076-11-kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-01 16:52:32 +01:00
Emanuele Giuseppe Esposito
bd53086e82 block: use bdrv_co_refresh_total_sectors when possible
In some places we are sure we are always running in a
coroutine, therefore it's useless to call the generated_co_wrapper,
instead call directly the _co_ function.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230113204212.359076-9-kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-01 16:52:32 +01:00
Emanuele Giuseppe Esposito
c057960c4e block: Rename refresh_total_sectors to bdrv_refresh_total_sectors
The name is not good, not the least because we are going to convert this
to a generated co_wrapper, which adds a _co infix after the first part
of the name.

No functional change intended.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230113204212.359076-6-kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-01 16:52:32 +01:00
Emanuele Giuseppe Esposito
1e97be9156 block: Convert bdrv_is_inserted() to co_wrapper
bdrv_is_inserted() is categorized as an I/O function, and it currently
doesn't run in a coroutine. We should let it take a graph rdlock since
it traverses the block nodes graph, which however is only possible in a
coroutine.

Therefore turn it into a co_wrapper to move the actual function into a
coroutine where the lock can be taken.

At the same time, add also blk_is_inserted as co_wrapper_mixed, since it
is called in both coroutine and non-coroutine contexts.

Because now this function creates a new coroutine and polls, we need to
take the AioContext lock where it is missing, for the only reason that
internally c_w_mixed_bdrv_rdlock calls AIO_WAIT_WHILE and it expects to
release the AioContext lock. Once the rwlock is ultimated and placed in
every place it needs to be, we will poll using AIO_WAIT_WHILE_UNLOCKED
and remove the AioContext lock.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230113204212.359076-5-kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-01 16:52:32 +01:00
Emanuele Giuseppe Esposito
09d9fc97f8 block: Convert bdrv_io_unplug() to co_wrapper
BlockDriver->bdrv_io_unplug is categorized as IO callback, and it
currently doesn't run in a coroutine. We should let it take a graph
rdlock since the callback traverses the block nodes graph, which however
is only possible in a coroutine.

The only caller of this function is blk_io_unplug(), therefore make
blk_io_unplug() a co_wrapper, so that we're always running in a
coroutine where the lock can be taken.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230113204212.359076-4-kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-01 16:52:32 +01:00
Emanuele Giuseppe Esposito
8f49745432 block: Convert bdrv_io_plug() to co_wrapper
BlockDriver->bdrv_io_plug is categorized as IO callback, and it
currently doesn't run in a coroutine. We should let it take a graph
rdlock since the callback traverses the block nodes graph, which however
is only possible in a coroutine.

The only caller of this function is blk_io_plug(), therefore make
blk_io_plug() a co_wrapper, so that we're always running in a coroutine
where the lock can be taken.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230113204212.359076-3-kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-02-01 16:52:32 +01:00
Markus Armbruster
e2c1c34f13 include/block: Untangle inclusion loops
We have two inclusion loops:

       block/block.h
    -> block/block-global-state.h
    -> block/block-common.h
    -> block/blockjob.h
    -> block/block.h

       block/block.h
    -> block/block-io.h
    -> block/block-common.h
    -> block/blockjob.h
    -> block/block.h

I believe these go back to Emanuele's reorganization of the block API,
merged a few months ago in commit d7e2fe4aac.

Fortunately, breaking them is merely a matter of deleting unnecessary
includes from headers, and adding them back in places where they are
now missing.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221221133551.3967339-2-armbru@redhat.com>
2023-01-20 07:24:28 +01:00
Peter Maydell
6394578984 block/io: Check for replay-enabled in bdrv_drain_all_begin()
In commit da0bd74434 we refactored bdrv_drain_all_begin() to pull out
the non-polling part into bdrv_drain_all_begin_nopoll().  This change
broke record-and-replay, because the "return early if replay enabled"
check is now in the sub-function bdrv_drain_all_begin_nopoll(), and
so it only causes us to return from that function, and not from the
calling bdrv_drain_all_begin().

Fix the regression by checking whether replay is enabled in both
functions.

The breakage and fix can be tested via 'make check-avocado': the
tests/avocado/reverse_debugging.py:ReverseDebugging_X86_64.test_x86_64_pc
tests/avocado/reverse_debugging.py:ReverseDebugging_AArch64.test_aarch64_virt
tests were both broken by this.

Fixes: da0bd74434 ("block: Factor out bdrv_drain_all_begin_nopoll()")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Fabiano Rosas <farosas@suse.de>
Message-id: 20221220174638.2156308-1-peter.maydell@linaro.org
2022-12-21 12:38:17 +00:00
Kevin Wolf
1b3ff9feb9 block: GRAPH_RDLOCK for functions only called by co_wrappers
The generated coroutine wrappers already take care to take the lock in
the non-coroutine path, and assume that the lock is already taken in the
coroutine path.

The only thing we need to do for the wrapped function is adding the
GRAPH_RDLOCK annotation. Doing so also allows us to mark the
corresponding callbacks in BlockDriver as GRAPH_RDLOCK_PTR.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221207131838.239125-19-kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15 16:08:23 +01:00
Kevin Wolf
da0bd74434 block: Factor out bdrv_drain_all_begin_nopoll()
Provide a separate function that just quiesces the users of a node to
prevent new requests from coming in, but without waiting for the already
in-flight I/O to complete.

This function can be used in contexts where polling is not allowed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221207131838.239125-2-kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15 16:07:43 +01:00
Emanuele Giuseppe Esposito
7b52a921c1 block-io: introduce coroutine_fn duplicates for bdrv_common_block_status_above callers
bdrv_common_block_status_above() is a g_c_w, and it is being called by
many "wrapper" functions like bdrv_is_allocated(),
bdrv_is_allocated_above() and bdrv_block_status_above().

Because we want to eventually split the coroutine from non-coroutine
case in g_c_w, create duplicate wrappers that take care of directly
calling the same coroutine functions called in the g_c_w.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20221128142337.657646-2-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15 16:07:43 +01:00
Kevin Wolf
606ed756c1 block: Remove poll parameter from bdrv_parent_drained_begin_single()
All callers of bdrv_parent_drained_begin_single() pass poll=false now,
so we don't need the parameter any more.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221118174110.55183-16-kwolf@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15 16:07:42 +01:00
Kevin Wolf
2398747128 block: Don't poll in bdrv_replace_child_noperm()
In order to make sure that bdrv_replace_child_noperm() doesn't have to
poll any more, get rid of the bdrv_parent_drained_begin_single() call.

This is possible now because we can require that the parent is already
drained through the child in question when the function is called and we
don't call the parent drain callbacks more than once.

The additional drain calls needed in callers cause the test case to run
its code in the drain handler too early (bdrv_attach_child() drains
now), so modify it to only enable the code after the test setup has
completed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221118174110.55183-15-kwolf@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15 16:07:42 +01:00
Kevin Wolf
05c272ff0c block: Drop out of coroutine in bdrv_do_drained_begin_quiesce()
The next patch adds a parent drain to bdrv_attach_child_common(), which
shouldn't be, but is currently called from coroutines in some cases (e.g.
.bdrv_co_create implementations generally open new nodes). Therefore,
the assertion that we're not in a coroutine doesn't hold true any more.

We could just remove the assertion because there is nothing in the
function that should be in conflict with running in a coroutine, but
just to be on the safe side, we can reverse the caller relationship
between bdrv_do_drained_begin() and bdrv_do_drained_begin_quiesce() so
that the latter also just drops out of coroutine context and we can
still be certain in the future that any drain code doesn't run in
coroutines.

As a nice side effect, the structure of bdrv_do_drained_begin() is now
symmetrical with bdrv_do_drained_end().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221118174110.55183-14-kwolf@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15 16:07:42 +01:00
Kevin Wolf
a82a3bd135 block: Remove ignore_bds_parents parameter from drain_begin/end.
ignore_bds_parents is now ignored during drain_begin and drain_end, so
we can just remove it there. It is still a valid optimisation for
drain_all in bdrv_drained_poll(), so leave it around there.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221118174110.55183-13-kwolf@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15 16:07:42 +01:00
Kevin Wolf
57e05be343 block: Call drain callbacks only once
We only need to call both the BlockDriver's callback and the parent
callbacks when going from undrained to drained or vice versa. A second
drain section doesn't make a difference for the driver or the parent,
they weren't supposed to send new requests before and after the second
drain.

One thing that gets in the way is the 'ignore_bds_parents' parameter in
bdrv_do_drained_begin_quiesce() and bdrv_do_drained_end(): It means that
bdrv_drain_all_begin() increases bs->quiesce_counter, but does not
quiesce the parent through BdrvChildClass callbacks. If an additional
drain section is started now, bs->quiesce_counter will be non-zero, but
we would still need to quiesce the parent through BdrvChildClass in
order to keep things consistent (and unquiesce it on the matching
bdrv_drained_end(), even though the counter would not reach 0 yet as
long as the bdrv_drain_all() section is still active).

Instead of keeping track of this, let's just get rid of the parameter.
It was introduced in commit 6cd5c9d7b2 as an optimisation so that
during bdrv_drain_all(), we wouldn't recursively drain all parents up to
the root for each node, resulting in quadratic complexity. As it happens,
calling the callbacks only once solves the same problem, so as of this
patch, we'll still have O(n) complexity and ignore_bds_parents is not
needed any more.

This patch only ignores the 'ignore_bds_parents' parameter. It will be
removed in a separate patch.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20221118174110.55183-12-kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15 16:07:42 +01:00
Kevin Wolf
299403aeda block: Remove subtree drains
Subtree drains are not used any more. Remove them.

After this, BdrvChildClass.attach/detach() don't poll any more.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20221118174110.55183-11-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15 16:07:42 +01:00
Kevin Wolf
c7bc05f78a block: Inline bdrv_drain_invoke()
bdrv_drain_invoke() has now two entirely separate cases that share no
code any more and are selected depending on a bool parameter. Each case
has only one caller. Just inline the function.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20221118174110.55183-6-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15 16:07:42 +01:00
Kevin Wolf
2f65df6e16 block: Remove drained_end_counter
drained_end_counter is unused now, nobody changes its value any more. It
can be removed.

In cases where we had two almost identical functions that only differed
in whether the caller passes drained_end_counter, or whether they would
poll for a local drained_end_counter to reach 0, these become a single
function.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20221118174110.55183-5-kwolf@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15 16:07:42 +01:00
Kevin Wolf
5e8ac21717 block: Revert .bdrv_drained_begin/end to non-coroutine_fn
Polling during bdrv_drained_end() can be problematic (and in the future,
we may get cases for bdrv_drained_begin() where polling is forbidden,
and we don't care about already in-flight requests, but just want to
prevent new requests from arriving).

The .bdrv_drained_begin/end callbacks running in a coroutine is the only
reason why we have to do this polling, so make them non-coroutine
callbacks again. None of the callers actually yield any more.

This means that bdrv_drained_end() effectively doesn't poll any more,
even if AIO_WAIT_WHILE() loops are still there (their condition is false
from the beginning). This is generally not a problem, but in
test-bdrv-drain, some additional explicit aio_poll() calls need to be
added because the test case wants to verify the final state after BHs
have executed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20221118174110.55183-4-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-12-15 16:07:42 +01:00
Hanna Reitz
ace5a161ea block: Start/end drain on correct AioContext
bdrv_parent_drained_{begin,end}_single() are supposed to operate on the
parent, not on the child, so they should not attempt to get the context
to poll from the child but the parent instead.  BDRV_POLL_WHILE(c->bs)
does get the context from the child, so we should replace it with
AIO_WAIT_WHILE() on the parent's context instead.

This problem becomes apparent when bdrv_replace_child_noperm() invokes
bdrv_parent_drained_end_single() after removing a child from a subgraph
that is in an I/O thread.  By the time bdrv_parent_drained_end_single()
is called, child->bs is NULL, and so BDRV_POLL_WHILE(c->bs, ...) will
poll the main loop instead of the I/O thread; but anything that
bdrv_parent_drained_end_single_no_poll() may have scheduled is going to
want to run in the I/O thread, but because we poll the main loop, the
I/O thread is never unpaused, and nothing is run, resulting in a
deadlock.

Closes: https://gitlab.com/qemu-project/qemu/-/issues/1215
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20221107151321.211175-4-hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-11-10 14:58:43 +01:00
Stefan Hajnoczi
d5ab9490cd Block layer patches
- Cleanup bs->backing and bs->file handling
 - Refactor bdrv_try_set_aio_context using transactions
 - Changes for improved coroutine_fn consistency
 - vhost-user-blk: fix the resize crash
 - io_uring: Use of io_uring_register_ring_fd() led to breakage, revert
 - vvfat: Fix some problems with r/w mode
 - Code cleanup
 - MAINTAINERS: Fold "Block QAPI, monitor, ..." into "Block layer core"
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmNazhIRHGt3b2xmQHJl
 ZGhhdC5jb20ACgkQfwmycsiPL9ZyTw/8Dfck/SuxfyeLlnQItkjaV4cnqWOU8vHs
 9x0KhlptCs+HXdF/3iicpA0lHojn7mNnbdFGjPRY4E0LriQv91TQ5ycdEmrseFPf
 sgeQlgdKCVU/pHjZ2wYarm2pE43Cx85a5xuufmw+7w49dNNZn14l4t+DgviuClVM
 nuVaogfZFbYyetre+Qd2TgLl+gJ+0d4o7Zs5lSWLrT8t0L9AGkcWPA7Nrbl6loIE
 dOautV4G7jLjuMiCeJZOGcnuRVe3gCQ5rCGBFzzH4DUtz4BmiYx4hd3LMEsP0PMM
 CrsfDZS04Ztybl9M7TmJuwkAm1gx1JDMOuJuh18lbJocIOBvhkKKxY2wI5LIdZVI
 ZntmU36RowkX+GGu/PYpYyMjBDClJppZCl7vnjyLYsVt6r0Vu6SmlHpJhcRYabhe
 96Kv1LXH9A6+ogKPU3Layw6JGjg01GNr1ALuT7PO3pGto/JshmOuBEJJDucoF84M
 5AfxFCohMROVldwblA6M0eKnlQBgtr5BvtgbV54BBo88VlFJgDJFQn7R09cTFUEo
 UwaJoS+nIaiZ0bQQVZhZloVppUaTdVJojzfVRCZZctga96/tu1HSFnGLnbEFpUN3
 KOf+XnVNS6Ro+nPSDf9bMjbIom2JicGFfV+6yMgIoxY/d5UA2dTZfefil4TAlSod
 6PsTgg+jrm8=
 =/Fw0
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into staging

Block layer patches

- Cleanup bs->backing and bs->file handling
- Refactor bdrv_try_set_aio_context using transactions
- Changes for improved coroutine_fn consistency
- vhost-user-blk: fix the resize crash
- io_uring: Use of io_uring_register_ring_fd() led to breakage, revert
- vvfat: Fix some problems with r/w mode
- Code cleanup
- MAINTAINERS: Fold "Block QAPI, monitor, ..." into "Block layer core"

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmNazhIRHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9ZyTw/8Dfck/SuxfyeLlnQItkjaV4cnqWOU8vHs
# 9x0KhlptCs+HXdF/3iicpA0lHojn7mNnbdFGjPRY4E0LriQv91TQ5ycdEmrseFPf
# sgeQlgdKCVU/pHjZ2wYarm2pE43Cx85a5xuufmw+7w49dNNZn14l4t+DgviuClVM
# nuVaogfZFbYyetre+Qd2TgLl+gJ+0d4o7Zs5lSWLrT8t0L9AGkcWPA7Nrbl6loIE
# dOautV4G7jLjuMiCeJZOGcnuRVe3gCQ5rCGBFzzH4DUtz4BmiYx4hd3LMEsP0PMM
# CrsfDZS04Ztybl9M7TmJuwkAm1gx1JDMOuJuh18lbJocIOBvhkKKxY2wI5LIdZVI
# ZntmU36RowkX+GGu/PYpYyMjBDClJppZCl7vnjyLYsVt6r0Vu6SmlHpJhcRYabhe
# 96Kv1LXH9A6+ogKPU3Layw6JGjg01GNr1ALuT7PO3pGto/JshmOuBEJJDucoF84M
# 5AfxFCohMROVldwblA6M0eKnlQBgtr5BvtgbV54BBo88VlFJgDJFQn7R09cTFUEo
# UwaJoS+nIaiZ0bQQVZhZloVppUaTdVJojzfVRCZZctga96/tu1HSFnGLnbEFpUN3
# KOf+XnVNS6Ro+nPSDf9bMjbIom2JicGFfV+6yMgIoxY/d5UA2dTZfefil4TAlSod
# 6PsTgg+jrm8=
# =/Fw0
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 27 Oct 2022 14:29:38 EDT
# gpg:                using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg:                issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (58 commits)
  block/block-backend: blk_set_enable_write_cache is IO_CODE
  monitor: switch to *_co_* functions
  vmdk: switch to *_co_* functions
  vhdx: switch to *_co_* functions
  vdi: switch to *_co_* functions
  qed: switch to *_co_* functions
  qcow2: switch to *_co_* functions
  qcow: switch to *_co_* functions
  parallels: switch to *_co_* functions
  mirror: switch to *_co_* functions
  block: switch to *_co_* functions
  commit: switch to *_co_* functions
  vmdk: manually add more coroutine_fn annotations
  qcow2: manually add more coroutine_fn annotations
  qcow: manually add more coroutine_fn annotations
  blkdebug: add missing coroutine_fn annotation for indirect-called functions
  qcow2: add coroutine_fn annotation for indirect-called functions
  block: add missing coroutine_fn annotation to BlockDriverState callbacks
  coroutine-io: add missing coroutine_fn annotation to prototypes
  coroutine-lock: add missing coroutine_fn annotation to prototypes
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-30 15:15:12 -04:00
Alberto Faria
ce47ff20b9 block: switch to *_co_* functions
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-16-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Alberto Faria
7c85803c49 block: remove incorrect coroutine_fn annotation
Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20221013123711.620631-3-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Stefan Hajnoczi
f4ec04bae9 block: return errors from bdrv_register_buf()
Registering an I/O buffer is only a performance optimization hint but it
is still necessary to return errors when it fails.

Later patches will need to detect errors when registering buffers but an
immediate advantage is that error_report() calls are no longer needed in
block driver .bdrv_register_buf() functions.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20221013185908.1297568-8-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-26 14:56:42 -04:00
Stefan Hajnoczi
e8b6535533 block: add BDRV_REQ_REGISTERED_BUF request flag
Block drivers may optimize I/O requests accessing buffers previously
registered with bdrv_register_buf(). Checking whether all elements of a
request's QEMUIOVector are within previously registered buffers is
expensive, so we need a hint from the user to avoid costly checks.

Add a BDRV_REQ_REGISTERED_BUF request flag to indicate that all
QEMUIOVector elements in an I/O request are known to be within
previously registered buffers.

Always pass the flag through to driver read/write functions. There is
little harm in passing the flag to a driver that does not use it.
Passing the flag to drivers avoids changes across many block drivers.
Filter drivers would need to explicitly support the flag and pass
through to their children when the children support it. That's a lot of
code changes and it's hard to remember to do that everywhere, leading to
silent reduced performance when the flag is accidentally dropped.

The only problematic scenario with the approach in this patch is when a
driver passes the flag through to internal I/O requests that don't use
the same I/O buffer. In that case the hint may be set when it should
actually be clear. This is a rare case though so the risk is low.

Some drivers have assert(!flags), which no longer works when
BDRV_REQ_REGISTERED_BUF is passed in. These assertions aren't very
useful anyway since the functions are called almost exclusively by
bdrv_driver_preadv/pwritev() so if we get flags handling right there
then the assertion is not needed.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20221013185908.1297568-7-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-26 14:56:42 -04:00
Stefan Hajnoczi
4f384011c5 block: pass size to bdrv_unregister_buf()
The only implementor of bdrv_register_buf() is block/nvme.c, where the
size is not needed when unregistering a buffer. This is because
util/vfio-helpers.c can look up mappings by address.

Future block drivers that implement bdrv_register_buf() may not be able
to do their job given only the buffer address. Add a size argument to
bdrv_unregister_buf().

Also document the assumptions about
bdrv_register_buf()/bdrv_unregister_buf() calls. The same <host, size>
values that were given to bdrv_register_buf() must be given to
bdrv_unregister_buf().

gcc 11.2.1 emits a spurious warning that img_bench()'s buf_size local
variable might be uninitialized, so it's necessary to silence the
compiler.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20221013185908.1297568-5-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-26 14:56:42 -04:00
Paolo Bonzini
881a4c553c block: add missing coroutine_fn annotations
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())".  Apply coroutine_fn to
functions where this holds.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-3-pbonzini@redhat.com>
[kwolf: Fixed up coding style]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-07 12:11:40 +02:00
Keith Busch
a7c5f67a78 block: move bdrv_qiov_is_aligned to file-posix
There is only user of bdrv_qiov_is_aligned(), so move the alignment
function to there and make it static.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Message-Id: <20220929200523.3218710-2-kbusch@meta.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-09-30 18:43:44 +02:00
Denis V. Lunev
131498f775 block: make serializing requests functions 'void'
Return codes of the following functions are never used in the code:
* bdrv_wait_serialising_requests_locked
* bdrv_wait_serialising_requests
* bdrv_make_request_serialising

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Hanna Reitz <hreitz@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Fam Zheng <fam@euphon.net>
CC: Ronnie Sahlberg <ronniesahlberg@gmail.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Peter Lieven <pl@kamp.de>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20220817083736.40981-3-den@openvz.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-09-30 18:43:44 +02:00
Alberto Faria
e97190a405 block: Add bdrv_co_pwrite_sync()
Also convert bdrv_pwrite_sync() to being implemented using
generated_co_wrapper.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220609152744.3891847-9-afaria@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-07-12 12:14:56 +02:00
Alberto Faria
1d39c7098b block: Implement bdrv_{pread,pwrite,pwrite_zeroes}() using generated_co_wrapper
bdrv_{pread,pwrite}() now return -EIO instead of -EINVAL when 'bytes' is
negative, making them consistent with bdrv_{preadv,pwritev}() and
bdrv_co_{pread,pwrite,preadv,pwritev}().

bdrv_pwrite_zeroes() now also calls trace_bdrv_co_pwrite_zeroes() and
clears the BDRV_REQ_MAY_UNMAP flag when appropriate, which it didn't
previously.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Message-Id: <20220609152744.3891847-8-afaria@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-07-12 12:14:56 +02:00
Alberto Faria
353a5d84b2 block: Make bdrv_{pread,pwrite}() return 0 on success
They currently return the value of their 'bytes' parameter on success.

Make them return 0 instead, for consistency with other I/O functions and
in preparation to implement them using generated_co_wrapper. This also
makes it clear that short reads/writes are not possible.

The few callers that rely on the previous behavior are adjusted
accordingly by hand.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220609152744.3891847-4-afaria@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-07-12 12:14:55 +02:00
Alberto Faria
32cc71def9 block: Change bdrv_{pread,pwrite,pwrite_sync}() param order
Swap 'buf' and 'bytes' around for consistency with
bdrv_co_{pread,pwrite}(), and in preparation to implement these
functions using generated_co_wrapper.

Callers were updated using this Coccinelle script:

    @@ expression child, offset, buf, bytes, flags; @@
    - bdrv_pread(child, offset, buf, bytes, flags)
    + bdrv_pread(child, offset, bytes, buf, flags)

    @@ expression child, offset, buf, bytes, flags; @@
    - bdrv_pwrite(child, offset, buf, bytes, flags)
    + bdrv_pwrite(child, offset, bytes, buf, flags)

    @@ expression child, offset, buf, bytes, flags; @@
    - bdrv_pwrite_sync(child, offset, buf, bytes, flags)
    + bdrv_pwrite_sync(child, offset, bytes, buf, flags)

Resulting overly-long lines were then fixed by hand.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20220609152744.3891847-3-afaria@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-07-12 12:14:55 +02:00
Alberto Faria
53fb7844f0 block: Add a 'flags' param to bdrv_{pread,pwrite,pwrite_sync}()
For consistency with other I/O functions, and in preparation to
implement them using generated_co_wrapper.

Callers were updated using this Coccinelle script:

    @@ expression child, offset, buf, bytes; @@
    - bdrv_pread(child, offset, buf, bytes)
    + bdrv_pread(child, offset, buf, bytes, 0)

    @@ expression child, offset, buf, bytes; @@
    - bdrv_pwrite(child, offset, buf, bytes)
    + bdrv_pwrite(child, offset, buf, bytes, 0)

    @@ expression child, offset, buf, bytes; @@
    - bdrv_pwrite_sync(child, offset, buf, bytes)
    + bdrv_pwrite_sync(child, offset, buf, bytes, 0)

Resulting overly-long lines were then fixed by hand.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20220609152744.3891847-2-afaria@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-07-12 12:14:55 +02:00
Stefan Hajnoczi
3399848b7f block: drop unused bdrv_co_drain() API
bdrv_co_drain() has not been used since commit 9a0cec664e ("mirror:
use bdrv_drained_begin/bdrv_drained_end") in 2016. Remove it so there
are fewer drain scenarios to worry about.

Use bdrv_drained_begin()/bdrv_drained_end() instead. They are "mixed"
functions that can be called from coroutine context. Unlike
bdrv_co_drain(), these functions provide control of the length of the
drained section, which is usually the right thing.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220521122714.3837731-1-stefanha@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-06-24 17:07:06 +02:00
Paolo Bonzini
f0d43b1ece coroutine-lock: qemu_co_queue_restart_all is a coroutine-only qemu_co_enter_all
qemu_co_queue_restart_all is basically the same as qemu_co_enter_all
but without a QemuLockable argument.  That's perfectly fine, but only as
long as the function is marked coroutine_fn.  If used outside coroutine
context, qemu_co_queue_wait will attempt to take the lock and that
is just broken: if you are calling qemu_co_queue_restart_all outside
coroutine context, the lock is going to be a QemuMutex which cannot be
taken twice by the same thread.

The patch adds the marker to qemu_co_queue_restart_all and to its sole
non-coroutine_fn caller; it then reimplements the function in terms of
qemu_co_enter_all_impl, to remove duplicated code and to clarify that the
latter also works in coroutine context.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20220427130830.150180-4-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-12 12:29:44 +02:00
Marc-André Lureau
8e3b0cbb72 Replace qemu_real_host_page variables with inlined functions
Replace the global variables with inlined helper functions. getpagesize() is very
likely annotated with a "const" function attribute (at least with glibc), and thus
optimization should apply even better.

This avoids the need for a constructor initialization too.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06 10:50:38 +02:00
Peter Maydell
9740b907a5 target-arm queue:
* cleanups of qemu_oom_check() and qemu_memalign()
  * target/arm/translate-neon: UNDEF if VLD1/VST1 stride bits are non-zero
  * target/arm/translate-neon: Simplify align field check for VLD3
  * GICv3 ITS: add more trace events
  * GICv3 ITS: implement 8-byte accesses properly
  * GICv3: fix minor issues with some trace/log messages
  * ui/cocoa: Use the standard about panel
  * target/arm: Provide cpu property for controling FEAT_LPA2
  * hw/arm/virt: Disable LPA2 for -machine virt-6.2
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmImNs4ZHHBldGVyLm1h
 eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3q87D/0cMQeF00uVRNqftrQg2SDI
 txJIG2QYUOPMCDfGWlGTfXv2TUc5y3XwA77C9vTcJcIWJlZ30DUa95DNYqA0BbOH
 TEOzRuZME64wA/JndHadz7oh+xb3HYn+6aSr63LeQCI3/h1eXVHknnEcyF1danOb
 YNB1T308THTEwJHQuKHYksIasgVwcjOf8FvMRYFozVkAKEx1SlabpFXST+aVNyx4
 ASsC2PTiJYAqwnYrTX8lWOYKMiKfkNrQcTd6x7rkoDw1pV7ZDMw2/69tpkhdJ5Fa
 lwxhwZ3+40x49eFGAhfuZWZmGLd4c+76u64pmWW429uk1JhaoXgErJM3xfHbI1er
 d7XSQYkMhDrY5SFuoE5XYwOuxanPtn3f7luM236Uzgf4ZR6qTrf6x+R1xLPZVYa9
 fWbjvR3g5sltTOzyc+9UsBq1OPCbRUbmhJtJDvojj5sWmNvgOwZnSkTu5kMAqvFP
 T2cQIi6phRBo3oMN/fhEZi3g828JjYEA9QlpWZ74JOyiXjYUq9VVNpoe/dtAv4Yy
 wZ+XhVNIK82/4Mxjr9SEeYeNzYrsEEvFAUqe9Bil2CpuIMV5ONEzs+UfQ/gyk4eq
 QnGPiojCrpf6PPAfci0Y6b4RzO+loMFpLjCpurngB4g4cBdmThKip0sVZdTZAI9Y
 lnusB8MR1sESoqYdPZsAfQ==
 =ix0J
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20220307' into staging

target-arm queue:
 * cleanups of qemu_oom_check() and qemu_memalign()
 * target/arm/translate-neon: UNDEF if VLD1/VST1 stride bits are non-zero
 * target/arm/translate-neon: Simplify align field check for VLD3
 * GICv3 ITS: add more trace events
 * GICv3 ITS: implement 8-byte accesses properly
 * GICv3: fix minor issues with some trace/log messages
 * ui/cocoa: Use the standard about panel
 * target/arm: Provide cpu property for controling FEAT_LPA2
 * hw/arm/virt: Disable LPA2 for -machine virt-6.2

# gpg: Signature made Mon 07 Mar 2022 16:46:06 GMT
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20220307:
  hw/arm/virt: Disable LPA2 for -machine virt-6.2
  target/arm: Provide cpu property for controling FEAT_LPA2
  ui/cocoa: Use the standard about panel
  hw/intc/arm_gicv3_cpuif: Fix register names in ICV_HPPIR read trace event
  hw/intc/arm_gicv3: Fix missing spaces in error log messages
  hw/intc/arm_gicv3: Specify valid and impl in MemoryRegionOps
  hw/intc/arm_gicv3_its: Add trace events for table reads and writes
  hw/intc/arm_gicv3_its: Add trace events for commands
  target/arm/translate-neon: Simplify align field check for VLD3
  target/arm/translate-neon: UNDEF if VLD1/VST1 stride bits are non-zero
  osdep: Move memalign-related functions to their own header
  util: Put qemu_vfree() in memalign.c
  util: Use meson checks for valloc() and memalign() presence
  util: Share qemu_try_memalign() implementation between POSIX and Windows
  meson.build: Don't misdetect posix_memalign() on Windows
  util: Return valid allocation for qemu_try_memalign() with zero size
  util: Unify implementations of qemu_memalign()
  util: Make qemu_oom_check() a static function

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-03-08 15:26:10 +00:00
Peter Maydell
5df022cf2e osdep: Move memalign-related functions to their own header
Move the various memalign-related functions out of osdep.h and into
their own header, which we include only where they are used.
While we're doing this, add some brief documentation comments.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org
2022-03-07 13:16:49 +00:00
Vladimir Sementsov-Ogievskiy
ce14f3b407 block/io: introduce block driver snapshot-access API
Add new block driver handlers and corresponding generic wrappers.
It will be used to allow copy-before-write filter to provide
reach fleecing interface in further commit.

In future this approach may be used to allow reading qcow2 internal
snapshots, for example to export them through NBD.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220303194349.2304213-11-vsementsov@virtuozzo.com>
[hreitz: Rebased on block GS/IO split]
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:33:31 +01:00
Vladimir Sementsov-Ogievskiy
45e62b464a block: fix preallocate filter: don't do unaligned preallocate requests
There is a bug in handling BDRV_REQ_NO_WAIT flag: we still may wait in
wait_serialising_requests() if request is unaligned. And this is
possible for the only user of this flag (preallocate filter) if
underlying file is unaligned to its request_alignment on start.

So, we have to fix preallocate filter to do only aligned preallocate
requests.

Next, we should fix generic block/io.c somehow. Keeping in mind that
preallocate is the only user of BDRV_REQ_NO_WAIT and that we have to
fix its behavior now, it seems more safe to just assert that we never
use BDRV_REQ_NO_WAIT with unaligned requests and add corresponding
comment. Let's do so.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Message-Id: <20220215121609.38570-1-vsementsov@virtuozzo.com>
[hreitz: Rebased on block GS/IO split]
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-03-07 09:19:20 +01:00
Hanna Reitz
4d378bbd83 block: Make bdrv_refresh_limits() non-recursive
bdrv_refresh_limits() recurses down to the node's children.  That does
not seem necessary: When we refresh limits on some node, and then
recurse down and were to change one of its children's BlockLimits, then
that would mean we noticed the changed limits by pure chance.  The fact
that we refresh the parent's limits has nothing to do with it, so the
reason for the change probably happened before this point in time, and
we should have refreshed the limits then.

Consequently, we should actually propagate block limits changes upwards,
not downwards.  That is a separate and pre-existing issue, though, and
so will not be addressed in this patch.

The problem with recursing is that bdrv_refresh_limits() is not atomic.
It begins with zeroing BDS.bl, and only then sets proper, valid limits.
If we do not drain all nodes whose limits are refreshed, then concurrent
I/O requests can encounter invalid request_alignment values and crash
qemu.  Therefore, a recursing bdrv_refresh_limits() requires the whole
subtree to be drained, which is currently not ensured by most callers.

A non-recursive bdrv_refresh_limits() only requires the node in question
to not receive I/O requests, and this is done by most callers in some
way or another:
- bdrv_open_driver() deals with a new node with no parents yet
- bdrv_set_file_or_backing_noperm() acts on a drained node
- bdrv_reopen_commit() acts only on drained nodes
- bdrv_append() should in theory require the node to be drained; in
  practice most callers just lock the AioContext, which should at least
  be enough to prevent concurrent I/O requests from accessing invalid
  limits

So we can resolve the bug by making bdrv_refresh_limits() non-recursive.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1879437
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20220216105355.30729-2-hreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:26 +01:00
Emanuele Giuseppe Esposito
1581a70ddd block/coroutines: I/O and "I/O or GS" API
block coroutines functions run in different aiocontext, and are
not protected by the BQL. Therefore are I/O.

On the other side, generated_co_wrapper functions use BDRV_POLL_WHILE,
meaning the caller can either be the main loop or a specific iothread.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-25-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito
967d7905d1 IO_CODE and IO_OR_GS_CODE for block_int I/O API
Mark all I/O functions with IO_CODE, and all "I/O OR GS" with
IO_OR_GS_CODE.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-14-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito
b4ad82aab1 assertions for block_int global state API
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-13-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito
384a48fb74 IO_CODE and IO_OR_GS_CODE for block I/O API
Mark all I/O functions with IO_CODE, and all "I/O OR GS" with
IO_OR_GS_CODE.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-6-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito
f791bf7f93 assertions for block global state API
All the global state (GS) API functions will check that
qemu_in_main_thread() returns true. If not, it means
that the safety of BQL cannot be guaranteed, and
they need to be moved to I/O.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-5-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00