This is the big patch that removes
aio_context_acquire()/aio_context_release() from the block layer and
affected block layer users.
There isn't a clean way to split this patch and the reviewers are likely
the same group of people, so I decided to do it in one patch.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-ID: <20231205182011.1976568-7-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Stop acquiring/releasing the AioContext lock in
bdrv_graph_wrlock()/bdrv_graph_unlock() since the lock no longer has any
effect.
The distinction between bdrv_graph_wrunlock() and
bdrv_graph_wrunlock_ctx() becomes meaningless and they can be collapsed
into one function.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231205182011.1976568-6-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_graph_wrunlock() calls aio_poll(), which may run callbacks that
have a nested event loop. Nested event loops can depend on other
iothreads making progress, so in order to allow them to make progress it
must not hold the AioContext lock of another thread while calling
aio_poll().
This introduces a @bs parameter to bdrv_graph_wrunlock() whose
AioContext is temporarily dropped (which matches bdrv_graph_wrlock()),
and a bdrv_graph_wrunlock_ctx() that can be used if the BlockDriverState
doesn't necessarily exist any more when unlocking.
This also requires a change to bdrv_schedule_unref(), which was relying
on the incorrectly taken lock. It needs to take the lock itself now.
While this is a separate bug, it can't be fixed a separate patch because
otherwise the intermediate state would either deadlock or try to release
a lock that we don't even hold.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231115172012.112727-3-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[kwolf: Fixed up bdrv_schedule_unref()]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Almost all functions that access bs->backing already take the graph
lock now. Add locking to the remaining users and finally annotate the
struct field itself as protected by the graph lock.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-18-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of taking the writer lock internally, require callers to already
hold it when calling bdrv_replace_node(). Its callers may already want
to hold the graph lock and so wouldn't be able to call functions that
take it internally.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231027155333.420094-17-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The function reads the parents list, so it needs to hold the graph lock.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230911094620.45040-14-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of taking the writer lock internally, require callers to already
hold it when calling bdrv_attach_child_common(). These callers will
typically already hold the graph lock once the locking work is
completed, which means that they can't call functions that take it
internally.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230911094620.45040-13-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The documentation for bdrv_append() says that the caller must hold the
AioContext lock for bs_top. Change all callers to actually adhere to the
contract.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230911094620.45040-5-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
with some rewording in
tests/qemu-iotests/298
tests/qtest/fuzz/generic_fuzz.c
tests/unit/test-throttle.c
as suggested by Eric.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
bdrv_pass_through is used as filter, even all node variables has
corresponding names. We want to append it, so it should be
backing-child-based filter like mirror_top.
So, in test_update_perm_tree, first child should be DATA, as we don't
want filters with two filtered children.
bdrv_exclusive_writer is used as a filter once. So it should be filter
anyway. We want to append it, so it should be backing-child-based
fitler too.
Make all FILTERED children to be PRIMARY as well. We are going to force
this rule by assertion soon.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-7-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
test_parallel_perm_update() does two things that we are going to
restrict in the near future:
1. It updates bs->file field by hand. bs->file will be managed
automatically by generic code (together with bs->children list).
Let's better refactor our "tricky" bds to have own state where one
of children is linked as "selected".
This also looks less "tricky", so avoid using this word.
2. It create FILTERED children that are not PRIMARY. Except for tests
all FILTERED children in the Qemu block layer are always PRIMARY as
well. We are going to formalize this rule, so let's better use DATA
children here.
3. It creates more than one FILTERED child, which is already abandoned
in BDRV_CHILD_FILTERED's description.
While being here, update the picture to better correspond to the test
code.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-5-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Move supports_backing check of bdrv_reopen_parse_backing to called
(through bdrv_set_backing_noperm()) bdrv_set_file_or_backing_noperm()
function. The check applies to general case, so it's appropriate for
bdrv_set_file_or_backing_noperm().
We have to declare backing support for two test drivers, otherwise new
check fails.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610120537.196183-7-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_append is not very good for inserting filters: it does extra
permission update as part of bdrv_set_backing_hd(). During this update
filter may conflict with other parents of top_bs.
Instead, let's first do all graph modifications and after it update
permissions.
append-greedy-filter test-case in test-bdrv-graph-mod is now works, so
move it out of debug option.
Note: bdrv_append() is still only works for backing-child based
filters. It's something to improve later.
Note2: we use the fact that bdrv_append() is used to append new nodes,
without backing child, so we don't need frozen check and inherits_from
logic from bdrv_set_backing_hd().
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-22-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
inore_children thing doesn't help to track all propagated permissions
of children we want to ignore. The simplest way to correctly update
permissions is update graph first and then do permission update. In
this case we just referesh permissions for the whole subgraph (in
topological-sort defined order) and everything is correctly calculated
automatically without any ignore_children.
So, refactor bdrv_replace_node_common to first do graph update and then
refresh the permissions.
Test test_parallel_exclusive_write() now pass, so move it out of
debugging "if".
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-18-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Rewrite bdrv_check_perm(), bdrv_abort_perm_update() and bdrv_set_perm()
to update nodes in topological sort order instead of simple DFS. With
topologically sorted nodes, we update a node only when all its parents
already updated. With DFS it's not so.
Consider the following example:
A -+
| |
| v
| B
| |
v |
C<-+
A is parent for B and C, B is parent for C.
Obviously, to update permissions, we should go in order A B C, so, when
we update C, all parent permissions already updated. But with current
approach (simple recursion) we can update in sequence A C B C (C is
updated twice). On first update of C, we consider old B permissions, so
doing wrong thing. If it succeed, all is OK, on second C update we will
finish with correct graph. But if the wrong thing failed, we break the
whole process for no reason (it's possible that updated B permission
will be less strict, but we will never check it).
Also new approach gives a way to simultaneously and correctly update
several nodes, we just need to run bdrv_topological_dfs() several times
to add all nodes and their subtrees into one topologically sorted list
(next patch will update bdrv_replace_node() in this manner).
Test test_parallel_perm_update() is now passing, so move it out of
debugging "if".
We also need to support ignore_children in
bdrv_parent_perms_conflict()
For test 283 order of conflicting parents check is changed.
Note also that in bdrv_check_perm() we don't check for parents conflict
at root bs, as we may be in the middle of permission update in
bdrv_reopen_multiple(). bdrv_reopen_multiple() will be updated soon.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-14-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We have too much comments for this feature. It seems better just don't
do it. Most of real users (tests don't count) have to create additional
reference.
Drop also comment in external_snapshot_prepare:
- bdrv_append doesn't "remove" old bs in common sense, it sounds
strange
- the fact that bdrv_append can fail is obvious from the context
- the fact that we must rollback all changes in transaction abort is
known (it's the direct role of abort)
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-5-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_append() is not quite good for inserting filters: it does extra
permission update in intermediate state, where filter get it filtered
child but is not yet replace it in a backing chain.
Some filters (for example backup-top) may want permissions even when
have no parents. And described intermediate state becomes invalid.
That's (half a) reason, why we need "inactive" state for backup-top
filter.
bdrv_append() will be improved later, now let's add a unit test.
Now test fails, so it runs only with -d flag. To run do
./test-bdrv-graph-mod -d -p /bdrv-graph-mod/append-greedy-filter
from <build-directory>/tests.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-4-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add test to show that simple DFS recursion order is not correct for
permission update. Correct order is topological-sort order, which will
be introduced later.
Consider the block driver which has two filter children: one active
with exclusive write access and one inactive with no specific
permissions.
And, these two children has a common base child, like this:
┌─────┐ ┌──────┐
│ fl2 │ ◀── │ top │
└─────┘ └──────┘
│ │
│ │ w
│ ▼
│ ┌──────┐
│ │ fl1 │
│ └──────┘
│ │
│ │ w
│ ▼
│ ┌──────┐
└───────▶ │ base │
└──────┘
So, exclusive write is propagated.
Assume, we want to make fl2 active instead of fl1.
So, we set some option for top driver and do permission update.
If permission update (remember, it's DFS) goes first through
top->fl1->base branch it will succeed: it firstly drop exclusive write
permissions and than apply them for another BdrvChildren.
But if permission update goes first through top->fl2->base branch it
will fail, as when we try to update fl2->base child, old not yet
updated fl1->base child will be in conflict.
Now test fails, so it runs only with -d flag. To run do
./test-bdrv-graph-mod -d -p /bdrv-graph-mod/parallel-perm-update
from <build-directory>/tests.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-3-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add the test that shows that concept of ignore_children is incomplete.
Actually, when we want to update something, ignoring permission of some
existing BdrvChild, we should ignore also the propagated effect of this
child to the other children. But that's not done. Better approach
(update permissions on already updated graph) will be implemented
later.
Now the test fails, so it's added with -d argument to not break make
check.
Test fails with
"Conflicts with use by fl1 as 'backing', which does not allow 'write' on base"
because when updating permissions we can ignore original top->fl1
BdrvChild. But we don't ignore exclusive write permission in fl1->base
BdrvChild, which is propagated. Correct thing to do is make graph
change first and then do permission update from the top node.
To run test do
./test-bdrv-graph-mod -d -p /bdrv-graph-mod/parallel-exclusive-write
from <build-directory>/tests.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-2-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The main tests directory still looks very crowded, and it's not
clear which files are part of a unit tests and which belong to
a different test subsystem. Let's clean up the mess and move the
unit tests to a separate directory.
Message-Id: <20210310063314.1049838-1-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>