Commit Graph

99375 Commits

Author SHA1 Message Date
Emanuele Giuseppe Esposito
f8be48adf0 block: use the new _change_ API instead of _can_set_ and _set_
Replace all direct usage of ->can_set_aio_ctx and ->set_aio_ctx,
and call bdrv_child_try_change_aio_context() in
bdrv_try_set_aio_context(), the main function called through
the whole block layer.

From this point onwards, ->can_set_aio_ctx and ->set_aio_ctx
won't be used anymore.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221025084952.2139888-8-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Emanuele Giuseppe Esposito
3394939621 block-backend: implement .change_aio_ctx in child_root
blk_root_change_aio_ctx() is very similar to blk_root_can_set_aio_ctx(),
but implements a new transaction so that if all check pass, the new
transaction's .commit will take care of changing the BlockBackend
AioContext. blk_root_set_aio_ctx_commit() is the same as
blk_root_set_aio_ctx().

Note: bdrv_child_try_change_aio_context() is not called by
anyone at this point.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221025084952.2139888-7-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Emanuele Giuseppe Esposito
27633e740e block: implement .change_aio_ctx in child_of_bds
bdrv_child_cb_change_aio_ctx() is identical to
bdrv_child_cb_can_set_aio_ctx(), as we only need
to recursively go on the parent bs.

Note: bdrv_child_try_change_aio_context() is not called by
anyone at this point.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221025084952.2139888-6-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Emanuele Giuseppe Esposito
3428b100dc blockjob: implement .change_aio_ctx in child_job
child_job_change_aio_ctx() is very similar to
child_job_can_set_aio_ctx(), but it implements a new transaction
so that if all check pass, the new transaction's .commit()
will take care of changin the BlockJob AioContext.
child_job_set_aio_ctx_commit() is similar to child_job_set_aio_ctx(),
but it doesn't need to invoke the recursion, as this is already
taken care by child_job_change_aio_ctx().

Note: bdrv_child_try_change_aio_context() is not called by
anyone at this point.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221025084952.2139888-5-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Emanuele Giuseppe Esposito
e08cc001d5 bdrv_change_aio_context: use hash table instead of list of visited nodes
Minor performance improvement, but given that we have hash tables
available, avoid iterating in the visited nodes list every time just
to check if a node has been already visited.

The data structure is not actually a proper hash map, but an hash set,
as we are just adding nodes and not key,value pairs.

Suggested-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221025084952.2139888-4-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Emanuele Giuseppe Esposito
7e8c182fb5 block: use transactions as a replacement of ->{can_}set_aio_context()
Simplify the way the aiocontext can be changed in a BDS graph.
There are currently two problems in bdrv_try_set_aio_context:
- There is a confusion of AioContext locks taken and released, because
  we assume that old aiocontext is always taken and new one is
  taken inside.

- It doesn't look very safe to call bdrv_drained_begin while some
  nodes have already switched to the new aiocontext and others haven't.
  This could be especially dangerous because bdrv_drained_begin polls, so
  something else could be executed while graph is in an inconsistent
  state.

Additional minor nitpick: can_set and set_ callbacks both traverse the
graph, both using the ignored list of visited nodes in a different way.

Therefore, get rid of all of this and introduce a new callback,
change_aio_context, that uses transactions to efficiently, cleanly
and most importantly safely change the aiocontext of a graph.

This new callback is a "merge" of the two previous ones:
- Just like can_set_aio_context, recursively traverses the graph.
  Marks all nodes that are visited using a GList, and checks if
  they *could* change the aio_context.
- For each node that passes the above check, drain it and add a new transaction
  that implements a callback that effectively changes the aiocontext.
- Once done, the recursive function returns if *all* nodes can change
  the AioContext. If so, commit the above transactions.
  Regardless of the outcome, call transaction.clean() to undo all drains
  done in the recursion.
- The transaction list is scanned only after all nodes are being drained, so
  we are sure that they all are in the same context, and then
  we switch their AioContext, concluding the drain only after all nodes
  switched to the new AioContext. In this way we make sure that
  bdrv_drained_begin() is always called under the old AioContext, and
  bdrv_drained_end() under the new one.
- Because of the above, we don't need to release and re-acquire the
  old AioContext every time, as everything is done once (and not
  per-node drain and aiocontext change).

Note that the "change" API is not yet invoked anywhere.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20221025084952.2139888-3-eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Emanuele Giuseppe Esposito
7f898610f6 block.c: assert bs->aio_context is written under BQL and drains
Also here ->aio_context is read by I/O threads and written
under BQL.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20221025084952.2139888-2-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
57f08941d3 block: refactor bdrv_remove_file_or_backing_child to bdrv_remove_child
Now the function can remove any child, so give it more common name.
Drop assertions and drop bs argument which becomes unused. Function
would be reused in a further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-16-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
c5c2174146 block/snapshot: drop indirection around bdrv_snapshot_fallback_ptr
Now the indirection is not actually used, we can safely reduce it to
simple pointer. For consistency do a bit of refactoring to get rid of
_ptr suffixes that become meaningless.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-15-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
5bb0474778 block: Manipulate bs->file / bs->backing pointers in .attach/.detach
bs->file and bs->backing are a kind of duplication of part of
bs->children. But very useful diplication, so let's not drop them at
all:)

We should manage bs->file and bs->backing in same place, where we
manage bs->children, to keep them in sync.

Moreover, generic io paths are unprepared to BdrvChild without a bs, so
it's double good to clear bs->file / bs->backing when we detach the
child.

Detach is simple: if we detach bs->file or bs->backing child, just
set corresponding field to NULL.

Attach is a bit more complicated. But we still can precisely detect
should we set one of bs->file / bs->backing or not:

- if role is BDRV_CHILD_COW, we definitely deal with bs->backing
- else, if role is BDRV_CHILD_FILTERED (it must be also
  BDRV_CHILD_PRIMARY), it's a filtered child. Use
  bs->drv->filtered_child_is_backing to chose the pointer field to
  modify.
- else, if role is BDRV_CHILD_PRIMARY, we deal with bs->file
- in all other cases, it's neither bs->backing nor bs->file. It's some
  other child and we shouldn't care

OK. This change brings one more good thing: we can (and should) get rid
of all indirect pointers in the block-graph-change transactions:

bdrv_attach_child_common() stores BdrvChild** into transaction to clear
it on abort.

bdrv_attach_child_common() has two callers: bdrv_attach_child_noperm()
just pass-through this feature, bdrv_root_attach_child() doesn't need
the feature.

Look at bdrv_attach_child_noperm() callers:
  - bdrv_attach_child() doesn't need the feature
  - bdrv_set_file_or_backing_noperm() uses the feature to manage
    bs->file and bs->backing, we don't want it anymore
  - bdrv_append() uses the feature to manage bs->backing, again we
    don't want it anymore

So, we should drop this stuff! Great!

We could probably keep BdrvChild** argument to keep the int return
value, but it seems not worth the complexity.

Finally, we now set .file / .backing automatically in generic code and
want to restring setting them by hand outside of .attach/.detach.
So, this patch cleanups all remaining places where they were set.
To find such places I use:

  git grep '\->file ='
  git grep '\->backing ='
  git grep '&.*\<backing\>'
  git grep '&.*\<file\>'

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-14-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
544acc7d1e Revert "block: Pass BdrvChild ** to replace_child_noperm"
That's a preparation to previously reverted
"block: Let replace_child_noperm free children". Drop it too, we don't
need it for a new approach.

This reverts commit be64bbb014.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-13-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
a2c37a3042 Revert "block: Restructure remove_file_or_backing_child()"
That's a preparation to previously reverted
"block: Let replace_child_noperm free children". Drop it too, we don't
need it for a new approach.

This reverts commit 562bda8bb4.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-12-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
0f0b1e29d3 Revert "block: Let replace_child_tran keep indirect pointer"
That's a preparation to previously reverted
"block: Let replace_child_noperm free children". Drop it too, we don't
need it for a new approach.

This reverts commit 82b54cf516.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-11-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
4eba825a82 Revert "block: Let replace_child_noperm free children"
We are going to reimplement this behavior (clear bs->file / bs->backing
pointers automatically when child->bs is cleared) in a nicer way, see
further commit
"block: Manipulate bs->file / bs->backing pointers in .attach/.detach".

With this revert we bring back a problem that was fixed by b0a9f6fed3.
Still the problem was mostly theoretical, we don't have concrete bugs
fixed by b0a9f6fed3, we don't have a specific test. Probably some
accidental failures of iotests are related.

Alternatively, we may merge this and following three reverts into final
"block: Manipulate ..." to avoid any kind of regression. But seems that
in this case having separate clear revert commits is better.

This reverts commit b0a9f6fed3.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-10-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
0c6100a7ff block/snapshot: stress that we fallback to primary child
Actually what we chose is a primary child. Let's stress it in the code.

We are going to drop indirect pointer logic here in future. Actually
this commit simplifies the future work: we drop use of indirection in
the assertion now.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-9-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
71ca43852a block: document connection between child roles and bs->backing/bs->file
Make the informal rules formal. In further commit we'll add
corresponding assertions.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-8-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
1921b4f786 test-bdrv-graph-mod: fix filters to be filters
bdrv_pass_through is used as filter, even all node variables has
corresponding names. We want to append it, so it should be
backing-child-based filter like mirror_top.
So, in test_update_perm_tree, first child should be DATA, as we don't
want filters with two filtered children.

bdrv_exclusive_writer is used as a filter once. So it should be filter
anyway. We want to append it, so it should be backing-child-based
fitler too.

Make all FILTERED children to be PRIMARY as well. We are going to force
this rule by assertion soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-7-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
9ebfc111a1 tests-bdrv-drain: bdrv_replace_test driver: declare supports_backing
We do add COW child to the node.  In future we are going to forbid
adding COW child to the node that doesn't support backing. So, fix it
here now.

Don't worry about setting bs->backing itself: in further commit we'll
update the block-layer to automatically set/unset this field in generic
code.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-6-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
1dcea71979 test-bdrv-graph-mod: update test_parallel_perm_update test case
test_parallel_perm_update() does two things that we are going to
restrict in the near future:

1. It updates bs->file field by hand. bs->file will be managed
   automatically by generic code (together with bs->children list).

   Let's better refactor our "tricky" bds to have own state where one
   of children is linked as "selected".
   This also looks less "tricky", so avoid using this word.

2. It create FILTERED children that are not PRIMARY. Except for tests
   all FILTERED children in the Qemu block layer are always PRIMARY as
   well.  We are going to formalize this rule, so let's better use DATA
   children here.

3. It creates more than one FILTERED child, which is already abandoned
   in BDRV_CHILD_FILTERED's description.

While being here, update the picture to better correspond to the test
code.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-5-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
a987aa7d3c block/blklogwrites: don't care to remove bs->file child on failure
We don't need to remove bs->file, generic layer takes care of it. No
other driver cares to remove bs->file on failure by hand.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-4-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
8393078032 block: introduce bdrv_open_file_child() helper
Almost all drivers call bdrv_open_child() similarly. Let's create a
helper for this.

The only not updated drivers that call bdrv_open_child() to set
bs->file are raw-format and snapshot-access:
    raw-format sometimes want to have filtered child but
        don't set drv->is_filter to true.
    snapshot-access wants only DATA | PRIMARY

Possibly we should implement drv->is_filter_func() handler, to consider
raw-format as filter when it works as filter.. But it's another story.

Note also, that we decrease assignments to bs->file in code: it helps
us restrict modifying this field in further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-3-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
046fd84fac block: BlockDriver: add .filtered_child_is_backing field
Unfortunately not all filters use .file child as filtered child. Two
exclusions are mirror_top and commit_top. Happily they both are private
filters. Bad thing is that this inconsistency is observable through qmp
commands query-block / query-named-block-nodes. So, could we just
change mirror_top and commit_top to use file child as all other filter
driver is an open question. Probably, we could do that with some kind
of deprecation period, but how to warn users during it?

For now, let's just add a field so we can distinguish them in generic
code, it will be used in further commits.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-2-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Li Feng
ab6075d849 vhost-user-blk: fix the resize crash
If the os is not installed and doesn't have the virtio guest driver,
the vhost dev isn't started, so the dev->vdev is NULL.

Reproduce: mount a Win 2019 iso, go into the install ui, then resize
the virtio-blk device, qemu crash.

Signed-off-by: Li Feng <fengli@smartx.com>
Message-Id: <20220919121816.3252223-1-fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Sam Li
7845e73147 block/io_uring: revert "Use io_uring_register_ring_fd() to skip fd operations"
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1193

The commit "Use io_uring_register_ring_fd() to skip fd operations" broke
when booting a guest with iothread and io_uring. That is because the
io_uring_register_ring_fd() call is made from the main thread instead of
IOThread where io_uring_submit() is called. It can not be guaranteed
to register the ring fd in the correct thread or unregister the same ring
fd if the IOThread is disabled. This optimization is not critical so we
will revert previous commit.

This reverts commit e2848bc574
and 77e3f038af.

Cc: qemu-stable@nongnu.org
Signed-off-by: Sam Li <faithilikerun@gmail.com>
Message-Id: <20220924144815.5591-1-faithilikerun@gmail.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Tested-by: Dario Faggioli <dfaggioli@suse.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Hervé Poussineau
1e85c7259b vvfat: allow spaces in file names
In R/W mode, files with spaces were never created on host side.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1176
Fixes: c79e243ed6
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221010175511.3414357-3-hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Hervé Poussineau
d0f95b6ca0 vvfat: allow some writes to bootsector
'reserved1' field in bootsector is used to mark volume dirty, or need to verify.
Allow writes to bootsector which only changes the 'reserved1' field.

This fixes I/O errors on Windows guests.

Resolves: https://bugs.launchpad.net/qemu/+bug/1889421
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-Id: <20221010175511.3414357-2-hpoussin@reactos.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Bin Meng
69fbfff95e block: Refactor get_tmp_filename()
At present there are two callers of get_tmp_filename() and they are
inconsistent.

One does:

    /* TODO: extra byte is a hack to ensure MAX_PATH space on Windows. */
    char *tmp_filename = g_malloc0(PATH_MAX + 1);
    ...
    ret = get_tmp_filename(tmp_filename, PATH_MAX + 1);

while the other does:

    s->qcow_filename = g_malloc(PATH_MAX);
    ret = get_tmp_filename(s->qcow_filename, PATH_MAX);

As we can see different 'size' arguments are passed. There are also
platform specific implementations inside the function, and the use
of snprintf is really undesirable.

The function name is also misleading. It creates a temporary file,
not just a filename.

Refactor this routine by changing its name and signature to:

    char *create_tmp_file(Error **errp)

and use g_get_tmp_dir() / g_mkstemp() for a consistent implementation.

While we are here, add some comments to mention that /var/tmp is
preferred over /tmp on non-win32 hosts.

Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <20221010040432.3380478-2-bin.meng@windriver.com>
[kwolf: Fixed incorrect errno negation and iotest 051]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:13:32 +02:00
Bin Meng
6b6471eee1 block: Ignore close() failure in get_tmp_filename()
The temporary file has been created and is ready for use. Checking
return value of close() does not seem useful. The file descriptor
is almost certainly closed; see close(2) under "Dealing with error
returns from close()".

Let's simply ignore close() failure here.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221010040432.3380478-1-bin.meng@windriver.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 18:33:32 +02:00
Markus Armbruster
b885cdda79 MAINTAINERS: Fold "Block QAPI, monitor, ..." into "Block layer core"
Section "Block QAPI, monitor, command line" is about the external
interfaces we provide for block devices.  It covers the relevant QAPI
schema parts, monitor and command line code, more or less.

The section's files are also covered by section "Block layer core",
except for the QAPI schema files.

I haven't acted as maintainer in this area for a long time.  Make it
official: add the QAPI schema files to section "Block layer core", and
delete section "Block QAPI, monitor, command line".

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221020120541.80757-1-armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 18:33:32 +02:00
Stefan Hajnoczi
d0d8d5707d target-arm queue:
* Implement FEAT_E0PD
  * Implement FEAT_HAFDBS
  * honor HCR_E2H and HCR_TGE in arm_excp_unmasked()
  * hw/arm/virt: Fix devicetree warnings about the virtio-iommu node
  * hw/core/resettable: fix reset level counting
  * hw/hyperv/hyperv.c: Use device_cold_reset() instead of device_legacy_reset()
  * imx: reload cmp timer outside of the reload ptimer transaction
  * x86: do not re-randomize RNG seed on snapshot load
  * m68k/virt: do not re-randomize RNG seed on snapshot load
  * m68k/q800: do not re-randomize RNG seed on snapshot load
  * arm: re-randomize rng-seed on reboot
  * riscv: re-randomize rng-seed on reboot
  * mips/boston: re-randomize rng-seed on reboot
  * openrisc: re-randomize rng-seed on reboot
  * rx: re-randomize rng-seed on reboot
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmNagAQZHHBldGVyLm1h
 eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3sv6D/0VXf61t6IcmQ342L5IeUeA
 jixouWQhma3WwFDjbEo3BehgBhdwH2gxF8XWZNudV1x5P4JbCwiD/sm9FKtNY3IX
 lOpcg4F7Ge6EHCEQ5PM75G4TNQBw1BTwGuNrXm8kpVZ7i7C4Zo3gzbqVYv59d406
 fMwZBZwwavn9xYI/ZOUq3CKv2W/xrveFIEfafQB1mmcu4azZRLlOdMXvsMY/Te1/
 GQ+0RPcemNfvfFwYfMKT9dqiCWgqzAoiGQNH2944mTnoJJMsI0JLcXP2z/4fFfYv
 J1m7mhOO9KiqUWzxJofQOgQIic1q6AY0lLw272mA/rbwwlmlm/bNl1DGE5Lyw64d
 t/dDWE6X8IHPqPzqqrOd8vpKIKUriDSL83D5uULpPXaQwyckTFDsAMu5VX4uswbm
 B+SizTghSNwMbOq1XsQg6DDiHEelbwwrltsLOSQujXrrngtSxjWXuFgWem4gT8HL
 uVQtrfrASV/gNBLRNX73vuL6pJaTEVqk53JI8MamZEIRLO1s6/nreOR13E+0611T
 iMywoOhAQA3RDe9NU0zgg6EGyskRZQG1CRTDQAz1sAt8WcHokg7Yj7LlfGE+/+Bh
 4cIuJI56Uf3DJF51A52+roaQkZDJZZkfE1EG8uMDIWszP5v2GDcwx3AS3FLuaDfH
 QHPsecbzEURFTmdt5VrKzg==
 =RD6C
 -----END PGP SIGNATURE-----

Merge tag 'pull-target-arm-20221027' of https://git.linaro.org/people/pmaydell/qemu-arm into staging

target-arm queue:
 * Implement FEAT_E0PD
 * Implement FEAT_HAFDBS
 * honor HCR_E2H and HCR_TGE in arm_excp_unmasked()
 * hw/arm/virt: Fix devicetree warnings about the virtio-iommu node
 * hw/core/resettable: fix reset level counting
 * hw/hyperv/hyperv.c: Use device_cold_reset() instead of device_legacy_reset()
 * imx: reload cmp timer outside of the reload ptimer transaction
 * x86: do not re-randomize RNG seed on snapshot load
 * m68k/virt: do not re-randomize RNG seed on snapshot load
 * m68k/q800: do not re-randomize RNG seed on snapshot load
 * arm: re-randomize rng-seed on reboot
 * riscv: re-randomize rng-seed on reboot
 * mips/boston: re-randomize rng-seed on reboot
 * openrisc: re-randomize rng-seed on reboot
 * rx: re-randomize rng-seed on reboot

# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmNagAQZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3sv6D/0VXf61t6IcmQ342L5IeUeA
# jixouWQhma3WwFDjbEo3BehgBhdwH2gxF8XWZNudV1x5P4JbCwiD/sm9FKtNY3IX
# lOpcg4F7Ge6EHCEQ5PM75G4TNQBw1BTwGuNrXm8kpVZ7i7C4Zo3gzbqVYv59d406
# fMwZBZwwavn9xYI/ZOUq3CKv2W/xrveFIEfafQB1mmcu4azZRLlOdMXvsMY/Te1/
# GQ+0RPcemNfvfFwYfMKT9dqiCWgqzAoiGQNH2944mTnoJJMsI0JLcXP2z/4fFfYv
# J1m7mhOO9KiqUWzxJofQOgQIic1q6AY0lLw272mA/rbwwlmlm/bNl1DGE5Lyw64d
# t/dDWE6X8IHPqPzqqrOd8vpKIKUriDSL83D5uULpPXaQwyckTFDsAMu5VX4uswbm
# B+SizTghSNwMbOq1XsQg6DDiHEelbwwrltsLOSQujXrrngtSxjWXuFgWem4gT8HL
# uVQtrfrASV/gNBLRNX73vuL6pJaTEVqk53JI8MamZEIRLO1s6/nreOR13E+0611T
# iMywoOhAQA3RDe9NU0zgg6EGyskRZQG1CRTDQAz1sAt8WcHokg7Yj7LlfGE+/+Bh
# 4cIuJI56Uf3DJF51A52+roaQkZDJZZkfE1EG8uMDIWszP5v2GDcwx3AS3FLuaDfH
# QHPsecbzEURFTmdt5VrKzg==
# =RD6C
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 27 Oct 2022 08:56:36 EDT
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg:                 aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* tag 'pull-target-arm-20221027' of https://git.linaro.org/people/pmaydell/qemu-arm: (31 commits)
  mips/malta: pass RNG seed via env var and re-randomize on reboot
  rx: re-randomize rng-seed on reboot
  openrisc: re-randomize rng-seed on reboot
  mips/boston: re-randomize rng-seed on reboot
  m68k/q800: do not re-randomize RNG seed on snapshot load
  m68k/virt: do not re-randomize RNG seed on snapshot load
  riscv: re-randomize rng-seed on reboot
  arm: re-randomize rng-seed on reboot
  x86: do not re-randomize RNG seed on snapshot load
  device-tree: add re-randomization helper function
  reset: allow registering handlers that aren't called by snapshot loading
  target/arm: Use the max page size in a 2-stage ptw
  target/arm: Implement FEAT_HAFDBS, dirty bit portion
  target/arm: Implement FEAT_HAFDBS, access flag portion
  target/arm: Tidy merging of attributes from descriptor and table
  target/arm: Consider GP an attribute in get_phys_addr_lpae
  target/arm: Don't shift attrs in get_phys_addr_lpae
  target/arm: Fix fault reporting in get_phys_addr_lpae
  target/arm: Remove loop from get_phys_addr_lpae
  target/arm: Add ARMFault_UnsuppAtomicUpdate
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-27 10:38:23 -04:00
Daniel P. Berrangé
da0ab2c4c4 crypto: add test cases for many malformed LUKS header scenarios
Validate that we diagnose each malformed LUKS header scenario with a
distinct error report.

Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-27 13:06:12 +01:00
Daniel P. Berrangé
741c314a33 crypto: ensure LUKS tests run with GNUTLS crypto provider
GNUTLS is supported as a crypto provider since

  commit cc4c7c7382
  Author: Daniel P. Berrangé <berrange@redhat.com>
  Date:   Wed Jun 30 17:20:02 2021 +0100

    crypto: introduce build system for gnutls crypto backend

So enable the LUKS tests in this config.

Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-27 13:06:11 +01:00
Daniel P. Berrangé
6c1989321e crypto: quote algorithm names in error messages
If given a malformed LUKS header, it is possible that the algorithm
names end up being an empty string. This leads to confusing error
messages unless quoting is used to highlight where the empty string
is subsituted in the error message.

Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-27 12:55:27 +01:00
Daniel P. Berrangé
98c72dfb71 crypto: split off helpers for converting LUKS header endianess
The unit test suite is shortly going to want to convert header
endianness separately from the main I/O functions.

Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-27 12:55:27 +01:00
Daniel P. Berrangé
36445acebd crypto: split LUKS header definitions off into file
This will allow unit testing code to use the structs.

Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-27 12:55:27 +01:00
Daniel P. Berrangé
b57151ac03 crypto: check that LUKS PBKDF2 iterations count is non-zero
Both the master key and key slot passphrases are run through the PBKDF2
algorithm. The iterations count is expected to be generally very large
(many 10's or 100's of 1000s). It is hard to define a low level cutoff,
but we can certainly say that iterations count should be non-zero. A
zero count likely indicates an initialization mistake so reject it.

Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-27 12:55:27 +01:00
Daniel P. Berrangé
c5f6962801 crypto: strengthen the check for key slots overlapping with LUKS header
The LUKS header data on disk is a fixed size, however, there's expected
to be a gap between the end of the header and the first key slot to get
alignment with the 2nd sector on 4k drives. This wasn't originally part
of the LUKS spec, but was always part of the reference implementation,
so it is worth validating this.

Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-27 12:55:27 +01:00
Daniel P. Berrangé
d233fbc327 crypto: validate that LUKS payload doesn't overlap with header
We already validate that LUKS keyslots don't overlap with the
header, or with each other. This closes the remaining hole in
validation of LUKS file regions.

Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-27 12:55:27 +01:00
Daniel P. Berrangé
93569c3730 crypto: enforce that key material doesn't overlap with LUKS header
We already check that key material doesn't overlap between key slots,
and that it doesn't overlap with the payload. We didn't check for
overlap with the LUKS header.

Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-27 12:55:27 +01:00
Daniel P. Berrangé
f1195961f3 crypto: enforce that LUKS stripes is always a fixed value
Although the LUKS stripes are encoded in the keyslot header and so
potentially configurable, in pratice the cryptsetup impl mandates
this has the fixed value 4000. To avoid incompatibility apply the
same enforcement in QEMU too. This also caps the memory usage for
key material when QEMU tries to open a LUKS volume.

Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-27 12:55:27 +01:00
Daniel P. Berrangé
c1d8634c20 crypto: sanity check that LUKS header strings are NUL-terminated
The LUKS spec requires that header strings are NUL-terminated, and our
code relies on that. Protect against maliciously crafted headers by
adding validation.

Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-27 12:55:27 +01:00
Daniel P. Berrangé
f1018ea0a3 tests: avoid DOS line endings in PSK file
Using FILE * APIs for writing the PSK file results in translation from
UNIX to DOS line endings on Windows. When the crypto PSK code later
loads the credentials the stray \r will result in failure to load the
PSK credentials into GNUTLS.

Rather than switching the FILE* APIs to open in binary format, just
switch to the more concise g_file_set_contents API.

Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-27 11:55:41 +01:00
Daniel P. Berrangé
3983bf1b41 crypto: check for and report errors setting PSK credentials
If setting credentials fails, the handshake will later fail to complete
with an obscure error message which is hard to diagnose.

Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-27 11:55:41 +01:00
Daniel P. Berrangé
dd84a906e0 scripts: check if .git exists before checking submodule status
Currently we check status of each submodule, before actually checking
if we're in a git repo. These status commands will all fail, but we
are hiding their output so we don't see it currently.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2022-10-27 11:54:37 +01:00
Jason A. Donenfeld
6233a13859 mips/malta: pass RNG seed via env var and re-randomize on reboot
As of the kernel commit linked below, Linux ingests an RNG seed
passed as part of the environment block by the bootloader or firmware.
This mechanism works across all different environment block types,
generically, which pass some block via the second firmware argument. On
malta, this has been tested to work when passed as an argument from
U-Boot's linux_env_set.

As is the case on most other architectures (such as boston), when
booting with `-kernel`, QEMU, acting as the bootloader, should pass the
RNG seed, so that the machine has good entropy for Linux to consume. So
this commit implements that quite simply by using the guest random API,
which is what is used on nearly all other archs too. It also
reinitializes the seed on reboot, so that it is always fresh.

Link: https://git.kernel.org/torvalds/c/056a68cea01
Cc: Aleksandar Rikalo <aleksandar.rikalo@syrmia.com>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-10-27 11:47:45 +01:00
Jason A. Donenfeld
a76b911c0d rx: re-randomize rng-seed on reboot
When the system reboots, the rng-seed that the FDT has should be
re-randomized, so that the new boot gets a new seed. Since the FDT is in
the ROM region at this point, we add a hook right after the ROM has been
added, so that we have a pointer to that copy of the FDT.

Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Message-id: 20221025004327.568476-12-Jason@zx2c4.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-10-27 11:34:31 +01:00
Jason A. Donenfeld
2db07d0506 openrisc: re-randomize rng-seed on reboot
When the system reboots, the rng-seed that the FDT has should be
re-randomized, so that the new boot gets a new seed. Since the FDT is in
the ROM region at this point, we add a hook right after the ROM has been
added, so that we have a pointer to that copy of the FDT.

Cc: Stafford Horne <shorne@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Message-id: 20221025004327.568476-11-Jason@zx2c4.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-10-27 11:34:31 +01:00
Jason A. Donenfeld
4fbae24450 mips/boston: re-randomize rng-seed on reboot
When the system reboots, the rng-seed that the FDT has should be
re-randomized, so that the new boot gets a new seed. Since the FDT is in
the ROM region at this point, we add a hook right after the ROM has been
added, so that we have a pointer to that copy of the FDT.

Cc: Aleksandar Rikalo <aleksandar.rikalo@syrmia.com>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Message-id: 20221025004327.568476-9-Jason@zx2c4.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-10-27 11:34:31 +01:00
Jason A. Donenfeld
fbbbe7eb23 m68k/q800: do not re-randomize RNG seed on snapshot load
Snapshot loading is supposed to be deterministic, so we shouldn't
re-randomize the various seeds used.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Message-id: 20221025004327.568476-8-Jason@zx2c4.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-10-27 11:34:31 +01:00
Jason A. Donenfeld
1ffd007c9c m68k/virt: do not re-randomize RNG seed on snapshot load
Snapshot loading is supposed to be deterministic, so we shouldn't
re-randomize the various seeds used.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Message-id: 20221025004327.568476-7-Jason@zx2c4.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-10-27 11:34:31 +01:00