Commit Graph

107861 Commits

Author SHA1 Message Date
Vladimir Sementsov-Ogievskiy
3a8736cf1e iotests: QemuStorageDaemon: add cmd() method like in QEMUMachine.
Add similar method for consistency.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20231006154125.1068348-8-vsementsov@yandex-team.ru
Signed-off-by: John Snow <jsnow@redhat.com>
2023-10-12 14:21:43 -04:00
Vladimir Sementsov-Ogievskiy
4e620ff48f python/machine.py: upgrade vm.cmd() method
The method is not popular in iotests, we prefer use vm.qmp() and then
check success by hand. But that's not optimal. To simplify movement to
vm.cmd() let's support same interface improvements like in vm.qmp().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20231006154125.1068348-7-vsementsov@yandex-team.ru
Signed-off-by: John Snow <jsnow@redhat.com>
2023-10-12 14:21:43 -04:00
Vladimir Sementsov-Ogievskiy
684750ab4f python/qemu: rename command() to cmd()
Use a shorter name. We are going to move in iotests from qmp() to
command() where possible. But command() is longer than qmp() and don't
look better. Let's rename.

You can simply grep for '\.command(' and for 'def command(' to check
that everything is updated (command() in tests/docker/docker.py is
unrelated).

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-id: 20231006154125.1068348-6-vsementsov@yandex-team.ru
[vsementsov: also update three occurrences in
   tests/avocado/machine_aspeed.py and keep r-b]
Signed-off-by: John Snow <jsnow@redhat.com>
2023-10-12 14:21:43 -04:00
Vladimir Sementsov-Ogievskiy
37274707f6 python: rename QEMUMonitorProtocol.cmd() to cmd_raw()
Having cmd() and command() methods in one class doesn't look good.
Rename cmd() to cmd_raw(), to show its meaning better.

We also want to rename command() to cmd() in future, so this commit is
a necessary step.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20231006154125.1068348-5-vsementsov@yandex-team.ru
Signed-off-by: John Snow <jsnow@redhat.com>
2023-10-12 14:21:43 -04:00
Vladimir Sementsov-Ogievskiy
7f521b023b scripts/cpu-x86-uarch-abi.py: use .command() instead of .cmd()
Here we don't expect a failure. In case of failure we'll crash on
trying to access ['return']. Better is to use .command() that clearly
raises on failure.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20231006154125.1068348-4-vsementsov@yandex-team.ru
Signed-off-by: John Snow <jsnow@redhat.com>
2023-10-12 14:21:43 -04:00
Vladimir Sementsov-Ogievskiy
2cee9ca97d qmp_shell.py: _fill_completion() use .command() instead of .cmd()
We just want to ignore failure, so we don't need low level .cmd(). This
helps further renaming .command() to .cmd().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20231006154125.1068348-3-vsementsov@yandex-team.ru
Signed-off-by: John Snow <jsnow@redhat.com>
2023-10-12 14:21:43 -04:00
Vladimir Sementsov-Ogievskiy
f187cfefd2 python/qemu/qmp/legacy: cmd(): drop cmd_id unused argument
The argument is unused, let's drop it for now, as we are going to
refactor the interface and don't want to refactor unused things.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20231006154125.1068348-2-vsementsov@yandex-team.ru
Signed-off-by: John Snow <jsnow@redhat.com>
2023-10-12 14:21:43 -04:00
Paolo Bonzini
c35b2fb1fd target/i386: fix shadowed variable pasto
Commit a908985971 ("target/i386/seg_helper: introduce tss_set_busy",
2023-09-26) failed to use the tss_selector argument of the new function,
which was therefore unused.

This shows up as a #GP fault when booting old versions of 32-bit
Linux.

Fixes: a908985971 ("target/i386/seg_helper: introduce tss_set_busy", 2023-09-26)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20231011135350.438492-1-pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-10-12 16:37:31 +02:00
Thomas Huth
cc46a7ef3b contrib/vhost-user-gpu: Fix compiler warning when compiling with -Wshadow
Rename some variables to avoid compiler warnings when compiling
with -Wshadow=local.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20231009083726.30301-1-thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-10-12 16:37:16 +02:00
Kevin Wolf
e6e964b8b0 block: Add assertion for bdrv_graph_wrlock()
bdrv_graph_wrlock() can't run in a coroutine (because it polls) and
requires holding the BQL. We already have GLOBAL_STATE_CODE() to assert
the latter. Assert the former as well and add a no_coroutine_fn marker.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-23-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
680e0cc40c block: Protect bs->children with graph_lock
Almost all functions that access the child links already take the graph
lock now. Add locking to the remaining users and finally annotate the
struct field itself as protected by the graph lock.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-22-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
b59b466071 block: Protect bs->parents with graph_lock
Almost all functions that access the parent link already take the graph
lock now. Add locking to the remaining user in a test case and finally
annotate the struct field itself as protected by the graph lock.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-21-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
3574499a1e block: Mark bdrv_get_specific_info() and callers GRAPH_RDLOCK
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_get_specific_info() need to hold a reader lock for the graph.
This removes an assume_graph_lock() call in vmdk's implementation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-20-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
018f9dea9c block: Mark bdrv_apply_auto_read_only() and callers GRAPH_RDLOCK
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_apply_auto_read_only() need to hold a reader lock for the graph
because it calls bdrv_can_set_read_only(), which indirectly accesses the
parents list of a node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-19-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
277f2007ce block: Mark bdrv_op_is_blocked() and callers GRAPH_RDLOCK
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_op_is_blocked() need to hold a reader lock for the graph
because it calls bdrv_get_device_or_node_name(), which accesses the
parents list of a node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-18-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
5155853e90 qcow2: Mark check_constraints_on_bitmap() GRAPH_RDLOCK
It still has an assume_graph_lock() call, but all of its callers are now
properly annotated to hold the graph lock. Update the function to be
GRAPH_RDLOCK as well and remove the assume_graph_lock().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-17-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
de4fed6f4e qcow2: Mark qcow2_inactivate() and callers GRAPH_RDLOCK
This adds GRAPH_RDLOCK annotations to declare that callers of
qcow2_inactivate() need to hold a reader lock for the graph because it
calls bdrv_get_device_or_node_name(), which accesses the parents list of
a node.

qcow2_do_close() is a bit strange because it is called from different
contexts. If close_data_file = true, we know that we were called from
non-coroutine main loop context (more specifically, we're coming from
qcow2_close()) and can safely drop the reader lock temporarily with
bdrv_graph_rdunlock_main_loop() and acquire the writer lock.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-16-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
0bb79c97fd qcow2: Mark qcow2_signal_corruption() and callers GRAPH_RDLOCK
This adds GRAPH_RDLOCK annotations to declare that callers of
qcow2_signal_corruption() need to hold a reader lock for the graph
because it calls bdrv_get_node_name(), which accesses the parents list
of a node.

For some places, we know that they will hold the lock, but we don't have
the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock()
with a FIXME comment. These places will be removed once everything is
properly annotated.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-15-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
bd131d6705 block: Mark bdrv_amend_options() and callers GRAPH_RDLOCK
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_amend_options() need to hold a reader lock for the graph. This
removes an assume_graph_lock() call in crypto's implementation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-14-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
4026f1c4f3 block: Mark bdrv_get_parent_name() and callers GRAPH_RDLOCK
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_get_parent_name() need to hold a reader lock for the graph
because it accesses the parents list of a node.

For some places, we know that they will hold the lock, but we don't have
the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock()
with a FIXME comment. These places will be removed once everything is
properly annotated.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-13-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
c0fc5123ad block: Mark bdrv_primary_child() and callers GRAPH_RDLOCK
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_primary_child() need to hold a reader lock for the graph
because it accesses the children list of a node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-12-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
b7cfc7d58e block: Mark bdrv_refresh_filename() and callers GRAPH_RDLOCK
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_refresh_filename() need to hold a reader lock for the graph
because it accesses the children list of a node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-11-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
15f3f1fe57 block: Mark bdrv_get_xdbg_block_graph() and callers GRAPH_RDLOCK
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_get_xdbg_block_graph() need to hold a reader lock for the graph
because it accesses the children list of a node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-10-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
ce433d2942 block: Take graph rdlock in parts of reopen
Reopen isn't easy with respect to locking because many of its functions
need to iterate the graph, some change it, and then you get some drains
in the middle where you can't hold any locks.

Therefore just documents most of the functions to be unlocked, and take
locks internally before accessing the graph.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-9-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
a32e781838 block: Mark bdrv_snapshot_fallback() and callers GRAPH_RDLOCK
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_snapshot_fallback() need to hold a reader lock for the graph
because it accesses the children list of a node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-8-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
7859c45a46 block: Mark bdrv_parent_cb_resize() and callers GRAPH_RDLOCK
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_parent_cb_resize() need to hold a reader lock for the graph.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-7-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Emanuele Giuseppe Esposito
d05ab380db block: Mark drain related functions GRAPH_RDLOCK
Draining recursively traverses the graph, therefore we need to make sure
that also such accesses to the graph are protected by the graph rdlock.

There are 3 different drain callers to consider:
1. drain in the main loop: no issue at all, rdlock is nop.
2. drain in an iothread: rdlock only works in main loop or coroutines,
   so disallow it.
3. drain in a coroutine (regardless of AioContext): the drain mechanism
   takes care of scheduling a BH in the bs->aio_context that will
   then take care of perform the actual draining. This is wrong,
   because as pointed in (2) if bs->aio_context is an iothread then
   rdlock won't work. Therefore change bdrv_co_yield_to_drain to
   schedule the BH in the main loop.

Caller (2) also implies that we need to modify test-bdrv-drain.c to
disallow draining in the iothreads.

For some places, we know that they will hold the lock, but we don't have
the GRAPH_RDLOCK annotations yet. In this case, add assume_graph_lock()
with a FIXME comment. These places will be removed once everything is
properly annotated.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-6-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
2b3912f135 block: Mark bdrv_first_blk() and bdrv_is_root_node() GRAPH_RDLOCK
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_first_blk() and bdrv_is_root_node() need to hold a reader lock
for the graph. These functions are the only functions in block-backend.c
that access the parent list of a node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-5-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
0e6bad1f21 block: Take graph rdlock in bdrv_inactivate_all()
The function reads the parents list, so it needs to hold the graph lock.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-4-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
e84c07bc73 block-coroutine-wrapper: Add no_co_wrapper_bdrv_rdlock functions
Add a new wrapper type for GRAPH_RDLOCK functions that should be called
from coroutine context.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-3-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
903df115aa test-bdrv-drain: Don't call bdrv_graph_wrlock() in coroutine context
AIO callbacks are effectively coroutine_mixed_fn. If AIO requests don't
return immediately, their callback is called from the request coroutine.
This means that in AIO callbacks, we can't call no_coroutine_fns such as
bdrv_graph_wrlock(). Unfortunately test-bdrv-drain does so.

Change the test to use a BH to drop out of coroutine context, and add
coroutine_mixed_fn and no_coroutine_fn markers to clarify the context
each function runs in.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-2-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Paolo Bonzini
cc32399773 block: convert more bdrv_is_allocated* and bdrv_block_status* calls to coroutine versions
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230904100306.156197-5-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Paolo Bonzini
578ffa9ffb block: switch to co_wrapper for bdrv_is_allocated_*
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230904100306.156197-4-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Paolo Bonzini
1b88457eaa block: complete public block status API
Include both coroutine and non-coroutine versions, the latter being
co_wrapper_mixed_bdrv_rdlock of the former.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230904100306.156197-3-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Paolo Bonzini
b170e92982 block: rename the bdrv_co_block_status static function
bdrv_block_status exists as a wrapper for bdrv_block_status_above, but
the name of the (hypothetical) coroutine version, bdrv_co_block_status,
is squatted by a random static function.  Rename it to
bdrv_co_do_block_status.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230904100306.156197-2-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:32 +02:00
Stefan Hajnoczi
63011373ad Second RISC-V PR for 8.2
* Add support for the max CPU
  * Detect user choice in TCG
  * Clear CSR values at reset and sync MPSTATE with host
  * Fix the typo of inverted order of pmpaddr13 and pmpaddr14
  * Split TCG/KVM accelerators from cpu.c
  * Add extension properties for all cpus
  * Replace GDB exit calls with proper shutdown
  * Support KVM_GET_REG_LIST
  * Remove RVG warning
  * Use env_archcpu for better performance
  * Deprecate capital 'Z' CPU properties
  * Fix vfwmaccbf16.vf
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmUncYAACgkQr3yVEwxT
 gBPQ3g/9Fi4uYRK7dymHHAQbOO9NPlmVPPSxmQ8fNUhoZUkbHfm56JEl42Xr02rA
 Lg2ORRQxJhAinANV8CotnbyLRHNCAvouCMCQEjHo1YEHzdXc0tQzp+rIOHT7v9rH
 6OQpI6RuCjO+0LQPMgzJx8yokMw/9b0uma3+RkNKod1XsSySo6JvDkMZGGZZWuVX
 Que3TMHzc4513PWEwRS9NaAHqRdy/ax0aPu9khswTYBxeJ/mBTLvGj4wBq5wnS7+
 JPvq0M5ScUMl4K5o884wsAzOdxRk8QZOMx3duMCbqXw0xFmYZj/EzcIeHdnXwuDB
 lcANd6LcESMNUb8iDBaFRjLnZ/gNiu20/P/LPWyTirfoZXzZ+h6WPnSeli36xtzO
 KKWtvS1YggCjsDvh9/PLYAvUGBcS/kUhIynN10YKnoKB+wSDxxyvBS1GU6c8czgc
 WDf3V4P3Z8oPKDA/24Qd9Uiho1Gq9FED4eBQPb9PuvkfboKE/g7lUp708XXDFVld
 hkJMsYROSRvk54RHITrD9Z+XFQ2TfC8wHLH0IwlyynQnc1sKvXaR6U1hZTAVtE4f
 yley/xCQ7OUV+hrx1sQLURcN6A+SPummOY5jdHiD29QcJnOZnkSy5j2KOlnHSa5i
 6v/6EFCgxwr69N6Q6X34VDv6+DZqLO2dNncQCInYFfupRhQ7t1E=
 =SUon
 -----END PGP SIGNATURE-----

Merge tag 'pull-riscv-to-apply-20231012-1' of https://github.com/alistair23/qemu into staging

Second RISC-V PR for 8.2

 * Add support for the max CPU
 * Detect user choice in TCG
 * Clear CSR values at reset and sync MPSTATE with host
 * Fix the typo of inverted order of pmpaddr13 and pmpaddr14
 * Split TCG/KVM accelerators from cpu.c
 * Add extension properties for all cpus
 * Replace GDB exit calls with proper shutdown
 * Support KVM_GET_REG_LIST
 * Remove RVG warning
 * Use env_archcpu for better performance
 * Deprecate capital 'Z' CPU properties
 * Fix vfwmaccbf16.vf

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmUncYAACgkQr3yVEwxT
# gBPQ3g/9Fi4uYRK7dymHHAQbOO9NPlmVPPSxmQ8fNUhoZUkbHfm56JEl42Xr02rA
# Lg2ORRQxJhAinANV8CotnbyLRHNCAvouCMCQEjHo1YEHzdXc0tQzp+rIOHT7v9rH
# 6OQpI6RuCjO+0LQPMgzJx8yokMw/9b0uma3+RkNKod1XsSySo6JvDkMZGGZZWuVX
# Que3TMHzc4513PWEwRS9NaAHqRdy/ax0aPu9khswTYBxeJ/mBTLvGj4wBq5wnS7+
# JPvq0M5ScUMl4K5o884wsAzOdxRk8QZOMx3duMCbqXw0xFmYZj/EzcIeHdnXwuDB
# lcANd6LcESMNUb8iDBaFRjLnZ/gNiu20/P/LPWyTirfoZXzZ+h6WPnSeli36xtzO
# KKWtvS1YggCjsDvh9/PLYAvUGBcS/kUhIynN10YKnoKB+wSDxxyvBS1GU6c8czgc
# WDf3V4P3Z8oPKDA/24Qd9Uiho1Gq9FED4eBQPb9PuvkfboKE/g7lUp708XXDFVld
# hkJMsYROSRvk54RHITrD9Z+XFQ2TfC8wHLH0IwlyynQnc1sKvXaR6U1hZTAVtE4f
# yley/xCQ7OUV+hrx1sQLURcN6A+SPummOY5jdHiD29QcJnOZnkSy5j2KOlnHSa5i
# 6v/6EFCgxwr69N6Q6X34VDv6+DZqLO2dNncQCInYFfupRhQ7t1E=
# =SUon
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 12 Oct 2023 00:09:36 EDT
# gpg:                using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65  9296 AF7C 9513 0C53 8013

* tag 'pull-riscv-to-apply-20231012-1' of https://github.com/alistair23/qemu: (54 commits)
  target/riscv: Fix vfwmaccbf16.vf
  target/riscv: deprecate capital 'Z' CPU properties
  target/riscv: Use env_archcpu for better performance
  target/riscv/tcg: remove RVG warning
  target/riscv/kvm: support KVM_GET_REG_LIST
  target/riscv/kvm: improve 'init_multiext_cfg' error msg
  gdbstub: replace exit calls with proper shutdown for softmmu
  hw/char: riscv_htif: replace exit calls with proper shutdown
  hw/misc/sifive_test.c: replace exit calls with proper shutdown
  softmmu: pass the main loop status to gdb "Wxx" packet
  softmmu: add means to pass an exit code when requesting a shutdown
  target/riscv/tcg-cpu.c: add extension properties for all cpus
  target/riscv: add riscv_cpu_get_name()
  target/riscv/cpu: move priv spec functions to tcg-cpu.c
  target/riscv/cpu.c: export isa_edata_arr[]
  target/riscv/tcg: move riscv_cpu_add_misa_properties() to tcg-cpu.c
  target/riscv/cpu.c: make misa_ext_cfgs[] 'const'
  target/riscv/tcg: introduce tcg_cpu_instance_init()
  target/riscv/cpu.c: export set_misa()
  target/riscv/kvm: do not use riscv_cpu_add_misa_properties()
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-12 10:24:44 -04:00
Stefan Hajnoczi
40886c4cf5 trivial patches for 2023-10-12
-----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEe3O61ovnosKJMUsicBtPaxppPlkFAmUnFa8PHG1qdEB0bHMu
 bXNrLnJ1AAoJEHAbT2saaT5ZBv8H/0MtWL6FqTzvz5yLn2WSbj2ng1RG1Deh36Sy
 1PCpFKy85ZSBKLOzgvbpn4VfpEdsvD/+sX4C4CVde+vR3oCjdUM14hnzEWX86gFl
 O8Ct8++MLPqnwgu6Rg6Z+Ie2yBtsQ5VABH/1q36T7+XHHh19bgEw6tW34/f2Ncxw
 8UQO2lm9tAMAOEfXoutoj8K8ch3FvbsEic9L0ORc7ntWc7NIauc3zizogtPHAzR8
 elB3BiLn4sMHLBj+IunndOiLadUAVOKTJ5PKi4b8iRa6aE8E6bjtLxdiPr4XEx/g
 7rSGvNM+Lm7mEgJSyyik+u0MshKjfRi+SrbvId9FIqACG1GCKeI=
 =rFns
 -----END PGP SIGNATURE-----

Merge tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu into staging

trivial patches for 2023-10-12

# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEe3O61ovnosKJMUsicBtPaxppPlkFAmUnFa8PHG1qdEB0bHMu
# bXNrLnJ1AAoJEHAbT2saaT5ZBv8H/0MtWL6FqTzvz5yLn2WSbj2ng1RG1Deh36Sy
# 1PCpFKy85ZSBKLOzgvbpn4VfpEdsvD/+sX4C4CVde+vR3oCjdUM14hnzEWX86gFl
# O8Ct8++MLPqnwgu6Rg6Z+Ie2yBtsQ5VABH/1q36T7+XHHh19bgEw6tW34/f2Ncxw
# 8UQO2lm9tAMAOEfXoutoj8K8ch3FvbsEic9L0ORc7ntWc7NIauc3zizogtPHAzR8
# elB3BiLn4sMHLBj+IunndOiLadUAVOKTJ5PKi4b8iRa6aE8E6bjtLxdiPr4XEx/g
# 7rSGvNM+Lm7mEgJSyyik+u0MshKjfRi+SrbvId9FIqACG1GCKeI=
# =rFns
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 11 Oct 2023 17:37:51 EDT
# gpg:                using RSA key 7B73BAD68BE7A2C289314B22701B4F6B1A693E59
# gpg:                issuer "mjt@tls.msk.ru"
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>" [full]
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>" [full]
# gpg:                 aka "Michael Tokarev <mjt@debian.org>" [full]
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931  4B22 701B 4F6B 1A69 3E59

* tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu:
  cpus: Remove unused smp_cores/smp_threads declarations
  scripts/xml-preprocess: Make sure this script is invoked via the right Python
  roms: use PYTHON to invoke python
  MAINTAINERS: Add some unowned files to the SBSA-REF section
  MAINTAINERS: Add section for overall sensors
  MAINTAINERS: add standard-headers to Hosts/LINUX
  MAINTAINERS: Add the CI-related doc files to the CI section
  MAINTAINERS: Add include folder to the hw/char/ section
  MAINTAINERS: Add unowned RISC-V related files to the right sections
  MAINTAINERS: Add g364fb and ds1225y to the Jazz section
  Fix compilation when UFFDIO_REGISTER is not set.
  Update AMD memory encryption document links.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-12 10:24:06 -04:00
Stefan Hajnoczi
ab3ec1586a qga-pull-2023-10-11
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEwsLBCepDxjwUI+uE711egWG6hOcFAmUmqx4ACgkQ711egWG6
 hOcgsBAAkmFQgGfxNGIXB+Y8QIWVHPlzhng6F3bzHXs+7t7RC7ITPcebsvmRZWKr
 Kn1etUhF22UneI+ipBh0JK3BkTZF5qnc4089F5Gh9uHfUS8aT7M+gDAcJoJDZpv3
 xr11iSlW9hooclAg6qV4RZ0bvjXbwgRHhI6I0P/1aNx7lKalWzZpvdnBOliG56rG
 oWf9vrWIC1nkoRg4vtKLqjouKzYsPR/kdXonr5bg8s+zV6Oqk6bC2FiN6TU/1QRI
 7FvC1mTESqWACtSqSzooGQVEhBVGZZOzzIufPCDxj9evgKZE+0Yd761IjcfXt3JS
 T4+C5Q2fZPL6hTPXnQF3YsRRJwn5tK8r1747dc2HsmLtX7EPXhu7H4gcPJLim/cc
 Kuln9+BVF+oqnLmkAgSP7Ss13lZPBFpEGhiREAYZrbD+lPfwv1ufTb1EkLWTWIpD
 MCTGW4ZBQsA+H/XMVCIf2dRYfCHgVslElmBq0hJTUvklQtrpVuGuHJzNis8W/qTq
 4AxMkh+3sS0lGDioq8fsIinlprP9XnF7hPvuJmLGQuV/wUHRmQ0L8twZiPB1g6nm
 8mwa/r+5lc04RXQ/cCLEtj6H+3XKHn8+QrsuknOnNEQ80lZtPQqM9/iz3ThSXd5Z
 zFcd4mOo5VwiGcS6KgmXpL9ZbYL6U8HHwAGuu1akky90rRxSK1A=
 =mHqt
 -----END PGP SIGNATURE-----

Merge tag 'qga-pull-2023-10-11' of https://github.com/kostyanf14/qemu into staging

qga-pull-2023-10-11

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEwsLBCepDxjwUI+uE711egWG6hOcFAmUmqx4ACgkQ711egWG6
# hOcgsBAAkmFQgGfxNGIXB+Y8QIWVHPlzhng6F3bzHXs+7t7RC7ITPcebsvmRZWKr
# Kn1etUhF22UneI+ipBh0JK3BkTZF5qnc4089F5Gh9uHfUS8aT7M+gDAcJoJDZpv3
# xr11iSlW9hooclAg6qV4RZ0bvjXbwgRHhI6I0P/1aNx7lKalWzZpvdnBOliG56rG
# oWf9vrWIC1nkoRg4vtKLqjouKzYsPR/kdXonr5bg8s+zV6Oqk6bC2FiN6TU/1QRI
# 7FvC1mTESqWACtSqSzooGQVEhBVGZZOzzIufPCDxj9evgKZE+0Yd761IjcfXt3JS
# T4+C5Q2fZPL6hTPXnQF3YsRRJwn5tK8r1747dc2HsmLtX7EPXhu7H4gcPJLim/cc
# Kuln9+BVF+oqnLmkAgSP7Ss13lZPBFpEGhiREAYZrbD+lPfwv1ufTb1EkLWTWIpD
# MCTGW4ZBQsA+H/XMVCIf2dRYfCHgVslElmBq0hJTUvklQtrpVuGuHJzNis8W/qTq
# 4AxMkh+3sS0lGDioq8fsIinlprP9XnF7hPvuJmLGQuV/wUHRmQ0L8twZiPB1g6nm
# 8mwa/r+5lc04RXQ/cCLEtj6H+3XKHn8+QrsuknOnNEQ80lZtPQqM9/iz3ThSXd5Z
# zFcd4mOo5VwiGcS6KgmXpL9ZbYL6U8HHwAGuu1akky90rRxSK1A=
# =mHqt
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 11 Oct 2023 10:03:10 EDT
# gpg:                using RSA key C2C2C109EA43C63C1423EB84EF5D5E8161BA84E7
# gpg: Good signature from "Kostiantyn Kostiuk (Upstream PR sign) <kkostiuk@redhat.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: C2C2 C109 EA43 C63C 1423  EB84 EF5D 5E81 61BA 84E7

* tag 'qga-pull-2023-10-11' of https://github.com/kostyanf14/qemu:
  qapi: qga: Clarify when out-data and err-data are populated
  qga: Fix memory leak when output stream is unused
  qga: Remove platform GUID definitions

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-12 10:23:21 -04:00
Thomas Huth
f51f90c65e gitlab-ci: Disable the riscv64-debian-cross-container by default
This job is failing since weeks. Let's mark it as manual until
it gets fixed.

Message-Id: <82aa015a-ca94-49ce-beec-679cc175b726@redhat.com>
Acked-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-10-12 14:18:03 +02:00
David Hildenbrand
ee6398d862 virtio-mem: Mark memslot alias memory regions unmergeable
Let's mark the memslot alias memory regions as unmergable, such that
flatview and vhost won't merge adjacent memory region aliases and we can
atomically map/unmap individual aliases without affecting adjacent
alias memory regions.

This handles vhost and vfio in multiple-memslot mode correctly (which do
not support atomic memslot updates) and avoids the temporary removal of
large memslots, which can be an expensive operation. For example, vfio
might have to unpin + repin a lot of memory, which is undesired.

Message-ID: <20230926185738.277351-19-david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
533f5d6679 memory,vhost: Allow for marking memory device memory regions unmergeable
Let's allow for marking memory regions unmergeable, to teach
flatview code and vhost to not merge adjacent aliases to the same memory
region into a larger memory section; instead, we want separate aliases to
stay separate such that we can atomically map/unmap aliases without
affecting other aliases.

This is desired for virtio-mem mapping device memory located on a RAM
memory region via multiple aliases into a memory region container,
resulting in separate memslots that can get (un)mapped atomically.

As an example with virtio-mem, the layout would look something like this:
  [...]
  0000000240000000-00000020bfffffff (prio 0, i/o): device-memory
    0000000240000000-000000043fffffff (prio 0, i/o): virtio-mem
      0000000240000000-000000027fffffff (prio 0, ram): alias memslot-0 @mem2 0000000000000000-000000003fffffff
      0000000280000000-00000002bfffffff (prio 0, ram): alias memslot-1 @mem2 0000000040000000-000000007fffffff
      00000002c0000000-00000002ffffffff (prio 0, ram): alias memslot-2 @mem2 0000000080000000-00000000bfffffff
  [...]

Without unmergable memory regions, all three memslots would get merged into
a single memory section. For example, when mapping another alias (e.g.,
virtio-mem-memslot-3) or when unmapping any of the mapped aliases,
memory listeners will first get notified about the removal of the big
memory section to then get notified about re-adding of the new
(differently merged) memory section(s).

In an ideal world, memory listeners would be able to deal with that
atomically, like KVM nowadays does. However, (a) supporting this for other
memory listeners (vhost-user, vfio) is fairly hard: temporary removal
can result in all kinds of issues on concurrent access to guest memory;
and (b) this handling is undesired, because temporarily removing+readding
can consume quite some time on bigger memslots and is not efficient
(e.g., vfio unpinning and repinning pages ...).

Let's allow for marking a memory region unmergeable, such that we
can atomically (un)map aliases to the same memory region, similar to
(un)mapping individual DIMMs.

Similarly, teach vhost code to not redo what flatview core stopped doing:
don't merge such sections. Merging in vhost code is really only relevant
for handling random holes in boot memory where; without this merging,
the vhost-user backend wouldn't be able to mmap() some boot memory
backed on hugetlb.

We'll use this for virtio-mem next.

Message-ID: <20230926185738.277351-18-david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
177f9b1ee4 virtio-mem: Expose device memory dynamically via multiple memslots if enabled
Having large virtio-mem devices that only expose little memory to a VM
is currently a problem: we map the whole sparse memory region into the
guest using a single memslot, resulting in one gigantic memslot in KVM.
KVM allocates metadata for the whole memslot, which can result in quite
some memory waste.

Assuming we have a 1 TiB virtio-mem device and only expose little (e.g.,
1 GiB) memory, we would create a single 1 TiB memslot and KVM has to
allocate metadata for that 1 TiB memslot: on x86, this implies allocating
a significant amount of memory for metadata:

(1) RMAP: 8 bytes per 4 KiB, 8 bytes per 2 MiB, 8 bytes per 1 GiB
    -> For 1 TiB: 2147483648 + 4194304 + 8192 = ~ 2 GiB (0.2 %)

    With the TDP MMU (cat /sys/module/kvm/parameters/tdp_mmu) this gets
    allocated lazily when required for nested VMs
(2) gfn_track: 2 bytes per 4 KiB
    -> For 1 TiB: 536870912 = ~512 MiB (0.05 %)
(3) lpage_info: 4 bytes per 2 MiB, 4 bytes per 1 GiB
    -> For 1 TiB: 2097152 + 4096 = ~2 MiB (0.0002 %)
(4) 2x dirty bitmaps for tracking: 2x 1 bit per 4 KiB page
    -> For 1 TiB: 536870912 = 64 MiB (0.006 %)

So we primarily care about (1) and (2). The bad thing is, that the
memory consumption *doubles* once SMM is enabled, because we create the
memslot once for !SMM and once for SMM.

Having a 1 TiB memslot without the TDP MMU consumes around:
* With SMM: 5 GiB
* Without SMM: 2.5 GiB
Having a 1 TiB memslot with the TDP MMU consumes around:
* With SMM: 1 GiB
* Without SMM: 512 MiB

... and that's really something we want to optimize, to be able to just
start a VM with small boot memory (e.g., 4 GiB) and a virtio-mem device
that can grow very large (e.g., 1 TiB).

Consequently, using multiple memslots and only mapping the memslots we
really need can significantly reduce memory waste and speed up
memslot-related operations. Let's expose the sparse RAM memory region using
multiple memslots, mapping only the memslots we currently need into our
device memory region container.

The feature can be enabled using "dynamic-memslots=on" and requires
"unplugged-inaccessible=on", which is nowadays the default.

Once enabled, we'll auto-detect the number of memslots to use based on the
memslot limit provided by the core. We'll use at most 1 memslot per
gigabyte. Note that our global limit of memslots accross all memory devices
is currently set to 256: even with multiple large virtio-mem devices,
we'd still have a sane limit on the number of memslots used.

The default is to not dynamically map memslot for now
("dynamic-memslots=off"). The optimization must be enabled manually,
because some vhost setups (e.g., hotplug of vhost-user devices) might be
problematic until we support more memslots especially in vhost-user backends.

Note that "dynamic-memslots=on" is just a hint that multiple memslots
*may* be used for internal optimizations, not that multiple memslots
*must* be used. The actual number of memslots that are used is an
internal detail: for example, once memslot metadata is no longer an
issue, we could simply stop optimizing for that. Migration source and
destination can differ on the setting of "dynamic-memslots".

Message-ID: <20230926185738.277351-17-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
884a0c20e6 virtio-mem: Update state to match bitmap as soon as it's been migrated
It's cleaner and future-proof to just have other state that depends on the
bitmap state to be updated as soon as possible when restoring the bitmap.

So factor out informing RamDiscardListener into a functon and call it in
case of early migration right after we restored the bitmap.

Message-ID: <20230926185738.277351-16-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
a45171dba7 virtio-mem: Pass non-const VirtIOMEM via virtio_mem_range_cb
Let's prepare for a user that has to modify the VirtIOMEM device state.

Message-ID: <20230926185738.277351-15-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
aa5317ef7c memory: Clarify mapping requirements for RamDiscardManager
We really only care about the RAM memory region not being mapped into
an address space yet as long as we're still setting up the
RamDiscardManager. Once mapped into an address space, memory notifiers
would get notified about such a region and any attempts to modify the
RamDiscardManager would be wrong.

While "mapped into an address space" is easy to check for RAM regions that
are mapped directly (following the ->container links), it's harder to
check when such regions are mapped indirectly via aliases. For now, we can
only detect that a region is mapped through an alias (->mapped_via_alias),
but we don't have a handle on these aliases to follow all their ->container
links to test if they are eventually mapped into an address space.

So relax the assertion in memory_region_set_ram_discard_manager(),
remove the check in memory_region_get_ram_discard_manager() and clarify
the doc.

Message-ID: <20230926185738.277351-14-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
a2335113ae memory-device,vhost: Support automatic decision on the number of memslots
We want to support memory devices that can automatically decide how many
memslots they will use. In the worst case, they have to use a single
memslot.

The target use cases are virtio-mem and the hyper-v balloon.

Let's calculate a reasonable limit such a memory device may use, and
instruct the device to make a decision based on that limit. Use a simple
heuristic that considers:
* A memslot soft-limit for all memory devices of 256; also, to not
  consume too many memslots -- which could harm performance.
* Actually still free and unreserved memslots
* The percentage of the remaining device memory region that memory device
  will occupy.

Further, while we properly check before plugging a memory device whether
there still is are free memslots, we have other memslot consumers (such as
boot memory, PCI BARs) that don't perform any checks and might dynamically
consume memslots without any prior reservation. So we might succeed in
plugging a memory device, but once we dynamically map a PCI BAR we would
be in trouble. Doing accounting / reservation / checks for all such
users is problematic (e.g., sometimes we might temporarily split boot
memory into two memslots, triggered by the BIOS).

We use the historic magic memslot number of 509 as orientation to when
supporting 256 memory devices -> memslots (leaving 253 for boot memory and
other devices) has been proven to work reliable. We'll fallback to
suggesting a single memslot if we don't have at least 509 total memslots.

Plugging vhost devices with less than 509 memslots available while we
have memory devices plugged that consume multiple memslots due to
automatic decisions can be problematic. Most configurations might just fail
due to "limit < used + reserved", however, it can also happen that these
memory devices would suddenly consume memslots that would actually be
required by other memslot consumers (boot, PCI BARs) later. Note that this
has always been sketchy with vhost devices that support only a small number
of memslots; but we don't want to make it any worse.So let's keep it simple
and simply reject plugging such vhost devices in such a configuration.

Eventually, all vhost devices that want to be fully compatible with such
memory devices should support a decent number of memslots (>= 509).

Message-ID: <20230926185738.277351-13-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
cd89c065b0 vhost: Add vhost_get_max_memslots()
Let's add vhost_get_max_memslots().

Message-ID: <20230926185738.277351-12-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
16ab2eda57 kvm: Add stub for kvm_get_max_memslots()
We'll need the stub soon from memory device context.

While at it, use "unsigned int" as return value and place the
declaration next to kvm_get_free_memslots().

Message-ID: <20230926185738.277351-11-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
766aa0a654 memory-device,vhost: Support memory devices that dynamically consume memslots
We want to support memory devices that have a dynamically managed memory
region container as device memory region. This device memory region maps
multiple RAM memory subregions (e.g., aliases to the same RAM memory
region), whereby these subregions can be (un)mapped on demand.

Each RAM subregion will consume a memslot in KVM and vhost, resulting in
such a new device consuming memslots dynamically, and initially usually
0. We already track the number of used vs. required memslots for all
memslots. From that, we can derive the number of reserved memslots that
must not be used otherwise.

The target use case is virtio-mem and the hyper-v balloon, which will
dynamically map aliases to RAM memory region into their device memory
region container.

Properly document what's supported and what's not and extend the vhost
memslot check accordingly.

Message-ID: <20230926185738.277351-10-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
f9716f4b0d memory-device: Track required and actually used memslots in DeviceMemoryState
Let's track how many memslots are required by plugged memory devices and
how many are currently actually getting used by plugged memory
devices.

"required - used" is the number of reserved memslots. For now, the number
of used and required memslots is always equal, and there are no
reservations. This is a preparation for memory devices that want to
dynamically consume memslots after initially specifying how many they
require -- where we'll end up with reserved memslots.

To track the number of used memslots, create a new address space for
our device memory and register a memory listener (add/remove) for that
address space.

Message-ID: <20230926185738.277351-9-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00