Commit Graph

107834 Commits

Author SHA1 Message Date
Kevin Wolf
2b3912f135 block: Mark bdrv_first_blk() and bdrv_is_root_node() GRAPH_RDLOCK
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_first_blk() and bdrv_is_root_node() need to hold a reader lock
for the graph. These functions are the only functions in block-backend.c
that access the parent list of a node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-5-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
0e6bad1f21 block: Take graph rdlock in bdrv_inactivate_all()
The function reads the parents list, so it needs to hold the graph lock.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-4-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
e84c07bc73 block-coroutine-wrapper: Add no_co_wrapper_bdrv_rdlock functions
Add a new wrapper type for GRAPH_RDLOCK functions that should be called
from coroutine context.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-3-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Kevin Wolf
903df115aa test-bdrv-drain: Don't call bdrv_graph_wrlock() in coroutine context
AIO callbacks are effectively coroutine_mixed_fn. If AIO requests don't
return immediately, their callback is called from the request coroutine.
This means that in AIO callbacks, we can't call no_coroutine_fns such as
bdrv_graph_wrlock(). Unfortunately test-bdrv-drain does so.

Change the test to use a BH to drop out of coroutine context, and add
coroutine_mixed_fn and no_coroutine_fn markers to clarify the context
each function runs in.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230929145157.45443-2-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Paolo Bonzini
cc32399773 block: convert more bdrv_is_allocated* and bdrv_block_status* calls to coroutine versions
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230904100306.156197-5-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Paolo Bonzini
578ffa9ffb block: switch to co_wrapper for bdrv_is_allocated_*
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230904100306.156197-4-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Paolo Bonzini
1b88457eaa block: complete public block status API
Include both coroutine and non-coroutine versions, the latter being
co_wrapper_mixed_bdrv_rdlock of the former.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230904100306.156197-3-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:33 +02:00
Paolo Bonzini
b170e92982 block: rename the bdrv_co_block_status static function
bdrv_block_status exists as a wrapper for bdrv_block_status_above, but
the name of the (hypothetical) coroutine version, bdrv_co_block_status,
is squatted by a random static function.  Rename it to
bdrv_co_do_block_status.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20230904100306.156197-2-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-12 16:31:32 +02:00
Stefan Hajnoczi
63011373ad Second RISC-V PR for 8.2
* Add support for the max CPU
  * Detect user choice in TCG
  * Clear CSR values at reset and sync MPSTATE with host
  * Fix the typo of inverted order of pmpaddr13 and pmpaddr14
  * Split TCG/KVM accelerators from cpu.c
  * Add extension properties for all cpus
  * Replace GDB exit calls with proper shutdown
  * Support KVM_GET_REG_LIST
  * Remove RVG warning
  * Use env_archcpu for better performance
  * Deprecate capital 'Z' CPU properties
  * Fix vfwmaccbf16.vf
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmUncYAACgkQr3yVEwxT
 gBPQ3g/9Fi4uYRK7dymHHAQbOO9NPlmVPPSxmQ8fNUhoZUkbHfm56JEl42Xr02rA
 Lg2ORRQxJhAinANV8CotnbyLRHNCAvouCMCQEjHo1YEHzdXc0tQzp+rIOHT7v9rH
 6OQpI6RuCjO+0LQPMgzJx8yokMw/9b0uma3+RkNKod1XsSySo6JvDkMZGGZZWuVX
 Que3TMHzc4513PWEwRS9NaAHqRdy/ax0aPu9khswTYBxeJ/mBTLvGj4wBq5wnS7+
 JPvq0M5ScUMl4K5o884wsAzOdxRk8QZOMx3duMCbqXw0xFmYZj/EzcIeHdnXwuDB
 lcANd6LcESMNUb8iDBaFRjLnZ/gNiu20/P/LPWyTirfoZXzZ+h6WPnSeli36xtzO
 KKWtvS1YggCjsDvh9/PLYAvUGBcS/kUhIynN10YKnoKB+wSDxxyvBS1GU6c8czgc
 WDf3V4P3Z8oPKDA/24Qd9Uiho1Gq9FED4eBQPb9PuvkfboKE/g7lUp708XXDFVld
 hkJMsYROSRvk54RHITrD9Z+XFQ2TfC8wHLH0IwlyynQnc1sKvXaR6U1hZTAVtE4f
 yley/xCQ7OUV+hrx1sQLURcN6A+SPummOY5jdHiD29QcJnOZnkSy5j2KOlnHSa5i
 6v/6EFCgxwr69N6Q6X34VDv6+DZqLO2dNncQCInYFfupRhQ7t1E=
 =SUon
 -----END PGP SIGNATURE-----

Merge tag 'pull-riscv-to-apply-20231012-1' of https://github.com/alistair23/qemu into staging

Second RISC-V PR for 8.2

 * Add support for the max CPU
 * Detect user choice in TCG
 * Clear CSR values at reset and sync MPSTATE with host
 * Fix the typo of inverted order of pmpaddr13 and pmpaddr14
 * Split TCG/KVM accelerators from cpu.c
 * Add extension properties for all cpus
 * Replace GDB exit calls with proper shutdown
 * Support KVM_GET_REG_LIST
 * Remove RVG warning
 * Use env_archcpu for better performance
 * Deprecate capital 'Z' CPU properties
 * Fix vfwmaccbf16.vf

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmUncYAACgkQr3yVEwxT
# gBPQ3g/9Fi4uYRK7dymHHAQbOO9NPlmVPPSxmQ8fNUhoZUkbHfm56JEl42Xr02rA
# Lg2ORRQxJhAinANV8CotnbyLRHNCAvouCMCQEjHo1YEHzdXc0tQzp+rIOHT7v9rH
# 6OQpI6RuCjO+0LQPMgzJx8yokMw/9b0uma3+RkNKod1XsSySo6JvDkMZGGZZWuVX
# Que3TMHzc4513PWEwRS9NaAHqRdy/ax0aPu9khswTYBxeJ/mBTLvGj4wBq5wnS7+
# JPvq0M5ScUMl4K5o884wsAzOdxRk8QZOMx3duMCbqXw0xFmYZj/EzcIeHdnXwuDB
# lcANd6LcESMNUb8iDBaFRjLnZ/gNiu20/P/LPWyTirfoZXzZ+h6WPnSeli36xtzO
# KKWtvS1YggCjsDvh9/PLYAvUGBcS/kUhIynN10YKnoKB+wSDxxyvBS1GU6c8czgc
# WDf3V4P3Z8oPKDA/24Qd9Uiho1Gq9FED4eBQPb9PuvkfboKE/g7lUp708XXDFVld
# hkJMsYROSRvk54RHITrD9Z+XFQ2TfC8wHLH0IwlyynQnc1sKvXaR6U1hZTAVtE4f
# yley/xCQ7OUV+hrx1sQLURcN6A+SPummOY5jdHiD29QcJnOZnkSy5j2KOlnHSa5i
# 6v/6EFCgxwr69N6Q6X34VDv6+DZqLO2dNncQCInYFfupRhQ7t1E=
# =SUon
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 12 Oct 2023 00:09:36 EDT
# gpg:                using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65  9296 AF7C 9513 0C53 8013

* tag 'pull-riscv-to-apply-20231012-1' of https://github.com/alistair23/qemu: (54 commits)
  target/riscv: Fix vfwmaccbf16.vf
  target/riscv: deprecate capital 'Z' CPU properties
  target/riscv: Use env_archcpu for better performance
  target/riscv/tcg: remove RVG warning
  target/riscv/kvm: support KVM_GET_REG_LIST
  target/riscv/kvm: improve 'init_multiext_cfg' error msg
  gdbstub: replace exit calls with proper shutdown for softmmu
  hw/char: riscv_htif: replace exit calls with proper shutdown
  hw/misc/sifive_test.c: replace exit calls with proper shutdown
  softmmu: pass the main loop status to gdb "Wxx" packet
  softmmu: add means to pass an exit code when requesting a shutdown
  target/riscv/tcg-cpu.c: add extension properties for all cpus
  target/riscv: add riscv_cpu_get_name()
  target/riscv/cpu: move priv spec functions to tcg-cpu.c
  target/riscv/cpu.c: export isa_edata_arr[]
  target/riscv/tcg: move riscv_cpu_add_misa_properties() to tcg-cpu.c
  target/riscv/cpu.c: make misa_ext_cfgs[] 'const'
  target/riscv/tcg: introduce tcg_cpu_instance_init()
  target/riscv/cpu.c: export set_misa()
  target/riscv/kvm: do not use riscv_cpu_add_misa_properties()
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-12 10:24:44 -04:00
Stefan Hajnoczi
40886c4cf5 trivial patches for 2023-10-12
-----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEe3O61ovnosKJMUsicBtPaxppPlkFAmUnFa8PHG1qdEB0bHMu
 bXNrLnJ1AAoJEHAbT2saaT5ZBv8H/0MtWL6FqTzvz5yLn2WSbj2ng1RG1Deh36Sy
 1PCpFKy85ZSBKLOzgvbpn4VfpEdsvD/+sX4C4CVde+vR3oCjdUM14hnzEWX86gFl
 O8Ct8++MLPqnwgu6Rg6Z+Ie2yBtsQ5VABH/1q36T7+XHHh19bgEw6tW34/f2Ncxw
 8UQO2lm9tAMAOEfXoutoj8K8ch3FvbsEic9L0ORc7ntWc7NIauc3zizogtPHAzR8
 elB3BiLn4sMHLBj+IunndOiLadUAVOKTJ5PKi4b8iRa6aE8E6bjtLxdiPr4XEx/g
 7rSGvNM+Lm7mEgJSyyik+u0MshKjfRi+SrbvId9FIqACG1GCKeI=
 =rFns
 -----END PGP SIGNATURE-----

Merge tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu into staging

trivial patches for 2023-10-12

# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEe3O61ovnosKJMUsicBtPaxppPlkFAmUnFa8PHG1qdEB0bHMu
# bXNrLnJ1AAoJEHAbT2saaT5ZBv8H/0MtWL6FqTzvz5yLn2WSbj2ng1RG1Deh36Sy
# 1PCpFKy85ZSBKLOzgvbpn4VfpEdsvD/+sX4C4CVde+vR3oCjdUM14hnzEWX86gFl
# O8Ct8++MLPqnwgu6Rg6Z+Ie2yBtsQ5VABH/1q36T7+XHHh19bgEw6tW34/f2Ncxw
# 8UQO2lm9tAMAOEfXoutoj8K8ch3FvbsEic9L0ORc7ntWc7NIauc3zizogtPHAzR8
# elB3BiLn4sMHLBj+IunndOiLadUAVOKTJ5PKi4b8iRa6aE8E6bjtLxdiPr4XEx/g
# 7rSGvNM+Lm7mEgJSyyik+u0MshKjfRi+SrbvId9FIqACG1GCKeI=
# =rFns
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 11 Oct 2023 17:37:51 EDT
# gpg:                using RSA key 7B73BAD68BE7A2C289314B22701B4F6B1A693E59
# gpg:                issuer "mjt@tls.msk.ru"
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>" [full]
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>" [full]
# gpg:                 aka "Michael Tokarev <mjt@debian.org>" [full]
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931  4B22 701B 4F6B 1A69 3E59

* tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu:
  cpus: Remove unused smp_cores/smp_threads declarations
  scripts/xml-preprocess: Make sure this script is invoked via the right Python
  roms: use PYTHON to invoke python
  MAINTAINERS: Add some unowned files to the SBSA-REF section
  MAINTAINERS: Add section for overall sensors
  MAINTAINERS: add standard-headers to Hosts/LINUX
  MAINTAINERS: Add the CI-related doc files to the CI section
  MAINTAINERS: Add include folder to the hw/char/ section
  MAINTAINERS: Add unowned RISC-V related files to the right sections
  MAINTAINERS: Add g364fb and ds1225y to the Jazz section
  Fix compilation when UFFDIO_REGISTER is not set.
  Update AMD memory encryption document links.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-12 10:24:06 -04:00
Stefan Hajnoczi
ab3ec1586a qga-pull-2023-10-11
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEwsLBCepDxjwUI+uE711egWG6hOcFAmUmqx4ACgkQ711egWG6
 hOcgsBAAkmFQgGfxNGIXB+Y8QIWVHPlzhng6F3bzHXs+7t7RC7ITPcebsvmRZWKr
 Kn1etUhF22UneI+ipBh0JK3BkTZF5qnc4089F5Gh9uHfUS8aT7M+gDAcJoJDZpv3
 xr11iSlW9hooclAg6qV4RZ0bvjXbwgRHhI6I0P/1aNx7lKalWzZpvdnBOliG56rG
 oWf9vrWIC1nkoRg4vtKLqjouKzYsPR/kdXonr5bg8s+zV6Oqk6bC2FiN6TU/1QRI
 7FvC1mTESqWACtSqSzooGQVEhBVGZZOzzIufPCDxj9evgKZE+0Yd761IjcfXt3JS
 T4+C5Q2fZPL6hTPXnQF3YsRRJwn5tK8r1747dc2HsmLtX7EPXhu7H4gcPJLim/cc
 Kuln9+BVF+oqnLmkAgSP7Ss13lZPBFpEGhiREAYZrbD+lPfwv1ufTb1EkLWTWIpD
 MCTGW4ZBQsA+H/XMVCIf2dRYfCHgVslElmBq0hJTUvklQtrpVuGuHJzNis8W/qTq
 4AxMkh+3sS0lGDioq8fsIinlprP9XnF7hPvuJmLGQuV/wUHRmQ0L8twZiPB1g6nm
 8mwa/r+5lc04RXQ/cCLEtj6H+3XKHn8+QrsuknOnNEQ80lZtPQqM9/iz3ThSXd5Z
 zFcd4mOo5VwiGcS6KgmXpL9ZbYL6U8HHwAGuu1akky90rRxSK1A=
 =mHqt
 -----END PGP SIGNATURE-----

Merge tag 'qga-pull-2023-10-11' of https://github.com/kostyanf14/qemu into staging

qga-pull-2023-10-11

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEwsLBCepDxjwUI+uE711egWG6hOcFAmUmqx4ACgkQ711egWG6
# hOcgsBAAkmFQgGfxNGIXB+Y8QIWVHPlzhng6F3bzHXs+7t7RC7ITPcebsvmRZWKr
# Kn1etUhF22UneI+ipBh0JK3BkTZF5qnc4089F5Gh9uHfUS8aT7M+gDAcJoJDZpv3
# xr11iSlW9hooclAg6qV4RZ0bvjXbwgRHhI6I0P/1aNx7lKalWzZpvdnBOliG56rG
# oWf9vrWIC1nkoRg4vtKLqjouKzYsPR/kdXonr5bg8s+zV6Oqk6bC2FiN6TU/1QRI
# 7FvC1mTESqWACtSqSzooGQVEhBVGZZOzzIufPCDxj9evgKZE+0Yd761IjcfXt3JS
# T4+C5Q2fZPL6hTPXnQF3YsRRJwn5tK8r1747dc2HsmLtX7EPXhu7H4gcPJLim/cc
# Kuln9+BVF+oqnLmkAgSP7Ss13lZPBFpEGhiREAYZrbD+lPfwv1ufTb1EkLWTWIpD
# MCTGW4ZBQsA+H/XMVCIf2dRYfCHgVslElmBq0hJTUvklQtrpVuGuHJzNis8W/qTq
# 4AxMkh+3sS0lGDioq8fsIinlprP9XnF7hPvuJmLGQuV/wUHRmQ0L8twZiPB1g6nm
# 8mwa/r+5lc04RXQ/cCLEtj6H+3XKHn8+QrsuknOnNEQ80lZtPQqM9/iz3ThSXd5Z
# zFcd4mOo5VwiGcS6KgmXpL9ZbYL6U8HHwAGuu1akky90rRxSK1A=
# =mHqt
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 11 Oct 2023 10:03:10 EDT
# gpg:                using RSA key C2C2C109EA43C63C1423EB84EF5D5E8161BA84E7
# gpg: Good signature from "Kostiantyn Kostiuk (Upstream PR sign) <kkostiuk@redhat.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: C2C2 C109 EA43 C63C 1423  EB84 EF5D 5E81 61BA 84E7

* tag 'qga-pull-2023-10-11' of https://github.com/kostyanf14/qemu:
  qapi: qga: Clarify when out-data and err-data are populated
  qga: Fix memory leak when output stream is unused
  qga: Remove platform GUID definitions

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-12 10:23:21 -04:00
Thomas Huth
f51f90c65e gitlab-ci: Disable the riscv64-debian-cross-container by default
This job is failing since weeks. Let's mark it as manual until
it gets fixed.

Message-Id: <82aa015a-ca94-49ce-beec-679cc175b726@redhat.com>
Acked-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-10-12 14:18:03 +02:00
David Hildenbrand
ee6398d862 virtio-mem: Mark memslot alias memory regions unmergeable
Let's mark the memslot alias memory regions as unmergable, such that
flatview and vhost won't merge adjacent memory region aliases and we can
atomically map/unmap individual aliases without affecting adjacent
alias memory regions.

This handles vhost and vfio in multiple-memslot mode correctly (which do
not support atomic memslot updates) and avoids the temporary removal of
large memslots, which can be an expensive operation. For example, vfio
might have to unpin + repin a lot of memory, which is undesired.

Message-ID: <20230926185738.277351-19-david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
533f5d6679 memory,vhost: Allow for marking memory device memory regions unmergeable
Let's allow for marking memory regions unmergeable, to teach
flatview code and vhost to not merge adjacent aliases to the same memory
region into a larger memory section; instead, we want separate aliases to
stay separate such that we can atomically map/unmap aliases without
affecting other aliases.

This is desired for virtio-mem mapping device memory located on a RAM
memory region via multiple aliases into a memory region container,
resulting in separate memslots that can get (un)mapped atomically.

As an example with virtio-mem, the layout would look something like this:
  [...]
  0000000240000000-00000020bfffffff (prio 0, i/o): device-memory
    0000000240000000-000000043fffffff (prio 0, i/o): virtio-mem
      0000000240000000-000000027fffffff (prio 0, ram): alias memslot-0 @mem2 0000000000000000-000000003fffffff
      0000000280000000-00000002bfffffff (prio 0, ram): alias memslot-1 @mem2 0000000040000000-000000007fffffff
      00000002c0000000-00000002ffffffff (prio 0, ram): alias memslot-2 @mem2 0000000080000000-00000000bfffffff
  [...]

Without unmergable memory regions, all three memslots would get merged into
a single memory section. For example, when mapping another alias (e.g.,
virtio-mem-memslot-3) or when unmapping any of the mapped aliases,
memory listeners will first get notified about the removal of the big
memory section to then get notified about re-adding of the new
(differently merged) memory section(s).

In an ideal world, memory listeners would be able to deal with that
atomically, like KVM nowadays does. However, (a) supporting this for other
memory listeners (vhost-user, vfio) is fairly hard: temporary removal
can result in all kinds of issues on concurrent access to guest memory;
and (b) this handling is undesired, because temporarily removing+readding
can consume quite some time on bigger memslots and is not efficient
(e.g., vfio unpinning and repinning pages ...).

Let's allow for marking a memory region unmergeable, such that we
can atomically (un)map aliases to the same memory region, similar to
(un)mapping individual DIMMs.

Similarly, teach vhost code to not redo what flatview core stopped doing:
don't merge such sections. Merging in vhost code is really only relevant
for handling random holes in boot memory where; without this merging,
the vhost-user backend wouldn't be able to mmap() some boot memory
backed on hugetlb.

We'll use this for virtio-mem next.

Message-ID: <20230926185738.277351-18-david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
177f9b1ee4 virtio-mem: Expose device memory dynamically via multiple memslots if enabled
Having large virtio-mem devices that only expose little memory to a VM
is currently a problem: we map the whole sparse memory region into the
guest using a single memslot, resulting in one gigantic memslot in KVM.
KVM allocates metadata for the whole memslot, which can result in quite
some memory waste.

Assuming we have a 1 TiB virtio-mem device and only expose little (e.g.,
1 GiB) memory, we would create a single 1 TiB memslot and KVM has to
allocate metadata for that 1 TiB memslot: on x86, this implies allocating
a significant amount of memory for metadata:

(1) RMAP: 8 bytes per 4 KiB, 8 bytes per 2 MiB, 8 bytes per 1 GiB
    -> For 1 TiB: 2147483648 + 4194304 + 8192 = ~ 2 GiB (0.2 %)

    With the TDP MMU (cat /sys/module/kvm/parameters/tdp_mmu) this gets
    allocated lazily when required for nested VMs
(2) gfn_track: 2 bytes per 4 KiB
    -> For 1 TiB: 536870912 = ~512 MiB (0.05 %)
(3) lpage_info: 4 bytes per 2 MiB, 4 bytes per 1 GiB
    -> For 1 TiB: 2097152 + 4096 = ~2 MiB (0.0002 %)
(4) 2x dirty bitmaps for tracking: 2x 1 bit per 4 KiB page
    -> For 1 TiB: 536870912 = 64 MiB (0.006 %)

So we primarily care about (1) and (2). The bad thing is, that the
memory consumption *doubles* once SMM is enabled, because we create the
memslot once for !SMM and once for SMM.

Having a 1 TiB memslot without the TDP MMU consumes around:
* With SMM: 5 GiB
* Without SMM: 2.5 GiB
Having a 1 TiB memslot with the TDP MMU consumes around:
* With SMM: 1 GiB
* Without SMM: 512 MiB

... and that's really something we want to optimize, to be able to just
start a VM with small boot memory (e.g., 4 GiB) and a virtio-mem device
that can grow very large (e.g., 1 TiB).

Consequently, using multiple memslots and only mapping the memslots we
really need can significantly reduce memory waste and speed up
memslot-related operations. Let's expose the sparse RAM memory region using
multiple memslots, mapping only the memslots we currently need into our
device memory region container.

The feature can be enabled using "dynamic-memslots=on" and requires
"unplugged-inaccessible=on", which is nowadays the default.

Once enabled, we'll auto-detect the number of memslots to use based on the
memslot limit provided by the core. We'll use at most 1 memslot per
gigabyte. Note that our global limit of memslots accross all memory devices
is currently set to 256: even with multiple large virtio-mem devices,
we'd still have a sane limit on the number of memslots used.

The default is to not dynamically map memslot for now
("dynamic-memslots=off"). The optimization must be enabled manually,
because some vhost setups (e.g., hotplug of vhost-user devices) might be
problematic until we support more memslots especially in vhost-user backends.

Note that "dynamic-memslots=on" is just a hint that multiple memslots
*may* be used for internal optimizations, not that multiple memslots
*must* be used. The actual number of memslots that are used is an
internal detail: for example, once memslot metadata is no longer an
issue, we could simply stop optimizing for that. Migration source and
destination can differ on the setting of "dynamic-memslots".

Message-ID: <20230926185738.277351-17-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
884a0c20e6 virtio-mem: Update state to match bitmap as soon as it's been migrated
It's cleaner and future-proof to just have other state that depends on the
bitmap state to be updated as soon as possible when restoring the bitmap.

So factor out informing RamDiscardListener into a functon and call it in
case of early migration right after we restored the bitmap.

Message-ID: <20230926185738.277351-16-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
a45171dba7 virtio-mem: Pass non-const VirtIOMEM via virtio_mem_range_cb
Let's prepare for a user that has to modify the VirtIOMEM device state.

Message-ID: <20230926185738.277351-15-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
aa5317ef7c memory: Clarify mapping requirements for RamDiscardManager
We really only care about the RAM memory region not being mapped into
an address space yet as long as we're still setting up the
RamDiscardManager. Once mapped into an address space, memory notifiers
would get notified about such a region and any attempts to modify the
RamDiscardManager would be wrong.

While "mapped into an address space" is easy to check for RAM regions that
are mapped directly (following the ->container links), it's harder to
check when such regions are mapped indirectly via aliases. For now, we can
only detect that a region is mapped through an alias (->mapped_via_alias),
but we don't have a handle on these aliases to follow all their ->container
links to test if they are eventually mapped into an address space.

So relax the assertion in memory_region_set_ram_discard_manager(),
remove the check in memory_region_get_ram_discard_manager() and clarify
the doc.

Message-ID: <20230926185738.277351-14-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
a2335113ae memory-device,vhost: Support automatic decision on the number of memslots
We want to support memory devices that can automatically decide how many
memslots they will use. In the worst case, they have to use a single
memslot.

The target use cases are virtio-mem and the hyper-v balloon.

Let's calculate a reasonable limit such a memory device may use, and
instruct the device to make a decision based on that limit. Use a simple
heuristic that considers:
* A memslot soft-limit for all memory devices of 256; also, to not
  consume too many memslots -- which could harm performance.
* Actually still free and unreserved memslots
* The percentage of the remaining device memory region that memory device
  will occupy.

Further, while we properly check before plugging a memory device whether
there still is are free memslots, we have other memslot consumers (such as
boot memory, PCI BARs) that don't perform any checks and might dynamically
consume memslots without any prior reservation. So we might succeed in
plugging a memory device, but once we dynamically map a PCI BAR we would
be in trouble. Doing accounting / reservation / checks for all such
users is problematic (e.g., sometimes we might temporarily split boot
memory into two memslots, triggered by the BIOS).

We use the historic magic memslot number of 509 as orientation to when
supporting 256 memory devices -> memslots (leaving 253 for boot memory and
other devices) has been proven to work reliable. We'll fallback to
suggesting a single memslot if we don't have at least 509 total memslots.

Plugging vhost devices with less than 509 memslots available while we
have memory devices plugged that consume multiple memslots due to
automatic decisions can be problematic. Most configurations might just fail
due to "limit < used + reserved", however, it can also happen that these
memory devices would suddenly consume memslots that would actually be
required by other memslot consumers (boot, PCI BARs) later. Note that this
has always been sketchy with vhost devices that support only a small number
of memslots; but we don't want to make it any worse.So let's keep it simple
and simply reject plugging such vhost devices in such a configuration.

Eventually, all vhost devices that want to be fully compatible with such
memory devices should support a decent number of memslots (>= 509).

Message-ID: <20230926185738.277351-13-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
cd89c065b0 vhost: Add vhost_get_max_memslots()
Let's add vhost_get_max_memslots().

Message-ID: <20230926185738.277351-12-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
16ab2eda57 kvm: Add stub for kvm_get_max_memslots()
We'll need the stub soon from memory device context.

While at it, use "unsigned int" as return value and place the
declaration next to kvm_get_free_memslots().

Message-ID: <20230926185738.277351-11-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
766aa0a654 memory-device,vhost: Support memory devices that dynamically consume memslots
We want to support memory devices that have a dynamically managed memory
region container as device memory region. This device memory region maps
multiple RAM memory subregions (e.g., aliases to the same RAM memory
region), whereby these subregions can be (un)mapped on demand.

Each RAM subregion will consume a memslot in KVM and vhost, resulting in
such a new device consuming memslots dynamically, and initially usually
0. We already track the number of used vs. required memslots for all
memslots. From that, we can derive the number of reserved memslots that
must not be used otherwise.

The target use case is virtio-mem and the hyper-v balloon, which will
dynamically map aliases to RAM memory region into their device memory
region container.

Properly document what's supported and what's not and extend the vhost
memslot check accordingly.

Message-ID: <20230926185738.277351-10-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
f9716f4b0d memory-device: Track required and actually used memslots in DeviceMemoryState
Let's track how many memslots are required by plugged memory devices and
how many are currently actually getting used by plugged memory
devices.

"required - used" is the number of reserved memslots. For now, the number
of used and required memslots is always equal, and there are no
reservations. This is a preparation for memory devices that want to
dynamically consume memslots after initially specifying how many they
require -- where we'll end up with reserved memslots.

To track the number of used memslots, create a new address space for
our device memory and register a memory listener (add/remove) for that
address space.

Message-ID: <20230926185738.277351-9-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
759bac673a stubs: Rename qmp_memory_device.c to memory_device.c
We want to place non-qmp stubs in there, so let's rename it. While at
it, put it into the MAINTAINERS file under "Memory devices".

Message-ID: <20230926185738.277351-8-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
7975feece9 memory-device: Support memory devices with multiple memslots
We want to support memory devices that have a memory region container as
device memory region that maps multiple RAM memory regions. Let's start
by supporting memory devices that statically map multiple RAM memory
regions and, thereby, consume multiple memslots.

We already have one device that uses a container as device memory region:
NVDIMMs. However, a NVDIMM always ends up consuming exactly one memslot.

Let's add support for that by asking the memory device via a new
callback how many memslots it requires.

Message-ID: <20230926185738.277351-7-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
8c49951c4a vhost: Return number of free memslots
Let's return the number of free slots instead of only checking if there
is a free slot. Required to support memory devices that consume multiple
memslots.

This is a preparation for memory devices that consume multiple memslots.

Message-ID: <20230926185738.277351-6-david@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
5b23186a95 kvm: Return number of free memslots
Let's return the number of free slots instead of only checking if there
is a free slot. While at it, check all address spaces, which will also
consider SMM under x86 correctly.

This is a preparation for memory devices that consume multiple memslots.

Message-ID: <20230926185738.277351-5-david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:22 +02:00
David Hildenbrand
022f033bd7 softmmu/physmem: Fixup qemu_ram_block_from_host() documentation
Let's fixup the documentation (e.g., removing traces of the ram_addr
parameter that no longer exists) and move it to the header file while at
it.

Message-ID: <20230926185738.277351-4-david@redhat.com>
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:21 +02:00
David Hildenbrand
309ebfa691 vhost: Remove vhost_backend_can_merge() callback
Checking whether the memory regions are equal is sufficient: if they are
equal, then most certainly the contained fd is equal.

The whole vhost-user memslot handling is suboptimal and overly
complicated. We shouldn't have to lookup a RAM memory regions we got
notified about in vhost_user_get_mr_data() using a host pointer. But that
requires a bigger rework -- especially an alternative vhost_set_mem_table()
backend call that simply consumes MemoryRegionSections.

For now, let's just drop vhost_backend_can_merge().

Message-ID: <20230926185738.277351-3-david@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:21 +02:00
David Hildenbrand
552b25229c vhost: Rework memslot filtering and fix "used_memslot" tracking
Having multiple vhost devices, some filtering out fd-less memslots and
some not, can mess up the "used_memslot" accounting. Consequently our
"free memslot" checks become unreliable and we might run out of free
memslots at runtime later.

An example sequence which can trigger a potential issue that involves
different vhost backends (vhost-kernel and vhost-user) and hotplugged
memory devices can be found at [1].

Let's make the filtering mechanism less generic and distinguish between
backends that support private memslots (without a fd) and ones that only
support shared memslots (with a fd). Track the used_memslots for both
cases separately and use the corresponding value when required.

Note: Most probably we should filter out MAP_PRIVATE fd-based RAM regions
(for example, via memory-backend-memfd,...,shared=off or as default with
 memory-backend-file) as well. When not using MAP_SHARED, it might not work
as expected. Add a TODO for now.

[1] https://lkml.kernel.org/r/fad9136f-08d3-3fd9-71a1-502069c000cf@redhat.com

Message-ID: <20230926185738.277351-2-david@redhat.com>
Fixes: 988a27754b ("vhost: allow backends to filter memory sections")
Cc: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-10-12 14:15:21 +02:00
Thomas Huth
abf8c47f44 MAINTAINERS: Add include/sysemu/qtest.h to the qtest section
We already list system/qtest.c in the qtest section, so the
corresponding header file should be listed here, too.

Message-ID: <20231012111401.871711-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-10-12 14:12:45 +02:00
Klaus Jensen
a8500f8043 hw/misc/Kconfig: add switch for i2c-echo
Associate i2c-echo with TEST_DEVICES and add a dependency on I2C.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230823-i2c-echo-fixes-v1-2-ccc05a6028f0@samsung.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-10-12 14:11:44 +02:00
Klaus Jensen
f912f1bdb6 hw/misc/i2c-echo: add copyright/license note
Add missing copyright and license notice. Also add a short description
of the device.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Message-ID: <20230823-i2c-echo-fixes-v1-1-ccc05a6028f0@samsung.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-10-12 14:11:44 +02:00
Chris Rauer
d0353b6e7b tests/qtest: Fix npcm7xx_timer-test.c flaky test
npcm7xx_timer-test occasionally fails due to the state of the timers
from the previous test iteration.  Advancing the clock step after the
reset resolves this issue.

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1897
Signed-off-by: Chris Rauer <crauer@google.com>
Message-ID: <20230929000831.691559-1-crauer@google.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-10-12 14:11:44 +02:00
Thomas Huth
e9a54265f5 hw/rdma: Deprecate the pvrdma device and the rdma subsystem
This subsystem is said to be in a bad shape (see e.g. [1], [2]
and [3]), and nobody seems to feel responsible to pick up patches
for this and send them via a pull request. For example there is
a patch for a CVE-worthy bug posted more than half a year ago [4]
which has never been merged. Thus let's mark it as deprecated and
finally remove it unless somebody steps up and improves the code
quality and adds proper regression tests.

[1] https://lore.kernel.org/qemu-devel/20230918144206.560120-1-armbru@redhat.com/
[2] https://lore.kernel.org/qemu-devel/ZQnojJOqoFu73995@redhat.com/
[3] https://lore.kernel.org/qemu-devel/1054981c-e8ae-c676-3b04-eeb030e11f65@tls.msk.ru/
[4] https://lore.kernel.org/qemu-devel/20230301142926.18686-1-yuval.shaia.ml@gmail.com/

Message-ID: <20230927133019.228495-1-thuth@redhat.com>
Acked-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-10-12 14:11:44 +02:00
Yuval Shaia
85fc35afa9 hw/pvrdma: Protect against buggy or malicious guest driver
Guest driver allocates and initialize page tables to be used as a ring
of descriptors for CQ and async events.
The page table that represents the ring, along with the number of pages
in the page table is passed to the device.
Currently our device supports only one page table for a ring.

Let's make sure that the number of page table entries the driver
reports, do not exceeds the one page table size.

Reported-by: Soul Chen <soulchen8650@gmail.com>
Signed-off-by: Yuval Shaia <yuval.shaia.ml@gmail.com>
Fixes: CVE-2023-1544
Message-ID: <20230301142926.18686-1-yuval.shaia.ml@gmail.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-10-12 14:11:44 +02:00
Thomas Huth
9e7d33941f hw/virtio/virtio-gpu: Fix compiler warning when compiling with -Wshadow
Avoid using trivial variable names in macros, otherwise we get
the following compiler warning when compiling with -Wshadow=local:

In file included from ../../qemu/hw/display/virtio-gpu-virgl.c:19:
../../home/thuth/devel/qemu/hw/display/virtio-gpu-virgl.c:
 In function ‘virgl_cmd_submit_3d’:
../../qemu/include/hw/virtio/virtio-gpu.h:228:16: error: declaration of ‘s’
 shadows a previous local [-Werror=shadow=compatible-local]
  228 |         size_t s;
      |                ^
../../qemu/hw/display/virtio-gpu-virgl.c:215:5: note: in expansion of macro
 ‘VIRTIO_GPU_FILL_CMD’
  215 |     VIRTIO_GPU_FILL_CMD(cs);
      |     ^~~~~~~~~~~~~~~~~~~
../../qemu/hw/display/virtio-gpu-virgl.c:213:12: note: shadowed declaration
 is here
  213 |     size_t s;
      |            ^
cc1: all warnings being treated as errors

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20231009084559.41427-1-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-10-12 14:09:58 +02:00
Thomas Huth
61499d87f4 libvhost-user: Fix compiler warning with -Wshadow=local
Rename shadowing variables to make this code compilable
with -Wshadow=local.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20231006121129.487251-1-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-10-12 14:09:58 +02:00
Thomas Huth
3cc72cdbb2 libvduse: Fix compiler warning with -Wshadow=local
No need to declare a new variable with the same name here,
we can simple re-use the one from the top of the function.
With this change, the file now compiles fine with -Wshadow=local.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20231006120819.480792-1-thuth@redhat.com>
Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-10-12 14:09:58 +02:00
Max Chou
837570cef2 target/riscv: Fix vfwmaccbf16.vf
The operator (fwmacc16) of vfwmaccbf16.vf helper function should be
replaced by fwmaccbf16.

Fixes: adf772b0f7 ("target/riscv: Add support for Zvfbfwma extension")
Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231005095734.567575-1-max.chou@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-10-12 12:50:13 +10:00
Daniel Henrique Barboza
8043effd9b target/riscv: deprecate capital 'Z' CPU properties
At this moment there are eleven CPU extension properties that starts
with capital 'Z': Zifencei, Zicsr, Zihintntl, Zihintpause, Zawrs, Zfa,
Zfh, Zfhmin, Zve32f, Zve64f and Zve64d. All other extensions are named
with lower-case letters.

We want all properties to be named with lower-case letters since it's
consistent with the riscv-isa string that we create in the FDT. Having
these 11 properties to be exceptions can be confusing.

Deprecate all of them. Create their lower-case counterpart to be used as
maintained CPU properties. When trying to use any deprecated property a
warning message will be displayed, recommending users to switch to the
lower-case variant:

./build/qemu-system-riscv64 -M virt -cpu rv64,Zifencei=true --nographic
qemu-system-riscv64: warning: CPU property 'Zifencei' is deprecated. Please use 'zifencei' instead

This will give users some time to change their scripts before we remove
the capital 'Z' properties entirely.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20231009112817.8896-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-10-12 12:42:41 +10:00
Richard W.M. Jones
614c9466a2 target/riscv: Use env_archcpu for better performance
RISCV_CPU(cs) uses a checked cast.  When QOM cast debugging is enabled
this adds about 5% total overhead when emulating RV64 on x86-64 host.

Using a RISC-V guest with 16 vCPUs, 16 GB of guest RAM, virtio-blk
disk.  The guest has a copy of the qemu source tree.  The test
involves compiling the qemu source tree with 'make clean; time make -j16'.

Before making this change the compile step took 449 & 447 seconds over
two consecutive runs.

After making this change: 428 & 421 seconds.

The saving is over 5%.

Thanks: Paolo Bonzini
Thanks: Philippe Mathieu-Daudé
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20231009124859.3373696-2-rjones@redhat.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-10-12 12:40:50 +10:00
Daniel Henrique Barboza
9b9741c38f target/riscv/tcg: remove RVG warning
Vendor CPUs that set RVG are displaying user warnings about other
extensions that RVG must enable, one warning per CPU. E.g.:

$ ./build/qemu-system-riscv64 -smp 8 -M virt -cpu veyron-v1 -nographic
qemu-system-riscv64: warning: Setting G will also set IMAFD_Zicsr_Zifencei
qemu-system-riscv64: warning: Setting G will also set IMAFD_Zicsr_Zifencei
qemu-system-riscv64: warning: Setting G will also set IMAFD_Zicsr_Zifencei
qemu-system-riscv64: warning: Setting G will also set IMAFD_Zicsr_Zifencei
qemu-system-riscv64: warning: Setting G will also set IMAFD_Zicsr_Zifencei
qemu-system-riscv64: warning: Setting G will also set IMAFD_Zicsr_Zifencei
qemu-system-riscv64: warning: Setting G will also set IMAFD_Zicsr_Zifencei
qemu-system-riscv64: warning: Setting G will also set IMAFD_Zicsr_Zifencei

This happens because we decided a while ago that, for simplicity, vendor
CPUs could set RVG instead of setting each G extension individually in
their cpu_init(). Our warning isn't taking that into account, and we're
bugging users with a warning that we're causing ourselves.

In a closer look we conclude that this warning is not warranted in any
other circumstance since we're just following the ISA [1], which states
in chapter 24:

"One goal of the RISC-V project is that it be used as a stable software
development target. For this purpose, we define a combination of a base
ISA (RV32I or RV64I) plus selected standard extensions (IMAFD, Zicsr,
Zifencei) as a 'general-purpose' ISA, and we use the abbreviation G for
the IMAFDZicsr Zifencei combination of instruction-set extensions."

With this in mind, enabling IMAFD_Zicsr_Zifencei if the user explicitly
enables 'G' is an expected behavior and the warning is unneeded. Any
user caught by surprise should refer to the ISA.

Remove the warning when handling RVG.

[1] https://github.com/riscv/riscv-isa-manual/releases/download/Ratified-IMAFDQC/riscv-spec-20191213.pdf

Reported-by: Paul A. Clarke <pclarke@ventanamicro.com>
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20231003122539.775932-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-10-12 12:39:45 +10:00
Daniel Henrique Barboza
608bdebb60 target/riscv/kvm: support KVM_GET_REG_LIST
KVM for RISC-V started supporting KVM_GET_REG_LIST in Linux 6.6. It
consists of a KVM ioctl() that retrieves a list of all available regs
for get_one_reg/set_one_reg. Regs that aren't present in the list aren't
supported in the host.

This simplifies our lives when initing the KVM regs since we don't have
to always attempt a KVM_GET_ONE_REG for all regs QEMU knows. We'll only
attempt a get_one_reg() if we're sure the reg is supported, i.e. it was
retrieved by KVM_GET_REG_LIST. Any error in get_one_reg() will then
always considered fatal, instead of having to handle special error codes
that might indicate a non-fatal failure.

Start by moving the current kvm_riscv_init_multiext_cfg() logic into a
new kvm_riscv_read_multiext_legacy() helper. We'll prioritize using
KVM_GET_REG_LIST, so check if we have it available and, in case we
don't, use the legacy() logic.

Otherwise, retrieve the available reg list and use it to check if the
host supports our known KVM regs, doing the usual get_one_reg() for
the supported regs and setting cpu->cfg accordingly.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20231003132148.797921-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-10-12 12:38:39 +10:00
Daniel Henrique Barboza
082e9e4a58 target/riscv/kvm: improve 'init_multiext_cfg' error msg
Our error message is returning the value of 'ret', which will be always
-1 in case of error, and will not be that useful:

qemu-system-riscv64: Unable to read ISA_EXT KVM register ssaia, error -1

Improve the error message by outputting 'errno' instead of 'ret'. Use
strerrorname_np() to output the error name instead of the error code.
This will give us what we need to know right away:

qemu-system-riscv64: Unable to read ISA_EXT KVM register ssaia, error code: ENOENT

Given that we're going to exit(1) in this condition instead of
attempting to recover, remove the 'kvm_riscv_destroy_scratch_vcpu()'
call.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231003132148.797921-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-10-12 12:37:38 +10:00
Clément Chigot
e216256ae9 gdbstub: replace exit calls with proper shutdown for softmmu
This replaces the exit calls by shutdown requests, ensuring a proper
cleanup of Qemu. Features like net/vhost-vdpa.c are expecting
qemu_cleanup to be called to remove their last residuals.

Signed-off-by: Clément Chigot <chigot@adacore.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231003071427.188697-6-chigot@adacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-10-12 12:36:37 +10:00
Clément Chigot
354c96069c hw/char: riscv_htif: replace exit calls with proper shutdown
This replaces the exit calls by shutdown requests, ensuring a proper
cleanup of Qemu. Otherwise, some connections like gdb could be broken
before its final packet ("Wxx") is being sent. This part, being done
inside qemu_cleanup function, can be reached only when the main loop
exits after a shutdown request.

Signed-off-by: Clément Chigot <chigot@adacore.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231003071427.188697-5-chigot@adacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-10-12 12:35:36 +10:00
Clément Chigot
215128e44b hw/misc/sifive_test.c: replace exit calls with proper shutdown
This replaces the exit calls by shutdown requests, ensuring a proper
cleanup of Qemu. Otherwise, some connections like gdb could be broken
before its final packet ("Wxx") is being sent. This part, being done
inside qemu_cleanup function, can be reached only when the main loop
exits after a shutdown request.

Signed-off-by: Clément Chigot <chigot@adacore.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231003071427.188697-4-chigot@adacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-10-12 12:34:30 +10:00
Clément Chigot
66bbe3e9b4 softmmu: pass the main loop status to gdb "Wxx" packet
gdb_exit function aims to close gdb sessions and sends the exit code of
the current execution. It's being called by qemu_cleanup once the main
loop is over.
Until now, the exit code sent was always 0. Now that hardware can
shutdown this main loop with custom exit codes, these codes must be
transfered to gdb as well.

Signed-off-by: Clément Chigot <chigot@adacore.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231003071427.188697-3-chigot@adacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-10-12 12:33:24 +10:00
Clément Chigot
0386f39b46 softmmu: add means to pass an exit code when requesting a shutdown
As of now, the exit code was either EXIT_FAILURE when a panic shutdown
was requested or EXIT_SUCCESS otherwise.
However, some hardware could want to pass more complex exit codes. Thus,
introduce a new shutdown request function allowing that.

Signed-off-by: Clément Chigot <chigot@adacore.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20231003071427.188697-2-chigot@adacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-10-12 12:32:20 +10:00