Commit Graph

1353 Commits

Author SHA1 Message Date
Paolo Bonzini
586d708c1e accel/kvm: check for KVM_CAP_MEMORY_ATTRIBUTES on vm
The exact set of available memory attributes can vary by VM.  In the
future it might vary depending on enabled capabilities, too.  Query the
extension on the VM level instead of on the KVM level, and only after
architecture-specific initialization.

Inspired by an analogous patch by Tom Dohrmann.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 19:41:30 +02:00
Paolo Bonzini
60de433d4c accel/kvm: check for KVM_CAP_MULTI_ADDRESS_SPACE on vm
KVM_CAP_MULTI_ADDRESS_SPACE used to be a global capability, but with the
introduction of AMD SEV-SNP confidential VMs, the number of address spaces
can vary by VM type.

Query the extension on the VM level instead of on the KVM level.

Inspired by an analogous patch by Tom Dohrmann.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 19:41:30 +02:00
Tom Dohrmann
64e0e63ea1 accel/kvm: check for KVM_CAP_READONLY_MEM on VM
KVM_CAP_READONLY_MEM used to be a global capability, but with the
introduction of AMD SEV-SNP confidential VMs, this extension is not
always available on all VM types [1,2].

Query the extension on the VM level instead of on the KVM level.

[1] https://patchwork.kernel.org/project/kvm/patch/20240809190319.1710470-2-seanjc@google.com/
[2] https://patchwork.kernel.org/project/kvm/patch/20240902144219.3716974-1-erbse.13@gmx.de/

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>
Link: https://lore.kernel.org/r/20240903062953.3926498-1-erbse.13@gmx.de
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 19:41:30 +02:00
Peter Xu
943c742868 KVM: Rename KVMState->nr_slots to nr_slots_max
This value used to reflect the maximum supported memslots from KVM kernel.
Rename it to be clearer.

Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240917163835.194664-5-peterx@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 19:41:30 +02:00
Peter Xu
dbdc00ba5b KVM: Rename KVMMemoryListener.nr_used_slots to nr_slots_used
This will make all nr_slots counters to be named in the same manner.

Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240917163835.194664-4-peterx@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 19:41:30 +02:00
Peter Xu
b34a908c8f KVM: Define KVM_MEMSLOTS_NUM_MAX_DEFAULT
Make the default max nr_slots a macro, it's only used when KVM reports
nothing.

Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240917163835.194664-3-peterx@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 19:41:30 +02:00
Peter Xu
5504a81261 KVM: Dynamic sized kvm memslots array
Zhiyi reported an infinite loop issue in VFIO use case.  The cause of that
was a separate discussion, however during that I found a regression of
dirty sync slowness when profiling.

Each KVMMemoryListerner maintains an array of kvm memslots.  Currently it's
statically allocated to be the max supported by the kernel.  However after
Linux commit 4fc096a99e ("KVM: Raise the maximum number of user memslots"),
the max supported memslots reported now grows to some number large enough
so that it may not be wise to always statically allocate with the max
reported.

What's worse, QEMU kvm code still walks all the allocated memslots entries
to do any form of lookups.  It can drastically slow down all memslot
operations because each of such loop can run over 32K times on the new
kernels.

Fix this issue by making the memslots to be allocated dynamically.

Here the initial size was set to 16 because it should cover the basic VM
usages, so that the hope is the majority VM use case may not even need to
grow at all (e.g. if one starts a VM with ./qemu-system-x86_64 by default
it'll consume 9 memslots), however not too large to waste memory.

There can also be even better way to address this, but so far this is the
simplest and should be already better even than before we grow the max
supported memslots.  For example, in the case of above issue when VFIO was
attached on a 32GB system, there are only ~10 memslots used.  So it could
be good enough as of now.

In the above VFIO context, measurement shows that the precopy dirty sync
shrinked from ~86ms to ~3ms after this patch applied.  It should also apply
to any KVM enabled VM even without VFIO.

NOTE: we don't have a FIXES tag for this patch because there's no real
commit that regressed this in QEMU. Such behavior existed for a long time,
but only start to be a problem when the kernel reports very large
nr_slots_max value.  However that's pretty common now (the kernel change
was merged in 2021) so we attached cc:stable because we'll want this change
to be backported to stable branches.

Cc: qemu-stable <qemu-stable@nongnu.org>
Reported-by: Zhiyi Guo <zhguo@redhat.com>
Tested-by: Zhiyi Guo <zhguo@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240917163835.194664-2-peterx@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 19:41:30 +02:00
Peter Maydell
51483f6c84 include: Move QemuLockCnt APIs to their own header
Currently the QemuLockCnt data structure and associated functions are
in the include/qemu/thread.h header.  Move them to their own
qemu/lockcnt.h.  The main reason for doing this is that it means we
can autogenerate the documentation comments into the docs/devel
documentation.

The copyright/author in the new header is drawn from lockcnt.c,
since the header changes were added in the same commit as
lockcnt.c; since neither thread.h nor lockcnt.c state an explicit
license, the standard default of GPL-2-or-later applies.

We include the new header (and the .c file, which was accidentally
omitted previously) in the "RCU" part of MAINTAINERS, since that
is where the lockcnt.rst documentation is categorized.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20240816132212.3602106-7-peter.maydell@linaro.org
2024-10-15 15:16:17 +01:00
Richard Henderson
795592fef7 accel/tcg: Use the alignment test in tlb_fill_align
When we have a tlb miss, defer the alignment check to
the new tlb_fill_align hook.  Move the existing alignment
check so that we only perform it with a tlb hit.

Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-13 11:27:05 -07:00
Richard Henderson
f168808d7d accel/tcg: Add TCGCPUOps.tlb_fill_align
Add a new callback to handle softmmu paging.  Return the page
details directly, instead of passing them indirectly to
tlb_set_page.  Handle alignment simultaneously with paging so
that faults are handled with target-specific priority.

Route all calls of the two hooks through a tlb_fill_align
function local to cputlb.c.

As yet no targets implement the new hook.
As yet cputlb.c does not use the new alignment check.

Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-13 11:27:05 -07:00
Richard Henderson
e5b063e81f include/exec/memop: Introduce memop_atomicity_bits
Split out of mmu_lookup.

Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-13 11:27:03 -07:00
Richard Henderson
c5809eee45 include/exec/memop: Rename get_alignment_bits
Rename to use "memop_" prefix, like other functions
that operate on MemOp.

Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-13 11:27:03 -07:00
Richard Henderson
49d1866a6e accel/tcg: Assert noreturn from write-only page for atomics
There should be no "just in case"; the page is already
in the tlb, and known to be not readable.

Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-13 11:27:03 -07:00
Paolo Bonzini
fe678c45d2 tcg: remove singlestep_enabled from DisasContextBase
It is used in a couple of places only, both within the same target.
Those can use the cflags just as well, so remove the separate field.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20241010083641.1785069-1-pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-13 10:04:29 -07:00
Philippe Mathieu-Daudé
25f4e71722 accel/tcg: Make page_set_flags() documentation public
Commit e505a063ba ("translate-all: Add assert_(memory|tb)_lock
annotations") states page_set_flags() is "public APIs and [is]
documented as needing them held for linux-user mode".
Document the prototype.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240822095045.72643-2-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-08 06:40:31 -07:00
Julia Suvorova
a1676bb304 kvm: Allow kvm_arch_get/put_registers to accept Error**
This is necessary to provide discernible error messages to the caller.

Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240927104743.218468-2-jusual@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-03 22:04:19 +02:00
Ani Sinha
28ed7f9761 accel/kvm: refactor dirty ring setup
Refactor setting up of dirty ring code in kvm_init() so that is can be
reused in the future patchsets.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20240912061838.4501-1-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-03 19:33:55 +02:00
Ani Sinha
67388078da kvm: refactor core virtual machine creation into its own function
Refactoring the core logic around KVM_CREATE_VM into its own separate function
so that it can be called from other functions in subsequent patches. There is
no functional change in this patch.

CC: pbonzini@redhat.com
CC: zhao1.liu@intel.com
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20240808113838.1697366-1-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-03 19:33:23 +02:00
Ani Sinha
804dfbe3ef kvm: replace fprintf with error_report()/printf() in kvm_init()
error_report() is more appropriate for error situations. Replace fprintf with
error_report() and error_printf() as appropriate. Some improvement in error
reporting also happens as a part of this change. For example:

From:
$ ./qemu-system-x86_64 --accel kvm
Could not access KVM kernel module: No such file or directory

To:
$ ./qemu-system-x86_64 --accel kvm
qemu-system-x86_64: --accel kvm: Could not access KVM kernel module: No such file or directory

CC: qemu-trivial@nongnu.org
CC: zhao1.liu@intel.com
CC: armbru@redhat.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20240828124539.62672-1-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-02 12:58:46 +02:00
Peter Maydell
173c427eb5 * Convert more Avocado tests to the new functional test framework
* Clean up assert() statements, use g_assert_not_reached() when possible
 * Improve output of the gitlab CI jobs
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmbz7xgRHHRodXRoQHJl
 ZGhhdC5jb20ACgkQLtnXdP5wLbWm6A//eVn+tzyyKCX/xdXlf7XyVpezvRpTFPOS
 HyO0WMkCf2kGmu6qYKx/fDZg86opdQzPLH2gPkuVrGOMZ0Z2630DjH0jNih8lL9Q
 J1oRX5YlU92chlzNmq59WB/j9CKd91ILtOoaPBuZkDob57yGEYVzCPqetVvF7L2+
 +rbnccrNPumGJFt035fxUGiGfgsmp28MHQzDwQdyr38uGjyNlqvqidfC8Vj1qzqP
 B7HvhGB/vkF0eHaanMt2el/ZuLKf+qeCi//F/CiXGMYnuKXyShA/Db6xvMElw1jB
 aQdwphP71IO+cxjJLaNjDHKGFstArsM/E21qlaSTBi+FTmPiwVULpVTiBmWsjhOh
 /klpdgRHf0hL2MciYKyOWgjlTocx3rEKjCTe2U5tpta9fp9CrlgMQotjDZIbohGI
 ULNahrW3Zmg4EmXDApfhYMXsQsSgWas9QSkmxzJzDp0VC7tf2Oq7RxeySrlw9MCx
 OG2qQY+rNcJ3NnpATjfAJpT1kg/IahDOCNHfLEaj1u13XVQIthVADvHwy5WxbwRP
 mwp3V9e9sUoznkM2eV646lzmkMim/WdYBF0YpT7eBs80+GoXZ0thx9IqWmwzX/ox
 rndBczVN+RY6PydJP40yljdvS7ArRT73wHqL6yKHfDpvFc4/p5mxTWwLQ3yJbXbE
 T3I+wtgfBU8=
 =FH7b
 -----END PGP SIGNATURE-----

Merge tag 'pull-request-2024-09-25' of https://gitlab.com/thuth/qemu into staging

* Convert more Avocado tests to the new functional test framework
* Clean up assert() statements, use g_assert_not_reached() when possible
* Improve output of the gitlab CI jobs

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmbz7xgRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbWm6A//eVn+tzyyKCX/xdXlf7XyVpezvRpTFPOS
# HyO0WMkCf2kGmu6qYKx/fDZg86opdQzPLH2gPkuVrGOMZ0Z2630DjH0jNih8lL9Q
# J1oRX5YlU92chlzNmq59WB/j9CKd91ILtOoaPBuZkDob57yGEYVzCPqetVvF7L2+
# +rbnccrNPumGJFt035fxUGiGfgsmp28MHQzDwQdyr38uGjyNlqvqidfC8Vj1qzqP
# B7HvhGB/vkF0eHaanMt2el/ZuLKf+qeCi//F/CiXGMYnuKXyShA/Db6xvMElw1jB
# aQdwphP71IO+cxjJLaNjDHKGFstArsM/E21qlaSTBi+FTmPiwVULpVTiBmWsjhOh
# /klpdgRHf0hL2MciYKyOWgjlTocx3rEKjCTe2U5tpta9fp9CrlgMQotjDZIbohGI
# ULNahrW3Zmg4EmXDApfhYMXsQsSgWas9QSkmxzJzDp0VC7tf2Oq7RxeySrlw9MCx
# OG2qQY+rNcJ3NnpATjfAJpT1kg/IahDOCNHfLEaj1u13XVQIthVADvHwy5WxbwRP
# mwp3V9e9sUoznkM2eV646lzmkMim/WdYBF0YpT7eBs80+GoXZ0thx9IqWmwzX/ox
# rndBczVN+RY6PydJP40yljdvS7ArRT73wHqL6yKHfDpvFc4/p5mxTWwLQ3yJbXbE
# T3I+wtgfBU8=
# =FH7b
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 25 Sep 2024 12:08:08 BST
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* tag 'pull-request-2024-09-25' of https://gitlab.com/thuth/qemu: (44 commits)
  .gitlab-ci.d: Make separate collapsible log sections for build and test
  .gitlab-ci.d: Split build and test in cross build job templates
  scripts/checkpatch.pl: emit error when using assert(false)
  tests/qtest: remove return after g_assert_not_reached()
  qom: remove return after g_assert_not_reached()
  qobject: remove return after g_assert_not_reached()
  migration: remove return after g_assert_not_reached()
  hw/ppc: remove return after g_assert_not_reached()
  hw/pci: remove return after g_assert_not_reached()
  hw/net: remove return after g_assert_not_reached()
  hw/hyperv: remove return after g_assert_not_reached()
  include/qemu: remove return after g_assert_not_reached()
  tcg/loongarch64: remove break after g_assert_not_reached()
  fpu: remove break after g_assert_not_reached()
  target/riscv: remove break after g_assert_not_reached()
  target/arm: remove break after g_assert_not_reached()
  hw/tpm: remove break after g_assert_not_reached()
  hw/scsi: remove break after g_assert_not_reached()
  hw/net: remove break after g_assert_not_reached()
  hw/acpi: remove break after g_assert_not_reached()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-09-28 12:34:38 +01:00
Pierrick Bouvier
1484a04283 accel/tcg: remove break after g_assert_not_reached()
This patch is part of a series that moves towards a consistent use of
g_assert_not_reached() rather than an ad hoc mix of different
assertion mechanisms.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20240919044641.386068-16-pierrick.bouvier@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-24 13:53:35 +02:00
Peter Maydell
a5dd9ee060 TCG plugin memory instrumentation updates
- deprecate plugins on 32 bit hosts
   - deprecate plugins with TCI
   - extend memory API to save value
   - add check-tcg tests to exercise new memory API
   - fix timer deadlock with non-changing timer
   - add basic block vector plugin to contrib
   - add cflow plugin to contrib
   - extend syscall plugin to dump write memory
   - validate ips plugin arguments meet minimum slice value
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmbsPCUACgkQ+9DbCVqe
 KkTm1gf9Hs5Zfdng0E+7sr5Dpa5F+cJOXU9QJhoTWJ4XC16CygWByqMXbyeX/kvm
 HXJEm6OnkADJhikIUCoBko8uK4/96iWSrDL0sEdzASX4SM/tXu684KeL+j9G/Ql8
 iqxm6tIjaJqmbSZRMp0l5jD+ZBltRMCzBNdK1suJR2ppQgqfKj3qMLVLtq2hhqPH
 qPgwKm44hk9BEpHYqXaivzSWN5GKCgvp5ECcFXCBhDcM+8W7Dl3Mv6X0pWOpYcKZ
 d2a5KUt+Xp7WB2jkOgJYr0zKCOQCiCjGSfm/30qRDOUnwiLRWbfamRI9jUDNUtfy
 RYR+GaspurGCwSkwICdlvj+vFp/16Q==
 =5wfo
 -----END PGP SIGNATURE-----

Merge tag 'pull-tcg-plugin-memory-190924-1' of https://gitlab.com/stsquad/qemu into staging

TCG plugin memory instrumentation updates

  - deprecate plugins on 32 bit hosts
  - deprecate plugins with TCI
  - extend memory API to save value
  - add check-tcg tests to exercise new memory API
  - fix timer deadlock with non-changing timer
  - add basic block vector plugin to contrib
  - add cflow plugin to contrib
  - extend syscall plugin to dump write memory
  - validate ips plugin arguments meet minimum slice value

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmbsPCUACgkQ+9DbCVqe
# KkTm1gf9Hs5Zfdng0E+7sr5Dpa5F+cJOXU9QJhoTWJ4XC16CygWByqMXbyeX/kvm
# HXJEm6OnkADJhikIUCoBko8uK4/96iWSrDL0sEdzASX4SM/tXu684KeL+j9G/Ql8
# iqxm6tIjaJqmbSZRMp0l5jD+ZBltRMCzBNdK1suJR2ppQgqfKj3qMLVLtq2hhqPH
# qPgwKm44hk9BEpHYqXaivzSWN5GKCgvp5ECcFXCBhDcM+8W7Dl3Mv6X0pWOpYcKZ
# d2a5KUt+Xp7WB2jkOgJYr0zKCOQCiCjGSfm/30qRDOUnwiLRWbfamRI9jUDNUtfy
# RYR+GaspurGCwSkwICdlvj+vFp/16Q==
# =5wfo
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 19 Sep 2024 15:58:45 BST
# gpg:                using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* tag 'pull-tcg-plugin-memory-190924-1' of https://gitlab.com/stsquad/qemu:
  contrib/plugins: avoid hanging program
  plugins: add option to dump write argument to syscall plugin
  plugins: add plugin API to read guest memory
  contrib/plugins: Add a plugin to generate basic block vectors
  util/timer: avoid deadlock when shutting down
  tests/tcg: add a system test to check memory instrumentation
  tests/tcg: ensure s390x-softmmu output redirected
  tests/tcg: only read/write 64 bit words on 64 bit systems
  tests/tcg: clean up output of memory system test
  tests/tcg/multiarch: add test for plugin memory access
  tests/tcg/plugins/mem: add option to print memory accesses
  tests/tcg: allow to check output of plugins
  tests/tcg: add mechanism to run specific tests with plugins
  plugins: extend API to get latest memory value accessed
  plugins: save value during memory accesses
  contrib/plugins: control flow plugin
  deprecation: don't enable TCG plugins by default with TCI
  deprecation: don't enable TCG plugins by default on 32 bit hosts

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-09-19 21:31:35 +01:00
Pierrick Bouvier
b709da5d29 plugins: save value during memory accesses
Different code paths handle memory accesses:
- tcg generated code
- load/store helpers
- atomic helpers

This value is saved in cpu->neg.plugin_mem_value_{high,low}. Values are
written only for accessed word size (upper bits are not set).

Atomic operations are doing read/write at the same time, so we generate
two memory callbacks instead of one, to allow plugins to access distinct
values.

For now, we can have access only up to 128 bits, thus split this in two
64 bits words. When QEMU will support wider operations, we'll be able to
reconsider this.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240724194708.1843704-2-pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240916085400.1046925-5-alex.bennee@linaro.org>
2024-09-19 15:58:01 +01:00
Peter Maydell
c4d16d4168 kvm: Remove unreachable code in kvm_dirty_ring_reaper_thread()
The code at the tail end of the loop in kvm_dirty_ring_reaper_thread()
is unreachable, because there is no way for execution to leave the
loop. Replace it with a g_assert_not_reached().

(The code has always been unreachable, right from the start
when the function was added in commit b4420f198dd8.)

Resolves: Coverity CID 1547687
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240815131206.3231819-3-peter.maydell@linaro.org
2024-09-19 13:13:58 +01:00
Peter Maydell
28d2d03c9c kvm: Make 'mmap_size' be 'int' in kvm_init_vcpu(), do_kvm_destroy_vcpu()
In kvm_init_vcpu()and do_kvm_destroy_vcpu(), the return value from
  kvm_ioctl(..., KVM_GET_VCPU_MMAP_SIZE, ...)
is an 'int', but we put it into a 'long' logal variable mmap_size.
Coverity then complains that there might be a truncation when we copy
that value into the 'int ret' which we use for returning a value in
an error-exit codepath. This can't ever actually overflow because
the value was in an 'int' to start with, but it makes more sense
to use 'int' for mmap_size so we don't do the widen-then-narrow
sequence in the first place.

Resolves: Coverity CID 1547515
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240815131206.3231819-2-peter.maydell@linaro.org
2024-09-19 13:13:58 +01:00
Danny Canter
d54ffa54fb hvf: arm: Implement and use hvf_get_physical_address_range
This patch's main focus is to use the previously added
hvf_get_physical_address_range to inform VM creation
about the IPA size we need for the VM, so we can extend
the default 36b IPA size and support VMs with 64+GB of
RAM. This is done by freezing the memory map, computing
the highest GPA and then (depending on if the platform
supports an IPA size that large) telling the kernel to
use a size >= for the VM. In pursuit of this a couple of
things related to how we handle the physical address range
we expose to guests were altered, but for an explanation of
what we were doing:

Today, to get the IPA size we were reading id_aa64mmfr0_el1's
PARange field from a newly made vcpu. Unfortunately, HVF just
returns the hosts PARange directly for the initial value and
not the IPA size that will actually back the VM, so we believe
we have much more address space than we actually do today it seems.

Starting in macOS 13.0 some APIs were introduced to be able to
query the maximum IPA size the kernel supports, and to set the IPA
size for a given VM. However, this still has a couple of issues
on < macOS 15. Up until macOS 15 (and if the hardware supported
it) the max IPA size was 39 bits which is not a valid PARange
value, so we can't clamp down what we advertise in the vcpu's
id_aa64mmfr0_el1 to our IPA size. Starting in macOS 15 however,
the maximum IPA size is 40 bits (if it's supported in the hardware
as well) which is also a valid PARange value so we can set our IPA
size to the maximum as well as clamp down the PARange we advertise
to the guest. This allows VMs with 64+ GB of RAM and should fix the
oddness of the PARange situation as well.

Signed-off-by: Danny Canter <danny_canter@apple.com>
Message-id: 20240828111552.93482-4-danny_canter@apple.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-09-13 15:31:47 +01:00
Danny Canter
2c760670af hvf: Split up hv_vm_create logic per arch
This is preliminary work to split up hv_vm_create
logic per platform so we can support creating VMs
with > 64GB of RAM on Apple Silicon machines. This
is done via ARM HVF's hv_vm_config_create() (and
other APIs that modify this config that will be
coming in future patches). This should have no
behavioral difference at all as hv_vm_config_create()
just assigns the same default values as if you just
passed NULL to the function.

Signed-off-by: Danny Canter <danny_canter@apple.com>
Message-id: 20240828111552.93482-3-danny_canter@apple.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-09-13 15:31:46 +01:00
Johannes Stoelp
6a8703aecb kvm: Use 'unsigned long' for request argument in functions wrapping ioctl()
Change the data type of the ioctl _request_ argument from 'int' to
'unsigned long' for the various accel/kvm functions which are
essentially wrappers around the ioctl() syscall.

The correct type for ioctl()'s 'request' argument is confused:
 * POSIX defines the request argument as 'int'
 * glibc uses 'unsigned long' in the prototype in sys/ioctl.h
 * the glibc info documentation uses 'int'
 * the Linux manpage uses 'unsigned long'
 * the Linux implementation of the syscall uses 'unsigned int'

If we wrap ioctl() with another function which uses 'int' as the
type for the request argument, then requests with the 0x8000_0000
bit set will be sign-extended when the 'int' is cast to
'unsigned long' for the call to ioctl().

On x86_64 one such example is the KVM_IRQ_LINE_STATUS request.
Bit requests with the _IOC_READ direction bit set, will have the high
bit set.

Fortunately the Linux Kernel truncates the upper 32bit of the request
on 64bit machines (because it uses 'unsigned int', and see also Linus
Torvalds' comments in
  https://sourceware.org/bugzilla/show_bug.cgi?id=14362 )
so this doesn't cause active problems for us.  However it is more
consistent to follow the glibc ioctl() prototype when we define
functions that are essentially wrappers around ioctl().

This resolves a Coverity issue where it points out that in
kvm_get_xsave() we assign a value (KVM_GET_XSAVE or KVM_GET_XSAVE2)
to an 'int' variable which can't hold it without overflow.

Resolves: Coverity CID 1547759
Signed-off-by: Johannes Stoelp <johannes.stoelp@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20240815122747.3053871-1-peter.maydell@linaro.org
[PMM: Rebased patch, adjusted commit message, included note about
 Coverity fix, updated the type of the local var in kvm_get_xsave,
 updated the comment in the KVMState struct definition]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-09-13 15:31:46 +01:00
Peter Maydell
da7510b720 accel/tcg: Remove dead code from rr_cpu_thread_fn()
The main loop in rr_cpu_thread_fn() can never terminate, so the
code at the end of the function to clean up the RCU subsystem is
dead code. Replace it with g_assert_not_reached().

(This is different from the other cpu_thread_fn for e.g. MTTCG or
for the KVM accelerator -- those can exit, if the vCPU they
are responsible for is unplugged. But the RR cpu thread fn
handles all CPUs in the system in a round-robin way, so even
if one is unplugged it keeps looping.)

Resolves: Coverity CID 1547782
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240815143634.3413679-1-peter.maydell@linaro.org
2024-09-05 13:12:36 +01:00
Nicholas Piggin
94962ff00d Revert "replay: stop us hanging in rr_wait_io_event"
This reverts commit 1f881ea4a4.

That commit causes reverse_debugging.py test failures, and does
not seem to solve the root cause of the problem x86-64 still
hangs in record/replay tests.

The problem with short-cutting the iowait that was taken during
record phase is that related events will not get consumed at the
same points (e.g., reading the clock).

A hang with zero icount always seems to be a symptom of an earlier
problem that has caused the recording to become out of synch with
the execution and consumption of events by replay.

Acked-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20240813050638.446172-6-npiggin@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240813202329.1237572-14-alex.bennee@linaro.org>
2024-08-16 14:04:19 +01:00
Salil Mehta
036144cff2 accel/kvm/kvm-all: Fixes the missing break in vCPU unpark logic
Loop should exit prematurely on successfully finding out the parked vCPU (struct
KVMParkedVcpu) in the 'struct KVMState' maintained 'kvm_parked_vcpus' list of
parked vCPUs.

Fixes: Coverity CID 1558552
Fixes: 08c3286822 ("accel/kvm: Extract common KVM vCPU {creation,parking} code")
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20240725145132.99355-1-salil.mehta@huawei.com
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <CAFEAcA-3_d1c7XSXWkFubD-LsW5c5i95e6xxV09r2C9yGtzcdA@mail.gmail.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-08-01 10:15:03 +01:00
Harsh Prateek Bora
c6a3d7bc9e accel/kvm: Introduce kvm_create_and_park_vcpu() helper
There are distinct helpers for creating and parking a KVM vCPU.
However, there can be cases where a platform needs to create and
immediately park the vCPU during early stages of vcpu init which
can later be reused when vcpu thread gets initialized. This would
help detect failures with kvm_create_vcpu at an early stage.

Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-07-26 09:21:06 +10:00
Richard Henderson
6410f877f5 Misc HW patch queue
- Restrict probe_access*() functions to TCG (Phil)
 - Extract do_invalidate_device_tlb from vtd_process_device_iotlb_desc (Clément)
 - Fixes in Loongson IPI model (Bibo & Phil)
 - Make docs/interop/firmware.json compatible with qapi-gen.py script (Thomas)
 - Correct MPC I2C MMIO region size (Zoltan)
 - Remove useless cast in Loongson3 Virt machine (Yao)
 - Various uses of range overlap API (Yao)
 - Use ERRP_GUARD macro in nubus_virtio_mmio_realize (Zhao)
 - Use DMA memory API in Goldfish UART model (Phil)
 - Expose fifo8_pop_buf and introduce fifo8_drop (Phil)
 - MAINTAINERS updates (Zhao, Phil)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmagFF8ACgkQ4+MsLN6t
 wN5bKg//f5TwUhsy2ff0FJpHheDOj/9Gc2nZ1U/Fp0E5N3sz3A7MGp91wye6Xwi3
 XG34YN9LK1AVzuCdrEEs5Uaxs1ZS1R2mV+fZaGHwYYxPDdnXxGyp/2Q0eyRxzbcN
 zxE2hWscYSZbPVEru4HvZJKfp4XnE1cqA78fJKMAdtq0IPq38tmQNRlJ+gWD9dC6
 ZUHXPFf3DnucvVuwqb0JYO/E+uJpcTtgR6pc09Xtv/HFgMiS0vKZ1I/6LChqAUw9
 eLMpD/5V2naemVadJe98/dL7gIUnhB8GTjsb4ioblG59AO/uojutwjBSQvFxBUUw
 U5lX9OSn20ouwcGiqimsz+5ziwhCG0R6r1zeQJFqUxrpZSscq7NQp9ygbvirm+wS
 edLc8yTPf4MtYOihzPP9jLPcXPZjEV64gSnJISDDFYWANCrysX3suaFEOuVYPl+s
 ZgQYRVSSYOYHgNqBSRkPKKVUxskSQiqLY3SfGJG4EA9Ktt5lD1cLCXQxhdsqphFm
 Ws3zkrVVL0EKl4v/4MtCgITIIctN1ZJE9u3oPJjASqSvK6EebFqAJkc2SidzKHz0
 F3iYX2AheWNHCQ3HFu023EvFryjlxYk95fs2f6Uj2a9yVbi813qsvd3gcZ8t0kTT
 +dmQwpu1MxjzZnA6838R6OCMnC+UpMPqQh3dPkU/5AF2fc3NnN8=
 =J/I2
 -----END PGP SIGNATURE-----

Merge tag 'hw-misc-20240723' of https://github.com/philmd/qemu into staging

Misc HW patch queue

- Restrict probe_access*() functions to TCG (Phil)
- Extract do_invalidate_device_tlb from vtd_process_device_iotlb_desc (Clément)
- Fixes in Loongson IPI model (Bibo & Phil)
- Make docs/interop/firmware.json compatible with qapi-gen.py script (Thomas)
- Correct MPC I2C MMIO region size (Zoltan)
- Remove useless cast in Loongson3 Virt machine (Yao)
- Various uses of range overlap API (Yao)
- Use ERRP_GUARD macro in nubus_virtio_mmio_realize (Zhao)
- Use DMA memory API in Goldfish UART model (Phil)
- Expose fifo8_pop_buf and introduce fifo8_drop (Phil)
- MAINTAINERS updates (Zhao, Phil)

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmagFF8ACgkQ4+MsLN6t
# wN5bKg//f5TwUhsy2ff0FJpHheDOj/9Gc2nZ1U/Fp0E5N3sz3A7MGp91wye6Xwi3
# XG34YN9LK1AVzuCdrEEs5Uaxs1ZS1R2mV+fZaGHwYYxPDdnXxGyp/2Q0eyRxzbcN
# zxE2hWscYSZbPVEru4HvZJKfp4XnE1cqA78fJKMAdtq0IPq38tmQNRlJ+gWD9dC6
# ZUHXPFf3DnucvVuwqb0JYO/E+uJpcTtgR6pc09Xtv/HFgMiS0vKZ1I/6LChqAUw9
# eLMpD/5V2naemVadJe98/dL7gIUnhB8GTjsb4ioblG59AO/uojutwjBSQvFxBUUw
# U5lX9OSn20ouwcGiqimsz+5ziwhCG0R6r1zeQJFqUxrpZSscq7NQp9ygbvirm+wS
# edLc8yTPf4MtYOihzPP9jLPcXPZjEV64gSnJISDDFYWANCrysX3suaFEOuVYPl+s
# ZgQYRVSSYOYHgNqBSRkPKKVUxskSQiqLY3SfGJG4EA9Ktt5lD1cLCXQxhdsqphFm
# Ws3zkrVVL0EKl4v/4MtCgITIIctN1ZJE9u3oPJjASqSvK6EebFqAJkc2SidzKHz0
# F3iYX2AheWNHCQ3HFu023EvFryjlxYk95fs2f6Uj2a9yVbi813qsvd3gcZ8t0kTT
# +dmQwpu1MxjzZnA6838R6OCMnC+UpMPqQh3dPkU/5AF2fc3NnN8=
# =J/I2
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 24 Jul 2024 06:36:47 AM AEST
# gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]

* tag 'hw-misc-20240723' of https://github.com/philmd/qemu: (28 commits)
  MAINTAINERS: Add myself as a reviewer of machine core
  MAINTAINERS: Cover guest-agent in QAPI schema
  util/fifo8: Introduce fifo8_drop()
  util/fifo8: Expose fifo8_pop_buf()
  util/fifo8: Rename fifo8_pop_buf() -> fifo8_pop_bufptr()
  util/fifo8: Rename fifo8_peek_buf() -> fifo8_peek_bufptr()
  util/fifo8: Use fifo8_reset() in fifo8_create()
  util/fifo8: Fix style
  chardev/char-fe: Document returned value on error
  hw/char/goldfish: Use DMA memory API
  hw/nubus/virtio-mmio: Fix missing ERRP_GUARD() in realize handler
  dump: make range overlap check more readable
  crypto/block-luks: make range overlap check more readable
  system/memory_mapping: make range overlap check more readable
  sparc/ldst_helper: make range overlap check more readable
  cxl/mailbox: make range overlap check more readable
  util/range: Make ranges_overlap() return bool
  hw/mips/loongson3_virt: remove useless type cast
  hw/i2c/mpc_i2c: Fix mmio region size
  docs/interop/firmware.json: convert "Example" section
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-24 15:39:43 +10:00
Richard Henderson
43f59bf765 * target/i386/kvm: support for reading RAPL MSRs using a helper program
* hpet: emulation improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaelL4UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMXoQf+K77lNlHLETSgeeP3dr7yZPOmXjjN
 qFY/18jiyLw7MK1rZC09fF+n9SoaTH8JDKupt0z9M1R10HKHLIO04f8zDE+dOxaE
 Rou3yKnlTgFPGSoPPFr1n1JJfxtYlLZRoUzaAcHUaa4W7JR/OHJX90n1Rb9MXeDk
 jV6P0v1FWtIDdM6ERm9qBGoQdYhj6Ra2T4/NZKJFXwIhKEkxgu4yO7WXv8l0dxQz
 jE4fKotqAvrkYW1EsiVZm30lw/19duhvGiYeQXoYhk8KKXXjAbJMblLITSNWsCio
 3l6Uud/lOxekkJDAq5nH3H9hCBm0WwvwL+0vRf3Mkr+/xRGvrhtmUdp8NQ==
 =00mB
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* target/i386/kvm: support for reading RAPL MSRs using a helper program
* hpet: emulation improvements

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaelL4UHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroMXoQf+K77lNlHLETSgeeP3dr7yZPOmXjjN
# qFY/18jiyLw7MK1rZC09fF+n9SoaTH8JDKupt0z9M1R10HKHLIO04f8zDE+dOxaE
# Rou3yKnlTgFPGSoPPFr1n1JJfxtYlLZRoUzaAcHUaa4W7JR/OHJX90n1Rb9MXeDk
# jV6P0v1FWtIDdM6ERm9qBGoQdYhj6Ra2T4/NZKJFXwIhKEkxgu4yO7WXv8l0dxQz
# jE4fKotqAvrkYW1EsiVZm30lw/19duhvGiYeQXoYhk8KKXXjAbJMblLITSNWsCio
# 3l6Uud/lOxekkJDAq5nH3H9hCBm0WwvwL+0vRf3Mkr+/xRGvrhtmUdp8NQ==
# =00mB
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 03:19:58 AM AEST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
  hpet: avoid timer storms on periodic timers
  hpet: store full 64-bit target value of the counter
  hpet: accept 64-bit reads and writes
  hpet: place read-only bits directly in "new_val"
  hpet: remove unnecessary variable "index"
  hpet: ignore high bits of comparator in 32-bit mode
  hpet: fix and cleanup persistence of interrupt status
  Add support for RAPL MSRs in KVM/Qemu
  tools: build qemu-vmsr-helper
  qio: add support for SO_PEERCRED for socket channel
  target/i386: do not crash if microvm guest uses SGX CPUID leaves

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-24 11:25:40 +10:00
Richard Henderson
5885bcef3d virtio,pci,pc: features,fixes
pci: Initial support for SPDM Responders
 cxl: Add support for scan media, feature commands, device patrol scrub
     control, DDR5 ECS control, firmware updates
 virtio: in-order support
 virtio-net: support for SR-IOV emulation (note: known issues on s390,
                                           might get reverted if not fixed)
 smbios: memory device size is now configurable per Machine
 cpu: architecture agnostic code to support vCPU Hotplug
 
 Fixes, cleanups all over the place.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmae9l8PHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRp8fYH/impBH9nViO/WK48io4mLSkl0EUL8Y/xrMvH
 zKFCKaXq8D96VTt1Z4EGKYgwG0voBKZaCEKYU/0ARGnSlSwxINQ8ROCnBWMfn2sx
 yQt08EXVMznNLtXjc6U5zCoCi6SaV85GH40No3MUFXBQt29ZSlFqO/fuHGZHYBwS
 wuVKvTjjNF4EsGt3rS4Qsv6BwZWMM+dE6yXpKWk68kR8IGp+6QGxkMbWt9uEX2Md
 VuemKVnFYw0XGCGy5K+ZkvoA2DGpEw0QxVSOMs8CI55Oc9SkTKz5fUSzXXGo1if+
 M1CTjOPJu6pMym6gy6XpFa8/QioDA/jE2vBQvfJ64TwhJDV159s=
 =k8e9
 -----END PGP SIGNATURE-----

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pci,pc: features,fixes

pci: Initial support for SPDM Responders
cxl: Add support for scan media, feature commands, device patrol scrub
    control, DDR5 ECS control, firmware updates
virtio: in-order support
virtio-net: support for SR-IOV emulation (note: known issues on s390,
                                          might get reverted if not fixed)
smbios: memory device size is now configurable per Machine
cpu: architecture agnostic code to support vCPU Hotplug

Fixes, cleanups all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmae9l8PHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp8fYH/impBH9nViO/WK48io4mLSkl0EUL8Y/xrMvH
# zKFCKaXq8D96VTt1Z4EGKYgwG0voBKZaCEKYU/0ARGnSlSwxINQ8ROCnBWMfn2sx
# yQt08EXVMznNLtXjc6U5zCoCi6SaV85GH40No3MUFXBQt29ZSlFqO/fuHGZHYBwS
# wuVKvTjjNF4EsGt3rS4Qsv6BwZWMM+dE6yXpKWk68kR8IGp+6QGxkMbWt9uEX2Md
# VuemKVnFYw0XGCGy5K+ZkvoA2DGpEw0QxVSOMs8CI55Oc9SkTKz5fUSzXXGo1if+
# M1CTjOPJu6pMym6gy6XpFa8/QioDA/jE2vBQvfJ64TwhJDV159s=
# =k8e9
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 10:16:31 AM AEST
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (61 commits)
  hw/nvme: Add SPDM over DOE support
  backends: Initial support for SPDM socket support
  hw/pci: Add all Data Object Types defined in PCIe r6.0
  tests/acpi: Add expected ACPI AML files for RISC-V
  tests/qtest/bios-tables-test.c: Enable basic testing for RISC-V
  tests/acpi: Add empty ACPI data files for RISC-V
  tests/qtest/bios-tables-test.c: Remove the fall back path
  tests/acpi: update expected DSDT blob for aarch64 and microvm
  acpi/gpex: Create PCI link devices outside PCI root bridge
  tests/acpi: Allow DSDT acpi table changes for aarch64
  hw/riscv/virt-acpi-build.c: Update the HID of RISC-V UART
  hw/riscv/virt-acpi-build.c: Add namespace devices for PLIC and APLIC
  virtio-iommu: Add trace point on virtio_iommu_detach_endpoint_from_domain
  hw/vfio/common: Add vfio_listener_region_del_iommu trace event
  virtio-iommu: Remove the end point on detach
  virtio-iommu: Free [host_]resv_ranges on unset_iommu_devices
  virtio-iommu: Remove probe_done
  Revert "virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged"
  gdbstub: Add helper function to unregister GDB register space
  physmem: Add helper function to destroy CPU AddressSpace
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-24 09:32:04 +10:00
Philippe Mathieu-Daudé
99481a0988 accel: Restrict probe_access*() functions to TCG
This API is specific to TCG (already handled by hardware
accelerators), so restrict it with #ifdef'ry. Remove
unnecessary stubs.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240529155918.6221-1-philmd@linaro.org>
2024-07-23 18:08:44 +02:00
Richard Henderson
71bce0e1fb accel/tcg: Export set/clear_helper_retaddr
target/arm: Use set_helper_retaddr for dc_zva, sve and sme
 target/ppc: Tidy dcbz helpers
 target/ppc: Use set_helper_retaddr for dcbz
 target/s390x: Use set_helper_retaddr in mem_helper.c
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmafJKIdHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+FBAf7Bup+karxeGHZx2rN
 cPeF248bcCWTxBWHK7dsYze4KqzsrlNIJlPeOKErU2bbbRDZGhOp1/N95WVz+P8V
 6Ny63WTsAYkaFWKxE6Jf0FWJlGw92btk75pTV2x/TNZixg7jg0vzVaYkk0lTYc5T
 m5e4WycYEbzYm0uodxI09i+wFvpd+7WCnl6xWtlJPWZENukvJ36Ss43egFMDtuMk
 vTJuBkS9wpwZ9MSi6EY6M+Raieg8bfaotInZeDvE/yRPNi7CwrA7Dgyc1y626uBA
 joGkYRLzhRgvT19kB3bvFZi1AXa0Pxr+j0xJqwspP239Gq5qezlS5Bv/DrHdmGHA
 jaqSwg==
 =XgUE
 -----END PGP SIGNATURE-----

Merge tag 'pull-tcg-20240723' of https://gitlab.com/rth7680/qemu into staging

accel/tcg: Export set/clear_helper_retaddr
target/arm: Use set_helper_retaddr for dc_zva, sve and sme
target/ppc: Tidy dcbz helpers
target/ppc: Use set_helper_retaddr for dcbz
target/s390x: Use set_helper_retaddr in mem_helper.c

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmafJKIdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+FBAf7Bup+karxeGHZx2rN
# cPeF248bcCWTxBWHK7dsYze4KqzsrlNIJlPeOKErU2bbbRDZGhOp1/N95WVz+P8V
# 6Ny63WTsAYkaFWKxE6Jf0FWJlGw92btk75pTV2x/TNZixg7jg0vzVaYkk0lTYc5T
# m5e4WycYEbzYm0uodxI09i+wFvpd+7WCnl6xWtlJPWZENukvJ36Ss43egFMDtuMk
# vTJuBkS9wpwZ9MSi6EY6M+Raieg8bfaotInZeDvE/yRPNi7CwrA7Dgyc1y626uBA
# joGkYRLzhRgvT19kB3bvFZi1AXa0Pxr+j0xJqwspP239Gq5qezlS5Bv/DrHdmGHA
# jaqSwg==
# =XgUE
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 01:33:54 PM AEST
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate]

* tag 'pull-tcg-20240723' of https://gitlab.com/rth7680/qemu:
  target/riscv: Simplify probing in vext_ldff
  target/s390x: Use set/clear_helper_retaddr in mem_helper.c
  target/s390x: Use user_or_likely in access_memmove
  target/s390x: Use user_or_likely in do_access_memset
  target/ppc: Improve helper_dcbz for user-only
  target/ppc: Merge helper_{dcbz,dcbzep}
  target/ppc: Split out helper_dbczl for 970
  target/ppc: Hoist dcbz_size out of dcbz_common
  target/ppc/mem_helper.c: Remove a conditional from dcbz_common()
  target/arm: Use set/clear_helper_retaddr in SVE and SME helpers
  target/arm: Use set/clear_helper_retaddr in helper-a64.c
  accel/tcg: Move {set,clear}_helper_retaddr to cpu_ldst.h

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23 15:19:39 +10:00
Richard Henderson
3d75856d1a accel/tcg: Move {set,clear}_helper_retaddr to cpu_ldst.h
Use of these in helpers goes hand-in-hand with tlb_vaddr_to_host
and other probing functions.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-23 10:56:04 +10:00
Salil Mehta
08c3286822 accel/kvm: Extract common KVM vCPU {creation,parking} code
KVM vCPU creation is done once during the vCPU realization when Qemu vCPU thread
is spawned. This is common to all the architectures as of now.

Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the
corresponding KVM vCPU object in the Host KVM is not destroyed as KVM doesn't
support vCPU removal. Therefore, its representative KVM vCPU object/context in
Qemu is parked.

Refactor architecture common logic so that some APIs could be reused by vCPU
Hotplug code of some architectures likes ARM, Loongson etc. Update new/old APIs
with trace events. New APIs qemu_{create,park,unpark}_vcpu() can be externally
called. No functional change is intended here.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Xianglai Li <lixianglai@loongson.cn>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240716111502.202344-2-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-22 20:15:41 -04:00
Anthony Harivel
0418f90809 Add support for RAPL MSRs in KVM/Qemu
Starting with the "Sandy Bridge" generation, Intel CPUs provide a RAPL
interface (Running Average Power Limit) for advertising the accumulated
energy consumption of various power domains (e.g. CPU packages, DRAM,
etc.).

The consumption is reported via MSRs (model specific registers) like
MSR_PKG_ENERGY_STATUS for the CPU package power domain. These MSRs are
64 bits registers that represent the accumulated energy consumption in
micro Joules. They are updated by microcode every ~1ms.

For now, KVM always returns 0 when the guest requests the value of
these MSRs. Use the KVM MSR filtering mechanism to allow QEMU handle
these MSRs dynamically in userspace.

To limit the amount of system calls for every MSR call, create a new
thread in QEMU that updates the "virtual" MSR values asynchronously.

Each vCPU has its own vMSR to reflect the independence of vCPUs. The
thread updates the vMSR values with the ratio of energy consumed of
the whole physical CPU package the vCPU thread runs on and the
thread's utime and stime values.

All other non-vCPU threads are also taken into account. Their energy
consumption is evenly distributed among all vCPUs threads running on
the same physical CPU package.

To overcome the problem that reading the RAPL MSR requires priviliged
access, a socket communication between QEMU and the qemu-vmsr-helper is
mandatory. You can specified the socket path in the parameter.

This feature is activated with -accel kvm,rapl=true,path=/path/sock.sock

Actual limitation:
- Works only on Intel host CPU because AMD CPUs are using different MSR
  adresses.

- Only the Package Power-Plane (MSR_PKG_ENERGY_STATUS) is reported at
  the moment.

Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240522153453.1230389-4-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-22 19:19:37 +02:00
Pierrick Bouvier
94ae227e15 plugins: fix mem callback array size
data was correctly copied, but size of array was not set
(g_array_sized_new only reserves memory, but does not set size).

As a result, callbacks were not called for code path relying on
plugin_register_vcpu_mem_cb().

Found when trying to trigger mem access callbacks for atomic
instructions.

Reviewed-by: Xingtao Yao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240706191335.878142-2-pierrick.bouvier@linaro.org>
Message-Id: <20240718094523.1198645-6-alex.bennee@linaro.org>
2024-07-22 09:37:56 +01:00
Zhao Liu
44fd9cf638 accel/kvm/kvm-all: Fix superfluous trailing semicolon
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-07-17 14:04:15 +03:00
Peter Maydell
de680286b5 accel/tcg: Make cpu_exec_interrupt hook mandatory
The TCGCPUOps::cpu_exec_interrupt hook is currently not mandatory; if
it is left NULL then we treat it as if it had returned false. However
since pretty much every architecture needs to handle interrupts,
almost every target we have provides the hook. The one exception is
Tricore, which doesn't currently implement the architectural
interrupt handling.

Add a "do nothing" implementation of cpu_exec_hook for Tricore,
assert on startup that the CPU does provide the hook, and remove
the runtime NULL check before calling it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20240712113949.4146855-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-16 20:04:08 +02:00
Peter Maydell
0487c63180 accel/tcg: Make TCGCPUOps::cpu_exec_halt mandatory
Now that all targets set TCGCPUOps::cpu_exec_halt, we can make it
mandatory and remove the fallback handling that calls cpu_has_work.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-11 11:41:34 +01:00
Akihiko Odaki
f64933c8ae hvf: Drop ifdef for macOS versions older than 12.0
macOS versions older than 12.0 are no longer supported.

docs/about/build-platforms.rst says:
> Support for the previous major version will be dropped 2 years after
> the new major version is released or when the vendor itself drops
> support, whichever comes first.

macOS 12.0 was released 2021:
https://www.apple.com/newsroom/2021/10/macos-monterey-is-now-available/

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240629-macos-v1-1-6e70a6b700a0@daynix.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-07-02 06:58:48 +02:00
Matheus Tavares Bernardino
2b5d12b685 cpu: fix memleak of 'halt_cond' and 'thread'
Since a4c2735f35 (cpu: move Qemu[Thread|Cond] setup into common code,
2024-05-30) these fields are now allocated at cpu_common_initfn(). So
let's make sure we also free them at cpu_common_finalize().

Furthermore, the code also frees these on round robin, but we missed
'halt_cond'.

Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-06-30 19:51:44 +03:00
Max Chou
fce3d48038 accel/tcg: Avoid unnecessary call overhead from qemu_plugin_vcpu_mem_cb
If there are not any QEMU plugin memory callback functions, checking
before calling the qemu_plugin_vcpu_mem_cb function can reduce the
function call overhead.

Signed-off-by: Max Chou <max.chou@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Message-Id: <20240613175122.1299212-2-max.chou@sifive.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240620152220.2192768-13-alex.bennee@linaro.org>
2024-06-24 10:15:23 +01:00
Pierrick Bouvier
ca7d7f4276 plugins: fix inject_mem_cb rw masking
These are not booleans, but masks.
Issue found by Richard Henderson.

Fixes: f86fd4d872 ("plugins: distinct types for callbacks")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240612195147.93121-3-pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240620152220.2192768-12-alex.bennee@linaro.org>
2024-06-24 10:15:16 +01:00
Pierrick Bouvier
d4d133a34b qtest: move qtest_{get, set}_virtual_clock to accel/qtest/qtest.c
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240530220610.1245424-5-pierrick.bouvier@linaro.org>
Message-Id: <20240620152220.2192768-8-alex.bennee@linaro.org>
2024-06-24 10:14:56 +01:00
Alex Bennée
e83e386200 qtest: use cpu interface in qtest_clock_warp
This generalises the qtest_clock_warp code to use the AccelOps
handlers for updating its own sense of time. This will make the next
patch which moves the warp code closer to pure code motion.

From: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240530220610.1245424-3-pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240620152220.2192768-6-alex.bennee@linaro.org>
2024-06-24 10:14:39 +01:00