Commit Graph

849 Commits

Author SHA1 Message Date
Peter Xu
ae4e46cd20 KVM: Dynamic sized kvm memslots array
Zhiyi reported an infinite loop issue in VFIO use case.  The cause of that
was a separate discussion, however during that I found a regression of
dirty sync slowness when profiling.

Each KVMMemoryListerner maintains an array of kvm memslots.  Currently it's
statically allocated to be the max supported by the kernel.  However after
Linux commit 4fc096a99e ("KVM: Raise the maximum number of user memslots"),
the max supported memslots reported now grows to some number large enough
so that it may not be wise to always statically allocate with the max
reported.

What's worse, QEMU kvm code still walks all the allocated memslots entries
to do any form of lookups.  It can drastically slow down all memslot
operations because each of such loop can run over 32K times on the new
kernels.

Fix this issue by making the memslots to be allocated dynamically.

Here the initial size was set to 16 because it should cover the basic VM
usages, so that the hope is the majority VM use case may not even need to
grow at all (e.g. if one starts a VM with ./qemu-system-x86_64 by default
it'll consume 9 memslots), however not too large to waste memory.

There can also be even better way to address this, but so far this is the
simplest and should be already better even than before we grow the max
supported memslots.  For example, in the case of above issue when VFIO was
attached on a 32GB system, there are only ~10 memslots used.  So it could
be good enough as of now.

In the above VFIO context, measurement shows that the precopy dirty sync
shrinked from ~86ms to ~3ms after this patch applied.  It should also apply
to any KVM enabled VM even without VFIO.

NOTE: we don't have a FIXES tag for this patch because there's no real
commit that regressed this in QEMU. Such behavior existed for a long time,
but only start to be a problem when the kernel reports very large
nr_slots_max value.  However that's pretty common now (the kernel change
was merged in 2021) so we attached cc:stable because we'll want this change
to be backported to stable branches.

Cc: qemu-stable <qemu-stable@nongnu.org>
Reported-by: Zhiyi Guo <zhguo@redhat.com>
Tested-by: Zhiyi Guo <zhguo@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240917163835.194664-2-peterx@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5504a81261)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context fixup in accel/kvm/kvm-all.c and accel/kvm/trace-events;
 also remove now-unused local variable `KVMState *s` in kvm-all.c:kvm_log_sync_global() )
2024-11-08 13:02:21 +03:00
Paolo Bonzini
b67924a048 accel/tcg: Revert mapping of PCREL translation block to multiple virtual addresses
This is causing regressions that have not been analyzed yet.  Revert the
change on stable branches.

Cc: qemu-stable@nongnu.org
Cc: Michael Tokarev <mjt@tls.msk.ru>
Related: https://gitlab.com/qemu-project/qemu/-/issues/2092
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-01-25 19:14:18 +03:00
Nicholas Piggin
e6fdfb8433 accel/tcg: mttcg remove false-negative halted assertion
mttcg asserts that an execution ending with EXCP_HALTED must have
cpu->halted. However between the event or instruction that sets
cpu->halted and requests exit and the assertion here, an
asynchronous event could clear cpu->halted.

This leads to crashes running AIX on ppc/pseries because it uses
H_CEDE/H_PROD hcalls, where H_CEDE sets self->halted = 1 and
H_PROD sets other cpu->halted = 0 and kicks it.

H_PROD could be turned into an interrupt to wake, but several other
places in ppc, sparc, and semihosting follow what looks like a similar
pattern setting halted = 0 directly. So remove this assertion.

Reported-by: Ivan Warren <ivan@vmfacility.fr>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20230829010658.8252-1-npiggin@gmail.com>
[rth: Keep the case label and adjust the comment.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 0e5903436d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-27 21:39:52 +03:00
Akihiko Odaki
bfe41c8f65 kvm: Introduce kvm_arch_get_default_type hook
kvm_arch_get_default_type() returns the default KVM type. This hook is
particularly useful to derive a KVM type that is valid for "none"
machine model, which is used by libvirt to probe the availability of
KVM.

For MIPS, the existing mips_kvm_type() is reused. This function ensures
the availability of VZ which is mandatory to use KVM on the current
QEMU.

Cc: qemu-stable@nongnu.org
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20230727073134.134102-2-akihiko.odaki@daynix.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: added doc comment for new function]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 5e0d65909c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-11 10:53:50 +03:00
Richard Henderson
0e262ee8e9 accel/tcg: Fix atomic_mmu_lookup for reads
A copy-paste bug had us looking at the victim cache for writes.

Cc: qemu-stable@nongnu.org
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Fixes: 08dff435e2 ("tcg: Probe the proper permissions for atomic ops")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230505204049.352469-1-richard.henderson@linaro.org>
(cherry picked from commit 8c313254e6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18 21:09:59 +03:00
Richard Henderson
344b63b380 accel/tcg: Split out setjmp_gen_code
Isolate the code protected by setjmp.  Fixes:

translate-all.c: In function ‘tb_gen_code’:
translate-all.c:748:51: error: argument ‘cflags’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered]

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-11-09 12:29:03 +11:00
Richard Henderson
35abb009b2 tcg: Move TCG_TARGET_HAS_direct_jump init to tb_gen_code
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-11-09 12:29:03 +11:00
Claudio Fontana
5141e9a23f accel: abort if we fail to load the accelerator plugin
if QEMU is configured with modules enabled, it is possible that the
load of an accelerator module will fail.
Exit in this case, relying on module_object_class_by_name to report
the specific load error if any.

Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

[claudio: changed abort() to exit(1)]
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20220929093035.4231-6-cfontana@suse.de>

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-06 09:48:50 +01:00
Richard Henderson
4e4fa6c12d accel/tcg: Complete cpu initialization before registration
Delay cpu_list_add until realize is complete, so that cross-cpu
interaction does not happen with incomplete cpu state.  For this,
we must delay plugin initialization out of tcg_exec_realizefn,
because no cpu_index has been assigned.

Fixes a problem with cross-cpu jump cache flushing, when the
jump cache has not yet been allocated.

Fixes: a976a99a29 ("include/hw/core: Create struct CPUJumpCache")
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-11-01 08:31:41 +11:00
Richard Henderson
cfa29dd506 accel/tcg: Remove reset_icount argument from cpu_restore_state_from_tb
The value passed is always true.

Reviewed-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-11-01 08:31:41 +11:00
Richard Henderson
3d419a4dd2 accel/tcg: Remove will_exit argument from cpu_restore_state
The value passed is always true, and if the target's
synchronize_from_tb hook is non-trivial, not exiting
may be erroneous.

Reviewed-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-11-01 08:31:41 +11:00
Richard Henderson
6392bd6b90 accel/tcg: Introduce cpu_unwind_state_data
Add a way to examine the unwind data without actually
restoring the data back into env.

Reviewed-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-11-01 07:28:53 +11:00
Xuzhou Cheng
c9923550b4 accel/qtest: Support qtest accelerator for Windows
Currently signal SIGIPI [=SIGUSR1] is used to kick the dummy CPU
when qtest accelerator is used. However SIGUSR1 is unsupported on
Windows. To support Windows, we add a QemuSemaphore CPUState::sem
to kick the dummy CPU instead for Windows.

Signed-off-by: Xuzhou Cheng <xuzhou.cheng@windriver.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20221028045736.679903-2-bin.meng@windriver.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-10-28 11:17:12 +02:00
Stefan Hajnoczi
08a5d04606 Revert incorrect cflags initialization.
Add direct jumps for tcg/loongarch64.
 Speed up breakpoint check.
 Improve assertions for atomic.h.
 Move restore_state_to_opc to TCGCPUOps.
 Cleanups to TranslationBlock maintenance.
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmNYlo4dHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV9y2wf9EKsCA6VtYI2Qtftf
 q/ujYFmUf8AKTb9eVcA0XX71CT1dEnFR7GQyT8B8X13x0pSbOX7tbEWHPreegTFV
 tESiejvymi6Q9devAB58GVwNoU/zPIQQGhCPxkVUKDmRztJz22MbGUzd7UKPPgU8
 2nVMkIpLTMBsKeFLxE/D3ZntmdKsgyI/1Dtkl9TxvlDGsCbMjbNcr8lM+TLaG2oX
 GZhFyJHKEVy0cobukvhhb/9rU7AWdG/BnFmZM16JxvHV/YCwJBx3Udhcy9xPePUU
 yIjkGsUAq4aB6H9RFuTWh7GmaY5u6gMbTTi2J7hDos0mzauYJtpgEB/H42LpycGE
 sOhkLQ==
 =DUb8
 -----END PGP SIGNATURE-----

Merge tag 'pull-tcg-20221026' of https://gitlab.com/rth7680/qemu into staging

Revert incorrect cflags initialization.
Add direct jumps for tcg/loongarch64.
Speed up breakpoint check.
Improve assertions for atomic.h.
Move restore_state_to_opc to TCGCPUOps.
Cleanups to TranslationBlock maintenance.

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmNYlo4dHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV9y2wf9EKsCA6VtYI2Qtftf
# q/ujYFmUf8AKTb9eVcA0XX71CT1dEnFR7GQyT8B8X13x0pSbOX7tbEWHPreegTFV
# tESiejvymi6Q9devAB58GVwNoU/zPIQQGhCPxkVUKDmRztJz22MbGUzd7UKPPgU8
# 2nVMkIpLTMBsKeFLxE/D3ZntmdKsgyI/1Dtkl9TxvlDGsCbMjbNcr8lM+TLaG2oX
# GZhFyJHKEVy0cobukvhhb/9rU7AWdG/BnFmZM16JxvHV/YCwJBx3Udhcy9xPePUU
# yIjkGsUAq4aB6H9RFuTWh7GmaY5u6gMbTTi2J7hDos0mzauYJtpgEB/H42LpycGE
# sOhkLQ==
# =DUb8
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 25 Oct 2022 22:08:14 EDT
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-tcg-20221026' of https://gitlab.com/rth7680/qemu: (47 commits)
  accel/tcg: Remove restore_state_to_opc function
  target/xtensa: Convert to tcg_ops restore_state_to_opc
  target/tricore: Convert to tcg_ops restore_state_to_opc
  target/sparc: Convert to tcg_ops restore_state_to_opc
  target/sh4: Convert to tcg_ops restore_state_to_opc
  target/s390x: Convert to tcg_ops restore_state_to_opc
  target/rx: Convert to tcg_ops restore_state_to_opc
  target/riscv: Convert to tcg_ops restore_state_to_opc
  target/ppc: Convert to tcg_ops restore_state_to_opc
  target/openrisc: Convert to tcg_ops restore_state_to_opc
  target/nios2: Convert to tcg_ops restore_state_to_opc
  target/mips: Convert to tcg_ops restore_state_to_opc
  target/microblaze: Convert to tcg_ops restore_state_to_opc
  target/m68k: Convert to tcg_ops restore_state_to_opc
  target/loongarch: Convert to tcg_ops restore_state_to_opc
  target/i386: Convert to tcg_ops restore_state_to_opc
  target/hppa: Convert to tcg_ops restore_state_to_opc
  target/hexagon: Convert to tcg_ops restore_state_to_opc
  target/cris: Convert to tcg_ops restore_state_to_opc
  target/avr: Convert to tcg_ops restore_state_to_opc
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-26 10:53:41 -04:00
Richard Henderson
04f105758b accel/tcg: Remove restore_state_to_opc function
All targets have been updated.  Use the tcg_ops target hook
exclusively, which allows the compat code to be removed.

Reviewed-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
d29256896f accel/tcg: Add restore_state_to_opc to TCGCPUOps
Add a tcg_ops hook to replace the restore_state_to_opc
function call.  Because these generic hooks cannot depend
on target-specific types, temporarily, copy the current
target_ulong data[] into uint64_t d64[].

Reviewed-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
8269c01417 accel/tcg: Simplify page_get/alloc_target_data
Since the only user, Arm MTE, always requires allocation,
merge the get and alloc functions to always produce a
non-null result.  Also assume that the user has already
checked page validity.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
0fe6108432 accel/tcg: Move TARGET_PAGE_DATA_SIZE impl to user-exec.c
Since "target data" is always user-only, move it out of
translate-all.c to user-exec.c.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
e786509f29 accel/tcg: Use tb_invalidate_phys_range in page_set_flags
Flush translation blocks in bulk, rather than page-by-page.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
43301e0566 accel/tcg: Use page_reset_target_data in page_set_flags
Use the existing function for clearing target data.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
8f39e01db9 accel/tcg: Call tb_invalidate_phys_page for PAGE_RESET
When PAGE_RESET is set, we are replacing pages with new
content, which means that we need to invalidate existing
cached data, such as TranslationBlocks.  Perform the
reset invalidate while we're doing other invalidates,
which allows us to remove the separate invalidates from
the user-only mmap/munmap/mprotect routines.

In addition, restrict invalidation to PAGE_EXEC pages.
Since cdf7130851, we have validated PAGE_EXEC is present
before translation, which means we can assume that if the
bit is not present, there are no translations to invalidate.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
24ace1ac3c accel/tcg: Use tb_invalidate_phys_page in page_set_flags
We do not require detection of overlapping TBs here,
so use the more appropriate function.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
65cd34e8c4 accel/tcg: Unify declarations of tb_invalidate_phys_range
We missed this function when we introduced tb_page_addr_t.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
d6d1fd2973 accel/tcg: Rename tb_invalidate_phys_page_range and drop end parameter
This function is is never called with a real range,
only for a single page.  Drop the second parameter
and rename to tb_invalidate_phys_page.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
67aabbb312 accel/tcg: Rename tb_invalidate_phys_page
Rename to tb_invalidate_phys_page_unwind to emphasize that
we also detect invalidating the current TB, and also to free
up that name for other usage.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
28905cfbd5 accel/tcg: Introduce tb_{set_}page_addr{0,1}
This data structure will be replaced for user-only: add accessors.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
4c88475c9f accel/tcg: Remove duplicate store to tb->page_addr[]
When we added the fast path, we initialized page_addr[] early.
These stores in and around tb_page_add() are redundant; remove them.

Fixes: 50627f1b7b ("accel/tcg: Add fast path for translator_ld*")
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
8516e2a92c accel/tcg: Drop cpu_get_tb_cpu_state from TARGET_HAS_PRECISE_SMC
The results of the calls to cpu_get_tb_cpu_state,
current_{pc,cs_base,flags}, are not used.
In tb_invalidate_phys_page, use bool for current_tb_modified.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
cc05368ad9 accel/tcg: Move assert_no_pages_locked to internal.h
There are no users outside of accel/tcg; this function
does not need to be defined in exec-all.h.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
8a14b62c3f accel/tcg: Split out tb-maint.c
Move all of the TranslationBlock flushing and page linking
code from translate-all.c to tb-maint.c.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
55098769bf accel/tcg: Split out PageDesc to internal.h
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
f5e8039974 accel/tcg: Remove disabled debug in translate-all.c
These items printf, and could be replaced with proper
tracepoints if we really cared.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Richard Henderson
50d4c8c1d4 accel/tcg: Make page_alloc_target_data allocation constant
Use a constant target data allocation size for all pages.
This will be necessary to reduce overhead of page tracking.
Since TARGET_PAGE_DATA_SIZE is now required, we can use this
to omit data tracking for targets that don't require it.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Leandro Lupori
69993c4e62 accel/tcg: Add a quicker check for breakpoints
Profiling QEMU during Fedora 35 for PPC64 boot revealed that a
considerable amount of time was being spent in
check_for_breakpoints() (0.61% of total time on PPC64 and 2.19% on
amd64), even though it was just checking that its queue was empty
and returning, when no breakpoints were set. It turns out this
function is not inlined by the compiler and it's always called by
helper_lookup_tb_ptr(), one of the most called functions.

By leaving only the check for empty queue in
check_for_breakpoints() and moving the remaining code to
check_for_breakpoints_slow(), called only when the queue is not
empty, it's possible to avoid the call overhead. An improvement of
about 3% in total time was measured on POWER9.

Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221025202424.195984-2-leandro.lupori@eldorado.org.br>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Peter Maydell
0585105c80 Revert "accel/tcg: Init TCG cflags in vCPU thread handler"
Commit a82fd5a4ec was intended to be a code cleanup, but
unfortunately it has a bug. It moves the initialization of the
TCG cflags from the "start a new vcpu" function to the
thread handler; this is fine when each vcpu has its own thread,
but when we are doing round-robin of vcpus on a single thread
we end up only initializing the cflags for CPU 0, not for any
of the others.

The most obvious effect of this bug is that running in icount
mode with more than one CPU is broken; typically the guest
hangs shortly after it brings up the secondary CPUs.

This reverts commit a82fd5a4ec.

Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20221021163409.3674911-1-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-25 22:01:14 +10:00
Matheus Tavares Bernardino
046ab3b808 accel/tcg/tcg-accel-ops-rr: fix trivial typo
Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <5dc556dbe241ae03859b7890d1998de5c77b7c6c.1666373742.git.quic_mathbern@quicinc.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-10-24 13:43:42 +02:00
Chenyi Qiang
e2e69f6bb9 i386: add notify VM exit support
There are cases that malicious virtual machine can cause CPU stuck (due
to event windows don't open up), e.g., infinite loop in microcode when
nested #AC (CVE-2015-5307). No event window means no event (NMI, SMI and
IRQ) can be delivered. It leads the CPU to be unavailable to host or
other VMs. Notify VM exit is introduced to mitigate such kind of
attacks, which will generate a VM exit if no event window occurs in VM
non-root mode for a specified amount of time (notify window).

A new KVM capability KVM_CAP_X86_NOTIFY_VMEXIT is exposed to user space
so that the user can query the capability and set the expected notify
window when creating VMs. The format of the argument when enabling this
capability is as follows:
  Bit 63:32 - notify window specified in qemu command
  Bit 31:0  - some flags (e.g. KVM_X86_NOTIFY_VMEXIT_ENABLED is set to
              enable the feature.)

Users can configure the feature by a new (x86 only) accel property:
    qemu -accel kvm,notify-vmexit=run|internal-error|disable,notify-window=n

The default option of notify-vmexit is run, which will enable the
capability and do nothing if the exit happens. The internal-error option
raises a KVM internal error if it happens. The disable option does not
enable the capability. The default value of notify-window is 0. It is valid
only when notify-vmexit is not disabled. The valid range of notify-window
is non-negative. It is even safe to set it to zero since there's an
internal hardware threshold to be added to ensure no false positive.

Because a notify VM exit may happen with VM_CONTEXT_INVALID set in exit
qualification (no cases are anticipated that would set this bit), which
means VM context is corrupted. It would be reflected in the flags of
KVM_EXIT_NOTIFY exit. If KVM_NOTIFY_CONTEXT_INVALID bit is set, raise a KVM
internal error unconditionally.

Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20220929072014.20705-5-chenyi.qiang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:00 +02:00
Chenyi Qiang
5f8a6bce1f kvm: expose struct KVMState
Expose struct KVMState out of kvm-all.c so that the field of struct
KVMState can be accessed when defining target-specific accelerator
properties.

Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20220929072014.20705-4-chenyi.qiang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:00 +02:00
Paolo Bonzini
3dba0a335c kvm: allow target-specific accelerator properties
Several hypervisor capabilities in KVM are target-specific.  When exposed
to QEMU users as accelerator properties (i.e. -accel kvm,prop=value), they
should not be available for all targets.

Add a hook for targets to add their own properties to -accel kvm, for
now no such property is defined.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220929072014.20705-3-chenyi.qiang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-10 09:23:16 +02:00
Alex Bennée
c7f1c53735 accel/kvm: move kvm_update_guest_debug to inline stub
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220929114231.583801-47-alex.bennee@linaro.org>
2022-10-06 11:53:41 +01:00
Alex Bennée
a48e7d9e52 gdbstub: move guest debug support check to ops
This removes the final hard coding of kvm_enabled() in gdbstub and
moves the check to an AccelOps.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Mads Ynddal <mads@ynddal.dk>
Message-Id: <20220929114231.583801-46-alex.bennee@linaro.org>
2022-10-06 11:53:41 +01:00
Alex Bennée
ae7467b1ac gdbstub: move breakpoint logic to accel ops
As HW virtualization requires specific support to handle breakpoints
lets push out special casing out of the core gdbstub code and into
AccelOpsClass. This will make it easier to add other accelerator
support and reduces some of the stub shenanigans.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Mads Ynddal <mads@ynddal.dk>
Message-Id: <20220929114231.583801-45-alex.bennee@linaro.org>
2022-10-06 11:53:41 +01:00
Alex Bennée
3b7a93880a gdbstub: move sstep flags probing into AccelClass
The support of single-stepping is very much dependent on support from
the accelerator we are using. To avoid special casing in gdbstub move
the probing out to an AccelClass function so future accelerators can
put their code there.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Mads Ynddal <mads@ynddal.dk>
Message-Id: <20220929114231.583801-44-alex.bennee@linaro.org>
2022-10-06 11:53:41 +01:00
Richard Henderson
8ed558ec0c accel/tcg: Introduce TARGET_TB_PCREL
Prepare for targets to be able to produce TBs that can
run in more than one virtual context.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-04 12:13:16 -07:00
Richard Henderson
fbf59aad17 accel/tcg: Introduce tb_pc and log_pc
The availability of tb->pc will shortly be conditional.
Introduce accessor functions to minimize ifdefs.

Pass around a known pc to places like tcg_gen_code,
where the caller must already have the value.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-04 12:13:12 -07:00
Richard Henderson
a976a99a29 include/hw/core: Create struct CPUJumpCache
Wrap the bare TranslationBlock pointer into a structure.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-04 12:13:12 -07:00
Richard Henderson
1d41a79b3c accel/tcg: Inline tb_flush_jmp_cache
This function has two users, who use it incompatibly.
In tlb_flush_page_by_mmuidx_async_0, when flushing a
single page, we need to flush exactly two pages.
In tlb_flush_range_by_mmuidx_async_0, when flushing a
range of pages, we need to flush N+1 pages.

This avoids double-flushing of jmp cache pages in a range.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-04 12:13:12 -07:00
Richard Henderson
93b996161b accel/tcg: Do not align tb->page_addr[0]
Let tb->page_addr[0] contain the address of the first byte of the
translated block, rather than the address of the page containing the
start of the translated block.  We need to recover this value anyway
at various points, and it is easier to discard a page offset when it
is not needed, which happens naturally via the existing find_page shift.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-04 12:13:04 -07:00
Richard Henderson
b21af662c1 accel/tcg: Use DisasContextBase in plugin_gen_tb_start
Use the pc coming from db->pc_first rather than the TB.

Use the cached host_addr rather than re-computing for the
first page.  We still need a separate lookup for the second
page because it won't be computed for DisasContextBase until
the translator actually performs a read from the page.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-03 20:53:31 -07:00
Richard Henderson
f79542afa7 accel/tcg: Use bool for page_find_alloc
Bool is more appropriate type for the alloc parameter.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-03 20:53:31 -07:00