Commit Graph

2115 Commits

Author SHA1 Message Date
Ilya Leoshkevich
ac2fb86a0e target/i386/gdbstub: Expose orig_ax
Copy XML files describing orig_ax from GDB and glue them with
CPUX86State.orig_ax.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20240912093012.402366-5-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-13 10:05:51 -07:00
Ilya Leoshkevich
e7a4427aec target/i386/gdbstub: Factor out gdb_get_reg() and gdb_write_reg()
i386 gdbstub handles both i386 and x86_64. Factor out two functions
for reading and writing registers without knowing their bitness.

While at it, simplify the TARGET_LONG_BITS == 32 case.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20240912093012.402366-4-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-13 10:05:51 -07:00
Julia Suvorova
fc058618d1 target/i386/kvm: Report which action failed in kvm_arch_put/get_registers
To help debug and triage future failure reports (akin to [1,2]) that
may occur during kvm_arch_put/get_registers, the error path of each
action is accompanied by unique error message.

[1] https://issues.redhat.com/browse/RHEL-7558
[2] https://issues.redhat.com/browse/RHEL-21761

Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240927104743.218468-3-jusual@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-03 22:04:24 +02:00
Julia Suvorova
a1676bb304 kvm: Allow kvm_arch_get/put_registers to accept Error**
This is necessary to provide discernible error messages to the caller.

Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240927104743.218468-2-jusual@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-03 22:04:19 +02:00
Fabiano Rosas
0701abbf98 target/i386: Expose IBPB-BRTYPE and SBPB CPUID bits to the guest
According to AMD's Speculative Return Stack Overflow whitepaper (link
below), the hypervisor should synthesize the value of IBPB_BRTYPE and
SBPB CPUID bits to the guest.

Support for this is already present in the kernel with commit
e47d86083c66 ("KVM: x86: Add SBPB support") and commit 6f0f23ef76be
("KVM: x86: Add IBPB_BRTYPE support").

Add support in QEMU to expose the bits to the guest OS.

host:
  # cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
  Mitigation: Safe RET

before (guest):
  $ cpuid -l 0x80000021 -1 -r
  0x80000021 0x00: eax=0x00000045 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
                            ^
  $ cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
  Vulnerable: Safe RET, no microcode

after (guest):
  $ cpuid -l 0x80000021 -1 -r
  0x80000021 0x00: eax=0x18000045 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
                            ^
  $ cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
  Mitigation: Safe RET

Reported-by: Fabian Vogt <fvogt@suse.de>
Link: https://www.amd.com/content/dam/amd/en/documents/corporate/cr/speculative-return-stack-overflow-whitepaper.pdf
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240805202041.5936-1-farosas@suse.de
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-03 19:33:23 +02:00
Paolo Bonzini
dc44854978 kvm/i386: replace identity_base variable with a constant
identity_base variable is first initialzied to address 0xfffbc000 and then
kvm_vm_set_identity_map_addr() overrides this value to address 0xfeffc000.
The initial address to which the variable was initialized was never used. Clean
everything up, placing 0xfeffc000 in a preprocessor constant.

Reported-by: Ani Sinha <anisinha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-03 19:33:23 +02:00
Ani Sinha
0cc42e63bb kvm/i386: refactor kvm_arch_init and split it into smaller functions
kvm_arch_init() enables a lot of vm capabilities. Refactor them into separate
smaller functions. Energy MSR related operations also moved to its own
function. There should be no functional impact.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20240903124143.39345-2-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-03 19:33:22 +02:00
Ani Sinha
87e82951c1 kvm/i386: fix return values of is_host_cpu_intel()
is_host_cpu_intel() should return TRUE if the host cpu in Intel based, otherwise
it should return FALSE. Currently, it returns zero (FALSE) when the host CPU
is INTEL and non-zero otherwise. Fix the function so that it agrees more with
the semantics. Adjust the calling logic accordingly. RAPL needs Intel host cpus.
If the host CPU is not Intel baseed, we should report error.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20240903080004.33746-1-anisinha@redhat.com
[While touching the code remove too many spaces from the second part of the
 error. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-02 12:58:46 +02:00
Ani Sinha
ed2880f4e9 kvm/i386: make kvm_filter_msr() and related definitions private to kvm module
kvm_filer_msr() is only used from i386 kvm module. Make it static so that its
easy for developers to understand that its not used anywhere else.
Same for QEMURDMSRHandler, QEMUWRMSRHandler and KVMMSRHandlers definitions.

CC: philmd@linaro.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20240903140045.41167-1-anisinha@redhat.com
[Make struct unnamed. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-02 12:58:46 +02:00
Lei Wang
ab891454eb target/i386: Raise the highest index value used for any VMCS encoding
Because the index value of the VMCS field encoding of FRED injected-event
data (one of the newly added VMCS fields for FRED transitions), 0x52, is
larger than any existing index value, raise the highest index value used
for any VMCS encoding to 0x52.

Because the index value of the VMCS field encoding of Secondary VM-exit
controls, 0x44, is larger than any existing index value, raise the highest
index value used for any VMCS encoding to 0x44.

Co-developed-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Lei Wang <lei4.wang@intel.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Link: https://lore.kernel.org/r/20240807081813.735158-4-xin@zytor.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-02 12:58:46 +02:00
Xin Li (Intel)
7c6ec5bc5f target/i386: Add VMX control bits for nested FRED support
Add definitions of
  1) VM-exit activate secondary controls bit
  2) VM-entry load FRED bit
which are required to enable nested FRED.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Link: https://lore.kernel.org/r/20240807081813.735158-3-xin@zytor.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-02 12:58:46 +02:00
Xin Li (Intel)
a23bc65398 target/i386: Delete duplicated macro definition CR4_FRED_MASK
Macro CR4_FRED_MASK is defined twice, delete one.

Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Link: https://lore.kernel.org/r/20240807081813.735158-2-xin@zytor.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-02 12:58:46 +02:00
Peter Maydell
173c427eb5 * Convert more Avocado tests to the new functional test framework
* Clean up assert() statements, use g_assert_not_reached() when possible
 * Improve output of the gitlab CI jobs
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmbz7xgRHHRodXRoQHJl
 ZGhhdC5jb20ACgkQLtnXdP5wLbWm6A//eVn+tzyyKCX/xdXlf7XyVpezvRpTFPOS
 HyO0WMkCf2kGmu6qYKx/fDZg86opdQzPLH2gPkuVrGOMZ0Z2630DjH0jNih8lL9Q
 J1oRX5YlU92chlzNmq59WB/j9CKd91ILtOoaPBuZkDob57yGEYVzCPqetVvF7L2+
 +rbnccrNPumGJFt035fxUGiGfgsmp28MHQzDwQdyr38uGjyNlqvqidfC8Vj1qzqP
 B7HvhGB/vkF0eHaanMt2el/ZuLKf+qeCi//F/CiXGMYnuKXyShA/Db6xvMElw1jB
 aQdwphP71IO+cxjJLaNjDHKGFstArsM/E21qlaSTBi+FTmPiwVULpVTiBmWsjhOh
 /klpdgRHf0hL2MciYKyOWgjlTocx3rEKjCTe2U5tpta9fp9CrlgMQotjDZIbohGI
 ULNahrW3Zmg4EmXDApfhYMXsQsSgWas9QSkmxzJzDp0VC7tf2Oq7RxeySrlw9MCx
 OG2qQY+rNcJ3NnpATjfAJpT1kg/IahDOCNHfLEaj1u13XVQIthVADvHwy5WxbwRP
 mwp3V9e9sUoznkM2eV646lzmkMim/WdYBF0YpT7eBs80+GoXZ0thx9IqWmwzX/ox
 rndBczVN+RY6PydJP40yljdvS7ArRT73wHqL6yKHfDpvFc4/p5mxTWwLQ3yJbXbE
 T3I+wtgfBU8=
 =FH7b
 -----END PGP SIGNATURE-----

Merge tag 'pull-request-2024-09-25' of https://gitlab.com/thuth/qemu into staging

* Convert more Avocado tests to the new functional test framework
* Clean up assert() statements, use g_assert_not_reached() when possible
* Improve output of the gitlab CI jobs

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmbz7xgRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbWm6A//eVn+tzyyKCX/xdXlf7XyVpezvRpTFPOS
# HyO0WMkCf2kGmu6qYKx/fDZg86opdQzPLH2gPkuVrGOMZ0Z2630DjH0jNih8lL9Q
# J1oRX5YlU92chlzNmq59WB/j9CKd91ILtOoaPBuZkDob57yGEYVzCPqetVvF7L2+
# +rbnccrNPumGJFt035fxUGiGfgsmp28MHQzDwQdyr38uGjyNlqvqidfC8Vj1qzqP
# B7HvhGB/vkF0eHaanMt2el/ZuLKf+qeCi//F/CiXGMYnuKXyShA/Db6xvMElw1jB
# aQdwphP71IO+cxjJLaNjDHKGFstArsM/E21qlaSTBi+FTmPiwVULpVTiBmWsjhOh
# /klpdgRHf0hL2MciYKyOWgjlTocx3rEKjCTe2U5tpta9fp9CrlgMQotjDZIbohGI
# ULNahrW3Zmg4EmXDApfhYMXsQsSgWas9QSkmxzJzDp0VC7tf2Oq7RxeySrlw9MCx
# OG2qQY+rNcJ3NnpATjfAJpT1kg/IahDOCNHfLEaj1u13XVQIthVADvHwy5WxbwRP
# mwp3V9e9sUoznkM2eV646lzmkMim/WdYBF0YpT7eBs80+GoXZ0thx9IqWmwzX/ox
# rndBczVN+RY6PydJP40yljdvS7ArRT73wHqL6yKHfDpvFc4/p5mxTWwLQ3yJbXbE
# T3I+wtgfBU8=
# =FH7b
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 25 Sep 2024 12:08:08 BST
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* tag 'pull-request-2024-09-25' of https://gitlab.com/thuth/qemu: (44 commits)
  .gitlab-ci.d: Make separate collapsible log sections for build and test
  .gitlab-ci.d: Split build and test in cross build job templates
  scripts/checkpatch.pl: emit error when using assert(false)
  tests/qtest: remove return after g_assert_not_reached()
  qom: remove return after g_assert_not_reached()
  qobject: remove return after g_assert_not_reached()
  migration: remove return after g_assert_not_reached()
  hw/ppc: remove return after g_assert_not_reached()
  hw/pci: remove return after g_assert_not_reached()
  hw/net: remove return after g_assert_not_reached()
  hw/hyperv: remove return after g_assert_not_reached()
  include/qemu: remove return after g_assert_not_reached()
  tcg/loongarch64: remove break after g_assert_not_reached()
  fpu: remove break after g_assert_not_reached()
  target/riscv: remove break after g_assert_not_reached()
  target/arm: remove break after g_assert_not_reached()
  hw/tpm: remove break after g_assert_not_reached()
  hw/scsi: remove break after g_assert_not_reached()
  hw/net: remove break after g_assert_not_reached()
  hw/acpi: remove break after g_assert_not_reached()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-09-28 12:34:38 +01:00
Pierrick Bouvier
f4fa1a5350 target/i386/kvm: replace assert(false) with g_assert_not_reached()
This patch is part of a series that moves towards a consistent use of
g_assert_not_reached() rather than an ad hoc mix of different
assertion mechanisms.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240919044641.386068-15-pierrick.bouvier@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-24 13:53:35 +02:00
Philippe Mathieu-Daudé
b14d064962 license: Update deprecated SPDX tag LGPL-2.0+ to LGPL-2.0-or-later
The 'LGPL-2.0+' license identifier has been deprecated since license
list version 2.0rc2 [1] and replaced by the 'LGPL-2.0-or-later' [2]
tag.

[1] https://spdx.org/licenses/LGPL-2.0+.html
[2] https://spdx.org/licenses/LGPL-2.0-or-later.html

Mechanical patch running:

  $ sed -i -e s/LGPL-2.0+/LGPL-2.0-or-later/ \
    $(git grep -l 'SPDX-License-Identifier: LGPL-2.0+$')

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-09-20 10:11:59 +03:00
Danny Canter
2c760670af hvf: Split up hv_vm_create logic per arch
This is preliminary work to split up hv_vm_create
logic per platform so we can support creating VMs
with > 64GB of RAM on Apple Silicon machines. This
is done via ARM HVF's hv_vm_config_create() (and
other APIs that modify this config that will be
coming in future patches). This should have no
behavioral difference at all as hv_vm_config_create()
just assigns the same default values as if you just
passed NULL to the function.

Signed-off-by: Danny Canter <danny_canter@apple.com>
Message-id: 20240828111552.93482-3-danny_canter@apple.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-09-13 15:31:46 +01:00
Johannes Stoelp
6a8703aecb kvm: Use 'unsigned long' for request argument in functions wrapping ioctl()
Change the data type of the ioctl _request_ argument from 'int' to
'unsigned long' for the various accel/kvm functions which are
essentially wrappers around the ioctl() syscall.

The correct type for ioctl()'s 'request' argument is confused:
 * POSIX defines the request argument as 'int'
 * glibc uses 'unsigned long' in the prototype in sys/ioctl.h
 * the glibc info documentation uses 'int'
 * the Linux manpage uses 'unsigned long'
 * the Linux implementation of the syscall uses 'unsigned int'

If we wrap ioctl() with another function which uses 'int' as the
type for the request argument, then requests with the 0x8000_0000
bit set will be sign-extended when the 'int' is cast to
'unsigned long' for the call to ioctl().

On x86_64 one such example is the KVM_IRQ_LINE_STATUS request.
Bit requests with the _IOC_READ direction bit set, will have the high
bit set.

Fortunately the Linux Kernel truncates the upper 32bit of the request
on 64bit machines (because it uses 'unsigned int', and see also Linus
Torvalds' comments in
  https://sourceware.org/bugzilla/show_bug.cgi?id=14362 )
so this doesn't cause active problems for us.  However it is more
consistent to follow the glibc ioctl() prototype when we define
functions that are essentially wrappers around ioctl().

This resolves a Coverity issue where it points out that in
kvm_get_xsave() we assign a value (KVM_GET_XSAVE or KVM_GET_XSAVE2)
to an 'int' variable which can't hold it without overflow.

Resolves: Coverity CID 1547759
Signed-off-by: Johannes Stoelp <johannes.stoelp@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20240815122747.3053871-1-peter.maydell@linaro.org
[PMM: Rebased patch, adjusted commit message, included note about
 Coverity fix, updated the type of the local var in kvm_get_xsave,
 updated the comment in the KVMState struct definition]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-09-13 15:31:46 +01:00
Markus Armbruster
ef834aa2b2 qapi/crypto: Rename QCryptoHashAlgorithm to *Algo, and drop prefix
QAPI's 'prefix' feature can make the connection between enumeration
type and its constants less than obvious.  It's best used with
restraint.

QCryptoHashAlgorithm has a 'prefix' that overrides the generated
enumeration constants' prefix to QCRYPTO_HASH_ALG.

We could simply drop 'prefix', but then the prefix becomes
QCRYPTO_HASH_ALGORITHM, which is rather long.

We could additionally rename the type to QCryptoHashAlg, but I think
the abbreviation "alg" is less than clear.

Rename the type to QCryptoHashAlgo instead.  The prefix becomes to
QCRYPTO_HASH_ALGO.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240904111836.3273842-12-armbru@redhat.com>
[Conflicts with merge commit 7bbadc60b5 resolved]
2024-09-10 14:02:16 +02:00
Richard Henderson
ded1db48c9 target/i386: Fix tss access size in switch_tss_ra
The two limit_max variables represent size - 1, just like the
encoding in the GDT, thus the 'old' access was off by one.
Access the minimal size of the new tss: the complete tss contains
the iopb, which may be a larger block than the access api expects,
and irrelevant because the iopb is not accessed during the
switch itself.

Fixes: 8b13106508 ("target/i386/tcg: use X86Access for TSS access")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2511
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240819074052.207783-1-richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
2024-08-21 09:11:26 +10:00
Richard Henderson
83a3a20e59 target/i386: Fix carry flag for BLSI
BLSI has inverted semantics for C as compared to the other two
BMI1 instructions, BLSMSK and BLSR.  Introduce CC_OP_BLSI* for
this purpose.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2175
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240801075845.573075-3-richard.henderson@linaro.org>
2024-08-21 09:11:26 +10:00
Richard Henderson
266d6dddbd target/i386: Split out gen_prepare_val_nz
Split out the TCG_COND_TSTEQ logic from gen_prepare_eflags_z,
and use it for CC_OP_BMILG* as well.  Prepare for requiring
both zero and non-zero senses.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240801075845.573075-2-richard.henderson@linaro.org>
2024-08-21 09:11:26 +10:00
Alex Bennée
cf584a908a target/i386: allow access_ptr to force slow path on failed probe
When we are using TCG plugin memory callbacks probe_access_internal
will return TLB_MMIO to force the slow path for memory access. This
results in probe_access returning NULL but the x86 access_ptr function
happily accepts an empty haddr resulting in segfault hilarity.

Check for an empty haddr to prevent the segfault and enable plugins to
track all the memory operations for the x86 save/restore helpers. As
we also want to run the slow path when instrumenting *-user we should
also not have the short cutting test_ptr macro.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2489
Fixes: 6d03226b42 (plugins: force slow path when plugins instrument memory ops)
Reviewed-by: Alexandre Iooss <erdnaxe@crans.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240813202329.1237572-8-alex.bennee@linaro.org>
2024-08-16 14:04:19 +01:00
Anthony Harivel
a6e65975c3 target/i386: Fix arguments for vmsr_read_thread_stat()
Snapshot of the stat utime and stime for each thread, taken before and
after the pause, must be stored in separate locations

Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240807124320.1741124-2-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-14 18:42:19 +02:00
Richard Henderson
7700d2293c target/i386: Assert MMX and XMM registers in range
The mmx assert would fire without the fix for #2495.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20240812025844.58956-4-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-13 16:35:43 +02:00
Richard Henderson
45230bca85 target/i386: Use unit not type in decode_modrm
Rather that enumerating the types that can produce
MMX operands, examine the unit.  No functional change.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20240812025844.58956-3-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-13 11:33:34 +02:00
Richard Henderson
416f2b16c0 target/i386: Do not apply REX to MMX operands
Cc: qemu-stable@nongnu.org
Fixes: b3e22b2318 ("target/i386: add core of new i386 decoder")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2495
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240812025844.58956-2-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-13 11:33:34 +02:00
Richard Henderson
ac63755b20 target/i386: Fix VSIB decode
With normal SIB, index == 4 indicates no index.
With VSIB, there is no exception for VR4/VR12.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2474
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240805003130.1421051-3-richard.henderson@linaro.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-05 14:14:47 +02:00
Paolo Bonzini
d4392415c3 target/i386: SEV: fix mismatch in vcek-disabled property name
The vcek-disabled property of the sev-snp-guest object is misspelled
vcek-required (which I suppose would use the opposite polarity) in
the call to object_class_property_add_bool().  Fix it.

Reported-by: Zixi Chen <zixchen@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-02 12:11:44 +02:00
Anthony Harivel
6e623af301 target/i386: Clean up error cases for vmsr_read_thread_stat()
Fix leaking memory of file handle in case of error
Erase unused "pid = -1"
Add clearer error_report

Should fix Coverity CID 1558557.

Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240726102632.1324432-3-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-31 13:13:31 +02:00
Anthony Harivel
5997fbdfac target/i386: Fix typo that assign same value twice
Should fix: CID 1558553

Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240726102632.1324432-2-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-31 13:13:31 +02:00
Zhao Liu
ada1f3cab3 target/i386/cpu: Mask off SGX/SGX_LC feature words for non-PC machine
Only PC machine supports SGX, so mask off SGX related feature words for
non-PC machine (microvm).

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20240730045544.2516284-5-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-31 13:13:31 +02:00
Zhao Liu
3722a98948 target/i386/cpu: Add dependencies of CPUID 0x12 leaves
As SDM stated, CPUID 0x12 leaves depend on CPUID_7_0_EBX_SGX (SGX
feature word).

Since FEAT_SGX_12_0_EAX, FEAT_SGX_12_0_EBX and FEAT_SGX_12_1_EAX define
multiple feature words, add the dependencies of those registers to
report the warning to user if SGX is absent.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20240730045544.2516284-4-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-31 13:13:31 +02:00
Zhao Liu
4912d6990b target/i386/cpu: Explicitly express SGX_LC and SGX feature words dependency
At present, cpu_x86_cpuid() silently masks off SGX_LC if SGX is absent.

This is not proper because the user is not told about the dependency
between the two.

So explicitly define the dependency between SGX_LC and SGX feature
words, so that user could get a warning when SGX_LC is enabled but
SGX is absent.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20240730045544.2516284-3-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-31 13:13:31 +02:00
Zhao Liu
eee194dd71 target/i386/cpu: Remove unnecessary SGX feature words checks
CPUID.0x7.0.ebx and CPUID.0x7.0.ecx leaves have been expressed as the
feature word lists, and the Host capability support has been checked
in x86_cpu_filter_features().

Therefore, such checks on SGX feature "words" are redundant, and
the follow-up adjustments to those feature "words" will not actually
take effect.

Remove unnecessary SGX feature words related checks.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20240730045544.2516284-2-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-31 13:13:31 +02:00
Xiong Zhang
39635ccd0b target/i386: Change unavail from u32 to u64
The feature word 'r' is a u64, and "unavail" is a u32, the operation
'r &= ~unavail' clears the high 32 bits of 'r'. This causes many vmx cases
in kvm-unit-tests to fail. Changing 'unavail' from u32 to u64 fixes this
issue.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2442
Fixes: 0b2757412c ("target/i386: drop AMD machine check bits from Intel CPUID")
Signed-off-by: Xiong Zhang <xiong.y.zhang@linux.intel.com>
Link: https://lore.kernel.org/r/20240730082927.250180-1-xiong.y.zhang@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-31 13:13:31 +02:00
Peter Maydell
bde8adb808 target/i386: Remove dead assignment to ss in do_interrupt64()
Coverity points out that in do_interrupt64() in the "to inner
privilege" codepath we set "ss = 0", but because we also set
"new_stack = 1" there, later in the function we will always override
that value of ss with "ss = 0 | dpl".

Remove the unnecessary initialization of ss, which allows us to
reduce the scope of the variable to only where it is used.  Borrow a
comment from helper_lcall_protected() that explains what "0 | dpl"
means here.

Resolves: Coverity CID 1527395
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240723162525.1585743-1-peter.maydell@linaro.org
2024-07-29 16:59:44 +01:00
Anthony Harivel
0418f90809 Add support for RAPL MSRs in KVM/Qemu
Starting with the "Sandy Bridge" generation, Intel CPUs provide a RAPL
interface (Running Average Power Limit) for advertising the accumulated
energy consumption of various power domains (e.g. CPU packages, DRAM,
etc.).

The consumption is reported via MSRs (model specific registers) like
MSR_PKG_ENERGY_STATUS for the CPU package power domain. These MSRs are
64 bits registers that represent the accumulated energy consumption in
micro Joules. They are updated by microcode every ~1ms.

For now, KVM always returns 0 when the guest requests the value of
these MSRs. Use the KVM MSR filtering mechanism to allow QEMU handle
these MSRs dynamically in userspace.

To limit the amount of system calls for every MSR call, create a new
thread in QEMU that updates the "virtual" MSR values asynchronously.

Each vCPU has its own vMSR to reflect the independence of vCPUs. The
thread updates the vMSR values with the ratio of energy consumed of
the whole physical CPU package the vCPU thread runs on and the
thread's utime and stime values.

All other non-vCPU threads are also taken into account. Their energy
consumption is evenly distributed among all vCPUs threads running on
the same physical CPU package.

To overcome the problem that reading the RAPL MSR requires priviliged
access, a socket communication between QEMU and the qemu-vmsr-helper is
mandatory. You can specified the socket path in the parameter.

This feature is activated with -accel kvm,rapl=true,path=/path/sock.sock

Actual limitation:
- Works only on Intel host CPU because AMD CPUs are using different MSR
  adresses.

- Only the Package Power-Plane (MSR_PKG_ENERGY_STATUS) is reported at
  the moment.

Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240522153453.1230389-4-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-22 19:19:37 +02:00
Paolo Bonzini
6a079f2e68 target/i386/tcg: save current task state before loading new one
This is how the steps are ordered in the manual.  EFLAGS.NT is
overwritten after the fact in the saved image.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:25 +02:00
Paolo Bonzini
8b13106508 target/i386/tcg: use X86Access for TSS access
This takes care of probing the vaddr range in advance, and is also faster
because it avoids repeated TLB lookups.  It also matches the Intel manual
better, as it says "Checks that the current (old) TSS, new TSS, and all
segment descriptors used in the task switch are paged into system memory";
note however that it's not clear how the processor checks for segment
descriptors, and this check is not included in the AMD manual.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:25 +02:00
Paolo Bonzini
05d41bbcb3 target/i386/tcg: check for correct busy state before switching to a new task
This step is listed in the Intel manual: "Checks that the new task is available
(call, jump, exception, or interrupt) or busy (IRET return)".

The AMD manual lists the same operation under the "Preventing recursion"
paragraph of "12.3.4 Nesting Tasks", though it is not clear if the processor
checks the busy bit in the IRET case.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Paolo Bonzini
8053862af9 target/i386/tcg: Compute MMU index once
Add the MMU index to the StackAccess struct, so that it can be cached
or (in the next patch) computed from information that is not in
CPUX86State.

Co-developed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Richard Henderson
fffe424b38 target/i386/tcg: Introduce x86_mmu_index_{kernel_,}pl
Disconnect mmu index computation from the current pl
as stored in env->hflags.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240617161210.4639-2-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Richard Henderson
059368bcf5 target/i386/tcg: Reorg push/pop within seg_helper.c
Interrupts and call gates should use accesses with the DPL as
the privilege level.  While computing the applicable MMU index
is easy, the harder thing is how to plumb it in the code.

One possibility could be to add a single argument to the PUSH* macros
for the privilege level, but this is repetitive and risks confusion
between the involved privilege levels.

Another possibility is to pass both CPL and DPL, and adjusting both
PUSH* and POP* to use specific privilege levels (instead of using
cpu_{ld,st}*_data). This makes the code more symmetric.

However, a more complicated but much nicer approach is to use a structure
to contain the stack parameters, env, unwind return address, and rewrite
the macros into functions.  The struct provides an easy home for the MMU
index as well.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240617161210.4639-4-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Paolo Bonzini
312ef3243e target/i386/tcg: use PUSHL/PUSHW for error code
Do not pre-decrement esp, let the macros subtract the appropriate
operand size.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Paolo Bonzini
0bd385e7e3 target/i386/tcg: Allow IRET from user mode to user mode with SMAP
This fixes a bug wherein i386/tcg assumed an interrupt return using
the IRET instruction was always returning from kernel mode to either
kernel mode or user mode. This assumption is violated when IRET is used
as a clever way to restore thread state, as for example in the dotnet
runtime. There, IRET returns from user mode to user mode.

This bug is that stack accesses from IRET and RETF, as well as accesses
to the parameters in a call gate, are normal data accesses using the
current CPL.  This manifested itself as a page fault in the guest Linux
kernel due to SMAP preventing the access.

This bug appears to have been in QEMU since the beginning.

Analyzed-by: Robert R. Henry <rrh.henry@gmail.com>
Co-developed-by: Robert R. Henry <rrh.henry@gmail.com>
Signed-off-by: Robert R. Henry <rrh.henry@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Richard Henderson
a7cf494993 target/i386/tcg: Remove SEG_ADDL
This truncation is now handled by MMU_*32_IDX.  The introduction of
MMU_*32_IDX in fact applied correct 32-bit wraparound to 16-bit accesses
with a high segment base (e.g.  big real mode or vm86 mode), which did
not use SEG_ADDL.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240617161210.4639-3-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Paolo Bonzini
3afc6539a8 target/i386/tcg: fix POP to memory in long mode
In long mode, POP to memory will write a full 64-bit value.  However,
the call to gen_writeback() in gen_POP will use MO_32 because the
decoding table is incorrect.

The bug was latent until commit aea49fbb01 ("target/i386: use gen_writeback()
within gen_POP()", 2024-06-08), and then became visible because gen_op_st_v
now receives op->ot instead of the "ot" returned by gen_pop_T0.

Analyzed-by: Clément Chigot <chigot@adacore.com>
Fixes: 5e9e21bcc4 ("target/i386: move 60-BF opcodes to new decoder", 2024-05-07)
Tested-by: Clément Chigot <chigot@adacore.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 18:18:24 +02:00
Michael Roth
9d38d9dca2 i386/sev: Don't allow automatic fallback to legacy KVM_SEV*_INIT
Currently if the 'legacy-vm-type' property of the sev-guest object is
'on', QEMU will attempt to use the newer KVM_SEV_INIT2 kernel
interface in conjunction with the newer KVM_X86_SEV_VM and
KVM_X86_SEV_ES_VM KVM VM types.

This can lead to measurement changes if, for instance, an SEV guest was
created on a host that originally had an older kernel that didn't
support KVM_SEV_INIT2, but is booted on the same host later on after the
host kernel was upgraded.

Instead, if legacy-vm-type is 'off', QEMU should fail if the
KVM_SEV_INIT2 interface is not provided by the current host kernel.
Modify the fallback handling accordingly.

In the future, VMSA features and other flags might be added to QEMU
which will require legacy-vm-type to be 'off' because they will rely
on the newer KVM_SEV_INIT2 interface. It may be difficult to convey to
users what values of legacy-vm-type are compatible with which
features/options, so as part of this rework, switch legacy-vm-type to a
tri-state OnOffAuto option. 'auto' in this case will automatically
switch to using the newer KVM_SEV_INIT2, but only if it is required to
make use of new VMSA features or other options only available via
KVM_SEV_INIT2.

Defining 'auto' in this way would avoid inadvertantly breaking
compatibility with older kernels since it would only be used in cases
where users opt into newer features that are only available via
KVM_SEV_INIT2 and newer kernels, and provide better default behavior
than the legacy-vm-type=off behavior that was previously in place, so
make it the default for 9.1+ machine types.

Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
cc: kvm@vger.kernel.org
Signed-off-by: Michael Roth <michael.roth@amd.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Link: https://lore.kernel.org/r/20240710041005.83720-1-michael.roth@amd.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-16 10:45:06 +02:00
Richard Henderson
5915139aba * meson: Pass objects and dependencies to declare_dependency(), not static_library()
* meson: Drop the .fa library suffix
 * target/i386: drop AMD machine check bits from Intel CPUID
 * target/i386: add avx-vnni-int16 feature
 * target/i386: SEV bugfixes
 * target/i386: SEV-SNP -cpu host support
 * char: fix exit issues
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaGceoUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNcpgf/XziKojGOTvYsE7xMijOUswYjCG5m
 ZVLqxTug8Q0zO/9mGvluKBTWmh8KhRWOovX5iZL8+F0gPoYPG4ONpNhh3wpA9+S7
 H7ph4V6sDJBX4l3OrOK6htD8dO5D9kns1iKGnE0lY60PkcHl+pU8BNWfK1zYp5US
 geiyzuRFRRtDmoNx5+o+w+D+W5msPZsnlj5BnPWM+O/ykeFfSrk2ztfdwHKXUhCB
 5FJcu2sWVx+wsdVzdjgT8USi5+VTK4vabq3SfccmNRxBRnJOCU5MrR63stMDceo4
 TswSB88I0WRV1848AudcGZRkjvKaXLyHJ+QTjg2dp7itEARJ3MGsvOpS5A==
 =3kv7
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* meson: Pass objects and dependencies to declare_dependency(), not static_library()
* meson: Drop the .fa library suffix
* target/i386: drop AMD machine check bits from Intel CPUID
* target/i386: add avx-vnni-int16 feature
* target/i386: SEV bugfixes
* target/i386: SEV-SNP -cpu host support
* char: fix exit issues

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaGceoUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroNcpgf/XziKojGOTvYsE7xMijOUswYjCG5m
# ZVLqxTug8Q0zO/9mGvluKBTWmh8KhRWOovX5iZL8+F0gPoYPG4ONpNhh3wpA9+S7
# H7ph4V6sDJBX4l3OrOK6htD8dO5D9kns1iKGnE0lY60PkcHl+pU8BNWfK1zYp5US
# geiyzuRFRRtDmoNx5+o+w+D+W5msPZsnlj5BnPWM+O/ykeFfSrk2ztfdwHKXUhCB
# 5FJcu2sWVx+wsdVzdjgT8USi5+VTK4vabq3SfccmNRxBRnJOCU5MrR63stMDceo4
# TswSB88I0WRV1848AudcGZRkjvKaXLyHJ+QTjg2dp7itEARJ3MGsvOpS5A==
# =3kv7
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 04 Jul 2024 02:56:58 AM PDT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
  target/i386/SEV: implement mask_cpuid_features
  target/i386: add support for masking CPUID features in confidential guests
  char-stdio: Restore blocking mode of stdout on exit
  target/i386: add avx-vnni-int16 feature
  i386/sev: Fallback to the default SEV device if none provided in sev_get_capabilities()
  i386/sev: Fix error message in sev_get_capabilities()
  target/i386: do not include undefined bits in the AMD topoext leaf
  target/i386: SEV: fix formatting of CPUID mismatch message
  target/i386: drop AMD machine check bits from Intel CPUID
  target/i386: pass X86CPU to x86_cpu_get_supported_feature_word
  meson: Drop the .fa library suffix
  Revert "meson: Propagate gnutls dependency"
  meson: Pass objects and dependencies to declare_dependency()
  meson: merge plugin_ldflags into emulator_link_args
  meson: move block.syms dependency out of libblock
  meson: move shared_module() calls where modules are already walked

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-04 09:16:07 -07:00
Paolo Bonzini
188569c10d target/i386/SEV: implement mask_cpuid_features
Drop features that are listed as "BitMask" in the PPR and currently
not supported by AMD processors.  The only ones that may become useful
in the future are TSC deadline timer and x2APIC, everything else is
not needed for SEV-SNP guests (e.g. VIRT_SSBD) or would require
processor support (e.g. TSC_ADJUST).

This allows running SEV-SNP guests with "-cpu host".

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-04 11:56:20 +02:00