Commit Graph

14213 Commits

Author SHA1 Message Date
Peter Maydell
ad58ba13d0 target/m68k: Initialize float_status fields in gdb set/get functions
In cf_fpu_gdb_get_reg() and cf_fpu_gdb_set_reg() we use a temporary
float_status variable to pass to floatx80_to_float64() and
float64_to_floatx80(), but we don't initialize it, meaning that those
functions could access uninitialized data.  Zero-init the structs.

(We don't need to set a NaN-propagation rule here because we
don't use these with a 2-argument fpu operation.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-11-peter.maydell@linaro.org
2024-11-05 10:09:54 +00:00
Peter Maydell
0527cfd94c target/m68k: Explicitly set 2-NaN propagation rule
Explicitly set the 2-NaN propagation rule on env->fp_status
and on the temporary fp_status that we use in frem (since
we pass that to a division operation function).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-11-05 10:09:54 +00:00
Peter Maydell
5aaab56a1a target/ppc: Explicitly set 2-NaN propagation rule
Set the 2-NaN propagation rule explicitly in env->fp_status
and env->vec_status.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-9-peter.maydell@linaro.org
2024-11-05 10:09:54 +00:00
Peter Maydell
841f9d7475 target/s390x: Explicitly set 2-NaN propagation rule
Set the 2-NaN propagation rule explicitly in env->fpu_status.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-8-peter.maydell@linaro.org
2024-11-05 10:09:54 +00:00
Peter Maydell
2915876e03 target/hppa: Explicitly set 2-NaN propagation rule
Set the 2-NaN propagation rule explicitly in env->fp_status.

Really we only need to do this at CPU reset (after reset has zeroed
out most of the CPU state struct, which typically includes fp_status
fields).  However target/hppa does not currently implement CPU reset
at all, so leave a TODO comment to note that this could be moved if
we ever do implement reset.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-7-peter.maydell@linaro.org
2024-11-05 10:09:54 +00:00
Peter Maydell
1bb5257def target/loongarch: Explicitly set 2-NaN propagation rule
Set the 2-NaN propagation rule explicitly in the float_status word we
use.

(There are a couple of places in fpu_helper.c where we create a
dummy float_status word with "float_status *s = { };", but these
are only used for calling float*_is_quiet_nan() so it doesn't
matter that we don't set a 2-NaN propagation rule there.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-6-peter.maydell@linaro.org
2024-11-05 10:09:54 +00:00
Peter Maydell
0c587f1339 target/mips: Explicitly set 2-NaN propagation rule
Set the 2-NaN propagation rule explicitly in the float_status words
we use.

For active_fpu.fp_status, we do this in a new fp_reset() function
which mirrors the existing msa_reset() function in doing "first call
restore to set the fp status parts that depend on CPU state, then set
the fp status parts that are constant".

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20241025141254.2141506-5-peter.maydell@linaro.org
2024-11-05 10:09:53 +00:00
Peter Maydell
d1ff996788 target/arm: Explicitly set 2-NaN propagation rule
Set the 2-NaN propagation rule explicitly in the float_status words
we use.  We wrap this plus the pre-existing setting of the
tininess-before-rounding flag in a new function
arm_set_default_fp_behaviours() to avoid repetition, since we have a
lot of float_status words at this point.

The situation with FPA11 emulation in linux-user is a little odd, and
arguably "correct" behaviour there would be to exactly match a real
Linux kernel's FPA11 emulation.  However FPA11 emulation is
essentially dead at this point and so it seems better to continue
with QEMU's current behaviour and leave a comment describing the
situation.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-4-peter.maydell@linaro.org
2024-11-05 10:09:53 +00:00
Peter Maydell
ee057a9f29 pull-loongarch-20241102
-----BEGIN PGP SIGNATURE-----
 
 iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZyXbXgAKCRBAov/yOSY+
 37a9BADZ7vI2idWNXdH+mLNDZNSOxfdKp6ggNgKS3S48Hi2zR72MEhwvR9dGlHDL
 98agrbV7/jI9Z+0dLAxvlyl1MvXfnn2sXYgUuZp6IAaQzFBa11HBAK7UFh3sTA4A
 gD4oPwl8AdJiFvDN6vNjS+dO0ls+j/YMaoLkAKLv15dlWtg4Rw==
 =EZnr
 -----END PGP SIGNATURE-----

Merge tag 'pull-loongarch-20241102' of https://gitlab.com/gaosong/qemu into staging

pull-loongarch-20241102

# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZyXbXgAKCRBAov/yOSY+
# 37a9BADZ7vI2idWNXdH+mLNDZNSOxfdKp6ggNgKS3S48Hi2zR72MEhwvR9dGlHDL
# 98agrbV7/jI9Z+0dLAxvlyl1MvXfnn2sXYgUuZp6IAaQzFBa11HBAK7UFh3sTA4A
# gD4oPwl8AdJiFvDN6vNjS+dO0ls+j/YMaoLkAKLv15dlWtg4Rw==
# =EZnr
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 02 Nov 2024 07:57:18 GMT
# gpg:                using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C  6C2C 40A2 FFF2 3926 3EDF

* tag 'pull-loongarch-20241102' of https://gitlab.com/gaosong/qemu:
  target/loongarch: Add steal time support on migration
  hw/loongarch/boot: Use warn_report when no kernel filename
  linux-headers: Update to Linux v6.12-rc5
  linux-headers: loongarch: Add kvm_para.h
  linux-headers: Add unistd_64.h
  target/loongarch/kvm: Implement LoongArch PMU extension
  target/loongarch: Implement lbt registers save/restore function
  target/loongarch: Add loongson binary translation feature

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-11-04 16:01:10 +00:00
Peter Maydell
c94bee4cd6 * target/i386: new feature bits for AMD processors
* target/i386/tcg: improvements around flag handling
 * target/i386: add AVX10 support
 * target/i386: add GraniteRapids-v2 model
 * dockerfiles: add libcbor
 * New nitro-enclave machine type
 * qom: cleanups to object_new
 * configure: detect 64-bit MIPS for rust
 * configure: deprecate 32-bit MIPS
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmcjvkQUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPIKgf/etNpO2T+eLFtWN/Qd5eopBXqNd9k
 KmeK9EgW9lqx2IPGNen33O+uKpb/TsMmubSsSF+YxTp7pmkc8+71f3rBMaIAD02r
 /paHSMVw0+f12DAFQz1jdvGihR7Mew0wcF/UdEt737y6vEmPxLTyYG3Gfa4NSZwT
 /V5jTOIcfUN/UEjNgIp6NTuOEESKmlqt22pfMapgkwMlAJYeeJU2X9eGYE86wJbq
 ZSXNgK3jL9wGT2XKa3e+OKzHfFpSkrB0JbQbdico9pefnBokN/hTeeUJ81wBAc7u
 i00W1CEQVJ5lhBc121d4AWMp83ME6HijJUOTMmJbFIONPsITFPHK1CAkng==
 =D4nR
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream-i386' of https://gitlab.com/bonzini/qemu into staging

* target/i386: new feature bits for AMD processors
* target/i386/tcg: improvements around flag handling
* target/i386: add AVX10 support
* target/i386: add GraniteRapids-v2 model
* dockerfiles: add libcbor
* New nitro-enclave machine type
* qom: cleanups to object_new
* configure: detect 64-bit MIPS for rust
* configure: deprecate 32-bit MIPS

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmcjvkQUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroPIKgf/etNpO2T+eLFtWN/Qd5eopBXqNd9k
# KmeK9EgW9lqx2IPGNen33O+uKpb/TsMmubSsSF+YxTp7pmkc8+71f3rBMaIAD02r
# /paHSMVw0+f12DAFQz1jdvGihR7Mew0wcF/UdEt737y6vEmPxLTyYG3Gfa4NSZwT
# /V5jTOIcfUN/UEjNgIp6NTuOEESKmlqt22pfMapgkwMlAJYeeJU2X9eGYE86wJbq
# ZSXNgK3jL9wGT2XKa3e+OKzHfFpSkrB0JbQbdico9pefnBokN/hTeeUJ81wBAc7u
# i00W1CEQVJ5lhBc121d4AWMp83ME6HijJUOTMmJbFIONPsITFPHK1CAkng==
# =D4nR
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 31 Oct 2024 17:28:36 GMT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream-i386' of https://gitlab.com/bonzini/qemu: (49 commits)
  target/i386: Introduce GraniteRapids-v2 model
  target/i386: Add AVX512 state when AVX10 is supported
  target/i386: Add feature dependencies for AVX10
  target/i386: add CPUID.24 features for AVX10
  target/i386: add AVX10 feature and AVX10 version property
  target/i386: return bool from x86_cpu_filter_features
  target/i386: do not rely on ExtSaveArea for accelerator-supported XCR0 bits
  target/i386: cpu: set correct supported XCR0 features for TCG
  target/i386: use + to put flags together
  target/i386: use higher-precision arithmetic to compute CF
  target/i386: use compiler builtin to compute PF
  target/i386: make flag variables unsigned
  target/i386: add a note about gen_jcc1
  target/i386: add a few more trivial CCPrepare cases
  target/i386: optimize TEST+Jxx sequences
  target/i386: optimize computation of ZF from CC_OP_DYNAMIC
  target/i386: Wrap cc_op_live with a validity check
  target/i386: Introduce cc_op_size
  target/i386: Rearrange CCOp
  target/i386: remove CC_OP_CLR
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-11-02 16:21:38 +00:00
Bibo Mao
47b54e15bb target/loongarch: Add steal time support on migration
With pv steal time supported, VM machine needs get physical address
of each vcpu and notify new host during migration. Here two
functions kvm_get_stealtime/kvm_set_stealtime, and guest steal time
physical address is only updated on KVM_PUT_FULL_STATE stage.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240930064040.753929-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
2024-11-02 15:45:45 +08:00
Bibo Mao
6edd2a9bec target/loongarch/kvm: Implement LoongArch PMU extension
Implement PMU extension for LoongArch kvm mode. Use OnOffAuto type
variable pmu to check the PMU feature. If the PMU Feature is not supported
with KVM host, it reports error if there is pmu=on command line.

If there is no any command line about pmu parameter, it checks whether
KVM host supports the PMU Feature and set the corresponding value in cpucfg.

This patch is based on lbt patch located at
  https://lore.kernel.org/qemu-devel/20240904061859.86615-1-maobibo@loongson.cn

Co-developed-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240918082315.2345034-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
2024-11-02 15:20:41 +08:00
Bibo Mao
a45df28601 target/loongarch: Implement lbt registers save/restore function
Six registers scr0 - scr3, eflags and ftop are added in percpu vmstate.
And two functions kvm_loongarch_get_lbt/kvm_loongarch_put_lbt are added
to save/restore lbt registers.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240929070405.235200-3-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
2024-11-02 15:20:41 +08:00
Bibo Mao
c23a53d894 target/loongarch: Add loongson binary translation feature
Loongson Binary Translation (LBT) is used to accelerate binary
translation, which contains 4 scratch registers (scr0 to scr3), x86/ARM
eflags (eflags) and x87 fpu stack pointer (ftop).

Now LBT feature is added in kvm mode, not supported in TCG mode since
it is not emulated. Feature variable lbt is added with OnOffAuto type,
If lbt feature is not supported with KVM host, it reports error if there
is lbt=on command line.

If there is no any command line about lbt parameter, it checks whether
KVM host supports lbt feature and set the corresponding value in cpucfg.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20240929070405.235200-2-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
2024-11-02 15:20:41 +08:00
Tao Su
1a519388a8 target/i386: Introduce GraniteRapids-v2 model
Update GraniteRapids CPU model to add AVX10 and the missing features(ss,
tsc-adjust, cldemote, movdiri, movdir64b).

Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241028024512.156724-7-tao1.su@linux.intel.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20241031085233.425388-9-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Tao Su
0d7475be3b target/i386: Add AVX512 state when AVX10 is supported
AVX10 state enumeration in CPUID leaf D and enabling in XCR0 register
are identical to AVX512 state regardless of the supported vector lengths.

Given that some E-cores will support AVX10 but not support AVX512, add
AVX512 state components to guest when AVX10 is enabled.

Based on a patch by Tao Su <tao1.su@linux.intel.com>

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-8-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Tao Su
150ab84b2d target/i386: Add feature dependencies for AVX10
Since the highest supported vector length for a processor implies that
all lesser vector lengths are also supported, add the dependencies of
the supported vector lengths. If all vector lengths aren't supported,
clear AVX10 enable bit as well.

Note that the order of AVX10 related dependencies should be kept as:
        CPUID_24_0_EBX_AVX10_128     -> CPUID_24_0_EBX_AVX10_256,
        CPUID_24_0_EBX_AVX10_256     -> CPUID_24_0_EBX_AVX10_512,
        CPUID_24_0_EBX_AVX10_VL_MASK -> CPUID_7_1_EDX_AVX10,
        CPUID_7_1_EDX_AVX10          -> CPUID_24_0_EBX,
so that prevent user from setting weird CPUID combinations, e.g. 256-bits
and 512-bits are supported but 128-bits is not, no vector lengths are
supported but AVX10 enable bit is still set.

Since AVX10_128 will be reserved as 1, adding these dependencies has the
bonus that when user sets -cpu host,-avx10-128, CPUID_7_1_EDX_AVX10 and
CPUID_24_0_EBX will be disabled automatically.

Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241028024512.156724-5-tao1.su@linux.intel.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20241031085233.425388-7-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Tao Su
2d055b8fe1 target/i386: add CPUID.24 features for AVX10
Introduce features for the supported vector bit lengths.

Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241028024512.156724-3-tao1.su@linux.intel.com
Link: https://lore.kernel.org/r/20241028024512.156724-4-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-6-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Tao Su
bccfb846fd target/i386: add AVX10 feature and AVX10 version property
When AVX10 enable bit is set, the 0x24 leaf will be present as "AVX10
Converged Vector ISA leaf" containing fields for the version number and
the supported vector bit lengths.

Introduce avx10-version property so that avx10 version can be controlled
by user and cpu model. Per spec, avx10 version can never be 0, the default
value of avx10-version is set to 0 to determine whether it is specified by
user.  The default can come from the device model or, for the max model,
from KVM's reported value.

Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241028024512.156724-3-tao1.su@linux.intel.com
Link: https://lore.kernel.org/r/20241028024512.156724-4-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-5-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
3507c6f046 target/i386: return bool from x86_cpu_filter_features
Prepare for filtering non-boolean features such as AVX10 version.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-4-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
b888c78070 target/i386: do not rely on ExtSaveArea for accelerator-supported XCR0 bits
Right now, QEMU is using the "feature" and "bits" fields of ExtSaveArea
to query the accelerator for the support status of extended save areas.
This is a problem for AVX10, which attaches two feature bits (AVX512F
and AVX10) to the same extended save states.

To keep the AVX10 hacks to the minimum, limit usage of esa->features
and esa->bits.  Instead, just query the accelerator for the 0xD leaf.
Do it in common code and clear esa->size if an extended save state is
unsupported.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-3-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
33098002a8 target/i386: cpu: set correct supported XCR0 features for TCG
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-2-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
6d8623b5c0 target/i386: use + to put flags together
This gives greater opportunity for reassociation on x86 targets,
since addition can use the LEA instruction.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
134ffcb276 target/i386: use higher-precision arithmetic to compute CF
If the operands of the arithmetic instruction fit within a half-register,
it's easiest to use a comparison instruction to compute the carry.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
24899cdcd2 target/i386: use compiler builtin to compute PF
This removes the 256 byte parity table from the executable.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
46c04e4bcf target/i386: make flag variables unsigned
This makes it easier for the compiler to understand which bits are set,
and it also removes "cltq" instructions to canonicalize the output value
as 32-bit signed.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
44d58e938b target/i386: add a note about gen_jcc1
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
cea677e821 target/i386: add a few more trivial CCPrepare cases
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
37df7c4d57 target/i386: optimize TEST+Jxx sequences
Mostly used for TEST+JG and TEST+JLE, but it is easy to cover
also JBE/JA and JL/JGE; shaves about 0.5% TCG ops.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
ae14b33de8 target/i386: optimize computation of ZF from CC_OP_DYNAMIC
Most uses of CC_OP_DYNAMIC are for CMP/JB/JE or similar sequences.
We can optimize many of them to avoid computation of the flags.
This eliminates both TCG ops to set up the new cc_op, and helper
instructions because evaluating just ZF is much cheaper.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Richard Henderson
1f7f72bdc4 target/i386: Wrap cc_op_live with a validity check
Assert that op is known and that cc_op_live_ is populated.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Richard Henderson
f359b2fb71 target/i386: Introduce cc_op_size
Replace arithmetic on cc_op with a helper function.
Assert that the op has a size and that it is valid
for the configuration.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240701025115.1265117-6-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Richard Henderson
ee806f9f67 target/i386: Rearrange CCOp
Give the first few enumerators explicit integer constants,
align the BWLQ enumerators.

This will be used to simplify ((op - CC_OP_*B) & 3).

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240701025115.1265117-4-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
e09447c39f target/i386: remove CC_OP_CLR
Just use CC_OP_EFLAGS; it is not that likely that the flags computed by
CC_OP_CLR survive the end of the basic block, in which case there is no
need to spill cc_op_src.

cc_op_src now does need spilling if the XOR is followed by a memory
operation, but this only costs 0.2% extra TCG ops.  They will be recouped
by simplifications in how QEMU evaluates ZF at runtime, which are even
greater with this change.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Richard Henderson
c2954745f2 target/i386: Tidy cc_op_str usage
Make const.  Use the read-only strings directly; do not copy
them into an on-stack buffer with snprintf.  Allow for holes
in the cc_op_str array, now present with CC_OP_POPCNT.

Fixes: 460231ad36 ("target/i386: give CC_OP_POPCNT low bits corresponding to MO_TL")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20240701025115.1265117-2-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
a635390f05 target/i386: use tcg_gen_ext_tl when applicable
Prefer it to gen_ext_tl in the common case where the destination is known.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Paolo Bonzini
7cac7aa704 target/i386/hvf: fix handling of XSAVE-related CPUID bits
The call to xgetbv() is passing the ecx value for cpuid function 0xD,
index 0. The xgetbv call thus returns false (OSXSAVE is bit 27, which is
well out of the range of CPUID[0xD,0].ECX) and eax is not modified. While
fixing it, cache the whole computation of supported XCR0 bits since it
will be used for more than just CPUID leaf 0xD.

Furthermore, unsupported subleafs of CPUID 0xD (including all those
corresponding to zero bits in host's XCR0) must be hidden; if OSXSAVE
is not set at all, the whole of CPUID leaf 0xD plus the XSAVE bit must
be hidden.

Finally, unconditionally drop XSTATE_BNDREGS_MASK and XSTATE_BNDCSR_MASK;
real hardware will only show them if the MPX bit is set in CPUID;
this is never the case for hvf_get_supported_cpuid() because QEMU's
Hypervisor.framework support does not handle the VMX fields related to
MPX (even in the unlikely possibility that the host has MPX enabled).
So hide those bits in the new cache_host_xcr0().

Cc: Phil Dennis-Jordan <lists@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:32 +01:00
Babu Moger
9c07a7af5d target/i386: Expose new feature bits in CPUID 8000_0021_EAX/EBX
Newer AMD CPUs support ERAPS (Enhanced Return Address Prediction Security)
feature that enables the auto-clear of RSB entries on a TLB flush, context
switches and VMEXITs. The number of default RSP entries is reflected in
RapSize.

Add the feature bit and feature word to support these features.

CPUID_Fn80000021_EAX
Bits   Feature Description
24     ERAPS:
       Indicates support for enhanced return address predictor security.

CPUID_Fn80000021_EBX
Bits   Feature Description
31-24  Reserved
23:16  RapSize:
       Return Address Predictor size. RapSize x 8 is the minimum number
       of CALL instructions software needs to execute to flush the RAP.
15-00  MicrocodePatchSize. Read-only.
       Reports the size of the Microcode patch in 16-byte multiples.
       If 0, the size of the patch is at most 5568 (15C0h) bytes.

Link: https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/programmer-references/57238.zip
Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/7c62371fe60af1e9bbd853f5f8e949bf2d908bd0.1729807947.git.babu.moger@amd.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:32 +01:00
Babu Moger
2ec282b8ea target/i386: Expose bits related to SRSO vulnerability
Add following bits related Speculative Return Stack Overflow (SRSO).
Guests can make use of these bits if supported.

These bits are reported via CPUID Fn8000_0021_EAX.
===================================================================
Bit Feature Description
===================================================================
27  SBPB                Indicates support for the Selective Branch Predictor Barrier.
28  IBPB_BRTYPE         MSR_PRED_CMD[IBPB] flushes all branch type predictions.
29  SRSO_NO             Not vulnerable to SRSO.
30  SRSO_USER_KERNEL_NO Not vulnerable to SRSO at the user-kernel boundary.
===================================================================

Link: https://www.amd.com/content/dam/amd/en/documents/corporate/cr/speculative-return-stack-overflow-whitepaper.pdf
Link: https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/programmer-references/57238.zip
Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/dadbd70c38f4e165418d193918a3747bd715c5f4.1729807947.git.babu.moger@amd.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:32 +01:00
Sandipan Das
209b0ac120 target/i386: Add PerfMonV2 feature bit
CPUID leaf 0x80000022, i.e. ExtPerfMonAndDbg, advertises new performance
monitoring features for AMD processors. Bit 0 of EAX indicates support
for Performance Monitoring Version 2 (PerfMonV2) features. If found to
be set during PMU initialization, the EBX bits can be used to determine
the number of available counters for different PMUs. It also denotes the
availability of global control and status registers.

Add the required CPUID feature word and feature bit to allow guests to
make use of the PerfMonV2 features.

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/a96f00ee2637674c63c61e9fc4dee343ea818053.1729807947.git.babu.moger@amd.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:32 +01:00
Babu Moger
9c882ad4dc target/i386: Fix minor typo in NO_NESTED_DATA_BP feature bit
Rename CPUID_8000_0021_EAX_No_NESTED_DATA_BP to
       CPUID_8000_0021_EAX_NO_NESTED_DATA_BP.

No functional change intended.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/a6749acd125670d3930f4ca31736a91b1d965f2f.1729807947.git.babu.moger@amd.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:32 +01:00
Xiaoyao Li
855bdb6c8a i386/cpu: Drop the check of phys_bits in host_cpu_realizefn()
The check of cpu->phys_bits to be in range between
[32, TARGET_PHYS_ADDR_SPACE_BITS] in host_cpu_realizefn()
is duplicated with check in x86_cpu_realizefn().

Since the ckeck in x86_cpu_realizefn() is called later and can cover all
the x86 cases. Remove the one in host_cpu_realizefn().

Opportunistically adjust cpu->phys_bits directly in
host_cpu_adjust_phys_bits(), which matches more with the function name.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20240929085747.2023198-1-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:32 +01:00
Peter Maydell
92ec780519 RISC-V PR for 9.2
* Fix an access to VXSAT
 * Expose RV32 cpu to RV64 QEMU
 * Don't clear PLIC pending bits on IRQ lowering
 * Make PLIC zeroth priority register read-only
 * Set vtype.vill on CPU reset
 * Check and update APLIC pending when write sourcecfg
 * Avoid dropping charecters with HTIF
 * Apply FIFO backpressure to guests using SiFive UART
 * Support for control flow integrity extensions
 * Support for the IOMMU with the virt machine
 * set 'aia_mode' to default in error path
 * clarify how 'riscv-aia' default works
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmci/tQACgkQr3yVEwxT
 gBNPAQ//dZKjjJm4Sh+UFdUslivBJYtL1rl2UUG2UqiNn/UoYh/vcHoSArljHTjt
 8riEStnaQqXziOpMIJjIMLJ4KoiIk2SMvjNfFtcmPiPZEDEpjsTxfUxBFsBee+fI
 4KNQKKFeljq4pa+VzVvXEqzCNJIzCThFXTZhZmer00M91HPA8ZQIHpv2JL1sWlgZ
 /HW24XEDFLGc/JsR55fxpPftlAqP+BfOrqMmbWy7x2Y+G8WI05hM2zTP/W8pnIz3
 z0GCRYSBlADtrp+3RqzTwQfK5pXoFc0iDktWVYlhoXaeEmOwo8IYxTjrvBGhnBq+
 ySX1DzTa23QmOIxSYYvCRuOxyOK9ziNn+EQ9FiFBt1h1o251CYMil1bwmYXMCMNJ
 rZwF1HfUx0g2GQW1ZOqh1eeyLO29JiOdV3hxlDO7X4bbISNgU6il5MXmnvf0/XVW
 Af3YhALeeDbHgHL1iVfjafzaviQc9+YrEX13eX6N2AjcgE5a3F7XNmGfFpFJ+mfQ
 CPgiwVBXat6UpBUGAt14UM+6wzp+crSgQR5IEGth+mKMKdkWoykvo7A2oHdu39zn
 2cdzsshg2qcLLUPTFy06OOTXX382kCWXuykhHOjZ4uu2SJJ7R0W3PlYV8HSde2Vu
 Rj+89ZlUSICJNXXweQB39r87hNbtRuDIO22V0B9XrApQbJj6/yE=
 =rPaa
 -----END PGP SIGNATURE-----

Merge tag 'pull-riscv-to-apply-20241031-1' of https://github.com/alistair23/qemu into staging

RISC-V PR for 9.2

* Fix an access to VXSAT
* Expose RV32 cpu to RV64 QEMU
* Don't clear PLIC pending bits on IRQ lowering
* Make PLIC zeroth priority register read-only
* Set vtype.vill on CPU reset
* Check and update APLIC pending when write sourcecfg
* Avoid dropping charecters with HTIF
* Apply FIFO backpressure to guests using SiFive UART
* Support for control flow integrity extensions
* Support for the IOMMU with the virt machine
* set 'aia_mode' to default in error path
* clarify how 'riscv-aia' default works

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmci/tQACgkQr3yVEwxT
# gBNPAQ//dZKjjJm4Sh+UFdUslivBJYtL1rl2UUG2UqiNn/UoYh/vcHoSArljHTjt
# 8riEStnaQqXziOpMIJjIMLJ4KoiIk2SMvjNfFtcmPiPZEDEpjsTxfUxBFsBee+fI
# 4KNQKKFeljq4pa+VzVvXEqzCNJIzCThFXTZhZmer00M91HPA8ZQIHpv2JL1sWlgZ
# /HW24XEDFLGc/JsR55fxpPftlAqP+BfOrqMmbWy7x2Y+G8WI05hM2zTP/W8pnIz3
# z0GCRYSBlADtrp+3RqzTwQfK5pXoFc0iDktWVYlhoXaeEmOwo8IYxTjrvBGhnBq+
# ySX1DzTa23QmOIxSYYvCRuOxyOK9ziNn+EQ9FiFBt1h1o251CYMil1bwmYXMCMNJ
# rZwF1HfUx0g2GQW1ZOqh1eeyLO29JiOdV3hxlDO7X4bbISNgU6il5MXmnvf0/XVW
# Af3YhALeeDbHgHL1iVfjafzaviQc9+YrEX13eX6N2AjcgE5a3F7XNmGfFpFJ+mfQ
# CPgiwVBXat6UpBUGAt14UM+6wzp+crSgQR5IEGth+mKMKdkWoykvo7A2oHdu39zn
# 2cdzsshg2qcLLUPTFy06OOTXX382kCWXuykhHOjZ4uu2SJJ7R0W3PlYV8HSde2Vu
# Rj+89ZlUSICJNXXweQB39r87hNbtRuDIO22V0B9XrApQbJj6/yE=
# =rPaa
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 31 Oct 2024 03:51:48 GMT
# gpg:                using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65  9296 AF7C 9513 0C53 8013

* tag 'pull-riscv-to-apply-20241031-1' of https://github.com/alistair23/qemu: (50 commits)
  target/riscv: Fix vcompress with rvv_ta_all_1s
  target/riscv/kvm: clarify how 'riscv-aia' default works
  target/riscv/kvm: set 'aia_mode' to default in error path
  docs/specs: add riscv-iommu
  qtest/riscv-iommu-test: add init queues test
  hw/riscv/riscv-iommu: add DBG support
  hw/riscv/riscv-iommu: add ATS support
  hw/riscv/riscv-iommu: add Address Translation Cache (IOATC)
  test/qtest: add riscv-iommu-pci tests
  hw/riscv/virt.c: support for RISC-V IOMMU PCIDevice hotplug
  hw/riscv: add riscv-iommu-pci reference device
  pci-ids.rst: add Red Hat pci-id for RISC-V IOMMU device
  hw/riscv: add RISC-V IOMMU base emulation
  hw/riscv: add riscv-iommu-bits.h
  exec/memtxattr: add process identifier to the transaction attributes
  target/riscv: Expose zicfiss extension as a cpu property
  disas/riscv: enable disassembly for compressed sspush/sspopchk
  disas/riscv: enable disassembly for zicfiss instructions
  target/riscv: compressed encodings for sspush and sspopchk
  target/riscv: implement zicfiss instructions
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-10-31 16:34:25 +00:00
Anton Blanchard
c128d39ede target/riscv: Fix vcompress with rvv_ta_all_1s
vcompress packs vl or less fields into vd, so the tail starts after the
last packed field. This could be more clearly expressed in the ISA,
but for now this thread helps to explain it:

https://github.com/riscv/riscv-v-spec/issues/796

Signed-off-by: Anton Blanchard <antonb@tenstorrent.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241030043538.939712-1-antonb@tenstorrent.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-10-31 13:51:24 +10:00
Daniel Henrique Barboza
fd16cfb299 target/riscv/kvm: clarify how 'riscv-aia' default works
We do not have control in the default 'riscv-aia' default value. We can
try to set it to a specific value, in this case 'auto', but there's no
guarantee that the host will accept it.

Couple with this we're always doing a 'qemu_log' to inform whether we're
ended up using the host default or if we managed to set the AIA mode to
the QEMU default we wanted to set.

Change the 'riscv-aia' description to better reflect how the option
works, and remove the two informative 'qemu_log' that are now unneeded:
if no message shows, riscv-aia was set to the default or uset-set value.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241028182037.290171-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-10-31 13:51:24 +10:00
Daniel Henrique Barboza
d201a127e1 target/riscv/kvm: set 'aia_mode' to default in error path
When failing to set the selected AIA mode, 'aia_mode' is left untouched.
This means that 'aia_mode' will not reflect the actual AIA mode,
retrieved in 'default_aia_mode',

This is benign for now, but it will impact QMP query commands that will
expose the 'aia_mode' value, retrieving the wrong value.

Set 'aia_mode' to 'default_aia_mode' if we fail to change the AIA mode
in KVM.

While we're at it, rework the log/warning messages to be a bit less
verbose. Instead of:

KVM AIA: default mode is emul
qemu-system-riscv64: warning: KVM AIA: failed to set KVM AIA mode

We can use a single warning message:

qemu-system-riscv64: warning: KVM AIA: failed to set KVM AIA mode 'auto', using default host mode 'emul'

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241028182037.290171-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-10-31 13:51:24 +10:00
Deepak Gupta
a6a47319dd target/riscv: Expose zicfiss extension as a cpu property
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241008225010.1861630-21-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-10-31 13:51:24 +10:00
Paolo Bonzini
14bde8cd76 target/i386: fix CPUID check for LFENCE and SFENCE
LFENCE and SFENCE were introduced with the original SSE instruction set;
marking them incorrectly as cpuid(SSE2) causes failures for CPU models
that lack SSE2, for example pentium3.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-30 16:30:56 +01:00
Deepak Gupta
905c032417 target/riscv: compressed encodings for sspush and sspopchk
sspush/sspopchk have compressed encodings carved out of zcmops.
compressed sspush is designated as c.mop.1 while compressed sspopchk
is designated as c.mop.5.

Note that c.sspush x1 exists while c.sspush x5 doesn't. Similarly
c.sspopchk x5 exists while c.sspopchk x1 doesn't.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Co-developed-by: Jim Shu <jim.shu@sifive.com>
Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241008225010.1861630-18-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-10-30 11:22:08 +10:00
Deepak Gupta
f06bfe3dc3 target/riscv: implement zicfiss instructions
zicfiss has following instructions
 - sspopchk: pops a value from shadow stack and compares with x1/x5.
   If they dont match, reports a sw check exception with tval = 3.
 - sspush: pushes value in x1/x5 on shadow stack
 - ssrdp: reads current shadow stack
 - ssamoswap: swaps contents of shadow stack atomically

sspopchk/sspush/ssrdp default to zimop if zimop implemented and SSE=0

If SSE=0, ssamoswap is illegal instruction exception.

This patch implements shadow stack operations for qemu-user and shadow
stack is not protected.

Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Co-developed-by: Jim Shu <jim.shu@sifive.com>
Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241008225010.1861630-17-debug@rivosinc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-10-30 11:22:08 +10:00