mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Peter Maydell	efbe180ad2	target/arm: Add new MMU indexes for AArch32 Secure PL1&0 Our current usage of MMU indexes when EL3 is AArch32 is confused. Architecturally, when EL3 is AArch32, all Secure code runs under the Secure PL1&0 translation regime: * code at EL3, which might be Mon, or SVC, or any of the other privileged modes (PL1) * code at EL0 (Secure PL0) This is different from when EL3 is AArch64, in which case EL3 is its own translation regime, and EL1 and EL0 (whether AArch32 or AArch64) have their own regime. We claimed to be mapping Secure PL1 to our ARMMMUIdx_EL3, but didn't do anything special about Secure PL0, which meant it used the same ARMMMUIdx_EL10_0 that NonSecure PL0 does. This resulted in a bug where arm_sctlr() incorrectly picked the NonSecure SCTLR as the controlling register when in Secure PL0, which meant we were spuriously generating alignment faults because we were looking at the wrong SCTLR control bits. The use of ARMMMUIdx_EL3 for Secure PL1 also resulted in the bug that we wouldn't honour the PAN bit for Secure PL1, because there's no equivalent _PAN mmu index for it. Fix this by adding two new MMU indexes: * ARMMMUIdx_E30_0 is for Secure PL0 * ARMMMUIdx_E30_3_PAN is for Secure PL1 when PAN is enabled The existing ARMMMUIdx_E3 is used to mean "Secure PL1 without PAN" (and would be named ARMMMUIdx_E30_3 in an AArch32-centric scheme). These extra two indexes bring us up to the maximum of 16 that the core code can currently support. This commit: * adds the new MMU index handling to the various places where we deal in MMU index values * adds assertions that we aren't AArch32 EL3 in a couple of places that currently use the E10 indexes, to document why they don't also need to handle the E30 indexes * documents in a comment why regime_has_2_ranges() doesn't need updating Notes for backporting: this commit depends on the preceding revert of 4c2c04746932; that revert and this commit should probably be backported to everywhere that we originally backported `4c2c047469`. Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2326 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2588 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241101142845.1712482-3-peter.maydell@linaro.org	2024-11-05 10:09:58 +00:00
Peter Maydell	056c5c90c1	Revert "target/arm: Fix usage of MMU indexes when EL3 is AArch32" This reverts commit `4c2c047469`. This commit tried to fix a problem with our usage of MMU indexes when EL3 is AArch32, using what it described as a "more complicated approach" where we share the same MMU index values for Secure PL1&0 and NonSecure PL1&0. In theory this should work, but the change didn't account for (at least) two things: (1) The design change means we need to flush the TLBs at any point where the CPU state flips from one to the other. We already flush the TLB when SCR.NS is changed, but we don't flush the TLB when we take an exception from NS PL1&0 into Mon or when we return from Mon to NS PL1&0, and the commit didn't add any code to do that. (2) The ATS12NS* address translate instructions allow Mon code (which is Secure) to do a stage 1+2 page table walk for NS. I thought this was OK because do_ats_write() does a page table walk which doesn't use the TLBs, so because it can pass both the MMU index and also an ARMSecuritySpace argument we can tell the table walk that we want NS stage1+2, not S. But that means that all the code within the ptw that needs to find e.g. the regime EL cannot do so only with an mmu_idx -- all these functions like regime_sctlr(), regime_el(), etc would need to pass both an mmu_idx and the security_space, so they can tell whether this is a translation regime controlled by EL1 or EL3 (and so whether to look at SCTLR.S or SCTLR.NS, etc). In particular, because regime_el() wasn't updated to look at the ARMSecuritySpace it would return 1 even when the CPU was in Monitor mode (and the controlling EL is 3). This meant that page table walks in Monitor mode would look at the wrong SCTLR, TCR, etc and would generally fault when they should not. Rather than trying to make the complicated changes needed to rescue the design of `4c2c047469`, we revert it in order to instead take the route that that commit describes as "the most straightforward" fix, where we add new MMU indexes EL30_0, EL30_3, EL30_3_PAN to correspond to "Secure PL1&0 at PL0", "Secure PL1&0 at PL1", and "Secure PL1&0 at PL1 with PAN". This revert will re-expose the "spurious alignment faults in Secure PL0" issue #2326; we'll fix it again in the next commit. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Thomas Huth <thuth@redhat.com> Message-id: 20241101142845.1712482-2-peter.maydell@linaro.org Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2024-11-05 10:09:58 +00:00
Peter Maydell	ba6558461c	target/rx: Explicitly set 2-NaN propagation rule Set the NaN propagation rule explicitly for the float_status word used in the rx target. This not the architecturally correct behaviour, but since this is a no-behaviour-change patch, we leave a TODO note to that effect. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025141254.2141506-21-peter.maydell@linaro.org	2024-11-05 10:09:57 +00:00
Peter Maydell	355e6cfb94	target/openrisc: Explicitly set 2-NaN propagation rule Set the NaN propagation rule explicitly for the float_status word used in the openrisc target. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025141254.2141506-20-peter.maydell@linaro.org	2024-11-05 10:09:57 +00:00
Peter Maydell	4fafdcc833	target/microblaze: Explicitly set 2-NaN propagation rule Set the NaN propagation rule explicitly for the float_status word used in the microblaze target. This is probably not the architecturally correct behaviour, but since this is a no-behaviour-change patch, we leave a TODO note to that effect. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025141254.2141506-19-peter.maydell@linaro.org	2024-11-05 10:09:57 +00:00
Peter Maydell	c18a13edce	target/microblaze: Move setting of float rounding mode to reset Although the floating point rounding mode for Microblaze is always nearest-even, we cannot set it just once in the CPU initfn. This is because env->fp_status is in the part of the CPU state struct that is zeroed on reset. Move the call to set_float_rounding_mode() into the reset fn. (This had no guest-visible effects because it happens that the float_round_nearest_even enum value is 0, so when the struct was zeroed it didn't corrupt the setting.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025141254.2141506-18-peter.maydell@linaro.org	2024-11-05 10:09:56 +00:00
Peter Maydell	8403a5015c	target/alpha: Explicitly set 2-NaN propagation rule Set the NaN propagation rule explicitly for the float_status word used in this target. This is a no-behaviour-change commit, so we retain the existing behaviour of x87-style pick-largest-significand NaN propagation. This is however not the architecturally correct handling, so we leave a TODO note to that effect. We also leave a TODO note pointing out that all this code in the cpu initfn (including the existing setting up of env->flags and the FPCR) should be in a currently non-existent CPU reset function. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025141254.2141506-17-peter.maydell@linaro.org	2024-11-05 10:09:56 +00:00
Peter Maydell	62d39b28ef	target/i386: Set 2-NaN propagation rule explicitly Set the NaN propagation rule explicitly for the float_status words used in the x86 target. This is a no-behaviour-change commit, so we retain the existing behaviour of using the x87-style "prefer QNaN over SNaN, then prefer the NaN with the larger significand" for MMX and SSE. This is however not the documented hardware behaviour, so we leave a TODO note about what we should be doing instead. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025141254.2141506-16-peter.maydell@linaro.org	2024-11-05 10:09:56 +00:00
Peter Maydell	8d988eb44c	target/xtensa: Explicitly set 2-NaN propagation rule Set the NaN propagation rule explicitly in xtensa_use_first_nan(). (When we convert the softfloat pickNaNMulAdd routine to also select a NaN propagation rule at runtime, we will be able to remove the use_first_nan flag because the propagation rules will handle everything.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Max Filippov <jcmvbkbc@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025141254.2141506-15-peter.maydell@linaro.org	2024-11-05 10:09:55 +00:00
Peter Maydell	80de5f24e0	target/xtensa: Factor out calls to set_use_first_nan() In xtensa we currently call set_use_first_nan() in a lot of places where we want to switch the NaN-propagation handling. We're about to change the softfloat API we use to do that, so start by factoring all the calls out into a single xtensa_use_first_nan() function. The bulk of this change was done with sed -i -e 's/set_use_first_nan(\([^,]\),[^)])/xtensa_use_first_nan(env, \1)/' target/xtensa/fpu_helper.c Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Max Filippov <jcmvbkbc@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025141254.2141506-14-peter.maydell@linaro.org	2024-11-05 10:09:55 +00:00
Peter Maydell	4482f32dcd	target/sparc: Explicitly set 2-NaN propagation rule Set the NaN propagation rule explicitly in the float_status words we use. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025141254.2141506-13-peter.maydell@linaro.org	2024-11-05 10:09:55 +00:00
Peter Maydell	65c1c039cd	target/sparc: Move cpu_put_fsr(env, 0) call to reset Currently we call cpu_put_fsr(0) in sparc_cpu_realizefn(), which initializes various fields in the CPU struct: * fsr_cexc_ftt * fcc[] * fsr_qne * fsr It also sets the rounding mode in env->fp_status. This is largely pointless, because when we later reset the CPU this will zero out all the fields up until the "end_reset_fields" label, which includes all of these (but not fp_status!) Move the cpu_put_fsr(env, 0) call to reset, because that expresses the logical requirement: we want to reset FSR to 0 on every reset. This isn't a behaviour change because the fields are all zero anyway. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025141254.2141506-12-peter.maydell@linaro.org	2024-11-05 10:09:55 +00:00
Peter Maydell	ad58ba13d0	target/m68k: Initialize float_status fields in gdb set/get functions In cf_fpu_gdb_get_reg() and cf_fpu_gdb_set_reg() we use a temporary float_status variable to pass to floatx80_to_float64() and float64_to_floatx80(), but we don't initialize it, meaning that those functions could access uninitialized data. Zero-init the structs. (We don't need to set a NaN-propagation rule here because we don't use these with a 2-argument fpu operation.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025141254.2141506-11-peter.maydell@linaro.org	2024-11-05 10:09:54 +00:00
Peter Maydell	0527cfd94c	target/m68k: Explicitly set 2-NaN propagation rule Explicitly set the 2-NaN propagation rule on env->fp_status and on the temporary fp_status that we use in frem (since we pass that to a division operation function). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2024-11-05 10:09:54 +00:00
Peter Maydell	5aaab56a1a	target/ppc: Explicitly set 2-NaN propagation rule Set the 2-NaN propagation rule explicitly in env->fp_status and env->vec_status. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025141254.2141506-9-peter.maydell@linaro.org	2024-11-05 10:09:54 +00:00
Peter Maydell	841f9d7475	target/s390x: Explicitly set 2-NaN propagation rule Set the 2-NaN propagation rule explicitly in env->fpu_status. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025141254.2141506-8-peter.maydell@linaro.org	2024-11-05 10:09:54 +00:00
Peter Maydell	2915876e03	target/hppa: Explicitly set 2-NaN propagation rule Set the 2-NaN propagation rule explicitly in env->fp_status. Really we only need to do this at CPU reset (after reset has zeroed out most of the CPU state struct, which typically includes fp_status fields). However target/hppa does not currently implement CPU reset at all, so leave a TODO comment to note that this could be moved if we ever do implement reset. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025141254.2141506-7-peter.maydell@linaro.org	2024-11-05 10:09:54 +00:00
Peter Maydell	1bb5257def	target/loongarch: Explicitly set 2-NaN propagation rule Set the 2-NaN propagation rule explicitly in the float_status word we use. (There are a couple of places in fpu_helper.c where we create a dummy float_status word with "float_status s = { };", but these are only used for calling float_is_quiet_nan() so it doesn't matter that we don't set a 2-NaN propagation rule there.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025141254.2141506-6-peter.maydell@linaro.org	2024-11-05 10:09:54 +00:00
Peter Maydell	0c587f1339	target/mips: Explicitly set 2-NaN propagation rule Set the 2-NaN propagation rule explicitly in the float_status words we use. For active_fpu.fp_status, we do this in a new fp_reset() function which mirrors the existing msa_reset() function in doing "first call restore to set the fp status parts that depend on CPU state, then set the fp status parts that are constant". Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20241025141254.2141506-5-peter.maydell@linaro.org	2024-11-05 10:09:53 +00:00
Peter Maydell	d1ff996788	target/arm: Explicitly set 2-NaN propagation rule Set the 2-NaN propagation rule explicitly in the float_status words we use. We wrap this plus the pre-existing setting of the tininess-before-rounding flag in a new function arm_set_default_fp_behaviours() to avoid repetition, since we have a lot of float_status words at this point. The situation with FPA11 emulation in linux-user is a little odd, and arguably "correct" behaviour there would be to exactly match a real Linux kernel's FPA11 emulation. However FPA11 emulation is essentially dead at this point and so it seems better to continue with QEMU's current behaviour and leave a comment describing the situation. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20241025141254.2141506-4-peter.maydell@linaro.org	2024-11-05 10:09:53 +00:00
Peter Maydell	ee057a9f29	pull-loongarch-20241102 -----BEGIN PGP SIGNATURE----- iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZyXbXgAKCRBAov/yOSY+ 37a9BADZ7vI2idWNXdH+mLNDZNSOxfdKp6ggNgKS3S48Hi2zR72MEhwvR9dGlHDL 98agrbV7/jI9Z+0dLAxvlyl1MvXfnn2sXYgUuZp6IAaQzFBa11HBAK7UFh3sTA4A gD4oPwl8AdJiFvDN6vNjS+dO0ls+j/YMaoLkAKLv15dlWtg4Rw== =EZnr -----END PGP SIGNATURE----- Merge tag 'pull-loongarch-20241102' of https://gitlab.com/gaosong/qemu into staging pull-loongarch-20241102 # -----BEGIN PGP SIGNATURE----- # # iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZyXbXgAKCRBAov/yOSY+ # 37a9BADZ7vI2idWNXdH+mLNDZNSOxfdKp6ggNgKS3S48Hi2zR72MEhwvR9dGlHDL # 98agrbV7/jI9Z+0dLAxvlyl1MvXfnn2sXYgUuZp6IAaQzFBa11HBAK7UFh3sTA4A # gD4oPwl8AdJiFvDN6vNjS+dO0ls+j/YMaoLkAKLv15dlWtg4Rw== # =EZnr # -----END PGP SIGNATURE----- # gpg: Signature made Sat 02 Nov 2024 07:57:18 GMT # gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF # gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF * tag 'pull-loongarch-20241102' of https://gitlab.com/gaosong/qemu: target/loongarch: Add steal time support on migration hw/loongarch/boot: Use warn_report when no kernel filename linux-headers: Update to Linux v6.12-rc5 linux-headers: loongarch: Add kvm_para.h linux-headers: Add unistd_64.h target/loongarch/kvm: Implement LoongArch PMU extension target/loongarch: Implement lbt registers save/restore function target/loongarch: Add loongson binary translation feature Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-11-04 16:01:10 +00:00
Peter Maydell	c94bee4cd6	* target/i386: new feature bits for AMD processors * target/i386/tcg: improvements around flag handling * target/i386: add AVX10 support * target/i386: add GraniteRapids-v2 model * dockerfiles: add libcbor * New nitro-enclave machine type * qom: cleanups to object_new * configure: detect 64-bit MIPS for rust * configure: deprecate 32-bit MIPS -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmcjvkQUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPIKgf/etNpO2T+eLFtWN/Qd5eopBXqNd9k KmeK9EgW9lqx2IPGNen33O+uKpb/TsMmubSsSF+YxTp7pmkc8+71f3rBMaIAD02r /paHSMVw0+f12DAFQz1jdvGihR7Mew0wcF/UdEt737y6vEmPxLTyYG3Gfa4NSZwT /V5jTOIcfUN/UEjNgIp6NTuOEESKmlqt22pfMapgkwMlAJYeeJU2X9eGYE86wJbq ZSXNgK3jL9wGT2XKa3e+OKzHfFpSkrB0JbQbdico9pefnBokN/hTeeUJ81wBAc7u i00W1CEQVJ5lhBc121d4AWMp83ME6HijJUOTMmJbFIONPsITFPHK1CAkng== =D4nR -----END PGP SIGNATURE----- Merge tag 'for-upstream-i386' of https://gitlab.com/bonzini/qemu into staging * target/i386: new feature bits for AMD processors * target/i386/tcg: improvements around flag handling * target/i386: add AVX10 support * target/i386: add GraniteRapids-v2 model * dockerfiles: add libcbor * New nitro-enclave machine type * qom: cleanups to object_new * configure: detect 64-bit MIPS for rust * configure: deprecate 32-bit MIPS # -----BEGIN PGP SIGNATURE----- # # iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmcjvkQUHHBib256aW5p # QHJlZGhhdC5jb20ACgkQv/vSX3jHroPIKgf/etNpO2T+eLFtWN/Qd5eopBXqNd9k # KmeK9EgW9lqx2IPGNen33O+uKpb/TsMmubSsSF+YxTp7pmkc8+71f3rBMaIAD02r # /paHSMVw0+f12DAFQz1jdvGihR7Mew0wcF/UdEt737y6vEmPxLTyYG3Gfa4NSZwT # /V5jTOIcfUN/UEjNgIp6NTuOEESKmlqt22pfMapgkwMlAJYeeJU2X9eGYE86wJbq # ZSXNgK3jL9wGT2XKa3e+OKzHfFpSkrB0JbQbdico9pefnBokN/hTeeUJ81wBAc7u # i00W1CEQVJ5lhBc121d4AWMp83ME6HijJUOTMmJbFIONPsITFPHK1CAkng== # =D4nR # -----END PGP SIGNATURE----- # gpg: Signature made Thu 31 Oct 2024 17:28:36 GMT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * tag 'for-upstream-i386' of https://gitlab.com/bonzini/qemu: (49 commits) target/i386: Introduce GraniteRapids-v2 model target/i386: Add AVX512 state when AVX10 is supported target/i386: Add feature dependencies for AVX10 target/i386: add CPUID.24 features for AVX10 target/i386: add AVX10 feature and AVX10 version property target/i386: return bool from x86_cpu_filter_features target/i386: do not rely on ExtSaveArea for accelerator-supported XCR0 bits target/i386: cpu: set correct supported XCR0 features for TCG target/i386: use + to put flags together target/i386: use higher-precision arithmetic to compute CF target/i386: use compiler builtin to compute PF target/i386: make flag variables unsigned target/i386: add a note about gen_jcc1 target/i386: add a few more trivial CCPrepare cases target/i386: optimize TEST+Jxx sequences target/i386: optimize computation of ZF from CC_OP_DYNAMIC target/i386: Wrap cc_op_live with a validity check target/i386: Introduce cc_op_size target/i386: Rearrange CCOp target/i386: remove CC_OP_CLR ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2024-11-02 16:21:38 +00:00
Bibo Mao	47b54e15bb	target/loongarch: Add steal time support on migration With pv steal time supported, VM machine needs get physical address of each vcpu and notify new host during migration. Here two functions kvm_get_stealtime/kvm_set_stealtime, and guest steal time physical address is only updated on KVM_PUT_FULL_STATE stage. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Reviewed-by: Song Gao <gaosong@loongson.cn> Message-Id: <20240930064040.753929-1-maobibo@loongson.cn> Signed-off-by: Song Gao <gaosong@loongson.cn>	2024-11-02 15:45:45 +08:00
Bibo Mao	6edd2a9bec	target/loongarch/kvm: Implement LoongArch PMU extension Implement PMU extension for LoongArch kvm mode. Use OnOffAuto type variable pmu to check the PMU feature. If the PMU Feature is not supported with KVM host, it reports error if there is pmu=on command line. If there is no any command line about pmu parameter, it checks whether KVM host supports the PMU Feature and set the corresponding value in cpucfg. This patch is based on lbt patch located at https://lore.kernel.org/qemu-devel/20240904061859.86615-1-maobibo@loongson.cn Co-developed-by: Song Gao <gaosong@loongson.cn> Signed-off-by: Bibo Mao <maobibo@loongson.cn> Reviewed-by: Song Gao <gaosong@loongson.cn> Message-Id: <20240918082315.2345034-1-maobibo@loongson.cn> Signed-off-by: Song Gao <gaosong@loongson.cn>	2024-11-02 15:20:41 +08:00
Bibo Mao	a45df28601	target/loongarch: Implement lbt registers save/restore function Six registers scr0 - scr3, eflags and ftop are added in percpu vmstate. And two functions kvm_loongarch_get_lbt/kvm_loongarch_put_lbt are added to save/restore lbt registers. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Reviewed-by: Song Gao <gaosong@loongson.cn> Message-Id: <20240929070405.235200-3-maobibo@loongson.cn> Signed-off-by: Song Gao <gaosong@loongson.cn>	2024-11-02 15:20:41 +08:00
Bibo Mao	c23a53d894	target/loongarch: Add loongson binary translation feature Loongson Binary Translation (LBT) is used to accelerate binary translation, which contains 4 scratch registers (scr0 to scr3), x86/ARM eflags (eflags) and x87 fpu stack pointer (ftop). Now LBT feature is added in kvm mode, not supported in TCG mode since it is not emulated. Feature variable lbt is added with OnOffAuto type, If lbt feature is not supported with KVM host, it reports error if there is lbt=on command line. If there is no any command line about lbt parameter, it checks whether KVM host supports lbt feature and set the corresponding value in cpucfg. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Reviewed-by: Song Gao <gaosong@loongson.cn> Message-Id: <20240929070405.235200-2-maobibo@loongson.cn> Signed-off-by: Song Gao <gaosong@loongson.cn>	2024-11-02 15:20:41 +08:00
Tao Su	1a519388a8	target/i386: Introduce GraniteRapids-v2 model Update GraniteRapids CPU model to add AVX10 and the missing features(ss, tsc-adjust, cldemote, movdiri, movdir64b). Tested-by: Xuelian Guo <xuelian.guo@intel.com> Signed-off-by: Tao Su <tao1.su@linux.intel.com> Link: https://lore.kernel.org/r/20241028024512.156724-7-tao1.su@linux.intel.com Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20241031085233.425388-9-tao1.su@linux.intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Tao Su	0d7475be3b	target/i386: Add AVX512 state when AVX10 is supported AVX10 state enumeration in CPUID leaf D and enabling in XCR0 register are identical to AVX512 state regardless of the supported vector lengths. Given that some E-cores will support AVX10 but not support AVX512, add AVX512 state components to guest when AVX10 is enabled. Based on a patch by Tao Su <tao1.su@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Xuelian Guo <xuelian.guo@intel.com> Signed-off-by: Tao Su <tao1.su@linux.intel.com> Link: https://lore.kernel.org/r/20241031085233.425388-8-tao1.su@linux.intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Tao Su	150ab84b2d	target/i386: Add feature dependencies for AVX10 Since the highest supported vector length for a processor implies that all lesser vector lengths are also supported, add the dependencies of the supported vector lengths. If all vector lengths aren't supported, clear AVX10 enable bit as well. Note that the order of AVX10 related dependencies should be kept as: CPUID_24_0_EBX_AVX10_128 -> CPUID_24_0_EBX_AVX10_256, CPUID_24_0_EBX_AVX10_256 -> CPUID_24_0_EBX_AVX10_512, CPUID_24_0_EBX_AVX10_VL_MASK -> CPUID_7_1_EDX_AVX10, CPUID_7_1_EDX_AVX10 -> CPUID_24_0_EBX, so that prevent user from setting weird CPUID combinations, e.g. 256-bits and 512-bits are supported but 128-bits is not, no vector lengths are supported but AVX10 enable bit is still set. Since AVX10_128 will be reserved as 1, adding these dependencies has the bonus that when user sets -cpu host,-avx10-128, CPUID_7_1_EDX_AVX10 and CPUID_24_0_EBX will be disabled automatically. Tested-by: Xuelian Guo <xuelian.guo@intel.com> Signed-off-by: Tao Su <tao1.su@linux.intel.com> Link: https://lore.kernel.org/r/20241028024512.156724-5-tao1.su@linux.intel.com Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20241031085233.425388-7-tao1.su@linux.intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Tao Su	2d055b8fe1	target/i386: add CPUID.24 features for AVX10 Introduce features for the supported vector bit lengths. Signed-off-by: Tao Su <tao1.su@linux.intel.com> Link: https://lore.kernel.org/r/20241028024512.156724-3-tao1.su@linux.intel.com Link: https://lore.kernel.org/r/20241028024512.156724-4-tao1.su@linux.intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Xuelian Guo <xuelian.guo@intel.com> Link: https://lore.kernel.org/r/20241031085233.425388-6-tao1.su@linux.intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Tao Su	bccfb846fd	target/i386: add AVX10 feature and AVX10 version property When AVX10 enable bit is set, the 0x24 leaf will be present as "AVX10 Converged Vector ISA leaf" containing fields for the version number and the supported vector bit lengths. Introduce avx10-version property so that avx10 version can be controlled by user and cpu model. Per spec, avx10 version can never be 0, the default value of avx10-version is set to 0 to determine whether it is specified by user. The default can come from the device model or, for the max model, from KVM's reported value. Signed-off-by: Tao Su <tao1.su@linux.intel.com> Link: https://lore.kernel.org/r/20241028024512.156724-3-tao1.su@linux.intel.com Link: https://lore.kernel.org/r/20241028024512.156724-4-tao1.su@linux.intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Xuelian Guo <xuelian.guo@intel.com> Link: https://lore.kernel.org/r/20241031085233.425388-5-tao1.su@linux.intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	3507c6f046	target/i386: return bool from x86_cpu_filter_features Prepare for filtering non-boolean features such as AVX10 version. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Tao Su <tao1.su@linux.intel.com> Link: https://lore.kernel.org/r/20241031085233.425388-4-tao1.su@linux.intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	b888c78070	target/i386: do not rely on ExtSaveArea for accelerator-supported XCR0 bits Right now, QEMU is using the "feature" and "bits" fields of ExtSaveArea to query the accelerator for the support status of extended save areas. This is a problem for AVX10, which attaches two feature bits (AVX512F and AVX10) to the same extended save states. To keep the AVX10 hacks to the minimum, limit usage of esa->features and esa->bits. Instead, just query the accelerator for the 0xD leaf. Do it in common code and clear esa->size if an extended save state is unsupported. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/r/20241031085233.425388-3-tao1.su@linux.intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	33098002a8	target/i386: cpu: set correct supported XCR0 features for TCG Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/r/20241031085233.425388-2-tao1.su@linux.intel.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	6d8623b5c0	target/i386: use + to put flags together This gives greater opportunity for reassociation on x86 targets, since addition can use the LEA instruction. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	134ffcb276	target/i386: use higher-precision arithmetic to compute CF If the operands of the arithmetic instruction fit within a half-register, it's easiest to use a comparison instruction to compute the carry. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	24899cdcd2	target/i386: use compiler builtin to compute PF This removes the 256 byte parity table from the executable. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	46c04e4bcf	target/i386: make flag variables unsigned This makes it easier for the compiler to understand which bits are set, and it also removes "cltq" instructions to canonicalize the output value as 32-bit signed. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	44d58e938b	target/i386: add a note about gen_jcc1 Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	cea677e821	target/i386: add a few more trivial CCPrepare cases Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	37df7c4d57	target/i386: optimize TEST+Jxx sequences Mostly used for TEST+JG and TEST+JLE, but it is easy to cover also JBE/JA and JL/JGE; shaves about 0.5% TCG ops. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	ae14b33de8	target/i386: optimize computation of ZF from CC_OP_DYNAMIC Most uses of CC_OP_DYNAMIC are for CMP/JB/JE or similar sequences. We can optimize many of them to avoid computation of the flags. This eliminates both TCG ops to set up the new cc_op, and helper instructions because evaluating just ZF is much cheaper. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Richard Henderson	1f7f72bdc4	target/i386: Wrap cc_op_live with a validity check Assert that op is known and that cc_op_live_ is populated. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Richard Henderson	f359b2fb71	target/i386: Introduce cc_op_size Replace arithmetic on cc_op with a helper function. Assert that the op has a size and that it is valid for the configuration. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20240701025115.1265117-6-richard.henderson@linaro.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Richard Henderson	ee806f9f67	target/i386: Rearrange CCOp Give the first few enumerators explicit integer constants, align the BWLQ enumerators. This will be used to simplify ((op - CC_OP_*B) & 3). Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20240701025115.1265117-4-richard.henderson@linaro.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	e09447c39f	target/i386: remove CC_OP_CLR Just use CC_OP_EFLAGS; it is not that likely that the flags computed by CC_OP_CLR survive the end of the basic block, in which case there is no need to spill cc_op_src. cc_op_src now does need spilling if the XOR is followed by a memory operation, but this only costs 0.2% extra TCG ops. They will be recouped by simplifications in how QEMU evaluates ZF at runtime, which are even greater with this change. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Richard Henderson	c2954745f2	target/i386: Tidy cc_op_str usage Make const. Use the read-only strings directly; do not copy them into an on-stack buffer with snprintf. Allow for holes in the cc_op_str array, now present with CC_OP_POPCNT. Fixes: `460231ad36` ("target/i386: give CC_OP_POPCNT low bits corresponding to MO_TL") Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20240701025115.1265117-2-richard.henderson@linaro.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	a635390f05	target/i386: use tcg_gen_ext_tl when applicable Prefer it to gen_ext_tl in the common case where the destination is known. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:33 +01:00
Paolo Bonzini	7cac7aa704	target/i386/hvf: fix handling of XSAVE-related CPUID bits The call to xgetbv() is passing the ecx value for cpuid function 0xD, index 0. The xgetbv call thus returns false (OSXSAVE is bit 27, which is well out of the range of CPUID[0xD,0].ECX) and eax is not modified. While fixing it, cache the whole computation of supported XCR0 bits since it will be used for more than just CPUID leaf 0xD. Furthermore, unsupported subleafs of CPUID 0xD (including all those corresponding to zero bits in host's XCR0) must be hidden; if OSXSAVE is not set at all, the whole of CPUID leaf 0xD plus the XSAVE bit must be hidden. Finally, unconditionally drop XSTATE_BNDREGS_MASK and XSTATE_BNDCSR_MASK; real hardware will only show them if the MPX bit is set in CPUID; this is never the case for hvf_get_supported_cpuid() because QEMU's Hypervisor.framework support does not handle the VMX fields related to MPX (even in the unlikely possibility that the host has MPX enabled). So hide those bits in the new cache_host_xcr0(). Cc: Phil Dennis-Jordan <lists@philjordan.eu> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:32 +01:00
Babu Moger	9c07a7af5d	target/i386: Expose new feature bits in CPUID 8000_0021_EAX/EBX Newer AMD CPUs support ERAPS (Enhanced Return Address Prediction Security) feature that enables the auto-clear of RSB entries on a TLB flush, context switches and VMEXITs. The number of default RSP entries is reflected in RapSize. Add the feature bit and feature word to support these features. CPUID_Fn80000021_EAX Bits Feature Description 24 ERAPS: Indicates support for enhanced return address predictor security. CPUID_Fn80000021_EBX Bits Feature Description 31-24 Reserved 23:16 RapSize: Return Address Predictor size. RapSize x 8 is the minimum number of CALL instructions software needs to execute to flush the RAP. 15-00 MicrocodePatchSize. Read-only. Reports the size of the Microcode patch in 16-byte multiples. If 0, the size of the patch is at most 5568 (15C0h) bytes. Link: https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/programmer-references/57238.zip Signed-off-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/7c62371fe60af1e9bbd853f5f8e949bf2d908bd0.1729807947.git.babu.moger@amd.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-31 18:28:32 +01:00

1 2 3 4 5 ...

14225 Commits