mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Paolo Bonzini	6605817b1a	target/i386: clean up repeated string operations Do not bother generating inline wrappers for gen_repz and gen_repz2; use s->prefix to separate REPZ from REPNZ in the case of SCAS and CMPS. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 13:28:02 +02:00
Paolo Bonzini	f0e754d3ce	target/i386: introduce gen_lea_ss_ofs Generalize gen_stack_A0() to include an initial add and to use an arbitrary destination. This is a common pattern and it is not a huge burden to add the extra arguments to the only caller of gen_stack_A0(). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 13:28:01 +02:00
Paolo Bonzini	20237d4070	target/i386: use mo_stacksize more Use mo_stacksize for all stack accesses, including when a 64-bit code segment is impossible and the code is therefore checking only for SS32(s). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 13:28:01 +02:00
Paolo Bonzini	d0e31d6d37	target/i386: inline gen_add_A0_ds_seg It is only used in MONITOR, where a direct call of gen_lea_v_seg is simpler, and in XLAT. Inline it in the latter. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 13:28:01 +02:00
Paolo Bonzini	420d60caad	target/i386: split gen_ldst_modrm for load and store The is_store argument of gen_ldst_modrm has only ever been passed a constant. Just split the function in two. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 13:28:01 +02:00
Paolo Bonzini	b3c49e654a	target/i386: reg in gen_ldst_modrm is always OR_TMP0 Values other than OR_TMP0 were only ever used by MOV and MOVNTI opcodes. Now that these have been converted to the new decoder, remove the argument. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 13:28:01 +02:00
Paolo Bonzini	f5bd6a48ee	target/i386: raze the gen_eob* jungle Make gen_eob take the DISAS_* constant as an argument, so that it is not necessary to have wrappers around it. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 13:28:01 +02:00
Paolo Bonzini	ad8f2ad77e	target/i386: assert that gen_update_eip_cur and gen_update_eip_next are the same in tb_stop This is an invariant now that there are no calls to gen_eob_inhibit_irq() outside tb_stop. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 13:28:01 +02:00
Paolo Bonzini	2512f786bf	target/i386: avoid calling gen_eob_inhibit_irq before tb_stop sti only has one exit, so it does not need to generate the end-of-translation code inline. It can be deferred to tb_stop. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 13:28:01 +02:00
Paolo Bonzini	c8494cb8b1	target/i386: avoid calling gen_eob_syscall before tb_stop syscall and sysret only have one exit, so they do not need to generate the end-of-translation code inline. It can be deferred to tb_stop. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 13:28:01 +02:00
Paolo Bonzini	9594b59331	target/i386: document and group DISAS_* constants Place DISAS_* constants that update cpu_eip first, and the "jump" ones last. Add comments explaining the differences and usage. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 13:28:01 +02:00
Paolo Bonzini	abdcc5c8ef	target/i386: set CC_OP in helpers if they want CC_OP_EFLAGS Mark cc_op as clean and do not spill it at the end of the translation block. Technically this is a tiny bit less efficient, but: * it results in translations that are a tiny bit smaller * for most of these instructions, it is not unlikely that they are close to the end of the basic block, in which case cc_op would not be overwritten * anyway the cost is probably dwarfed by that of computing flags. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 13:28:01 +02:00
Paolo Bonzini	a0625efd4d	target/i386: cpu_load_eflags already sets cc_op No need to set it again at the end of the translation block, cc_op_dirty can be set to false. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 13:28:01 +02:00
Paolo Bonzini	f6ac77eab6	target/i386: remove unnecessary gen_update_cc_op before gen_eob* This is already handled in gen_eob(). Before adding another DISAS_* case, remove the double calls. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 13:28:01 +02:00
Paolo Bonzini	69d7281262	target/i386: cleanup eob handling of RSM gen_helper_rsm cannot generate an exception, and reloads the flags. So there's no need to spill cc_op and update cpu_eip, but on the other hand cc_op must be reset to CC_OP_EFLAGS before returning. It all works by chance, because by spilling cc_op before the call to the helper, it becomes non-dirty and gen_eob will not overwrite the CC_OP_EFLAGS value that is placed there by the helper. But let's clean it up. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 13:28:01 +02:00
Paolo Bonzini	f0f0136abb	target/i386: no single-step exception after MOV or POP SS Intel SDM 18.3.1.4 "If an occurrence of the MOV or POP instruction loads the SS register executes with EFLAGS.TF = 1, no single-step debug exception occurs following the MOV or POP instruction." Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 13:27:54 +02:00
Paolo Bonzini	8225bff7c5	target/i386: disable jmp_opt if EFLAGS.RF is 1 If EFLAGS.RF is 1, special processing in gen_eob_worker() is needed and therefore goto_tb cannot be used. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-25 10:00:12 +02:00
Richard Henderson	7b68a5fe2f	* hw/i386/pc_sysfw: Alias rather than copy isa-bios region * target/i386: add control bits support for LAM * target/i386: tweaks to new translator * target/i386: add support for LAM in CPUID enumeration * hw/i386/pc: Support smp.modules for x86 PC machine * target-i386: hyper-v: Correct kvm_hv_handle_exit return value -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZOMlAUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNTSwf8DOPgipepNcsxUQoV9nOBfNXqEWa6 DilQGwuu/3eMSPITUCGKVrtLR5azwCwvNfYYErVBPVIhjImnk3XHwfKpH1csadgq 7Np8WGjAyKEIP/yC/K1VwsanFHv3hmC6jfcO3ZnsnlmbHsRINbvU9uMlFuiQkKJG lP/dSUcTVhwLT6eFr9DVDUnq4Nh7j3saY85pZUoDclobpeRLaEAYrawha1/0uQpc g7MZYsxT3sg9PIHlM+flpRvJNPz/ZDBdj4raN1xo4q0ET0KRLni6oEOVs5GpTY1R t4O8a/IYkxeI15K9U7i0HwYI2wVwKZbHgp9XPMYVZFJdKBGT8bnF56pV9A== =lp7q -----END PGP SIGNATURE----- Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging * hw/i386/pc_sysfw: Alias rather than copy isa-bios region * target/i386: add control bits support for LAM * target/i386: tweaks to new translator * target/i386: add support for LAM in CPUID enumeration * hw/i386/pc: Support smp.modules for x86 PC machine * target-i386: hyper-v: Correct kvm_hv_handle_exit return value # -----BEGIN PGP SIGNATURE----- # # iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZOMlAUHHBib256aW5p # QHJlZGhhdC5jb20ACgkQv/vSX3jHroNTSwf8DOPgipepNcsxUQoV9nOBfNXqEWa6 # DilQGwuu/3eMSPITUCGKVrtLR5azwCwvNfYYErVBPVIhjImnk3XHwfKpH1csadgq # 7Np8WGjAyKEIP/yC/K1VwsanFHv3hmC6jfcO3ZnsnlmbHsRINbvU9uMlFuiQkKJG # lP/dSUcTVhwLT6eFr9DVDUnq4Nh7j3saY85pZUoDclobpeRLaEAYrawha1/0uQpc # g7MZYsxT3sg9PIHlM+flpRvJNPz/ZDBdj4raN1xo4q0ET0KRLni6oEOVs5GpTY1R # t4O8a/IYkxeI15K9U7i0HwYI2wVwKZbHgp9XPMYVZFJdKBGT8bnF56pV9A== # =lp7q # -----END PGP SIGNATURE----- # gpg: Signature made Wed 22 May 2024 10:58:40 AM PDT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] * tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (23 commits) target-i386: hyper-v: Correct kvm_hv_handle_exit return value i386/cpu: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14] i386/cpu: Use CPUCacheInfo.share_level to encode CPUID[4] i386: Add cache topology info in CPUCacheInfo hw/i386/pc: Support smp.modules for x86 PC machine tests: Add test case of APIC ID for module level parsing i386/cpu: Introduce module-id to X86CPU i386: Support module_id in X86CPUTopoIDs i386: Expose module level in CPUID[0x1F] i386: Support modules_per_die in X86CPUTopoInfo i386: Introduce module level cpu topology to CPUX86State i386/cpu: Decouple CPUID[0x1F] subleaf with specific topology level i386: Split topology types of CPUID[0x1F] from the definitions of CPUID[0xB] i386/cpu: Introduce bitmap to cache available CPU topology levels i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() i386/cpu: Use APIC ID info get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] i386/cpu: Use APIC ID info to encode cache topo in CPUID[4] i386/cpu: Fix i/d-cache topology to core level for Intel CPU target/i386: add control bits support for LAM target/i386: add support for LAM in CPUID enumeration ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2024-05-23 08:14:03 -07:00
Bibo Mao	f83434f3dc	target/loongarch: Add loongarch vector property unconditionally Currently LSX/LASX vector property is decided by the default value. Instead vector property should be added unconditionally, and it is irrelative with its default value. If vector is disabled by default, vector also can be enabled from command line. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Reviewed-by: Song Gao <gaosong@loongson.cn> Message-Id: <20240521080549.434197-2-maobibo@loongson.cn> Signed-off-by: Song Gao <gaosong@loongson.cn>	2024-05-23 09:30:41 +08:00
Song Gao	07c0866103	target/loongarch/kvm: fpu save the vreg registers high 192bit On kvm side, get_fpu/set_fpu save the vreg registers high 192bits, but QEMU missing. Cc: qemu-stable@nongnu.org Signed-off-by: Song Gao <gaosong@loongson.cn> Reviewed-by: Bibo Mao <maobibo@loongson.cn> Message-Id: <20240514110752.989572-1-gaosong@loongson.cn>	2024-05-23 09:30:41 +08:00
Song Gao	0eb285c362	target/loongarch/kvm: Fix VM recovery from disk failures vmstate does not save kvm_state_conter, which can cause VM recovery from disk to fail. Cc: qemu-stable@nongnu.org Signed-off-by: Song Gao <gaosong@loongson.cn> Acked-by: Peter Xu <peterx@redhat.com> Message-Id: <20240508024732.3127792-1-gaosong@loongson.cn>	2024-05-23 09:30:41 +08:00
donsheng	84d4b72854	target-i386: hyper-v: Correct kvm_hv_handle_exit return value This bug fix addresses the incorrect return value of kvm_hv_handle_exit for KVM_EXIT_HYPERV_SYNIC, which should be EXCP_INTERRUPT. Handling of KVM_EXIT_HYPERV_SYNIC in QEMU needs to be synchronous. This means that async_synic_update should run in the current QEMU vCPU thread before returning to KVM, returning EXCP_INTERRUPT to guarantee this. Returning 0 can cause async_synic_update to run asynchronously. One problem (kvm-unit-tests's hyperv_synic test fails with timeout error) caused by this bug: When a guest VM writes to the HV_X64_MSR_SCONTROL MSR to enable Hyper-V SynIC, a VM exit is triggered and processed by the kvm_hv_handle_exit function of the QEMU vCPU. This function then calls the async_synic_update function to set synic->sctl_enabled to true. A true value of synic->sctl_enabled is required before creating SINT routes using the hyperv_sint_route_new() function. If kvm_hv_handle_exit returns 0 for KVM_EXIT_HYPERV_SYNIC, the current QEMU vCPU thread may return to KVM and enter the guest VM before running async_synic_update. In such case, the hyperv_synic test’s subsequent call to synic_ctl(HV_TEST_DEV_SINT_ROUTE_CREATE, ...) immediately after writing to HV_X64_MSR_SCONTROL can cause QEMU’s hyperv_sint_route_new() function to return prematurely (because synic->sctl_enabled is false). If the SINT route is not created successfully, the SINT interrupt will not be fired, resulting in a timeout error in the hyperv_synic test. Fixes: `267e071bd6` (“hyperv: make overlay pages for SynIC”) Suggested-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com> Message-ID: <20240521200114.11588-1-dongsheng.x.zhang@intel.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 19:56:28 +02:00
Zhao Liu	5eb608a13b	i386/cpu: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14] CPUID[0x8000001D].EAX[bits 25:14] NumSharingCache: number of logical processors sharing cache. The number of logical processors sharing this cache is NumSharingCache + 1. After cache models have topology information, we can use CPUCacheInfo.share_level to decide which topology level to be encoded into CPUID[0x8000001D].EAX[bits 25:14]. Tested-by: Yongwei Ma <yongwei.ma@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Message-ID: <20240424154929.1487382-22-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 19:56:27 +02:00
Zhao Liu	f602eb925a	i386/cpu: Use CPUCacheInfo.share_level to encode CPUID[4] CPUID[4].EAX[bits 25:14] is used to represent the cache topology for Intel CPUs. After cache models have topology information, we can use CPUCacheInfo.share_level to decide which topology level to be encoded into CPUID[4].EAX[bits 25:14]. And since with the helper max_processor_ids_for_cache(), the filed CPUID[4].EAX[bits 25:14] (original virable "num_apic_ids") is parsed based on cpu topology levels, which are verified when parsing -smp, it's no need to check this value by "assert(num_apic_ids > 0)" again, so remove this assert(). Additionally, wrap the encoding of CPUID[4].EAX[bits 31:26] into a helper to make the code cleaner. Tested-by: Yongwei Ma <yongwei.ma@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Message-ID: <20240424154929.1487382-21-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 19:56:27 +02:00
Zhao Liu	9fcba76ab9	i386: Add cache topology info in CPUCacheInfo Currently, by default, the cache topology is encoded as: 1. i/d cache is shared in one core. 2. L2 cache is shared in one core. 3. L3 cache is shared in one die. This default general setting has caused a misunderstanding, that is, the cache topology is completely equated with a specific cpu topology, such as the connection between L2 cache and core level, and the connection between L3 cache and die level. In fact, the settings of these topologies depend on the specific platform and are not static. For example, on Alder Lake-P, every four Atom cores share the same L2 cache. Thus, we should explicitly define the corresponding cache topology for different cache models to increase scalability. Except legacy_l2_cache_cpuid2 (its default topo level is CPU_TOPO_LEVEL_UNKNOW), explicitly set the corresponding topology level for all other cache models. In order to be compatible with the existing cache topology, set the CPU_TOPO_LEVEL_CORE level for the i/d cache, set the CPU_TOPO_LEVEL_CORE level for L2 cache, and set the CPU_TOPO_LEVEL_DIE level for L3 cache. The field for CPUID[4].EAX[bits 25:14] or CPUID[0x8000001D].EAX[bits 25:14] will be set based on CPUCacheInfo.share_level. Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20240424154929.1487382-20-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 19:43:29 +02:00
Zhao Liu	588208346f	i386/cpu: Introduce module-id to X86CPU Introduce module-id to be consistent with the module-id field in CpuInstanceProperties. Following the legacy smp check rules, also add the module_id validity into x86_cpu_pre_plug(). Tested-by: Yongwei Ma <yongwei.ma@intel.com> Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Message-ID: <20240424154929.1487382-17-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 19:43:29 +02:00
Zhao Liu	5304873acd	i386: Expose module level in CPUID[0x1F] Linux kernel (from v6.4, with commit edc0a2b595765 ("x86/topology: Fix erroneous smp_num_siblings on Intel Hybrid platforms") is able to handle platforms with Module level enumerated via CPUID.1F. Expose the module level in CPUID[0x1F] if the machine has more than 1 modules. Tested-by: Yongwei Ma <yongwei.ma@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Message-ID: <20240424154929.1487382-15-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 19:43:29 +02:00
Zhao Liu	3568adc995	i386: Support modules_per_die in X86CPUTopoInfo Support module level in i386 cpu topology structure "X86CPUTopoInfo". Since x86 does not yet support the "modules" parameter in "-smp", X86CPUTopoInfo.modules_per_die is currently always 1. Therefore, the module level width in APIC ID, which can be calculated by "apicid_bitwidth_for_count(topo_info->modules_per_die)", is always 0 for now, so we can directly add APIC ID related helpers to support module level parsing. In addition, update topology structure in test-x86-topo.c. Tested-by: Yongwei Ma <yongwei.ma@intel.com> Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Message-ID: <20240424154929.1487382-14-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 19:43:29 +02:00
Zhao Liu	81c392ab5c	i386: Introduce module level cpu topology to CPUX86State Intel CPUs implement module level on hybrid client products (e.g., ADL-N, MTL, etc) and E-core server products. A module contains a set of cores that share certain resources (in current products, the resource usually includes L2 cache, as well as module scoped features and MSRs). Module level support is the prerequisite for L2 cache topology on module level. With module level, we can implement the Guest's CPU topology and future cache topology to be consistent with the Host's on Intel hybrid client/E-core server platforms. Tested-by: Yongwei Ma <yongwei.ma@intel.com> Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Message-ID: <20240424154929.1487382-13-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 19:43:29 +02:00
Zhao Liu	822bce9f58	i386/cpu: Decouple CPUID[0x1F] subleaf with specific topology level At present, the subleaf 0x02 of CPUID[0x1F] is bound to the "die" level. In fact, the specific topology level exposed in 0x1F depends on the platform's support for extension levels (module, tile and die). To help expose "module" level in 0x1F, decouple CPUID[0x1F] subleaf with specific topology level. Tested-by: Yongwei Ma <yongwei.ma@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240424154929.1487382-12-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 19:43:29 +02:00
Zhao Liu	0f6ed7ba13	i386: Split topology types of CPUID[0x1F] from the definitions of CPUID[0xB] CPUID[0xB] defines SMT, Core and Invalid types, and this leaf is shared by Intel and AMD CPUs. But for extended topology levels, Intel CPU (in CPUID[0x1F]) and AMD CPU (in CPUID[0x80000026]) have the different definitions with different enumeration values. Though CPUID[0x80000026] hasn't been implemented in QEMU, to avoid possible misunderstanding, split topology types of CPUID[0x1F] from the definitions of CPUID[0xB] and introduce CPUID[0x1F]-specific topology types. Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Babu Moger <babu.moger@amd.com> Message-ID: <20240424154929.1487382-11-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 19:43:29 +02:00
Zhao Liu	6ddeb0ec8c	i386/cpu: Introduce bitmap to cache available CPU topology levels Currently, QEMU checks the specify number of topology domains to detect if there's extended topology levels (e.g., checking nr_dies). With this bitmap, the extended CPU topology (the levels other than SMT, core and package) could be easier to detect without touching the topology details. This is also in preparation for the follow-up to decouple CPUID[0x1F] subleaf with specific topology level. Tested-by: Yongwei Ma <yongwei.ma@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240424154929.1487382-10-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 19:43:29 +02:00
Zhao Liu	2613747a79	i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid() In cpu_x86_cpuid(), there are many variables in representing the cpu topology, e.g., topo_info, cs->nr_cores and cs->nr_threads. Since the names of cs->nr_cores and cs->nr_threads do not accurately represent its meaning, the use of cs->nr_cores or cs->nr_threads is prone to confusion and mistakes. And the structure X86CPUTopoInfo names its members clearly, thus the variable "topo_info" should be preferred. In addition, in cpu_x86_cpuid(), to uniformly use the topology variable, replace env->dies with topo_info.dies_per_pkg as well. Suggested-by: Robert Hoo <robert.hu@linux.intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Babu Moger <babu.moger@amd.com> Message-ID: <20240424154929.1487382-9-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 19:43:29 +02:00
Zhao Liu	9a085c4b4a	i386/cpu: Use APIC ID info get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14] The commit `8f4202fb10` ("i386: Populate AMD Processor Cache Information for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding the number of sharing threads directly. From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14]) means [1]: The number of logical processors sharing this cache is the value of this field incremented by 1. To determine which logical processors are sharing a cache, determine a Share Id for each processor as follows: ShareId = LocalApicId >> log2(NumSharingCache+1) Logical processors with the same ShareId then share a cache. If NumSharingCache+1 is not a power of two, round it up to the next power of two. From the description above, the calculation of this field should be same as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of APIC ID to calculate this field. [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology Information Tested-by: Yongwei Ma <yongwei.ma@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240424154929.1487382-8-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 19:43:29 +02:00
Zhao Liu	88dd4ca06c	i386/cpu: Use APIC ID info to encode cache topo in CPUID[4] Refer to the fixes of cache_info_passthrough ([1], [2]) and SDM, the CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26] should use the nearest power-of-2 integer. The nearest power-of-2 integer can be calculated by pow2ceil() or by using APIC ID offset/width (like L3 topology using 1 << die_offset [3]). But in fact, CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26] are associated with APIC ID. For example, in linux kernel, the field "num_threads_sharing" (Bits 25 - 14) is parsed with APIC ID. And for another example, on Alder Lake P, the CPUID.04H:EAX[bits 31:26] is not matched with actual core numbers and it's calculated by: "(1 << (pkg_offset - core_offset)) - 1". Therefore the topology information of APIC ID should be preferred to calculate nearest power-of-2 integer for CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26]: 1. d/i cache is shared in a core, 1 << core_offset should be used instead of "cs->nr_threads" in encode_cache_cpuid4() for CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14]. 2. L2 cache is supposed to be shared in a core as for now, thereby 1 << core_offset should also be used instead of "cs->nr_threads" in encode_cache_cpuid4() for CPUID.04H.02H:EAX[bits 25:14]. 3. Similarly, the value for CPUID.04H:EAX[bits 31:26] should also be calculated with the bit width between the package and SMT levels in the APIC ID (1 << (pkg_offset - core_offset) - 1). In addition, use APIC ID bits calculations to replace "pow2ceil()" for cache_info_passthrough case. [1]: `efb3934adf` ("x86: cpu: make sure number of addressable IDs for processor cores meets the spec") [2]: `d7caf13b5f` ("x86: cpu: fixup number of addressable IDs for logical processors sharing cache") [3]: `d65af288a8` ("i386: Update new x86_apicid parsing rules with die_offset support") Fixes: `7e3482f824` ("i386: Helpers to encode cache information consistently") Suggested-by: Robert Hoo <robert.hu@linux.intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Message-ID: <20240424154929.1487382-7-zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 19:43:26 +02:00
Zhao Liu	12f6b8280f	i386/cpu: Fix i/d-cache topology to core level for Intel CPU For i-cache and d-cache, current QEMU hardcodes the maximum IDs for CPUs sharing cache (CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14]) to 0, and this means i-cache and d-cache are shared in the SMT level. This is correct if there's single thread per core, but is wrong for the hyper threading case (one core contains multiple threads) since the i-cache and d-cache are shared in the core level other than SMT level. For AMD CPU, commit `8f4202fb10` ("i386: Populate AMD Processor Cache Information for cpuid 0x8000001D") has already introduced i/d cache topology as core level by default. Therefore, in order to be compatible with both multi-threaded and single-threaded situations, we should set i-cache and d-cache be shared at the core level by default. This fix changes the default i/d cache topology from per-thread to per-core. Potentially, this change in L1 cache topology may affect the performance of the VM if the user does not specifically specify the topology or bind the vCPU. However, the way to achieve optimal performance should be to create a reasonable topology and set the appropriate vCPU affinity without relying on QEMU's default topology structure. Fixes: `7e3482f824` ("i386: Helpers to encode cache information consistently") Suggested-by: Robert Hoo <robert.hu@linux.intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20240424154929.1487382-6-zhao1.liu@intel.com> [Add compat property. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 19:39:33 +02:00
Binbin Wu	0117067131	target/i386: add control bits support for LAM LAM uses CR3[61] and CR3[62] to configure/enable LAM on user pointers. LAM uses CR4[28] to configure/enable LAM on supervisor pointers. For CR3 LAM bits, no additional handling needed: - TCG LAM is not supported for TCG of target-i386. helper_write_crN() and helper_vmrun() check max physical address bits before calling cpu_x86_update_cr3(), no change needed, i.e. CR3 LAM bits are not allowed to be set in TCG. - gdbstub x86_cpu_gdb_write_register() will call cpu_x86_update_cr3() to update cr3. Allow gdb to set the LAM bit(s) to CR3, if vcpu doesn't support LAM, KVM_SET_SREGS will fail as other reserved bits. For CR4 LAM bit, its reservation depends on vcpu supporting LAM feature or not. - TCG LAM is not supported for TCG of target-i386. helper_write_crN() and helper_vmrun() check CR4 reserved bit before calling cpu_x86_update_cr4(), i.e. CR4 LAM bit is not allowed to be set in TCG. - gdbstub x86_cpu_gdb_write_register() will call cpu_x86_update_cr4() to update cr4. Mask out LAM bit on CR4 if vcpu doesn't support LAM. - x86_cpu_reset_hold() doesn't need special handling. Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com> Tested-by: Xuelian Guo <xuelian.guo@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-ID: <20240112060042.19925-3-binbin.wu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 15:53:30 +02:00
Robert Hoo	ba67809059	target/i386: add support for LAM in CPUID enumeration Linear Address Masking (LAM) is a new Intel CPU feature, which allows software to use of the untranslated address bits for metadata. The bit definition: CPUID.(EAX=7,ECX=1):EAX[26] Add CPUID definition for LAM. Note LAM feature is not supported for TCG of target-i386, LAM CPIUD bit will not be added to TCG_7_1_EAX_FEATURES. More info can be found in Intel ISE Chapter "LINEAR ADDRESS MASKING(LAM)" https://cdrdv2.intel.com/v1/dl/getContent/671368 Signed-off-by: Robert Hoo <robert.hu@linux.intel.com> Co-developed-by: Binbin Wu <binbin.wu@linux.intel.com> Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com> Tested-by: Xuelian Guo <xuelian.guo@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-ID: <20240112060042.19925-2-binbin.wu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 15:53:30 +02:00
Paolo Bonzini	ec56891984	target/i386: clean up AAM/AAD The 32-bit AAM/AAD opcodes are using helpers that read and write flags and env->regs[R_EAX]. Clean them up so that the table correctly includes AX as a 16-bit input and output. No real reason to do it to be honest, but they are nice one-output helpers and it removes the masking of env->regs[R_EAX] that generic load/writeback code already does. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20240522123912.608497-1-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 15:53:30 +02:00
Paolo Bonzini	d0414d71f6	target/i386: generate simpler code for ROL/ROR with immediate count gen_rot_carry and gen_rot_overflow are meant to be called with count == NULL if the count cannot be zero. However this is not done in gen_ROL and gen_ROR, and writing everywhere "can_be_zero ? count : NULL" is burdensome and less readable. Just pass can_be_zero as a separate argument. gen_RCL and gen_RCR use a conditional branch to skip the computation if count is zero, so they can pass false unconditionally to gen_rot_overflow. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20240522123914.608516-1-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-22 15:53:30 +02:00
Richard Henderson	922582ace2	target/hppa: - Use TCG_COND_TST where applicable. - Use CF_BP_PAGE instead of a local breakpoint search. - Clean up IAOQ handling during translation. - Implement CF_PCREL. - Implement PSW.B. - Implement PSW.X. - Log cpu state on interrupt and rfi. -----BEGIN PGP SIGNATURE----- iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZEgnwdHHJpY2hhcmQu aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+43gf8CakQdMSqfGV2nGP+ 7wWZOAV04IyfkJ38F/CH0ihUkblEOzXJ1shTFkrHEw257j0D10MctSSbjrqz5BwU obQcwoVlxzTGXqzhkZ6wagkcqjv3TtlPtznZIk6JssdlrtwIKDmE2/3t1dzHnyBD WTrS0SK3YvVRovq/ai51raUbiBsNq7XG3skHEsMKsFxp4EaDP5JTbputdQWdffjh TBmXImhHC3gm09KWIUZwfEBHlaa7YXk2orzB8kBE8S2kQj9vrGXEaC4jYnBcQLPw NDDkBYRqxHYQr0vIAHee+5cUgt1jDBr5rXnAnJwzK0wyEEc4Mi4OTPhNE604iu2y SDxS8Q== =A4Qf -----END PGP SIGNATURE----- Merge tag 'pull-hppa-20240515' of https://gitlab.com/rth7680/qemu into staging target/hppa: - Use TCG_COND_TST where applicable. - Use CF_BP_PAGE instead of a local breakpoint search. - Clean up IAOQ handling during translation. - Implement CF_PCREL. - Implement PSW.B. - Implement PSW.X. - Log cpu state on interrupt and rfi. # -----BEGIN PGP SIGNATURE----- # # iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmZEgnwdHHJpY2hhcmQu # aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+43gf8CakQdMSqfGV2nGP+ # 7wWZOAV04IyfkJ38F/CH0ihUkblEOzXJ1shTFkrHEw257j0D10MctSSbjrqz5BwU # obQcwoVlxzTGXqzhkZ6wagkcqjv3TtlPtznZIk6JssdlrtwIKDmE2/3t1dzHnyBD # WTrS0SK3YvVRovq/ai51raUbiBsNq7XG3skHEsMKsFxp4EaDP5JTbputdQWdffjh # TBmXImhHC3gm09KWIUZwfEBHlaa7YXk2orzB8kBE8S2kQj9vrGXEaC4jYnBcQLPw # NDDkBYRqxHYQr0vIAHee+5cUgt1jDBr5rXnAnJwzK0wyEEc4Mi4OTPhNE604iu2y # SDxS8Q== # =A4Qf # -----END PGP SIGNATURE----- # gpg: Signature made Wed 15 May 2024 11:38:04 AM CEST # gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F # gpg: issuer "richard.henderson@linaro.org" # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [ultimate] * tag 'pull-hppa-20240515' of https://gitlab.com/rth7680/qemu: (43 commits) target/hppa: Log cpu state on return-from-interrupt target/hppa: Log cpu state at interrupt target/hppa: Implement CF_PCREL target/hppa: Adjust priv for B,GATE at runtime target/hppa: Drop tlb_entry return from hppa_get_physical_address target/hppa: Implement PSW_X target/hppa: Implement PSW_B target/hppa: Manage PSW_X and PSW_B in translator target/hppa: Split PSW X and B into their own field target/hppa: Improve hppa_cpu_dump_state target/hppa: Do not mask in copy_iaoq_entry target/hppa: Store full iaoq_f and page offset of iaoq_b in TB linux-user/hppa: Force all code addresses to PRIV_USER target/hppa: Use delay_excp for conditional trap on overflow target/hppa: Use delay_excp for conditional traps target/hppa: Introduce DisasDelayException target/hppa: Remove cond_free target/hppa: Use TCG_COND_TST* in trans_ftest target/hppa: Use registerfields.h for FPSR target/hppa: Use TCG_COND_TST* in trans_bb_imm ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2024-05-15 11:46:58 +02:00
Richard Henderson	9e035f0078	target/hppa: Log cpu state on return-from-interrupt Inverse of the logging on taking an interrupt. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2024-05-15 10:03:45 +02:00
Richard Henderson	12959fcdcf	target/hppa: Log cpu state at interrupt This contains all of the information logged before, plus more. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2024-05-15 10:03:45 +02:00
Richard Henderson	6dd9b145f6	target/hppa: Implement CF_PCREL Now that the groundwork has been laid, enabling CF_PCREL within the translator proper is a simple matter of updating copy_iaoq_entry and install_iaq_entries. We also need to modify the unwind info, since we no longer have absolute addresses to install. As expected, this reduces the runtime overhead of compilation when running a Linux kernel with address space randomization enabled. Reviewed-by: Helge Deller <deller@gmx.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2024-05-15 10:03:45 +02:00
Richard Henderson	804cd52d3a	target/hppa: Adjust priv for B,GATE at runtime Do not compile in the priv change based on the first translation; look up the PTE at execution time. This is required for CF_PCREL, where a page may be mapped multiple times with different attributes. Reviewed-by: Helge Deller <deller@gmx.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2024-05-15 10:03:45 +02:00
Richard Henderson	190d7fa572	target/hppa: Drop tlb_entry return from hppa_get_physical_address The return-by-reference is never used. Reviewed-by: Helge Deller <deller@gmx.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2024-05-15 10:03:44 +02:00
Richard Henderson	d8bc138125	target/hppa: Implement PSW_X Use PAGE_WRITE_INV to temporarily enable write permission on for a given page, driven by PSW_X being set. Reviewed-by: Helge Deller <deller@gmx.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2024-05-15 10:03:44 +02:00
Richard Henderson	5ae8adbb01	target/hppa: Implement PSW_B PSW_B causes B,GATE to trap as an illegal instruction, removing our previous sequential execution test that was merely an approximation. Reviewed-by: Helge Deller <deller@gmx.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2024-05-15 10:03:44 +02:00
Richard Henderson	d27fe7c3af	target/hppa: Manage PSW_X and PSW_B in translator PSW_X is cleared after every instruction, and only set by RFI. PSW_B is cleared after every non-branch, or branch not taken, and only set by taken branches. We can clear both bits with a single store, at most once per TB. Taken branches set PSW_B, at most once per TB. Reviewed-by: Helge Deller <deller@gmx.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2024-05-15 10:03:44 +02:00
Richard Henderson	ebc9401a40	target/hppa: Split PSW X and B into their own field Generally, both of these bits are cleared at the end of each instruction. By separating these, we will be able to clear both with a single insn, instead of 2 or 3. Reviewed-by: Helge Deller <deller@gmx.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2024-05-15 10:03:44 +02:00

1 2 3 4 5 ...

13330 Commits