mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Richard Henderson	76fd66c9b2	target/i386: Use probe_access_full_mmu in ptw_translate The probe_access_full_mmu function was designed for this purpose, and does not report the memory operation event to plugins. Cc: qemu-stable@nongnu.org Fixes: `6d03226b42` ("plugins: force slow path when plugins instrument memory ops") Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20241013184733.1423747-3-richard.henderson@linaro.org> (cherry picked from commit `115ade42d5`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-11-08 13:02:41 +03:00
Alexander Graf	3c137af782	target/i386: Walk NPT in guest real mode When translating virtual to physical address with a guest CPU that supports nested paging (NPT), we need to perform every page table walk access indirectly through the NPT, which we correctly do. However, we treat real mode (no page table walk) special: In that case, we currently just skip any walks and translate VA -> PA. With NPT enabled, we also need to then perform NPT walk to do GVA -> GPA -> HPA which we fail to do so far. The net result of that is that TCG VMs with NPT enabled that execute real mode code (like SeaBIOS) end up with GPA==HPA mappings which means the guest accesses host code and data. This typically shows as failure to boot guests. This patch changes the page walk logic for NPT enabled guests so that we always perform a GVA -> GPA translation and then skip any logic that requires an actual PTE. That way, all remaining logic to walk the NPT stays and we successfully walk the NPT in real mode. Cc: qemu-stable@nongnu.org Fixes: `fe441054bb` ("target-i386: Add NPT support") Signed-off-by: Alexander Graf <graf@amazon.com> Reported-by: Eduard Vlad <evlad@amazon.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20240921085712.28902-1-graf@amazon.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> (cherry picked from commit `b56617bbcb`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-11-08 13:02:41 +03:00
Richard Henderson	e0f505a3de	target/i386: Do not apply REX to MMX operands Cc: qemu-stable@nongnu.org Fixes: `b3e22b2318` ("target/i386: add core of new i386 decoder") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2495 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20240812025844.58956-2-richard.henderson@linaro.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `416f2b16c0`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-08-28 08:37:15 +03:00
Richard Henderson	3854679d58	target/i386: Fix VSIB decode With normal SIB, index == 4 indicates no index. With VSIB, there is no exception for VR4/VR12. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2474 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20240805003130.1421051-3-richard.henderson@linaro.org Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `ac63755b20`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: modify the change to pre-new-decoder introduced past qemu 9.0)	2024-08-28 08:37:14 +03:00
Chuang Xu	40e04161b3	i386/cpu: fixup number of addressable IDs for processor cores in the physical package When QEMU is started with: -cpu host,host-cache-info=on,l3-cache=off \ -smp 2,sockets=1,dies=1,cores=1,threads=2 Guest can't acquire maximum number of addressable IDs for processor cores in the physical package from CPUID[04H]. When creating a CPU topology of 1 core per package, host-cache-info only uses the Host's addressable core IDs field (CPUID.04H.EAX[bits 31-26]), resulting in a conflict (on the multicore Host) between the Guest core topology information in this field and the Guest's actual cores number. Fix it by removing the unnecessary condition to cover 1 core per package case. This is safe because cores_per_pkg will not be 0 and will be at least 1. Fixes: `d7caf13b5f` ("x86: cpu: fixup number of addressable IDs for logical processors sharing cache") Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com> Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com> Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-ID: <20240611032314.64076-1-xuchuangxclwt@bytedance.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `903916f0a0`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: fixup for 8.2 due to other changes in this area past 9.0)	2024-07-01 09:49:12 +03:00
Mark Cave-Ayland	3e09472893	target/i386: fix size of EBP writeback in gen_enter() The calculation of FrameTemp is done using the size indicated by mo_pushpop() before being written back to EBP, but the final writeback to EBP is done using the size indicated by mo_stacksize(). In the case where mo_pushpop() is MO_32 and mo_stacksize() is MO_16 then the final writeback to EBP is done using MO_16 which can leave junk in the top 16-bits of EBP after executing ENTER. Change the writeback of EBP to use the same size indicated by mo_pushpop() to ensure that the full value is written back. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198 Message-ID: <20240606095319.229650-5-mark.cave-ayland@ilande.co.uk> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `3973615e7f`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-06-11 10:02:19 +03:00
Xinyu Li	276ec925a7	target/i386: fix SSE and SSE2 feature check Features check of CPUID_SSE and CPUID_SSE2 should use cpuid_features, rather than cpuid_ext_features. Signed-off-by: Xinyu Li <lixinyu20s@ict.ac.cn> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-ID: <20240602100904.2137939-1-lixinyu20s@ict.ac.cn> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `da7c95920d`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-06-06 14:17:00 +03:00
Paolo Bonzini	d84afebcee	target/i386: fix xsave.flat from kvm-unit-tests xsave.flat checks that "executing the XSETBV instruction causes a general- protection fault (#GP) if ECX = 0 and EAX[2:1] has the value 10b". QEMU allows that option, so the test fails. Add the condition. Cc: qemu-stable@nongnu.org Fixes: `892544317f` ("target/i386: implement XSAVE and XRSTOR of AVX registers", 2022-10-18) Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `7604bbc2d8`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-06-06 14:14:19 +03:00
Paolo Bonzini	52031d6be5	target/i386: no single-step exception after MOV or POP SS Intel SDM 18.3.1.4 "If an occurrence of the MOV or POP instruction loads the SS register executes with EFLAGS.TF = 1, no single-step debug exception occurs following the MOV or POP instruction." Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `f0f0136abb`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-05-27 14:27:37 +03:00
Paolo Bonzini	c6171d524d	target/i386: disable jmp_opt if EFLAGS.RF is 1 If EFLAGS.RF is 1, special processing in gen_eob_worker() is needed and therefore goto_tb cannot be used. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `8225bff7c5`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-05-27 14:27:37 +03:00
donsheng	9b98ab7d3d	target-i386: hyper-v: Correct kvm_hv_handle_exit return value This bug fix addresses the incorrect return value of kvm_hv_handle_exit for KVM_EXIT_HYPERV_SYNIC, which should be EXCP_INTERRUPT. Handling of KVM_EXIT_HYPERV_SYNIC in QEMU needs to be synchronous. This means that async_synic_update should run in the current QEMU vCPU thread before returning to KVM, returning EXCP_INTERRUPT to guarantee this. Returning 0 can cause async_synic_update to run asynchronously. One problem (kvm-unit-tests's hyperv_synic test fails with timeout error) caused by this bug: When a guest VM writes to the HV_X64_MSR_SCONTROL MSR to enable Hyper-V SynIC, a VM exit is triggered and processed by the kvm_hv_handle_exit function of the QEMU vCPU. This function then calls the async_synic_update function to set synic->sctl_enabled to true. A true value of synic->sctl_enabled is required before creating SINT routes using the hyperv_sint_route_new() function. If kvm_hv_handle_exit returns 0 for KVM_EXIT_HYPERV_SYNIC, the current QEMU vCPU thread may return to KVM and enter the guest VM before running async_synic_update. In such case, the hyperv_synic test’s subsequent call to synic_ctl(HV_TEST_DEV_SINT_ROUTE_CREATE, ...) immediately after writing to HV_X64_MSR_SCONTROL can cause QEMU’s hyperv_sint_route_new() function to return prematurely (because synic->sctl_enabled is false). If the SINT route is not created successfully, the SINT interrupt will not be fired, resulting in a timeout error in the hyperv_synic test. Fixes: `267e071bd6` (“hyperv: make overlay pages for SynIC”) Suggested-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com> Message-ID: <20240521200114.11588-1-dongsheng.x.zhang@intel.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `84d4b72854`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-05-27 07:32:40 +03:00
Paolo Bonzini	2b95625643	target/i386: fix feature dependency for WAITPKG The VMX feature bit depends on general availability of WAITPKG, not the other way round. Fixes: `33cc88261c` ("target/i386: add support for VMX_SECONDARY_EXEC_ENABLE_USER_WAIT_PAUSE", 2023-08-28) Cc: qemu-stable@nongnu.org Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `fe01af5d47`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-05-27 07:32:40 +03:00
Paolo Bonzini	1cc3cb96b8	target/i386: rdpkru/wrpkru are no-prefix instructions Reject 0x66/0xf3/0xf2 in front of them. Cc: qemu-stable@nongnu.org Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `40a3ec7b5f`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-05-27 07:32:40 +03:00
Paolo Bonzini	eb761b4ee5	target/i386: fix operand size for DATA16 REX.W POPCNT According to the manual, 32-bit vs 64-bit is governed by REX.W and REX ignores the 0x66 prefix. This can be confirmed with this program: #include <stdio.h> int main() { int x = 0x12340000; int y; asm("popcntl %1, %0" : "=r" (y) : "r" (x)); printf("%x\n", y); asm("mov $-1, %0; .byte 0x66; popcntl %1, %0" : "+r" (y) : "r" (x)); printf("%x\n", y); asm("mov $-1, %0; .byte 0x66; popcntq %q1, %q0" : "+r" (y) : "r" (x)); printf("%x\n", y); } which prints 5/ffff0000/5 on real hardware and 5/ffff0000/ffff0000 on QEMU. Cc: qemu-stable@nongnu.org Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `41c685dc59`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: drop removal of mo_64_32() helper function in target/i386/tcg/translate.c due to missing-in-8.2 v9.0.0-542-gaef4f4affde2 "target/i386: remove now-converted opcodes from old decoder" which removed other user of it)	2024-05-27 07:32:26 +03:00
Ruihan Li	50ed4f856a	target/i386: Give IRQs a chance when resetting HF_INHIBIT_IRQ_MASK When emulated with QEMU, interrupts will never come in the following loop. However, if the NOP instruction is uncommented, interrupts will fire as normal. loop: cli call do_sti jmp loop do_sti: sti # nop ret This behavior is different from that of a real processor. For example, if KVM is enabled, interrupts will always fire regardless of whether the NOP instruction is commented or not. Also, the Intel Software Developer Manual states that after the STI instruction is executed, the interrupt inhibit should end as soon as the next instruction (e.g., the RET instruction if the NOP instruction is commented) is executed. This problem is caused because the previous code may choose not to end the TB even if the HF_INHIBIT_IRQ_MASK has just been reset (e.g., in the case where the STI instruction is immediately followed by the RET instruction), so that IRQs may not have a change to trigger. This commit fixes the problem by always terminating the current TB to give IRQs a chance to trigger when HF_INHIBIT_IRQ_MASK is reset. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn> Message-ID: <20240415064518.4951-4-lrh2000@pku.edu.cn> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `6a5a63f74b`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-05-13 09:03:37 +03:00
Gregory Price	78bebf8692	target/i386/tcg: Enable page walking from MMIO memory CXL emulation of interleave requires read and write hooks due to requirement for subpage granularity. The Linux kernel stack now enables using this memory as conventional memory in a separate NUMA node. If a process is deliberately forced to run from that node $ numactl --membind=1 ls the page table walk on i386 fails. Useful part of backtrace: (cpu=cpu@entry=0x555556fd9000, fmt=fmt@entry=0x555555fe3378 "cpu_io_recompile: could not find TB for pc=%p") at ../../cpu-target.c:359 (retaddr=0, addr=19595792376, attrs=..., xlat=<optimized out>, cpu=0x555556fd9000, out_offset=<synthetic pointer>) at ../../accel/tcg/cputlb.c:1339 (cpu=0x555556fd9000, full=0x7fffee0d96e0, ret_be=ret_be@entry=0, addr=19595792376, size=size@entry=8, mmu_idx=4, type=MMU_DATA_LOAD, ra=0) at ../../accel/tcg/cputlb.c:2030 (cpu=cpu@entry=0x555556fd9000, p=p@entry=0x7ffff56fddc0, mmu_idx=<optimized out>, type=type@entry=MMU_DATA_LOAD, memop=<optimized out>, ra=ra@entry=0) at ../../accel/tcg/cputlb.c:2356 (cpu=cpu@entry=0x555556fd9000, addr=addr@entry=19595792376, oi=oi@entry=52, ra=ra@entry=0, access_type=access_type@entry=MMU_DATA_LOAD) at ../../accel/tcg/cputlb.c:2439 at ../../accel/tcg/ldst_common.c.inc:301 at ../../target/i386/tcg/sysemu/excp_helper.c:173 (err=0x7ffff56fdf80, out=0x7ffff56fdf70, mmu_idx=0, access_type=MMU_INST_FETCH, addr=18446744072116178925, env=0x555556fdb7c0) at ../../target/i386/tcg/sysemu/excp_helper.c:578 (cs=0x555556fd9000, addr=18446744072116178925, size=<optimized out>, access_type=MMU_INST_FETCH, mmu_idx=0, probe=<optimized out>, retaddr=0) at ../../target/i386/tcg/sysemu/excp_helper.c:604 Avoid this by plumbing the address all the way down from x86_cpu_tlb_fill() where is available as retaddr to the actual accessors which provide it to probe_access_full() which already handles MMIO accesses. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2180 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2220 Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Suggested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Gregory Price <gregory.price@memverge.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-ID: <20240307155304.31241-2-Jonathan.Cameron@huawei.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> (cherry picked from commit `9dab7bbb01`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-03-27 09:47:02 +03:00
Tao Su	cf794e4774	target/i386: Revert monitor_puts() in do_inject_x86_mce() monitor_puts() doesn't check the monitor pointer, but do_inject_x86_mce() may have a parameter with NULL monitor pointer. Revert monitor_puts() in do_inject_x86_mce() to fix, then the fact that we send the same message to monitor and log is again more obvious. Fixes: `bf0c50d4aa` (monitor: expose monitor_puts to rest of code) Reviwed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Tao Su <tao1.su@linux.intel.com> Message-ID: <20240320083640.523287-1-tao1.su@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `7fd226b047`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-03-21 20:16:22 +03:00
Paolo Bonzini	509525dc67	target/i386: fix direction of "32-bit MMU" test The low bit of MMU indices for x86 TCG indicates whether the processor is in 32-bit mode and therefore linear addresses have to be masked to 32 bits. However, the index was computed incorrectly, leading to possible conflicts in the TLB for any address above 4G. Analyzed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Fixes: `b1661801c1` ("target/i386: Fix physical address truncation", 2024-02-28) Fixes: `a28b6b4e74` ("target/i386: Fix physical address truncation" in stable-8.2) Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2206 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `2cc68629a6`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: move changes for x86_cpu_mmu_index() to cpu_mmu_index() due to missing v8.2.0-1030-gace0c5fe59 "target/i386: Populate CPUClass.mmu_index")	2024-03-21 19:57:05 +03:00
Paolo Bonzini	1ced8cd541	target/i386: use separate MMU indexes for 32-bit accesses Accesses from a 32-bit environment (32-bit code segment for instruction accesses, EFER.LMA==0 for processor accesses) have to mask away the upper 32 bits of the address. While a bit wasteful, the easiest way to do so is to use separate MMU indexes. These days, QEMU anyway is compiled with a fixed value for NB_MMU_MODES. Split MMU_USER_IDX, MMU_KSMAP_IDX and MMU_KNOSMAP_IDX in two. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `90f641531c`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: move changes for x86_cpu_mmu_index() to cpu_mmu_index() due to missing v8.2.0-1030-gace0c5fe59 "target/i386: Populate CPUClass.mmu_index")	2024-03-21 19:50:47 +03:00
Paolo Bonzini	a85b8ec872	target/i386: introduce function to query MMU indices Remove knowledge of specific MMU indexes (other than MMU_NESTED_IDX and MMU_PHYS_IDX) from mmu_translate(). This will make it possible to split 32-bit and 64-bit MMU indexes. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `5f97afe254`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: context fixup in target/i386/cpu.h due to other changes in that area)	2024-03-21 19:43:42 +03:00
Paolo Bonzini	decafac46b	target/i386: leave the A20 bit set in the final NPT walk The A20 mask is only applied to the final memory access. Nested page tables are always walked with the raw guest-physical address. Unlike the previous patch, in this one the masking must be kept, but it was done too early. Cc: qemu-stable@nongnu.org Fixes: `4a1e9d4d11` ("target/i386: Use atomic operations for pte updates", 2022-10-18) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `b5a9de3259`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-02-28 21:19:03 +03:00
Paolo Bonzini	6801a20ebd	target/i386: remove unnecessary/wrong application of the A20 mask If ptw_translate() does a MMU_PHYS_IDX access, the A20 mask is already applied in get_physical_address(), which is called via probe_access_full() and x86_cpu_tlb_fill(). If ptw_translate() on the other hand does a MMU_NESTED_IDX access, the A20 mask must not be applied to the address that is looked up in the nested page tables; it must be applied only to the addresses that hold the NPT entries (which is achieved via MMU_PHYS_IDX, per the previous paragraph). Therefore, we can remove A20 masking from the computation of the page table entry's address, and let get_physical_address() or mmu_translate() apply it when they know they are returning a host-physical address. Cc: qemu-stable@nongnu.org Fixes: `4a1e9d4d11` ("target/i386: Use atomic operations for pte updates", 2022-10-18) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `a28fe7dc19`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-02-28 21:18:45 +03:00
Paolo Bonzini	a28b6b4e74	target/i386: Fix physical address truncation The address translation logic in get_physical_address() will currently truncate physical addresses to 32 bits unless long mode is enabled. This is incorrect when using physical address extensions (PAE) outside of long mode, with the result that a 32-bit operating system using PAE to access memory above 4G will experience undefined behaviour. The truncation code was originally introduced in commit `33dfdb5` ("x86: only allow real mode to access 32bit without LMA"), where it applied only to translations performed while paging is disabled (and so cannot affect guests using PAE). Commit `9828198` ("target/i386: Add MMU_PHYS_IDX and MMU_NESTED_IDX") rearranged the code such that the truncation also applied to the use of MMU_PHYS_IDX and MMU_NESTED_IDX. Commit `4a1e9d4` ("target/i386: Use atomic operations for pte updates") brought this truncation into scope for page table entry accesses, and is the first commit for which a Windows 10 32-bit guest will reliably fail to boot if memory above 4G is present. The truncation code however is not completely redundant. Even though the maximum address size for any executed instruction is 32 bits, helpers for operations such as BOUND, FSAVE or XSAVE may ask get_physical_address() to translate an address outside of the 32-bit range, if invoked with an argument that is close to the 4G boundary. Likewise for processor accesses, for example TSS or IDT accesses, when EFER.LMA==0. So, move the address truncation in get_physical_address() so that it applies to 32-bit MMU indexes, but not to MMU_PHYS_IDX and MMU_NESTED_IDX. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2040 Fixes: `4a1e9d4d11` ("target/i386: Use atomic operations for pte updates", 2022-10-18) Cc: qemu-stable@nongnu.org Co-developed-by: Michael Brown <mcb30@ipxe.org> Signed-off-by: Michael Brown <mcb30@ipxe.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `b1661801c1`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: drop unrelated change in target/i386/cpu.c)	2024-02-28 21:15:46 +03:00
Paolo Bonzini	5c4091fe07	target/i386: check validity of VMCB addresses MSR_VM_HSAVE_PA bits 0-11 are reserved, as are the bits above the maximum physical address width of the processor. Setting them to 1 causes a #GP (see "15.30.4 VM_HSAVE_PA MSR" in the AMD manual). The same is true of VMCB addresses passed to VMRUN/VMLOAD/VMSAVE, even though the manual is not clear on that. Cc: qemu-stable@nongnu.org Fixes: `4a1e9d4d11` ("target/i386: Use atomic operations for pte updates", 2022-10-18) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `d09c79010f`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-02-28 21:03:19 +03:00
Paolo Bonzini	6ed8211379	target/i386: mask high bits of CR3 in 32-bit mode CR3 bits 63:32 are ignored in 32-bit mode (either legacy 2-level paging or PAE paging). Do this in mmu_translate() to remove the last where get_physical_address() meaningfully drops the high bits of the address. Cc: qemu-stable@nongnu.org Suggested-by: Richard Henderson <richard.henderson@linaro.org> Fixes: `4a1e9d4d11` ("target/i386: Use atomic operations for pte updates", 2022-10-18) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `68fb78d7d5`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-02-28 21:02:56 +03:00
Ziqiao Kong	0b30735d38	target/i386: Generate an illegal opcode exception on cmp instructions with lock prefix target/i386: As specified by Intel Manual Vol2 3-180, cmp instructions are not allowed to have lock prefix and a `UD` should be raised. Without this patch, s1->T0 will be uninitialized and used in the case OP_CMPL. Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com> Message-ID: <20240215095015.570748-2-ziqiaokong@gmail.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `99d0dcd7f1`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-02-20 18:44:21 +03:00
Xiaoyao Li	f5dddb856c	i386/cpuid: Move leaf 7 to correct group CPUID leaf 7 was grouped together with SGX leaf 0x12 by commit `b9edbadefb` ("i386: Propagate SGX CPUID sub-leafs to KVM") by mistake. SGX leaf 0x12 has its specific logic to check if subleaf (starting from 2) is valid or not by checking the bit 0:3 of corresponding EAX is 1 or not. Leaf 7 follows the logic that EAX of subleaf 0 enumerates the maximum valid subleaf. Fixes: `b9edbadefb` ("i386: Propagate SGX CPUID sub-leafs to KVM") Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20240125024016.2521244-4-xiaoyao.li@intel.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `0729857c70`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-02-20 18:43:03 +03:00
Xiaoyao Li	e8d27721cb	i386/cpuid: Decrease cpuid_i when skipping CPUID leaf 1F Existing code misses a decrement of cpuid_i when skip leaf 0x1F. There's a blank CPUID entry(with leaf, subleaf as 0, and all fields stuffed 0s) left in the CPUID array. It conflicts with correct CPUID leaf 0. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by:Yang Weijiang <weijiang.yang@intel.com> Message-ID: <20240125024016.2521244-2-xiaoyao.li@intel.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `10f92799af`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-02-20 18:42:24 +03:00
Xiaoyao Li	72c4ef9da0	i386/cpu: Mask with XCR0/XSS mask for FEAT_XSAVE_XCR0_HI and FEAT_XSAVE_XSS_HI leafs The value of FEAT_XSAVE_XCR0_HI leaf and FEAT_XSAVE_XSS_HI leaf also need to be masked by XCR0 and XSS mask respectively, to make it logically correct. Fixes: `301e90675c` ("target/i386: Enable support for XSAVES based features") Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Yang Weijiang <weijiang.yang@intel.com> Message-ID: <20240115091325.1904229-3-xiaoyao.li@intel.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `a11a365159`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-02-20 18:41:43 +03:00
Xiaoyao Li	0766f137f5	i386/cpu: Clear FEAT_XSAVE_XSS_LO/HI leafs when CPUID_EXT_XSAVE is not available Leaf FEAT_XSAVE_XSS_LO and FEAT_XSAVE_XSS_HI also need to be cleared when CPUID_EXT_XSAVE is not set. Fixes: `301e90675c` ("target/i386: Enable support for XSAVES based features") Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Yang Weijiang <weijiang.yang@intel.com> Message-ID: <20240115091325.1904229-2-xiaoyao.li@intel.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `81f5cad385`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-02-20 18:41:25 +03:00
Paolo Bonzini	c46f68bd7d	target/i386: pcrel: store low bits of physical address in data[0] For PC-relative translation blocks, env->eip changes during the execution of a translation block, Therefore, QEMU must be able to recover an instruction's PC just from the TranslationBlock struct and the instruction data with. Because a TB will not span two pages, QEMU stores all the low bits of EIP in the instruction data and replaces them in x86_restore_state_to_opc. Bits 12 and higher (which may vary between executions of a PCREL TB, since these only use the physical address in the hash key) are kept unmodified from env->eip. The assumption is that these bits of EIP, unlike bits 0-11, will not change as the translation block executes. Unfortunately, this is incorrect when the CS base is not aligned to a page. Then the linear address of the instructions (i.e. the one with the CS base addred) indeed will never span two pages, but bits 12+ of EIP can actually change. For example, if CS base is 0x80262200 and EIP = 0x6FF4, the first instruction in the translation block will be at linear address 0x802691F4. Even a very small TB will cross to EIP = 0x7xxx, while the linear addresses will remain comfortably within a single page. The fix is simply to use the low bits of the linear address for data[0], since those don't change. Then x86_restore_state_to_opc uses tb->cs_base to compute a temporary linear address (referring to some unknown instruction in the TB, but with the correct values of bits 12 and higher); the low bits are replaced with data[0], and EIP is obtained by subtracting again the CS base. Huge thanks to Mark Cave-Ayland for the image and initial debugging, and to Gitlab user @kjliew for help with bisecting another occurrence of (hopefully!) the same bug. It should be relatively easy to write a testcase that performs MMIO on an EIP with different bits 12+ than the first instruction of the translation block; any help is welcome. Fixes: `e3a79e0e87` ("target/i386: Enable TARGET_TB_PCREL", 2022-10-11) Cc: qemu-stable@nongnu.org Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Cc: Richard Henderson <richard.henderson@linaro.org> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1759 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1964 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2012 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `729ba8e933`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-01-20 12:24:50 +03:00
guoguangyao	652c34cbb2	target/i386: fix incorrect EIP in PC-relative translation blocks The PCREL patches introduced a bug when updating EIP in the !CF_PCREL case. Using s->pc in func gen_update_eip_next() solves the problem. Cc: qemu-stable@nongnu.org Fixes: `b5e0d5d22f` ("target/i386: Fix 32-bit wrapping of pc/eip computation") Signed-off-by: guoguangyao <guoguangyao18@mails.ucas.ac.cn> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20240115020804.30272-1-guoguangyao18@mails.ucas.ac.cn> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `2926eab896`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-01-20 12:21:08 +03:00
Richard Henderson	6e8e580e39	target/i386: Do not re-compute new pc with CF_PCREL With PCREL, we have a page-relative view of EIP, and an approximation of PC = EIP+CSBASE that is good enough to detect page crossings. If we try to recompute PC after masking EIP, we will mess up that approximation and write a corrupt value to EIP. We already handled masking properly for PCREL, so the fix in `b5e0d5d2` was only needed for the !PCREL path. Cc: qemu-stable@nongnu.org Fixes: `b5e0d5d22f` ("target/i386: Fix 32-bit wrapping of pc/eip computation") Reported-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20240101230617.129349-1-richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `a58506b748`) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2024-01-20 12:20:44 +03:00
Richard Henderson	b5e0d5d22f	target/i386: Fix 32-bit wrapping of pc/eip computation In 32-bit mode, pc = eip + cs_base is also 32-bit, and must wrap. Failure to do so results in incorrect memory exceptions to the guest. Before `732d548732`, this was implicitly done via truncation to target_ulong but only in qemu-system-i386, not qemu-system-x86_64. To fix this, we must add conditional zero-extensions. Since we have to test for 32 vs 64-bit anyway, note that cs_base is always zero in 64-bit mode. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2022 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20231212172510.103305-1-richard.henderson@linaro.org>	2023-12-12 13:35:08 -08:00
Michael Roth	5746f70d68	i386/sev: Avoid SEV-ES crash due to missing MSR_EFER_LMA bit Commit `7191f24c7f` ("accel/kvm/kvm-all: Handle register access errors") added error checking for KVM_SET_SREGS/KVM_SET_SREGS2. In doing so, it exposed a long-running bug in current KVM support for SEV-ES where the kernel assumes that MSR_EFER_LMA will be set explicitly by the guest kernel, in which case EFER write traps would result in KVM eventually seeing MSR_EFER_LMA get set and recording it in such a way that it would be subsequently visible when accessing it via KVM_GET_SREGS/etc. However, guest kernels currently rely on MSR_EFER_LMA getting set automatically when MSR_EFER_LME is set and paging is enabled via CR0_PG_MASK. As a result, the EFER write traps don't actually expose the MSR_EFER_LMA bit, even though it is set internally, and when QEMU subsequently tries to pass this EFER value back to KVM via KVM_SET_SREGS* it will fail various sanity checks and return -EINVAL, which is now considered fatal due to the aforementioned QEMU commit. This can be addressed by inferring the MSR_EFER_LMA bit being set when paging is enabled and MSR_EFER_LME is set, and synthesizing it to ensure the expected bits are all present in subsequent handling on the host side. Ultimately, this handling will be implemented in the host kernel, but to avoid breaking QEMU's SEV-ES support when using older host kernels, the same handling can be done in QEMU just after fetching the register values via KVM_GET_SREGS*. Implement that here. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Akihiko Odaki <akihiko.odaki@daynix.com> Cc: Philippe Mathieu-Daudé <philmd@linaro.org> Cc: Lara Lazier <laramglazier@gmail.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Maxim Levitsky <mlevitsk@redhat.com> Cc: <kvm@vger.kernel.org> Fixes: `7191f24c7f` ("accel/kvm/kvm-all: Handle register access errors") Signed-off-by: Michael Roth <michael.roth@amd.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20231206155821.1194551-1-michael.roth@amd.com>	2023-12-06 14:34:11 -05:00
Markus Armbruster	298d8b1220	target/i386/cpu: Improve error message for property "vendor" Improve $ qemu-system-x86_64 -device max-x86_64-cpu,vendor=me qemu-system-x86_64: -device max-x86_64-cpu,vendor=me: Property '.vendor' doesn't take value 'me' to qemu-system-x86_64: -device max-x86_64-cpu,vendor=0123456789abc: value of property 'vendor' must consist of exactly 12 characters Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20231031111059.3407803-8-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> [Typo corrected]	2023-11-17 10:07:52 +01:00
Stefan Hajnoczi	ed1d873caa	Misc hardware patch queue HW emulation: - PMBus fixes and tests (Titus) - IDE fixes and tests (Fiona) - New ADM1266 sensor (Titus) - Better error propagation in PCI-ISA i82378 (Philippe) - Declare SD model QOM types using DEFINE_TYPES macro (Philippe) Topology: - Fix CPUState::nr_cores calculation (Zhuocheng Ding and Zhao Liu) Monitor: - Synchronize CPU state in 'info lapic' (Dongli Zhang) QOM: - Have 'cpu-qom.h' target-agnostic (Philippe) - Move ArchCPUClass definition to each target's cpu.h (Philippe) - Call object_class_is_abstract once in cpu_class_by_name (Philippe) UI: - Use correct key names in titles on MacOS / SDL2 (Adrian) MIPS: - Fix MSA BZ/BNZ and TX79 LQ/SQ opcodes (Philippe) Nios2: - Create IRQs after vCPU is realized (Philippe) PPC: - Restrict KVM objects to system emulation (Philippe) - Move target-specific definitions out of 'cpu-qom.h' (Philippe) S390X: - Make hw/s390x/css.h and hw/s390x/sclp.h headers target agnostic (Philippe) X86: - HVF & KVM cleanups (Philippe) Various targets: - Use env_archcpu() to optimize (Philippe) Misc: - Few global variable shadowing removed (Philippe) - Introduce cpu_exec_reset_hold and factor tcg_cpu_reset_hold out (Philippe) - Remove few more 'softmmu' mentions (Philippe) - Fix and cleanup in vl.c (Akihiko & Marc-André) - Resource leak fix in dump (Zongmin Zhou) - MAINTAINERS updates (Thomas, Daniel) -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmVKKmEACgkQ4+MsLN6t wN4xHQ//X/enH4C7K3VP/tSinDiwmXN2o61L9rjqSDQkBaCtktZx4c8qKSDL7V4S vwzmvvBn3biMXQwZNVJo9d0oz2qoaF9tI6Ao0XDHAan9ziagfG9YMqWhkCfj077Q jLdCqkUuMJBvQgXGB1a6UgCme8PQx7h0oqjbCNfB0ZBls24b5DiEjO87LE4OTbTi zKRhYEpZpGwIVcy+1dAsbaBpGFP06sr1doB9Wz4c06eSx7t0kFSPk6U4CyOPrGXh ynyCxPwngxIXmarY8gqPs3SBs7oXsH8Q/ZOHr1LbuXhwSuw/0zBQU9aF7Ir8RPan DB79JjPrtxTAhICKredWT79v9M18D2/1MpONgg4vtx5K2FzGYoAJULCHyfkHMRSM L6/H0ZQPHvf7w72k9EcSQIhd0wPlMqRmfy37/8xcLiw1h4l/USx48QeKaeFWeSEu DgwSk+R61HbrKvQz/U0tF98zUEyBaQXNrKmyzht0YE4peAtpbPNBeRHkd0GMae/Z HOmkt8QlFQ0T14qSK7mSHaSJTUzRvFGD01cbuCDxVsyCWWsesEikXBACZLG5RCRY Rn1WeX1H9eE3kKi9iueLnhzcF9yM5XqFE3f6RnDzY8nkg91lsTMSQgFcIpv6uGyp 3WOTNSC9SoFyI3x8pCWiKOGytPUb8xk+PnOA85wYvVmT+7j6wus= =OVdQ -----END PGP SIGNATURE----- Merge tag 'misc-cpus-20231107' of https://github.com/philmd/qemu into staging Misc hardware patch queue HW emulation: - PMBus fixes and tests (Titus) - IDE fixes and tests (Fiona) - New ADM1266 sensor (Titus) - Better error propagation in PCI-ISA i82378 (Philippe) - Declare SD model QOM types using DEFINE_TYPES macro (Philippe) Topology: - Fix CPUState::nr_cores calculation (Zhuocheng Ding and Zhao Liu) Monitor: - Synchronize CPU state in 'info lapic' (Dongli Zhang) QOM: - Have 'cpu-qom.h' target-agnostic (Philippe) - Move ArchCPUClass definition to each target's cpu.h (Philippe) - Call object_class_is_abstract once in cpu_class_by_name (Philippe) UI: - Use correct key names in titles on MacOS / SDL2 (Adrian) MIPS: - Fix MSA BZ/BNZ and TX79 LQ/SQ opcodes (Philippe) Nios2: - Create IRQs after vCPU is realized (Philippe) PPC: - Restrict KVM objects to system emulation (Philippe) - Move target-specific definitions out of 'cpu-qom.h' (Philippe) S390X: - Make hw/s390x/css.h and hw/s390x/sclp.h headers target agnostic (Philippe) X86: - HVF & KVM cleanups (Philippe) Various targets: - Use env_archcpu() to optimize (Philippe) Misc: - Few global variable shadowing removed (Philippe) - Introduce cpu_exec_reset_hold and factor tcg_cpu_reset_hold out (Philippe) - Remove few more 'softmmu' mentions (Philippe) - Fix and cleanup in vl.c (Akihiko & Marc-André) - Resource leak fix in dump (Zongmin Zhou) - MAINTAINERS updates (Thomas, Daniel) # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmVKKmEACgkQ4+MsLN6t # wN4xHQ//X/enH4C7K3VP/tSinDiwmXN2o61L9rjqSDQkBaCtktZx4c8qKSDL7V4S # vwzmvvBn3biMXQwZNVJo9d0oz2qoaF9tI6Ao0XDHAan9ziagfG9YMqWhkCfj077Q # jLdCqkUuMJBvQgXGB1a6UgCme8PQx7h0oqjbCNfB0ZBls24b5DiEjO87LE4OTbTi # zKRhYEpZpGwIVcy+1dAsbaBpGFP06sr1doB9Wz4c06eSx7t0kFSPk6U4CyOPrGXh # ynyCxPwngxIXmarY8gqPs3SBs7oXsH8Q/ZOHr1LbuXhwSuw/0zBQU9aF7Ir8RPan # DB79JjPrtxTAhICKredWT79v9M18D2/1MpONgg4vtx5K2FzGYoAJULCHyfkHMRSM # L6/H0ZQPHvf7w72k9EcSQIhd0wPlMqRmfy37/8xcLiw1h4l/USx48QeKaeFWeSEu # DgwSk+R61HbrKvQz/U0tF98zUEyBaQXNrKmyzht0YE4peAtpbPNBeRHkd0GMae/Z # HOmkt8QlFQ0T14qSK7mSHaSJTUzRvFGD01cbuCDxVsyCWWsesEikXBACZLG5RCRY # Rn1WeX1H9eE3kKi9iueLnhzcF9yM5XqFE3f6RnDzY8nkg91lsTMSQgFcIpv6uGyp # 3WOTNSC9SoFyI3x8pCWiKOGytPUb8xk+PnOA85wYvVmT+7j6wus= # =OVdQ # -----END PGP SIGNATURE----- # gpg: Signature made Tue 07 Nov 2023 20:15:29 HKT # gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE # gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full] # Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE * tag 'misc-cpus-20231107' of https://github.com/philmd/qemu: (75 commits) dump: Add close fd on error return to avoid resource leak ui/sdl2: use correct key names in win title on mac MAINTAINERS: Add more guest-agent related files to the corresponding section MAINTAINERS: Add include/hw/xtensa/mx_pic.h to the XTFPGA machine section MAINTAINERS: update libvirt devel mailing list address MAINTAINERS: Add the CAN documentation file to the CAN section MAINTAINERS: Add include/hw/timer/tmu012.h to the SH4 R2D section hw/sd: Declare QOM types using DEFINE_TYPES() macro hw/i2c: pmbus: reset page register for out of range reads hw/i2c: pmbus: immediately clear faults on request tests/qtest: add tests for ADM1266 hw/sensor: add ADM1266 device model hw/i2c: pmbus: add VCAP register hw/i2c: pmbus: add fan support hw/i2c: pmbus: add vout mode bitfields hw/i2c: pmbus add support for block receive tests/qtest: ahci-test: add test exposing reset issue with pending callback hw/ide: reset: cancel async DMA operation before resetting state hw/cpu: Update the comments of nr_cores and nr_dies system/cpus: Fix CPUState.nr_cores' calculation ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-11-08 08:32:26 +08:00
Zhao Liu	aa1878fbc9	hw/cpu: Update the comments of nr_cores and nr_dies In the nr_threads' comment, specify it represents the number of threads in the "core" to avoid confusion. Also add comment for nr_dies in CPUX86State. Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20231024090323.1859210-5-zhao1.liu@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-11-07 13:08:48 +01:00
Zhuocheng Ding	958ac3c42b	system/cpus: Fix CPUState.nr_cores' calculation From CPUState.nr_cores' comment, it represents "number of cores within this CPU package". After `003f230e37` ("machine: Tweak the order of topology members in struct CpuTopology"), the meaning of smp.cores changed to "the number of cores in one die", but this commit missed to change CPUState.nr_cores' calculation, so that CPUState.nr_cores became wrong and now it misses to consider numbers of clusters and dies. At present, only i386 is using CPUState.nr_cores. But as for i386, which supports die level, the uses of CPUState.nr_cores are very confusing: Early uses are based on the meaning of "cores per package" (before die is introduced into i386), and later uses are based on "cores per die" (after die's introduction). This difference is due to that commit `a94e142899` ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine") misunderstood that CPUState.nr_cores means "cores per die" when calculated CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this wrong understanding. With the influence of `003f230e37` and `a94e142899`, for i386 currently the result of CPUState.nr_cores is "cores per die", thus the original uses of CPUState.cores based on the meaning of "cores per package" are wrong when multiple dies exist: 1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is incorrect because it expects "cpus per package" but now the result is "cpus per die". 2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H: EAX[bits 31:26] is incorrect because they expect "cpus per package" but now the result is "cpus per die". The error not only impacts the EAX calculation in cache_info_passthrough case, but also impacts other cases of setting cache topology for Intel CPU according to cpu topology (specifically, the incoming parameter "num_cores" expects "cores per package" in encode_cache_cpuid4()). 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits 15:00] is incorrect because the EBX of 0BH.01H (core level) expects "cpus per package", which may be different with 1FH.01H (The reason is 1FH can support more levels. For QEMU, 1FH also supports die, 1FH.01H:EBX[bits 15:00] expects "cpus per die"). 4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.80000001H is calculated, here "cpus per package" is expected to be checked, but in fact, now it checks "cpus per die". Though "cpus per die" also works for this code logic, this isn't consistent with AMD's APM. 5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.80000008H:ECX expects "cpus per package" but it obtains "cpus per die". 6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c, MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per package", but in these functions, it obtains "cpus per die" and "cores per die". On the other hand, these uses are correct now (they are added in/after `a94e142899`): 1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die meets the actual meaning of CPUState.nr_cores ("cores per die"). 2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID. 04H's calculation) considers number of dies, so it's correct. 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits 15:00] needs "cpus per die" and it gets the correct result, and CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package". When CPUState.nr_cores is correctly changed to "cores per package" again , the above errors will be fixed without extra work, but the "currently" correct cases will go wrong and need special handling to pass correct "cpus/cores per die" they want. Fix CPUState.nr_cores' calculation to fit the original meaning "cores per package", as well as changing calculation of topo_info.cores_per_die, vcpus_per_socket and CPUID.1FH. Fixes: `a94e142899` ("target/i386: Add CPUID.1F generation support for multi-dies PCMachine") Fixes: `003f230e37` ("machine: Tweak the order of topology members in struct CpuTopology") Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com> Co-developed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20231024090323.1859210-4-zhao1.liu@linux.intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-11-07 13:08:48 +01:00
Philippe Mathieu-Daudé	9348028e7e	target: Move ArchCPUClass definition to 'cpu.h' The OBJECT_DECLARE_CPU_TYPE() macro forward-declares each ArchCPUClass type. These forward declarations are sufficient for code in hw/ to use the QOM definitions. No need to expose these structure definitions. Keep each local to their target/ by moving them to the corresponding "cpu.h" header. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20231013140116.255-13-philmd@linaro.org>	2023-11-07 13:08:48 +01:00
Dongli Zhang	2b34d7382b	target/i386/monitor: synchronize cpu state for lapic info While the default "info lapic" always synchronizes cpu state ... mon_get_cpu() -> mon_get_cpu_sync(mon, true) -> cpu_synchronize_state(cpu) -> ioctl KVM_GET_LAPIC (taking KVM as example) ... the cpu state is not synchronized when the apic-id is available as argument. The cpu state should be synchronized when apic-id is available. Otherwise the "info lapic <apic-id>" always returns stale data. Reference: https://lore.kernel.org/all/20211028155457.967291-19-berrange@redhat.com/ Cc: Joe Jin <joe.jin@oracle.com> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Message-ID: <20231030085336.2681386-1-armbru@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-ID: <20231026211938.162815-1-dongli.zhang@oracle.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-11-07 12:13:27 +01:00
Philippe Mathieu-Daudé	de910c496a	target/i386/kvm: Correct comment in kvm_cpu_realize() Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230918160257.30127-4-philmd@linaro.org>	2023-11-07 12:13:27 +01:00
Philippe Mathieu-Daudé	3152e95413	target/i386/hvf: Rename 'X86CPU *x86_cpu' variable as 'cpu' Follow the naming used by other files in target/i386/. No functional changes. Suggested-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-Id: <20231020111136.44401-4-philmd@linaro.org>	2023-11-07 12:13:27 +01:00
Philippe Mathieu-Daudé	5366a0644f	target/i386/hvf: Rename 'CPUState *cpu' variable as 'cs' Follow the naming used by other files in target/i386/. No functional changes. Suggested-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-Id: <20231020111136.44401-3-philmd@linaro.org>	2023-11-07 12:13:27 +01:00
Philippe Mathieu-Daudé	89c02195c9	target/i386/hvf: Use CPUState typedef Follow C style guidelines and use CPUState forward declaration from "qemu/typedefs.h". No functional changes. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-Id: <20231020111136.44401-2-philmd@linaro.org>	2023-11-07 12:13:27 +01:00
Philippe Mathieu-Daudé	a9e445df54	target/i386/hvf: Use env_archcpu() in simulate_[rdmsr/wrmsr]() When CPUArchState* is available (here CPUX86State), we can use the fast env_archcpu() macro to get ArchCPU (here X86CPU). The QOM cast X86_CPU() macro will be slower when building with --enable-qom-cast-debug. Pass CPUX86State as argument to simulate_rdmsr / simulate_wrmsr instead of a CPUState* to avoid an extra cast. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Roman Bolshakov <roman@roolebo.dev> Tested-by: Roman Bolshakov <roman@roolebo.dev> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-Id: <20231009110239.66778-7-philmd@linaro.org>	2023-11-07 12:13:27 +01:00
Philippe Mathieu-Daudé	82b641d626	target/i386/hvf: Use x86_cpu in simulate_[rdmsr\|wrmsr]() We already have 'x86_cpu = X86_CPU(cpu)'. Use the variable instead of doing another QOM cast with X86_CPU(). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Roman Bolshakov <roman@roolebo.dev> Tested-by: Roman Bolshakov <roman@roolebo.dev> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-Id: <20231009110239.66778-6-philmd@linaro.org>	2023-11-07 12:13:27 +01:00
Philippe Mathieu-Daudé	2d56be5a29	target: Declare FOO_CPU_TYPE_NAME/SUFFIX in 'cpu-qom.h' Hegerogeneous code needs access to the FOO_CPU_TYPE_NAME() macro to resolve target CPU types. Move the declaration (along with the required FOO_CPU_TYPE_SUFFIX) to "cpu-qom.h". "target/foo/cpu-qom.h" is supposed to be target agnostic (include-able by any target). Add such mention in the header. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20231013140116.255-7-philmd@linaro.org>	2023-11-07 12:13:27 +01:00
Philippe Mathieu-Daudé	6ee45fac56	target: Unify QOM style Enforce the style described by commit `067109a11c` ("docs/devel: mention the spacing requirement for QOM"): The first declaration of a storage or class structure should always be the parent and leave a visual space between that declaration and the new code. It is also useful to separate backing for properties (options driven by the user) and internal state to make navigation easier. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Message-Id: <20231013140116.255-2-philmd@linaro.org>	2023-11-07 12:13:27 +01:00
David Woodhouse	a72ccc7fc4	hw/xen: add support for Xen primary console in emulated mode The primary console is special because the toolstack maps a page into the guest for its ring, and also allocates the guest-side event channel. The guest's grant table is even primed to export that page using a known grant ref#. Add support for all that in emulated mode, so that we can have a primary console. For reasons unclear, the backends running under real Xen don't just use a mapping of the well-known GNTTAB_RESERVED_CONSOLE grant ref (which would also be in the ring-ref node in XenStore). Instead, the toolstack sets the ring-ref node of the primary console to the GFN of the guest page. The backend is expected to handle that special case and map it with foreignmem operations instead. We don't have an implementation of foreignmem ops for emulated Xen mode, so just make it map GNTTAB_RESERVED_CONSOLE instead. This would probably work for real Xen too, but we can't work out how to make real Xen create a primary console of type "ioemu" to make QEMU drive it, so we can't test that; might as well leave it as it is for now under Xen. Now at last we can boot the Xen PV shim and run PV kernels in QEMU. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>	2023-11-07 08:54:20 +00:00

1 2 3 4 5 ...

1779 Commits