This is just meant to serve as an example on how we can implement
hypercalls. xen_version specifically since Qemu does all kind of
feature controllability. So handling that here seems appropriate.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Implement kvm_gva_rw() safely]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This means handling the new exit reason for Xen but still
crashing on purpose. As we implement each of the hypercalls
we will then return the right return code.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Add CPL to hypercall tracing, disallow hypercalls from CPL > 0]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
There are (at least) three different vCPU ID number spaces. One is the
internal KVM vCPU index, based purely on which vCPU was chronologically
created in the kernel first. If userspace threads are all spawned and
create their KVM vCPUs in essentially random order, then the KVM indices
are basically random too.
The second number space is the APIC ID space, which is consistent and
useful for referencing vCPUs. MSIs will specify the target vCPU using
the APIC ID, for example, and the KVM Xen APIs also take an APIC ID
from userspace whenever a vCPU needs to be specified (as opposed to
just using the appropriate vCPU fd).
The third number space is not normally relevant to the kernel, and is
the ACPI/MADT/Xen CPU number which corresponds to cs->cpu_index. But
Xen timer hypercalls use it, and Xen timer hypercalls *really* want
to be accelerated in the kernel rather than handled in userspace, so
the kernel needs to be told.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Introduce support for emulating CPUID for Xen HVM guests. It doesn't make
sense to advertise the KVM leaves to a Xen guest, so do Xen unconditionally
when the xen-version machine property is set.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[dwmw2: Obtain xen_version from KVM property, make it automatic]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
This just initializes the basic Xen support in KVM for now. Only permitted
on TYPE_PC_MACHINE because that's where the sysbus devices for Xen heap
overlay, event channel, grant tables and other stuff will exist. There's
no point having the basic hypercall support if nothing else works.
Provide sysemu/kvm_xen.h and a kvm_xen_get_caps() which will be used
later by support devices.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Change to match the recent change to probe_access_flags.
All existing callers updated to supply 0, so no change in behaviour.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
- Various header cleaned up (removing pointless headers)
- Mark various files/code user/system specific
- Make various objects target-independent
- Remove tswapN() calls from dump.o
- Suggest g_assert_not_reached() instead of assert(0)
- qdev / qom
- Replace various container_of() by QOM cast macros
- Declare some QOM macros using OBJECT_DECLARE_TYPE()
- Embed OHCI QOM child in SM501 chipset
- hw (ISA & IDE)
- add some documentation, improve function names
- un-inline, open-code few functions
- have ISA API accessing IRQ/DMA prefer ISABus over ISADevice
- Demote IDE subsystem maintenance to "Odd Fixes"
- ui: Improve Ctrl+Alt hint on Darwin Cocoa
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmP9IeAACgkQ4+MsLN6t
wN7bdQ//SxJYJuQvqTT6s+O0LmP6NbqvhxCXX7YAwK2jCTM+zTgcqqRZCcisLQol
3ENu2UhnZmiLKHSOxatOVozbws08/u8Vl+WkW4UTMUb1yo5KPaPtq808Y95RdAJB
7D7B5juDGnFRAHXZz38zVk9uIuEkm+Po/pD0JQa+upBtAAgOJTqGavDNSR5+T0Yl
VjGdwK0b10skPqiF6OABYoy/4IFHVJJFIbARZh+a7hrF0llsbzUts5JiYsOxEEHQ
t3woUItdMnS1m0+Ty4AQ8m0Yv9y4HZOIzixvsZ+vChj5ariwUhL9/7wC/s/UCYEg
gKVA5X8R6n/ME6DScK99a+CyR/MXkz70b/rOUZxoutXhV3xdh4X1stL4WN9W/m3z
D4i4ZrUsDUcKCGWlj49of/dKbOPwk1+e/mT0oDZD6JzG0ODjfdVxvJ/JEV2iHgS3
WqHuSKzX/20H9j7/MgfbQ0HjBFOQ8tl781vQzhD+y+cF/IiTsHhrE6esIWho4bob
kfSdVydUWWRnBsnyGoRZXoEMX9tn+pu0nKxEDm2Bo2+jajsa0aZZPokgjxaz4MnD
Hx+/p1E+8IuOn05JgzQSgTJmKFdSbya203tXIsTo1kL2aJTJ6QfMvgEPP/fkn+lS
oQyVBFZmb1JDdTM1MxOncnlWLg74rp/CWEc+u5pSdbxMO/M/uac=
=AV/+
-----END PGP SIGNATURE-----
Merge tag 'buildsys-qom-qdev-ui-20230227' of https://github.com/philmd/qemu into staging
- buildsys
- Various header cleaned up (removing pointless headers)
- Mark various files/code user/system specific
- Make various objects target-independent
- Remove tswapN() calls from dump.o
- Suggest g_assert_not_reached() instead of assert(0)
- qdev / qom
- Replace various container_of() by QOM cast macros
- Declare some QOM macros using OBJECT_DECLARE_TYPE()
- Embed OHCI QOM child in SM501 chipset
- hw (ISA & IDE)
- add some documentation, improve function names
- un-inline, open-code few functions
- have ISA API accessing IRQ/DMA prefer ISABus over ISADevice
- Demote IDE subsystem maintenance to "Odd Fixes"
- ui: Improve Ctrl+Alt hint on Darwin Cocoa
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmP9IeAACgkQ4+MsLN6t
# wN7bdQ//SxJYJuQvqTT6s+O0LmP6NbqvhxCXX7YAwK2jCTM+zTgcqqRZCcisLQol
# 3ENu2UhnZmiLKHSOxatOVozbws08/u8Vl+WkW4UTMUb1yo5KPaPtq808Y95RdAJB
# 7D7B5juDGnFRAHXZz38zVk9uIuEkm+Po/pD0JQa+upBtAAgOJTqGavDNSR5+T0Yl
# VjGdwK0b10skPqiF6OABYoy/4IFHVJJFIbARZh+a7hrF0llsbzUts5JiYsOxEEHQ
# t3woUItdMnS1m0+Ty4AQ8m0Yv9y4HZOIzixvsZ+vChj5ariwUhL9/7wC/s/UCYEg
# gKVA5X8R6n/ME6DScK99a+CyR/MXkz70b/rOUZxoutXhV3xdh4X1stL4WN9W/m3z
# D4i4ZrUsDUcKCGWlj49of/dKbOPwk1+e/mT0oDZD6JzG0ODjfdVxvJ/JEV2iHgS3
# WqHuSKzX/20H9j7/MgfbQ0HjBFOQ8tl781vQzhD+y+cF/IiTsHhrE6esIWho4bob
# kfSdVydUWWRnBsnyGoRZXoEMX9tn+pu0nKxEDm2Bo2+jajsa0aZZPokgjxaz4MnD
# Hx+/p1E+8IuOn05JgzQSgTJmKFdSbya203tXIsTo1kL2aJTJ6QfMvgEPP/fkn+lS
# oQyVBFZmb1JDdTM1MxOncnlWLg74rp/CWEc+u5pSdbxMO/M/uac=
# =AV/+
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 27 Feb 2023 21:34:24 GMT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* tag 'buildsys-qom-qdev-ui-20230227' of https://github.com/philmd/qemu: (125 commits)
ui/cocoa: user friendly characters for release mouse
dump: Add create_win_dump() stub for non-x86 targets
dump: Simplify compiling win_dump.o by introducing win_dump_available()
dump: Clean included headers
dump: Replace TARGET_PAGE_SIZE -> qemu_target_page_size()
dump: Replace tswapN() -> cpu_to_dumpN()
hw/ide/pci: Add PCIIDEState::isa_irq[]
hw/ide/via: Replace magic 2 value by ARRAY_SIZE / MAX_IDE_DEVS
hw/ide/piix: Refactor pci_piix_init_ports as pci_piix_init_bus per bus
hw/ide/piix: Pass Error* to pci_piix_init_ports() for better error msg
hw/ide/piix: Remove unused includes
hw/ide/pci: Unexport bmdma_active_if()
hw/ide/ioport: Remove unnecessary includes
hw/ide: Declare ide_get_[geometry/bios_chs_trans] in 'hw/ide/internal.h'
hw/ide: Rename idebus_active_if() -> ide_bus_active_if()
hw/ide: Rename ide_init2() -> ide_bus_init_output_irq()
hw/ide: Rename ide_exec_cmd() -> ide_bus_exec_cmd()
hw/ide: Rename ide_register_restart_cb -> ide_bus_register_restart_cb
hw/ide: Rename ide_create_drive() -> ide_bus_create_drive()
hw/ide: Rename ide_set_irq() -> ide_bus_set_irq()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The ioapic sources reside in hw/intc already. Move the headers there
as well.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230213173033.98762-11-shentey@gmail.com>
[PMD: Keep ioapic_internal.h in hw/intc/, not under include/]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
x86_cpu_dump_local_apic_state() is called from monitor.c which
is only compiled for system emulation since commit bf95728400
("monitor: remove target-specific code from monitor.c").
Interestingly this stub was added few weeks later in commit
1f871d49e3 ("hmp: added local apic dump state") and was not
necessary by that time.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221216220158.6317-5-philmd@linaro.org>
NEED_CPU_H is always defined for these target-specific headers.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221216220158.6317-2-philmd@linaro.org>
The 'hwaddr' type is only available / meaningful on system emulation.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20221216215519.5522-5-philmd@linaro.org>
The new CPU model mostly inherits features from Icelake-Server, while
adding new features:
- AMX (Advance Matrix eXtensions)
- Bus Lock Debug Exception
and new instructions:
- AVX VNNI (Vector Neural Network Instruction):
- VPDPBUS: Multiply and Add Unsigned and Signed Bytes
- VPDPBUSDS: Multiply and Add Unsigned and Signed Bytes with Saturation
- VPDPWSSD: Multiply and Add Signed Word Integers
- VPDPWSSDS: Multiply and Add Signed Integers with Saturation
- FP16: Replicates existing AVX512 computational SP (FP32) instructions
using FP16 instead of FP32 for ~2X performance gain
- SERIALIZE: Provide software with a simple way to force the processor to
complete all modifications, faster, allowed in all privilege levels and
not causing an unconditional VM exit
- TSX Suspend Load Address Tracking: Allows programmers to choose which
memory accesses do not need to be tracked in the TSX read set
- AVX512_BF16: Vector Neural Network Instructions supporting BFLOAT16
inputs and conversion instructions from IEEE single precision
- fast zero-length MOVSB (KVM doesn't support yet)
- fast short STOSB (KVM doesn't support yet)
- fast short CMPSB, SCASB (KVM doesn't support yet)
Features that may be added in future versions:
- CET (virtualization support hasn't been merged)
Signed-off-by: Wang, Lei <lei4.wang@intel.com>
Reviewed-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <20220812055751.14553-1-lei4.wang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are just a flag that documents the performance characteristic of
an instruction; it needs no hypervisor support. So include them even
if KVM does not show them. In particular, FZRM/FSRS/FSRC have only
been added very recently, but they are available on Sapphire Rapids
processors.
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are three more markers for string operation optimizations.
They can all be added to TCG, whose string operations are more or
less as fast as they can be for short lengths.
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fast short REP MOVS can be added to TCG, since a trivial translation
of string operation is a good option for short lengths.
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We did not correctly handle N >= operand size.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1374
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230114233206.3118472-1-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230207075115.1525-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Before this commit, when GDB attached an OS working on QEMU, order of FPU
stack registers printed by GDB command 'info float' was wrong. There was a
bug causing the problem in 'g' packets sent by QEMU to GDB. The packets have
values of registers of machine emulated by QEMU containing FPU stack
registers. There are 2 ways to specify a x87 FPU stack register. The first
is specifying by absolute indexed register names (R0, ..., R7). The second
is specifying by stack top relative indexed register names (ST0, ..., ST7).
Values of the FPU stack registers should be located in 'g' packet and be
ordered by the relative index. But QEMU had located these registers ordered
by the absolute index. After this commit, when QEMU reads registers to make
a 'g' packet, QEMU specifies FPU stack registers by the relative index.
Then, the registers are ordered correctly in the packet. As a result, GDB,
the packet receiver, can print FPU stack registers in the correct order.
Signed-off-by: TaiseiIto <taisei1212@outlook.jp>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <TY0PR0101MB4285923FBE9AD97CE832D95BA4E59@TY0PR0101MB4285.apcprd01.prod.exchangelabs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Failure to truncate the inputs results in garbage for the carry-out.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1373
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230115012103.3131796-1-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When ADCX is followed by ADOX or vice versa, the second instruction's
carry comes from EFLAGS and the condition codes use the CC_OP_ADCOX
operation. Retrieving the carry from EFLAGS is handled by this bit
of gen_ADCOX:
tcg_gen_extract_tl(carry_in, cpu_cc_src,
ctz32(cc_op == CC_OP_ADCX ? CC_C : CC_O), 1);
Unfortunately, in this case cc_op has been overwritten by the previous
"if" statement to CC_OP_ADCOX. This works by chance when the first
instruction is ADCX; however, if the first instruction is ADOX,
ADCX will incorrectly take its carry from OF instead of CF.
Fix by moving the computation of the new cc_op at the end of the function.
The included exhaustive test case fails without this patch and passes
afterwards.
Because ADCX/ADOX need not be invoked through the VEX prefix, this
regression bisects to commit 16fc5726a6 ("target/i386: reimplement
0x0f 0x38, add AVX", 2022-10-18). However, the mistake happened a
little earlier, when BMI instructions were rewritten using the new
decoder framework.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1471
Reported-by: Paul Jolly <https://gitlab.com/myitcv>
Fixes: 1d0b926150 ("target/i386: move scalar 0F 38 and 0F 3A instruction to new decoder", 2022-10-18)
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We forgot to set cc_src, which is used for computing C.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1370
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230114180601.2993644-1-richard.henderson@linaro.org>
Cc: qemu-stable@nongnu.org
Fixes: 1d0b926150 ("target/i386: move scalar 0F 38 and 0F 3A instruction to new decoder", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There were two problems here: not limiting the input to operand bits,
and not correctly handling large extraction length.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1372
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230114230542.3116013-3-richard.henderson@linaro.org>
Cc: qemu-stable@nongnu.org
Fixes: 1d0b926150 ("target/i386: move scalar 0F 38 and 0F 3A instruction to new decoder", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use tcg_gen_atomic_cmpxchg_i128 for the atomic case,
and tcg_gen_qemu_ld/st_i128 otherwise.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use tcg_gen_atomic_cmpxchg_i64 for the atomic case,
and tcg_gen_nonatomic_cmpxchg_i64 otherwise.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The HAXM project has been retired (see https://github.com/intel/haxm#status),
so we should mark the code in QEMU as deprecated (and finally remove it
unless somebody else picks the project up again - which is quite unlikely
since there are now whpx and hvf on these operating systems, too).
Message-Id: <20230126121034.1035138-1-thuth@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We have two inclusion loops:
block/block.h
-> block/block-global-state.h
-> block/block-common.h
-> block/blockjob.h
-> block/block.h
block/block.h
-> block/block-io.h
-> block/block-common.h
-> block/blockjob.h
-> block/block.h
I believe these go back to Emanuele's reorganization of the block API,
merged a few months ago in commit d7e2fe4aac.
Fortunately, breaking them is merely a matter of deleting unnecessary
includes from headers, and adding them back in places where they are
now missing.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221221133551.3967339-2-armbru@redhat.com>
The 'hwaddr' type is defined in "exec/hwaddr.h" as:
hwaddr is the type of a physical address
(its size can be different from 'target_ulong').
All definitions use the 'HWADDR_' prefix, except TARGET_FMT_plx:
$ fgrep define include/exec/hwaddr.h
#define HWADDR_H
#define HWADDR_BITS 64
#define HWADDR_MAX UINT64_MAX
#define TARGET_FMT_plx "%016" PRIx64
^^^^^^
#define HWADDR_PRId PRId64
#define HWADDR_PRIi PRIi64
#define HWADDR_PRIo PRIo64
#define HWADDR_PRIu PRIu64
#define HWADDR_PRIx PRIx64
#define HWADDR_PRIX PRIX64
Since hwaddr's size can be *different* from target_ulong, it is
very confusing to read one of its format using the 'TARGET_FMT_'
prefix, normally used for the target_long / target_ulong types:
$ fgrep TARGET_FMT_ include/exec/cpu-defs.h
#define TARGET_FMT_lx "%08x"
#define TARGET_FMT_ld "%d"
#define TARGET_FMT_lu "%u"
#define TARGET_FMT_lx "%016" PRIx64
#define TARGET_FMT_ld "%" PRId64
#define TARGET_FMT_lu "%" PRIu64
Apparently this format was missed during commit a8170e5e97
("Rename target_phys_addr_t to hwaddr"), so complete it by
doing a bulk-rename with:
$ sed -i -e s/TARGET_FMT_plx/HWADDR_FMT_plx/g $(git grep -l TARGET_FMT_plx)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230110212947.34557-1-philmd@linaro.org>
[thuth: Fix some warnings from checkpatch.pl along the way]
Signed-off-by: Thomas Huth <thuth@redhat.com>
VRCPSS, VRSQRTSS and VCVTSx2Sx have a 32-bit or 64-bit memory operand,
which is represented in the decoding tables by X86_VEX_REPScalar. Add it
to the tables, and make validate_vex() handle the case of an instruction
that is in exception type 4 without the REP prefix and exception type 5
with it; this is the cas of VRCP and VRSQRT.
Reported-by: yongwoo <https://gitlab.com/yongwoo36>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1377
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
To avoid compilation errors when -Werror=maybe-uninitialized is used,
add a default case with g_assert_not_reached().
Otherwise with GCC 11.3.1 "cc (GCC) 11.3.1 20220421 (Red Hat 11.3.1-2)"
we get:
../target/i386/ops_sse.h: In function ‘helper_vpermdq_ymm’:
../target/i386/ops_sse.h:2495:13: error: ‘r3’ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
2495 | d->Q(3) = r3;
| ~~~~~~~~^~~~
../target/i386/ops_sse.h:2494:13: error: ‘r2’ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
2494 | d->Q(2) = r2;
| ~~~~~~~~^~~~
../target/i386/ops_sse.h:2493:13: error: ‘r1’ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
2493 | d->Q(1) = r1;
| ~~~~~~~~^~~~
../target/i386/ops_sse.h:2492:13: error: ‘r0’ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
2492 | d->Q(0) = r0;
| ~~~~~~~~^~~~
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20221222140158.1260748-1-eric.auger@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When in 64-bit mode, IDT entiries are 16 bytes, so `intno * 16` is used
for base/limit/offset calculations. However, even in 64-bit mode, the
exception error code still uses bits [3,16) for the invlaid interrupt
index.
This means the error code should still be `intno * 8 + 2` even in 64-bit
mode.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1382
Signed-off-by: Joe Richey <joerichey@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new SGX Asynchronous Exit (AEX) notification mechanism (AEX-notify)
allows one enclave to receive a notification in the ERESUME after the
enclave exit due to an AEX. EDECCSSA is a new SGX user leaf function
(ENCLU[EDECCSSA]) to facilitate the AEX notification handling.
Whether the hardware supports to create enclave with AEX-notify support
is enumerated via CPUID.(EAX=0x12,ECX=0x1):EAX[10]. The new EDECCSSA
user leaf function is enumerated via CPUID.(EAX=0x12,ECX=0x0):EAX[11].
Add support to allow to expose the new SGX AEX-notify feature and the
new EDECCSSA user leaf function to KVM guest.
Link: https://lore.kernel.org/lkml/166760360549.4906.809756297092548496.tip-bot2@tip-bot2/
Link: https://lore.kernel.org/lkml/166760360934.4906.2427175408052308969.tip-bot2@tip-bot2/
Reviewed-by: Yang Zhong <yang.zhong@linux.intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Message-Id: <20221109024834.172705-1-kai.huang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-machine kernel-irqchip=off is broken for many guest OSes; kernel-irqchip=split
is the replacement that works, so remove the deprecated support for the former.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Convert the i386 CPU class to use 3-phase reset, so it doesn't
need to use device_class_set_parent_reset() any more.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Edgar E. Iglesias <edgar@zeroasic.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Message-id: 20221124115023.2437291-7-peter.maydell@linaro.org
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmOZ6lYSHGFybWJydUBy
ZWRoYXQuY29tAAoJEDhwtADrkYZT6VEQAKynjWh3AIZ4/qOgrVqsP0oRspevLmfH
BbuGoldjYpEE7RbwuCaZalZ7iy7TcSySxnPfUDVsFHd7NWffJVjwKHifGC0D/Ez0
+Ggyb1CBebN+mS7t+BNFUHdMM+wxFIlHwg4f4aTFbn2o0HKgj2a8tcNzNRonZbfa
xURnvbD4G4u0VZEc3Jak+x193xbOJFsuuWq0BZnDuNk+XqjyW2RwfpXLPJVk+82a
4uy/YgYuqXUqBeULwcJj+shBL4SXR9GyajTFMS64przSUle0ADUmXkPtaS2agV7e
Pym/UQuAcxvNyw34fJsiMZxx6rZI9YU30jQUMRLoYcPRR/Q/aiPeiiHtiD6Kaid7
IfOeH/EArXaQRFpD89xj4YcaTnRLQOEj0NXgXvAbQf6eD8JYyao/S/0lCsPZEoA2
nibLqEQ25ncDNXoSomuwtfjVff3w68lODFbhwqfA0gf3cPtCgVZ6xQ8P/McNY6K6
wqFHXMWTDHk1LOCTucjYz1z2TGzTnSG4iWi5Yt6FSxAc958AO+v5ALn/1pcYun+E
azM/MF0AInKj2aJCT530zT0tpCs/Jo07YKC8k6ubi77S0ZdmGS1XLeXkRXfk1+yI
OhuUgiVlSTHxD69DagT2vbnx1mDMM9X+OBIMvEi5nwvD9A/ghaCgkDeGFvbA1ud0
t0mxPBZJ+tiZ
=JJjG
-----END PGP SIGNATURE-----
Merge tag 'pull-misc-2022-12-14' of https://repo.or.cz/qemu/armbru into staging
Miscellaneous patches for 2022-12-14
# gpg: Signature made Wed 14 Dec 2022 15:23:02 GMT
# gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg: issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* tag 'pull-misc-2022-12-14' of https://repo.or.cz/qemu/armbru:
ppc4xx_sdram: Simplify sdram_ddr_size() to return
block/vmdk: Simplify vmdk_co_create() to return directly
cleanup: Tweak and re-run return_directly.cocci
io: Tidy up fat-fingered parameter name
qapi: Use returned bool to check for failure (again)
sockets: Use ERRP_GUARD() where obviously appropriate
qemu-config: Use ERRP_GUARD() where obviously appropriate
qemu-config: Make config_parse_qdict() return bool
monitor: Use ERRP_GUARD() in monitor_init()
monitor: Simplify monitor_fd_param()'s error handling
error: Move ERRP_GUARD() to the beginning of the function
error: Drop a few superfluous ERRP_GUARD()
error: Drop some obviously superfluous error_propagate()
Drop more useless casts from void * to pointer
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The has_FOO for pointer-valued FOO are redundant, except for arrays.
They are also a nuisance to work with. Recent commit "qapi: Start to
elide redundant has_FOO in generated C" provided the means to elide
them step by step. This is the step for qapi/machine*.json.
Said commit explains the transformation in more detail. The invariant
violations mentioned there do not occur here.
Cc: Eduardo Habkost <eduardo@habkost.net>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221104160712.3005652-16-armbru@redhat.com>
Commit 012d4c96e2 changed the visitor functions taking Error ** to
return bool instead of void, and the commits following it used the new
return value to simplify error checking. Since then a few more uses
in need of the same treatment crept in. Do that. All pretty
mechanical except for
* balloon_stats_get_all()
This is basically the same transformation commit 012d4c96e2 applied
to the virtual walk example in include/qapi/visitor.h.
* set_max_queue_size()
Additionally replace "goto end of function" by return.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221121085054.683122-10-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In get_physical_address, the canonical address check failed to
set TranslateFault.stage2, which resulted in an uninitialized
read from the struct when reporting the fault in x86_cpu_tlb_fill.
Adjust all error paths to use structure assignment so that the
entire struct is always initialized.
Reported-by: Daniel Hoffman <dhoff749@gmail.com>
Fixes: 9bbcf37219 ("target/i386: Reorg GET_HPHYS")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221201074522.178498-1-richard.henderson@linaro.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1324
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MMX state is saved/restored by FSAVE/FRSTOR so the instructions are
not illegal opcodes even if CR4.OSFXSR=0. Make sure that validate_vex
takes into account the prefix and only checks HF_OSFXSR_MASK in the
presence of an SSE instruction.
Fixes: 20581aadec ("target/i386: validate VEX prefixes via the instructions' exception classes", 2022-10-18)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1350
Reported-by: Helge Konetzka (@hejko on gitlab.com)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When translating code that is using LAHF and SAHF in combination with the
REX prefix, the instructions should not use any other register than AH;
however, QEMU selects SPL (SP being register 4, just like AH) if the
REX prefix is present. To fix this, use deposit directly without
going through gen_op_mov_v_reg and gen_op_mov_reg_v.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/130
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Unlike the memory case, where "the destination operand receives a write
cycle without regard to the result of the comparison", rm must not be
touched altogether if the write fails, including not zero-extending
it on 64-bit processors. This is not how the movcond currently works,
because it is always followed by a gen_op_mov_reg_v to rm.
To fix it, introduce a new function that is similar to gen_op_mov_reg_v
but writes to a TCG temporary.
Considering that gen_extu(ot, oldv) is not needed in the memory case
either, the two cases for register and memory destinations are different
enough that one might as well fuse the two "if (mod == 3)" into one.
So do that too.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/508
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[rth: Add a test case ]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
If CR0.PG is unset, pg_mode will be zero, but it will also be zero
for non-PAE/non-PSE page tables with CR0.WP=0. Restore the
correct test for paging enabled.
Fixes: 98281984a3 ("target/i386: Add MMU_PHYS_IDX and MMU_NESTED_IDX")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1269
Reported-by: Andreas Gustafsson <gson@gson.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221102091232.1092552-1-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The helpers for reset_rf, cli, sti, clac, stac are
completely trivial; implement them inline.
Drop some nearby #if 0 code.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The value passed is always true, and if the target's
synchronize_from_tb hook is non-trivial, not exiting
may be erroneous.
Reviewed-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Avoid cpu_restore_state, and modifying env->eip out from
underneath the translator with TARGET_TB_PCREL. There is
some slight duplication from x86_restore_state_to_opc,
but it's just a few lines.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1269
Reviewed-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Specify maximum possible APIC ID assigned for current VM session to KVM
prior to the creation of vCPUs. By this setting, KVM can set up VM-scoped
data structure indexed by the APIC ID, e.g. Posted-Interrupt Descriptor
pointer table to support Intel IPI virtualization, with the most optimal
memory footprint.
It can be achieved by calling KVM_ENABLE_CAP for KVM_CAP_MAX_VCPU_ID
capability once KVM has enabled it. Ignoring the return error if KVM
doesn't support this capability yet.
Signed-off-by: Zeng Guang <guang.zeng@intel.com>
Acked-by: Peter Xu <peterx@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20220825025246.26618-1-guang.zeng@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These memory allocation functions return void *, and casting to
another pointer type is useless clutter. Drop these casts.
If you really want another pointer type, consider g_new().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220923120025.448759-3-armbru@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The only issue with FMA instructions is that there are _a lot_ of them (30
opcodes, each of which comes in up to 4 versions depending on VEX.W and
VEX.L; a total of 96 possibilities). However, they can be implement with
only 6 helpers, two for scalar operations and four for packed operations.
(Scalar versions do not do any merging; they only affect the bottom 32
or 64 bits of the output operand. Therefore, there is no separate XMM
and YMM of the scalar helpers).
First, we can reduce the number of helpers to one third by passing four
operands (one output and three inputs); the reordering of which operands
go to the multiply and which go to the add is done in emit.c.
Second, the different instructions also dispatch to the same softfloat
function, so the flags for float32_muladd and float64_muladd are passed
in the helper as int arguments, with a little extra complication to
handle FMADDSUB and FMSUBADD.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
F16C only consists of two instructions, which are a bit peculiar
nevertheless.
First, they access only the low half of an YMM or XMM register for the
packed-half operand; the exact size still depends on the VEX.L flag.
This is similar to the existing avx_movx flag, but not exactly because
avx_movx is hardcoded to affect operand 2. To this end I added a "ph"
format name; it's possible to reuse this approach for the VPMOVSX and
VPMOVZX instructions, though that would also require adding two more
formats for the low-quarter and low-eighth of an operand.
Second, VCVTPS2PH is somewhat weird because it *stores* the result of
the instruction into memory rather than loading it.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
VROUND, FSTCW and STMXCSR all have to perform the same conversion from
x86 rounding modes to softfloat constants. Since the ISA is consistent
on the meaning of the two-bit rounding modes, extract the common code
into a wrapper for set_float_rounding_mode.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If the destination is a memory register, op->n is -1. Going through
tcg_gen_gvec_dup_imm path is both useless (the value has been stored
by the gen_* function already) and wrong because of the out-of-bounds
access.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With all SSE (and AVX!) instructions now implemented in disas_insn_new,
it's possible to remove gen_sse, as well as the helpers for instructions
that now use gvec.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This adds another kind of weirdness when you thought you had seen it all:
an opcode byte that comes _after_ the address, not before. It's not
worth adding a new X86_SPECIAL_* constant for it, but it's actually
not unlike VCMP; so, forgive me for exploiting the similarity and just
deciding to dispatch to the right gen_helper_* call in a single code
generation function.
In fact, the old decoder had a bug where s->rip_offset should have
been set to 1 for 3DNow! instructions, and it's fixed now.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Include AVX, AVX2 and VAES in the guest cpuid features supported by TCG.
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-40-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are exactly the same as the non-VEX version, but one has to be careful
that only VEX.L=0 is allowed.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Here the code is a bit uglier due to the truncation and extension
of registers to and from 32-bit. There is also a mistake in the
manual with respect to the size of the memory operand of CVTPS2PI
and CVTTPS2PI, reported by Ricky Zhou.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are mostly moves, and yet are a total pain. The main issue
is that:
1) some instructions are selected by mod==11 (register operand)
vs. mod=00/01/10 (memory operand)
2) stores to memory are two-operand operations, while the 3-register
and load-from-memory versions operate on the entire contents of the
destination; this makes it easier to separate the gen_* function for
the store case
3) it's inefficient to load into xmm_T0 only to move the value out
again, so the gen_* function for the load case is separated too
The manual also has various mistakes in the operands here, for example
the store case of MOVHPS operates on a 128-bit source (albeit discarding
the bottom 64 bits) and therefore should be Mq,Vdq rather than Mq,Vq.
Likewise for the destination and source of MOVHLPS.
VUNPCK?PS and VUNPCK?PD are the same as VUNPCK?DQ and VUNPCK?QDQ,
but encoded as prefixes rather than separate operands. The helpers
can be reused however.
For MOVSLDUP, MOVSHDUP and MOVDDUP I chose to reimplement them as
helpers. I named the helper for MOVDDUP "movdldup" in preparation
for possible future introduction of MOVDHDUP and to clarify the
similarity with MOVSLDUP.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nothing special going on here, for once.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are several special cases here:
1) extending moves have different widths for the helpers vs. for the
memory loads, and the width for memory loads depends on VEX.L too.
This is represented by X86_SPECIAL_AVXExtMov.
2) some instructions, such as variable-width shifts, select the vector element
size via REX.W.
3) VSIB instructions (VGATHERxPy, VPGATHERxy) are also part of this group,
and they have (among other things) two output operands.
3) the macros for 4-operand blends (which are under 0x0f 0x3a) have to be
extended to support 2-operand blends. The 2-operand variant actually
came a few years earlier, but it is clearer to implement them in the
opposite order.
X86_TYPE_WM, introduced earlier for unaligned loads, is reused for helpers
that accept a Reg* but have a M argument.
These three-byte opcodes also include AVX new instructions, for which
the helpers were originally implemented by Paul Brook <paul@nowt.org>.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As pmovmskb is used by strlen et al, this is the third
highest overhead sse operation at %0.8.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
[Reorganize to generate code for any vector size. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The more complicated operations here are insertions and extractions.
Otherwise, there are just more entries than usual because the PS/PD/SS/SD
variations are encoded in the opcode rater than in the prefixes.
These three-byte opcodes also include AVX new instructions, whose
implementation in the helpers was originally done by Paul Brook
<paul@nowt.org>.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Three-byte opcodes from the 0F3Ah area all have an immediate byte which
is usually unsigned. Clarify in the helper code that it is unsigned;
the new decoder treats immediates as signed by default, and seeing
an intN_t in the prototype might give the wrong impression that one
can use decode->immediate directly.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The more complicated ones here are d6-d7, e6-e7, f7. The others
are trivial.
For LDDQU, using gen_load_sse directly might corrupt the register if
the second part of the load fails. Therefore, add a custom X86_TYPE_WM
value; like X86_TYPE_W it does call gen_load(), but it also rejects a
value of 11 in the ModRM field like X86_TYPE_M.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This includes shifts by immediate, which use bits 3-5 of the ModRM byte
as an opcode extension. With the exception of 128-bit shifts, they are
implemented using gvec.
This also covers VZEROALL and VZEROUPPER, which use the same opcode
as EMMS. If we were wanting to optimize out gen_clear_ymmh then this
would be one of the starting points. The implementation of the VZEROALL
and VZEROUPPER helpers is by Paul Brook.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are a mixed batch, including the first two horizontal
(66 and F2 only) operations, more moves, and SSE4a extract/insert.
Because SSE4a is pretty rare, I chose to leave the helper as they are,
but it is possible to unify them by loading index and length from the
source XMM register and generating deposit or extract TCG ops.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are mostly floating-point SSE operations. The odd ones out
are MOVMSK and CVTxx2yy, the others are straightforward.
Unary operations are a bit special in AVX because they have 2 operands
for PD/PS operands (VEX.vvvv must be 1111b), and 3 operands for SD/SS.
They are handled using X86_OP_GROUP3 for compactness.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are more simple integer instructions present in both MMX and SSE/AVX,
with no holes that were later occupied by newer instructions.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These are both MMX and SSE/AVX instructions, except for vmovdqu. In both
cases the inputs and output is in s->ptr{0,1,2}, so the only difference
between MMX, SSE, and AVX is which helper to call.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new implementation of SSE will cover AVX from the get go, because
all the work for the helper functions is already done. We just need to
build them.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new implementation of SSE will cover AVX from the get go, so include
the 24 extra comparison operators that are only available with the VEX
prefix.
Based on a patch by Paul Brook <paul@nowt.org>.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Compared to Paul's implementation, the new decoder will use a different approach
to implement AVX's merging of dst with src1 on scalar operations. Adjust the
old SSE decoder to be compatible with new-style helpers.
The affected instructions are CVTSx2Sx, ROUNDSx, RSQRTSx, SQRTSx, RCPSx.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Compared to Paul's implementation, the new decoder will use a different approach
to implement AVX's merging of dst with src1 on scalar operations. Adjust the
helpers to provide this functionality.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add to the helpers all the operands that are needed to implement AVX.
Extracted from a patch by Paul Brook <paul@nowt.org>.
Message-Id: <20220424220204.2493824-26-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Adjust all #ifdefs to match the ones in ops_sse.h.
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-23-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because these are the only VEX instructions that QEMU supports, the
new decoder is entered on the first byte of a valid VEX prefix, and VEX
decoding only needs to be done in decode-new.c.inc.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Many SSE and AVX instructions are only valid with specific prefixes
(none, 66, F3, F2). Introduce a direct way to encode this in the
decoding table to avoid using decode groups too much.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a new hflag bit to determine whether AVX instructions are allowed
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-4-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
TCG will shortly implement VAES instructions, so add the relevant feature
word to the DisasContext.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add generic code generation that takes care of preparing operands
around calls to decode.e.gen in a table-driven manner, so that ALU
operations need not take care of that.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new decoder is based on three principles:
- use mostly table-driven decoding, using tables derived as much as possible
from the Intel manual. Centralizing the decode the operands makes it
more homogeneous, for example all immediates are signed. All modrm
handling is in one function, and can be shared between SSE and ALU
instructions (including XMM<->GPR instructions). The SSE/AVX decoder
will also not have duplicated code between the 0F, 0F38 and 0F3A tables.
- keep the code as "non-branchy" as possible. Generally, the code for
the new decoder is more verbose, but the control flow is simpler.
Conditionals are not nested and have small bodies. All instruction
groups are resolved even before operands are decoded, and code
generation is separated as much as possible within small functions
that only handle one instruction each.
- keep address generation and (for ALU operands) memory loads and writeback
as much in common code as possible. All ALU operations for example
are implemented as T0=f(T0,T1). For non-ALU instructions,
read-modify-write memory operations are rare, but registers do not
have TCGv equivalents: therefore, the common logic sets up pointer
temporaries with the operands, while load and writeback are handled
by gvec or by helpers.
These principles make future code review and extensibility simpler, at
the cost of having a relatively large amount of code in the form of this
patch. Even EVEX should not be _too_ hard to implement (it's just a crazy
large amount of possibilities).
This patch introduces the main decoder flow, and integrates the old
decoder with the new one. The old decoder takes care of parsing
prefixes and then optionally drops to the new one. The changes to the
old decoder are minimal and allow it to be replaced incrementally with
the new one.
There is a debugging mechanism through a "LIMIT" environment variable.
In user-mode emulation, the variable is the number of instructions
decoded by the new decoder before permanently switching to the old one.
In system emulation, the variable is the highest opcode that is decoded
by the new decoder (this is less friendly, but it's the best that can
be done without requiring deterministic execution).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
REX.W can be used even in 32-bit mode by AVX instructions, where it is retroactively
renamed to VEX.W. Make the field available even in 32-bit mode but keep the REX_W()
macro as it was; this way, that the handling of dflag does not use it by mistake and
the AVX code more clearly points at the special VEX behavior of the bit.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ldq takes a pointer to the first byte to load the 64-bit word in;
ldo takes a pointer to the first byte of the ZMMReg. Make them
consistent, which will be useful in the new SSE decoder's
load/writeback routines.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will be used for emission and endian adjustments of gvec operations.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220822223722.1697758-2-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rather than recurse directly on mmu_translate, go through the
same softmmu lookup that we did for the page table walk.
This centralizes all knowledge of MMU_NESTED_IDX, with respect
to setup of TranslationParams, to get_physical_address.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221002172956.265735-10-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use probe_access_full in order to resolve to a host address,
which then lets us use a host cmpxchg to update the pte.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/279
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221002172956.265735-9-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We don't need one variable set per translation level,
which requires copying into pte/pte_addr for huge pages.
Standardize on pte/pte_addr for all levels.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221002172956.265735-8-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use MMU_NESTED_IDX for each memory access, rather than
just a single translation to physical. Adjust svm_save_seg
and svm_load_seg to pass in mmu_idx.
This removes the last use of get_hphys so remove it.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221002172956.265735-7-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These new mmu indexes will be helpful for improving
paging and code throughout the target.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221002172956.265735-6-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>