Make the AES vector helpers AVX ready
No functional changes to existing helpers
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-22-paul@nowt.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make the pclmulqdq helper AVX ready
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-21-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rewrite the blendv helpers so that they can easily be extended to support
the AVX encodings, which make all 4 arguments explicit.
No functional changes to the existing helpers
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-20-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fixup various vector helpers that either trivially exten to 256 bit,
or don't have 256 bit variants.
No functional changes to existing helpers
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-19-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Perpare the horizontal atithmetic vector helpers for AVX
These currently use a dummy Reg typed variable to store the result then
assign the whole register. This will cause 128 bit operations to corrupt
the upper half of the register, so replace it with explicit temporaries
and element assignments.
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-18-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make the dpps and dppd helpers AVX-ready
I can't see any obvious reason why dppd shouldn't work on 256 bit ymm
registers, but both AMD and Intel agree that it's xmm only.
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-17-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
AVX includes an additional set of comparison predicates, some of which
our softfloat implementation does not expose as separate functions.
Rewrite the helpers in terms of floatN_compare for future extensibility.
Signed-off-by: Paul Brook <paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220424220204.2493824-24-paul@nowt.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Prepare the "easy" floating point vector helpers for AVX
No functional changes to existing helpers.
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-16-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These helpers need to take special care to avoid overwriting source values
before the wole result has been calculated. Currently they use a dummy
Reg typed variable to store the result then assign the whole register.
This will cause 128 bit operations to corrupt the upper half of the register,
so replace it with explicit temporaries and element assignments.
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-14-paul@nowt.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
More preparatory work for AVX support in various integer vector helpers
No functional changes to existing helpers.
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-13-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rewrite the "simple" vector integer helpers in preperation for AVX support.
While the current code is able to use the same prototype for unary
(a = F(b)) and binary (a = F(b, c)) operations, future changes will cause
them to diverge.
No functional changes to existing helpers
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-12-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rewrite the vector shift helpers in preperation for AVX support (3 operand
form and 256 bit vectors).
For now keep the existing two operand interface.
No functional changes to existing helpers.
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-11-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use a union to store the various possible kinds of function pointers, and
access the correct one based on the flags.
SSEOpHelper_table6 and SSEOpHelper_table7 right now only have one case,
but this would change with AVX's 3- and 4-argument operations. Use
unions there too, to keep the code more similar for the three tables.
Extracted from a patch by Paul Brook <paul@nowt.org>.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For AVX we're going to need both 128 bit (xmm) and 256 bit (ymm) variants of
floating point helpers. Add the register type suffix to the existing
*PS and *PD helpers (SS and SD variants are only valid on 128 bit vectors)
No functional changes.
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-15-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Extracted from a patch by Paul Brook <paul@nowt.org>.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Put more flags to work to avoid hardcoding lists of opcodes. The op7 case
for SSE_OPF_CMP is included for homogeneity and because AVX needs it, but
it is never used by SSE or MMX.
Extracted from a patch by Paul Brook <paul@nowt.org>.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Handle 3DNOW instructions early to avoid complicating the MMX/SSE logic.
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-25-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a flags field each row in sse_op_table6 and sse_op_table7.
Initially this is only used as a replacement for the magic SSE41_SPECIAL
pointer. The other flags are mostly relevant for the AVX implementation
but can be applied to SSE as well.
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-6-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a flags field to each row in sse_op_table1.
Initially this is only used as a replacement for the magic
SSE_SPECIAL and SSE_DUMMY pointers, the other flags are mostly
relevant for the AVX implementation but can be applied to SSE as well.
Signed-off-by: Paul Brook <paul@nowt.org>
Message-Id: <20220424220204.2493824-5-paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a convenience macro to get the address of an xmm_regs element within
CPUX86State.
This was originally going to be the basis of an implementation that broke
operations into 128 bit chunks. I scrapped that idea, so this is now a purely
cosmetic change. But I think a worthwhile one - it reduces the number of
function calls that need to be split over multiple lines.
No functional changes.
Signed-off-by: Paul Brook <paul@nowt.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220424220204.2493824-9-paul@nowt.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Extracted from a patch by Paul Brook <paul@nowt.org>.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Write down explicitly the load/store sequence.
Extracted from a patch by Paul Brook <paul@nowt.org>.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In the first 7.2 queue we have changes in the powernv pnv-phb handling,
the start of the QOMification of the ppc405 model, the removal of the
taihu machine, a new SLOF image and others.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQQX6/+ZI9AYAK8oOBk82cqW3gMxZAUCYw/AFgAKCRA82cqW3gMx
ZI6XAP0d8m6r1JqKXPSfCwVYy+AfrwY7oZWYbeTqdamK6xHcUQD+JyCcFcogY4Vz
YwvHLd9W2cqvoWiZ4tmkK4Mb0Xt0Xg4=
=0uL/
-----END PGP SIGNATURE-----
Merge tag 'pull-ppc-20220831' of https://gitlab.com/danielhb/qemu into staging
ppc patch queue for 2022-08-31:
In the first 7.2 queue we have changes in the powernv pnv-phb handling,
the start of the QOMification of the ppc405 model, the removal of the
taihu machine, a new SLOF image and others.
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQQX6/+ZI9AYAK8oOBk82cqW3gMxZAUCYw/AFgAKCRA82cqW3gMx
# ZI6XAP0d8m6r1JqKXPSfCwVYy+AfrwY7oZWYbeTqdamK6xHcUQD+JyCcFcogY4Vz
# YwvHLd9W2cqvoWiZ4tmkK4Mb0Xt0Xg4=
# =0uL/
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 31 Aug 2022 16:09:58 EDT
# gpg: using EDDSA key 17EBFF9923D01800AF2838193CD9CA96DE033164
# gpg: Good signature from "Daniel Henrique Barboza <danielhb413@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 17EB FF99 23D0 1800 AF28 3819 3CD9 CA96 DE03 3164
* tag 'pull-ppc-20220831' of https://gitlab.com/danielhb/qemu: (60 commits)
ppc4xx: Fix code style problems reported by checkpatch
ppc/ppc4xx: Fix sdram trace events
hw/ppc/Kconfig: Move imply before select
hw/ppc/sam460ex: Remove PPC405 dependency from sam460ex
ppc405: Move machine specific code to ppc405_boards.c
ppc/ppc405: QOM'ify FPGA
ppc/ppc405: Use an explicit I2C object
hw/intc/ppc-uic: Convert ppc-uic to a PPC4xx DCR device
ppc/ppc405: Use an embedded PPCUIC model in SoC state
ppc4xx: Rename ppc405-ebc to ppc4xx-ebc
ppc4xx: Move EBC model to ppc4xx_devs.c
ppc4xx: Rename ppc405-plb to ppc4xx-plb
ppc4xx: Move PLB model to ppc4xx_devs.c
ppc/ppc405: QOM'ify MAL
ppc/ppc405: QOM'ify PLB
ppc/ppc405: QOM'ify POB
ppc/ppc405: QOM'ify OPBA
ppc/ppc405: QOM'ify EBC
ppc/ppc405: QOM'ify DMA
ppc/ppc405: QOM'ify GPIO
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The DPPS (Dot Product) instruction is defined to first sum pairs of
intermediate results, then sum those values to get the final result.
i.e. (A+B)+(C+D)
We incrementally sum the results, i.e. ((A+B)+C)+D, which can result
in incorrect rouding.
For consistency, also change the variable names to the ones used
in the Intel SDM and implement DPPD following the manual.
Based on a patch by Paul Brook <paul@nowt.org>.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The computation must not overwrite neither the destination
nor the source before the last element has been computed.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_put_sregs2() fails to reset 'locked' CR4/CR0 bits upon vCPU reset when
it is in VMX root operation. Do kvm_put_msr_feature_control() before
kvm_put_sregs2() to (possibly) kick vCPU out of VMX root operation. It also
seems logical to do kvm_put_msr_feature_control() before
kvm_put_nested_state() and not after it, especially when 'real' nested
state is set.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20220818150113.479917-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make sure env->nested_state is cleaned up when a vCPU is reset, it may
be stale after an incoming migration, kvm_arch_put_registers() may
end up failing or putting vCPU in a weird state.
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20220818150113.479917-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This bit is not saved across interrupts, so we must
delay delivering the interrupt until the skip has
been processed.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1118
Reviewed-by: Michael Rolnik <mrolnik@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We cannot deliver two interrupts simultaneously;
the first interrupt handler must execute first.
Reviewed-by: Michael Rolnik <mrolnik@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There is no need to go through cc->tcg_ops when
we know what value that must have.
Reviewed-by: Michael Rolnik <mrolnik@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While there are no target-specific nonfaulting probes,
generic code may grow some uses at some point.
Note that the attrs argument was incorrect -- it should have
been MEMTXATTRS_UNSPECIFIED. Just use the simpler interface.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When an overflow exception occurs and OE is set the intermediate result
should be adjusted (by subtracting from the exponent) to avoid rounding
to inf. The same applies to an underflow exceptionion and UE (but adding
to the exponent). To do this set the fp_status.rebias_overflow when OE
is set and fp_status.rebias_underflow when UE is set as the FPU will
recalculate in case of a overflow/underflow if the according rebias* is
set.
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220805141522.412864-3-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
ppc_cpu_compare_class_pvr_mask() should match the best CPU class in the
family, because it is used by the KVM subsystem to find the host CPU
class. Since commit 03ae4133ab ("target-ppc: Add pvr_match()
callback"), it matches any class in the family (the first one in the
comparison list).
Since commit f30c843ced ("ppc/pnv: Introduce PowerNV machines with
fixed CPU models"), pnv has relied on pnv_match having these new
semantics to check machine compatibility with a CPU family.
Resolve this by adding a parameter to the pvr_match function to select
the best or any match, and restore the old behaviour for the KVM case.
Prior to this fix, e.g., a POWER9 DD2.3 KVM host matches to the
power9_v1.0 class (because that happens to be the first POWER9 family
CPU compared). After the patch, it matches the power9_v2.0 class.
This approach requires pnv_match contain knowledge of the CPU classes
implemented in the same family, which feels ugly. But pushing the 'best'
match down to the class would still require they know about one another
which is not obviously much better. For now this gets things working.
Fixes: 03ae4133ab ("target-ppc: Add pvr_match() callback")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20220731013358.170187-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
I2 is 16 bits, not 32.
Found by running valgrind's none/tests/s390x/traps.
Fixes: 1c26875182 ("target-s390: Implement COMPARE AND TRAP")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20220817161529.597414-1-iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Add stfle 197 (processor-activity-instrumentation extension 1) to the
gen16 default model and fence it off for 7.1 and older.
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20220727135120.12784-1-borntraeger@linux.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The proberi assembler instruction checks the read/write access rights
for the page of a given address and shall return a value of 1 if the
test succeeds and a value of 0 on failure in the target register.
But when run in linux-user mode, qemu currently simply returns the
return code of page_check_range() which returns 0 on success and -1 on
failure, which is the opposite of what proberi should return.
Fix it by checking the return code of page_check_range() and return the
expected return value.
The easiest way to reproduce the issue is by running
"/lib/ld.so.1 --version" in a chroot which fails without this patch.
At startup of ld.so the __canonicalize_funcptr_for_compare() function is
used to resolve the function address out of a function descriptor, which
fails because proberi (due to the wrong return code) seems to indicate
that the given address isn't accessible.
Signed-off-by: Helge Deller <deller@gmx.de>
1. Add some information about how to boot the LoongArch virt
machine by uefi bios and linux kernel and how to access the
source code or binary file.
2. Move the explanation of LoongArch system emulation in the
target/loongarch/README to docs/system/loongarch/loongson3.rst
Signed-off-by: Xiaojuan Yang <yangxiaojuan@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20220812091957.3338126-1-yangxiaojuan@loongson.cn>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The newly added neoverse-n1 CPU has ID register values which indicate
the presence of the Statistical Profiling Extension, because the real
hardware has this feature. QEMU's TCG emulation does not yet
implement SPE, though (not even as a minimal stub implementation), so
guests will crash if they try to use it because the SPE system
registers don't exist.
Force ID_AA64DFR0_EL1.PMSVer to 0 in CPU realize for TCG, so that
we don't advertise to the guest a feature that doesn't exist.
(We could alternatively do this by editing the value that
aarch64_neoverse_n1_initfn() sets for this ID register, but
suppressing the field in realize means we won't re-introduce this bug
when we add other CPUs that have SPE in hardware, such as the
Neoverse-V1.)
An example of a non-booting guest is current mainline Linux (5.19),
when booting in EL2 on the virt board (ie with -machine
virtualization=on).
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Message-id: 20220811131127.947334-1-peter.maydell@linaro.org
All of the fpu operations are defined with TCG_CALL_NO_WG, but they
all modify FCSR0. The most efficient way to fix this is to remove
cpu_fcsr0, and instead use explicit load and store operations for the
two instructions that manipulate that value.
Acked-by: Qi Hu <huqi@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Reported-by: Feiyang Chen <chenfeiyang@loongson.cn>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Coverity notes that we forgot to check the error return from
lock_user() in one place in the handling of the UHI_plog semihosting
call. Add the missing error handling.
report_fault() is rather brutal in that it will call abort(), but
this is the same error-handling used in the rest of this file.
Resolves: Coverity CID 1490684
Fixes: ea4210600d ("target/mips: Avoid qemu_semihosting_log_out for UHI_plog")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220719191737.384744-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
GDB LoongArch fpu use fcc register, update gdb_set_fpu()
and gdb_get_fpu() to match it.
Signed-off-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220805033523.1416837-6-gaosong@loongson.cn>
Rename loongarch-fpu64.xml to loongarch-fpu.xml and update
loongarch-fpu.xml to match upstream GDB [1]
[1]:https://github.com/bminor/binutils-gdb/blob/master/gdb/features/loongarch/fpu.xml
Signed-off-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220805033523.1416837-5-gaosong@loongson.cn>
GDB LoongArch add a register orig_a0, see the base64.xml [1].
We should add the orig_a0 to match the upstream GDB.
[1]: https://github.com/bminor/binutils-gdb/blob/master/gdb/features/loongarch/base64.xml
Signed-off-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220805033523.1416837-2-gaosong@loongson.cn>