Commit Graph

51 Commits

Author SHA1 Message Date
Paolo Bonzini
87b8bde55d Revert "host/i386: assume presence of SSE2"
This reverts commit b18236897c.
The x86-64 instruction set can now be tuned down to x86-64 v1
or i386 Pentium Pro.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-28 14:44:51 +02:00
Paolo Bonzini
fe721c1948 Revert "host/i386: assume presence of POPCNT"
This reverts commit 45ccdbcb24.
The x86-64 instruction set can now be tuned down to x86-64 v1
or i386 Pentium Pro.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-28 14:44:51 +02:00
Richard Henderson
a96a498738 util/bufferiszero: Add loongarch64 vector acceleration
Use inline assembly because no release compiler allows
per-function selection of the ISA.

Tested-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-06-19 12:47:11 -07:00
Richard Henderson
2d32a5d2a0 util/bufferiszero: Split out host include files
Split out host/bufferiszero.h.inc for x86, aarch64 and generic
in order to avoid an overlong ifdef ladder.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-06-19 12:47:11 -07:00
Richard Henderson
9d779187b8 util/loongarch64: Detect LASX vector support
Reviewed-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-06-19 10:55:12 -07:00
Paolo Bonzini
45ccdbcb24 host/i386: assume presence of POPCNT
QEMU now requires an x86-64-v2 host, which has the POPCNT instruction.
Use it freely in TCG-generated code.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-05 11:01:05 +02:00
Paolo Bonzini
b18236897c host/i386: assume presence of SSE2
QEMU now requires an x86-64-v2 host, which has SSE2.
Use it freely in buffer_is_zero.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-05 11:01:05 +02:00
Paolo Bonzini
e68e97ce55 host/i386: assume presence of CMOV
QEMU now requires an x86-64-v2 host, which always has CMOV.
Use it freely in TCG generated code.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-05 11:01:05 +02:00
Paolo Bonzini
72baef13b9 host/i386: nothing looks at CPUINFO_SSE4
The only user was the SSE4.1 variant of buffer_is_zero, which has
been removed; code to compute CPUINFO_SSE4 is dead.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-05 11:01:05 +02:00
Philippe Mathieu-Daudé
e4751d340a accel/tcg: Rename load-extract/store-insert headers using .h.inc suffix
Since commit 139c1837db ("meson: rename included C source files
to .c.inc"), QEMU standard procedure for included C files is to
use *.c.inc.

Besides, since commit 6a0057aa22 ("docs/devel: make a statement
about includes") this is documented in the Coding Style:

  If you do use template header files they should be named with
  the ``.c.inc`` or ``.h.inc`` suffix to make it clear they are
  being included for expansion.

Therefore rename 'store-insert-al16.h' as 'store-insert-al16.h.inc'
and 'load-extract-al16-al8.h' as 'load-extract-al16-al8.h.inc'.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240424173333.96148-3-philmd@linaro.org>
2024-04-26 15:31:37 +02:00
Thomas Huth
34aee9c946 host/include/generic/host/atomic128: Fix compilation problem with Clang 17
When compiling QEMU with Clang 17 on a s390x, the compilation fails:

In file included from ../accel/tcg/cputlb.c:32:
In file included from /root/qemu/include/exec/helper-proto-common.h:10:
In file included from /root/qemu/include/qemu/atomic128.h:62:
/root/qemu/host/include/generic/host/atomic128-ldst.h:68:15: error:
 __sync builtin operation MUST have natural alignment (consider using __
atomic). [-Werror,-Wsync-alignment]
   68 |     } while (!__sync_bool_compare_and_swap_16(ptr_align, old, new.i));
      |               ^
In file included from ../accel/tcg/cputlb.c:32:
In file included from /root/qemu/include/exec/helper-proto-common.h:10:
In file included from /root/qemu/include/qemu/atomic128.h:61:
/root/qemu/host/include/generic/host/atomic128-cas.h:36:11: error:
 __sync builtin operation MUST have natural alignment (consider using __a
tomic). [-Werror,-Wsync-alignment]
   36 |     r.i = __sync_val_compare_and_swap_16(ptr_align, c.i, n.i);
      |           ^
2 errors generated.

It's arguably a bug in Clang since we already use __builtin_assume_aligned()
to tell the compiler that the pointer is properly aligned. But according to
https://github.com/llvm/llvm-project/issues/69146 it seems like the Clang
folks don't see an easy fix on their side and recommend to use a type
declared with __attribute__((aligned(16))) to work around this problem.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1934
Message-ID: <20231108085954.313071-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-11-13 11:35:47 +01:00
Richard Henderson
adc8467e69 host/include/loongarch64: Add atomic16 load and store
While loongarch64 does not have a 128-bit cmpxchg, it does
have 128-bit atomic load and store via the vector unit.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230916220151.526140-6-richard.henderson@linaro.org>
2023-11-06 08:27:21 -08:00
Richard Henderson
0885f1221e util: Add cpuinfo for loongarch64
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Jiajie Chen <c@jia.je>
Message-Id: <20230916220151.526140-4-richard.henderson@linaro.org>
2023-11-06 08:27:21 -08:00
Michael Tokarev
bad5cfcd60 i386: spelling fixes
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2023-09-20 07:54:34 +03:00
Michael Tokarev
e6a19a6477 ppc: spelling fixes
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2023-09-20 07:54:34 +03:00
Stefan Hajnoczi
d7754940d7 *: Delete checks for old host definitions
tcg/loongarch64: Generate LSX instructions
 fpu: Add conversions between bfloat16 and [u]int8
 fpu: Handle m68k extended precision denormals properly
 accel/tcg: Improve cputlb i/o organization
 accel/tcg: Simplify tlb_plugin_lookup
 accel/tcg: Remove false-negative halted assertion
 tcg: Add gvec compare with immediate and scalar operand
 tcg/aarch64: Emit BTI insns at jump landing pads
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmUF4VIdHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8wOwf+I9qNus2kV3yQxpuU
 2hqYuLXvH96l9vbqaoyx7hyyJTtrqytLGCMPmQKUdtBGtO6z7PnLNDiooGcbO+gw
 2gdfw3Q//JZUTdx+ZSujUksV0F96Tqu0zi4TdJUPNIwhCrh0K8VjiftfPfbynRtz
 KhQ1lNeO/QzcAgzKiun2NyqdPiYDmNuEIS/jYedQwQweRp/xQJ4/x8DmhGf/OiD4
 rGAcdslN+RenqgFACcJ2A1vxUGMeQv5g/Cn82FgTk0cmgcfAODMnC+WnOm8ruQdT
 snluvnh/2/r8jIhx3frKDKGtaKHCPhoCS7GNK48qejxaybvv3CJQ4qsjRIBKVrVM
 cIrsSw==
 =cTgD
 -----END PGP SIGNATURE-----

Merge tag 'pull-tcg-20230915-2' of https://gitlab.com/rth7680/qemu into staging

*: Delete checks for old host definitions
tcg/loongarch64: Generate LSX instructions
fpu: Add conversions between bfloat16 and [u]int8
fpu: Handle m68k extended precision denormals properly
accel/tcg: Improve cputlb i/o organization
accel/tcg: Simplify tlb_plugin_lookup
accel/tcg: Remove false-negative halted assertion
tcg: Add gvec compare with immediate and scalar operand
tcg/aarch64: Emit BTI insns at jump landing pads

[Resolved conflict between CPUINFO_PMULL and CPUINFO_BTI.
--Stefan]

* tag 'pull-tcg-20230915-2' of https://gitlab.com/rth7680/qemu: (39 commits)
  tcg: Map code_gen_buffer with PROT_BTI
  tcg/aarch64: Emit BTI insns at jump landing pads
  util/cpuinfo-aarch64: Add CPUINFO_BTI
  tcg: Add tcg_out_tb_start backend hook
  fpu: Handle m68k extended precision denormals properly
  fpu: Add conversions between bfloat16 and [u]int8
  accel/tcg: Introduce do_st16_mmio_leN
  accel/tcg: Introduce do_ld16_mmio_beN
  accel/tcg: Merge io_writex into do_st_mmio_leN
  accel/tcg: Merge io_readx into do_ld_mmio_beN
  accel/tcg: Replace direct use of io_readx/io_writex in do_{ld,st}_1
  accel/tcg: Merge cpu_transaction_failed into io_failed
  plugin: Simplify struct qemu_plugin_hwaddr
  accel/tcg: Use CPUTLBEntryFull.phys_addr in io_failed
  accel/tcg: Split out io_prepare and io_failed
  accel/tcg: Simplify tlb_plugin_lookup
  target/arm: Use tcg_gen_gvec_cmpi for compare vs 0
  tcg: Add gvec compare with immediate and scalar operand
  tcg/loongarch64: Implement 128-bit load & store
  tcg/loongarch64: Lower rotli_vec to vrotri
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-09-19 13:20:54 -04:00
Richard Henderson
095859e5d9 util/cpuinfo-aarch64: Add CPUINFO_BTI
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-16 14:57:16 +00:00
Richard Henderson
055c99015a host/include/aarch64: Implement clmul.h
Detect PMULL in cpuinfo; implement the accel hook.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-15 13:57:00 +00:00
Richard Henderson
d6493dbb46 host/include/i386: Implement clmul.h
Detect PCLMUL in cpuinfo; implement the accel hook.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-15 13:57:00 +00:00
Richard Henderson
00f463b38a crypto: Add generic 64-bit carry-less multiply routine
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-09-15 13:57:00 +00:00
Michael Tokarev
8b81968c1c other architectures: spelling fixes
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2023-07-25 17:14:07 +03:00
Michael Tokarev
673d821541 arm: spelling fixes
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2023-07-25 17:13:53 +03:00
Richard Henderson
57357322e4 host/include/ppc: Implement aes-round.h
Detect CRYPTO in cpuinfo; implement the accel hooks.

Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-08 07:30:17 +01:00
Richard Henderson
8d97f28e36 host/include/aarch64: Implement aes-round.h
Detect AES in cpuinfo; implement the accel hooks.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-08 07:30:17 +01:00
Richard Henderson
d6a2443696 host/include/i386: Implement aes-round.h
Detect AES in cpuinfo; implement the accel hooks.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-08 07:30:17 +01:00
Richard Henderson
28e91474ce crypto: Add aesdec_ISB_ISR_AK_IMC
Add a primitive for InvSubBytes + InvShiftRows +
AddRoundKey + InvMixColumns.

Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-08 07:30:17 +01:00
Richard Henderson
15ff15982a crypto: Add aesdec_ISB_ISR_IMC_AK
Add a primitive for InvSubBytes + InvShiftRows +
InvMixColumns + AddRoundKey.

Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-08 07:30:17 +01:00
Richard Henderson
7c58cb972e crypto: Add aesenc_SB_SR_MC_AK
Add a primitive for SubBytes + ShiftRows + MixColumns + AddRoundKey.

Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-08 07:30:17 +01:00
Richard Henderson
5b41deb310 crypto: Add aesdec_IMC
Add a primitive for InvMixColumns.

Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-08 07:30:17 +01:00
Richard Henderson
04e1f30eed crypto: Add aesenc_MC
Add a primitive for MixColumns.

Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-08 07:30:17 +01:00
Richard Henderson
192fa84986 crypto: Add aesdec_ISB_ISR_AK
Add a primitive for InvSubBytes + InvShiftRows + AddRoundKey.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-08 07:30:17 +01:00
Richard Henderson
6b0a96ce3a crypto: Add aesenc_SB_SR_AK
Start adding infrastructure for accelerating guest AES.
Begin with a SubBytes + ShiftRows + AddRoundKey primitive.

Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-08 07:30:17 +01:00
Richard Henderson
623d7e3551 util: Add cpuinfo-ppc.c
Move the code from tcg/.  Fix a bug in that PPC_FEATURE2_ARCH_3_10
is actually spelled PPC_FEATURE2_ARCH_3_1.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-08 07:30:17 +01:00
Richard Henderson
8a2bd74787 host/include/x86_64: Use __m128i for "x" constraints
The macOS catalina compiler produces an error for __int128_t
as the type for allocation with SSE inline asm constraint.
Create a new X86Int128Union type and use the vector type for
all SSE register inputs and outputs.

Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-20 10:01:30 +02:00
Richard Henderson
dd906a37c7 accel/tcg: Add aarch64 store_atom_insert_al16
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30 09:51:12 -07:00
Richard Henderson
129fe7cddb accel/tcg: Add aarch64 lse2 load_atom_extract_al16_or_al8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30 09:51:12 -07:00
Richard Henderson
a8bde8da47 accel/tcg: Add x86_64 load_atom_extract_al16_or_al8
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30 09:51:11 -07:00
Richard Henderson
b3f4144fa9 accel/tcg: Extract store_atom_insert_al16 to host header
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30 09:51:11 -07:00
Richard Henderson
af844a1149 accel/tcg: Extract load_atom_extract_al16_or_al8 to host header
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30 09:51:11 -07:00
Richard Henderson
480dfba2c9 qemu/atomic128: Add x86_64 atomic128-ldst.h
With CPUINFO_ATOMIC_VMOVDQA, we can perform proper atomic
load/store without cmpxchg16b.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30 09:51:11 -07:00
Richard Henderson
e45fb74ddb qemu/atomic128: Add runtime test for FEAT_LSE2
With FEAT_LSE2, load and store of int128 is directly supported.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 18:54:55 -07:00
Richard Henderson
b35b812567 qemu/atomic128: Improve cmpxchg fallback for atomic16_set
Use __sync_bool_compare_and_swap_16 to control the loop,
rather than a separate comparison.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 18:54:55 -07:00
Richard Henderson
21c38f31c0 qemu/atomic128: Split atomic16_read
Create both atomic16_read_ro and atomic16_read_rw.
Previously we pretended that we had atomic16_read in system mode,
because we "know" that all ram is always writable to the host.
Now, expose read-only and read-write versions all of the time.

For aarch64, do not fall back to __atomic_read_16 even if
supported by the compiler, to work around a clang bug.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 18:54:55 -07:00
Richard Henderson
4deb39ebb3 accel/tcg: Eliminate #if on HAVE_ATOMIC128 and HAVE_CMPXCHG128
These symbols will shortly become dynamic runtime tests and
therefore not appropriate for the preprocessor.  Use the
matching CONFIG_* symbols for that purpose.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 18:54:55 -07:00
Richard Henderson
333c813b06 include/qemu: Move CONFIG_ATOMIC128_OPT handling to atomic128.h
Not only the routines in ldst_atomicity.c.inc need markup,
but also the ones in the headers.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 16:51:18 -07:00
Richard Henderson
154691335e include/host: Split out atomic128-ldst.h
Separates the aarch64-specific portion into its own file.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 16:51:18 -07:00
Richard Henderson
412db3d5b5 include/host: Split out atomic128-cas.h
Separates the aarch64-specific portion into its own file.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 16:51:18 -07:00
Richard Henderson
0dd0c7fa20 util: Add cpuinfo-aarch64.c
Move the code from tcg/.  The only use of these bits so far
is with respect to the atomicity of tcg operations.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 16:51:18 -07:00
Richard Henderson
a48b6afea4 util: Add i386 CPUINFO_ATOMIC_VMOVDQU
Add a bit to indicate when VMOVDQU is also atomic if aligned.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 16:49:33 -07:00
Richard Henderson
6bc12fd042 util: Add cpuinfo-i386.c
Add cpuinfo.h for i386 and x86_64, and the initialization
for that in util/.  Populate that with a slightly altered
copy of the tcg host probing code.  Other uses of cpuid.h
will be adjusted one patch at a time.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 16:49:28 -07:00