Commit Graph

1017 Commits

Author SHA1 Message Date
Richard Henderson
f5e6786de4 target-arm queue:
* Support gdbstub (guest debug) in HVF
  * xnlx-versal: Support CANFD controller
  * bpim2u: New board model: Banana Pi BPI-M2 Ultra
  * Emulate FEAT_LSE2
  * allow DC CVA[D]P in user mode emulation
  * trap DCC access in user mode emulation
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmR/AKUZHHBldGVyLm1h
 eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3jzIEACNepQGY44yPhrEG+wD4WAB
 fH670KI33HcsFd2rGsC369gcssQbRIW/29reOzNhRMuol+kHI6OFaONpuKSdO0Rz
 TLVIsnT2Uq8KwbYfLtDQt5knj027amPy75d4re8wIK1eZB4dOIHysqAvQrJYeync
 9obKku8xXGLwZh/mYHoVgHcZU0cPJO9nri39n1tV3JUBsgmqEURjzbZrMcF+yMX7
 bUzOYQvC1Iedmo+aWfx43u82AlNQFz1lsqmnQj7Z5rvv0HT+BRF5WzVMP0qRh5+Z
 njkqmBH9xb9kkgeHmeMvHpWox+J+obeSmVg/4gDNlJpThmpuU0Vr7EXUN3MBQlV9
 lhyy6zrTwC/BToiQqdT2dnpao9FzXy5exfnqi/py5IuqfjAzSO+p61LlPPZ4cJri
 pCK4yq2gzQXYfrlZkUJipvRMH8Xa4IdQx+w7lXrQoJdduF4/+6aJW/GAWSu0e7eC
 zgBwaJjI7ENce8ixJnuEFUxUnaBo8dl72a0PGA1UU8PL+cJNOIpyhPk4goWQprdn
 iFF4ZnjhBRZ2gk/4HGD9u5Vo2lNqP93YS5QhkGkF+HJsBmcOZgidIUpfHhPQvvHO
 Np196T2cAETCWGV1xG4CaTpxN2ndRReq3C0/mzfhIbwhXEACtvAiSlO4KB8t6pJj
 MzinCABXHcovJbGbxZ9j6w==
 =8SdN
 -----END PGP SIGNATURE-----

Merge tag 'pull-target-arm-20230606' of https://git.linaro.org/people/pmaydell/qemu-arm into staging

target-arm queue:
 * Support gdbstub (guest debug) in HVF
 * xnlx-versal: Support CANFD controller
 * bpim2u: New board model: Banana Pi BPI-M2 Ultra
 * Emulate FEAT_LSE2
 * allow DC CVA[D]P in user mode emulation
 * trap DCC access in user mode emulation

# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmR/AKUZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3jzIEACNepQGY44yPhrEG+wD4WAB
# fH670KI33HcsFd2rGsC369gcssQbRIW/29reOzNhRMuol+kHI6OFaONpuKSdO0Rz
# TLVIsnT2Uq8KwbYfLtDQt5knj027amPy75d4re8wIK1eZB4dOIHysqAvQrJYeync
# 9obKku8xXGLwZh/mYHoVgHcZU0cPJO9nri39n1tV3JUBsgmqEURjzbZrMcF+yMX7
# bUzOYQvC1Iedmo+aWfx43u82AlNQFz1lsqmnQj7Z5rvv0HT+BRF5WzVMP0qRh5+Z
# njkqmBH9xb9kkgeHmeMvHpWox+J+obeSmVg/4gDNlJpThmpuU0Vr7EXUN3MBQlV9
# lhyy6zrTwC/BToiQqdT2dnpao9FzXy5exfnqi/py5IuqfjAzSO+p61LlPPZ4cJri
# pCK4yq2gzQXYfrlZkUJipvRMH8Xa4IdQx+w7lXrQoJdduF4/+6aJW/GAWSu0e7eC
# zgBwaJjI7ENce8ixJnuEFUxUnaBo8dl72a0PGA1UU8PL+cJNOIpyhPk4goWQprdn
# iFF4ZnjhBRZ2gk/4HGD9u5Vo2lNqP93YS5QhkGkF+HJsBmcOZgidIUpfHhPQvvHO
# Np196T2cAETCWGV1xG4CaTpxN2ndRReq3C0/mzfhIbwhXEACtvAiSlO4KB8t6pJj
# MzinCABXHcovJbGbxZ9j6w==
# =8SdN
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 06 Jun 2023 02:47:17 AM PDT
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg:                 aka "Peter Maydell <peter@archaic.org.uk>" [unknown]

* tag 'pull-target-arm-20230606' of https://git.linaro.org/people/pmaydell/qemu-arm: (42 commits)
  target/arm: trap DCC access in user mode emulation
  tests/tcg/aarch64: add DC CVA[D]P tests
  target/arm: allow DC CVA[D]P in user mode emulation
  target/arm: Enable FEAT_LSE2 for -cpu max
  tests/tcg/multiarch: Adjust sigbus.c
  tests/tcg/aarch64: Use stz2g in mte-7.c
  target/arm: Move mte check for store-exclusive
  target/arm: Relax ordered/atomic alignment checks for LSE2
  target/arm: Add SCTLR.nAA to TBFLAG_A64
  target/arm: Check alignment in helper_mte_check
  target/arm: Pass single_memop to gen_mte_checkN
  target/arm: Pass memop to gen_mte_check1*
  target/arm: Hoist finalize_memop out of do_fp_{ld, st}
  target/arm: Hoist finalize_memop out of do_gpr_{ld, st}
  target/arm: Load/store integer pair with one tcg operation
  target/arm: Sink gen_mte_check1 into load/store_exclusive
  target/arm: Use tcg_gen_qemu_{ld, st}_i128 in gen_sve_{ld, st}r
  target/arm: Use tcg_gen_qemu_st_i128 for STZG, STZ2G
  target/arm: Use tcg_gen_qemu_{st, ld}_i128 for do_fp_{st, ld}
  target/arm: Use tcg_gen_qemu_ld_i128 for LDXP
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-06 12:11:34 -07:00
Richard Henderson
c0dde5fc5c accel/tcg: Fix undefined shift in store_whole_le16
The computation is documented as unused in this case,
but triggers an ubsan error:

../accel/tcg/ldst_atomicity.c.inc:837:33: runtime error: shift exponent -32 is negative
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../accel/tcg/ldst_atomicity.c.inc:837:33 in

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230606171629.98157-1-richard.henderson@linaro.org>
2023-06-06 12:11:02 -07:00
Francesco Cagnin
eb2edc42b1 hvf: add guest debugging handlers for Apple Silicon hosts
Guests can now be debugged through the gdbstub. Support is added for
single-stepping, software breakpoints, hardware breakpoints and
watchpoints. The code has been structured like the KVM counterpart.

While guest debugging is enabled, the guest can still read and write the
DBG*_EL1 registers but they don't have any effect.

Signed-off-by: Francesco Cagnin <fcagnin@quarkslab.com>
Message-id: 20230601153107.81955-5-fcagnin@quarkslab.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-06-06 10:19:30 +01:00
Francesco Cagnin
f41520402c hvf: add breakpoint handlers
Required for guest debugging. The code has been structured like the KVM
counterpart.

Signed-off-by: Francesco Cagnin <fcagnin@quarkslab.com>
Message-id: 20230601153107.81955-4-fcagnin@quarkslab.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-06-06 10:19:30 +01:00
Paolo Bonzini
06831001ac atomics: eliminate mb_read/mb_set
qatomic_mb_read and qatomic_mb_set were the very first atomic primitives
introduced for QEMU; their semantics are unclear and they provide a false
sense of safety.

The last use of qatomic_mb_read() has been removed, so delete it.
qatomic_mb_set() instead can survive as an optimized
qatomic_set()+smp_mb(), similar to Linux's smp_store_mb(), but
rename it to qatomic_set_mb() to match the order of the two
operations.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-06 09:42:14 +02:00
Ilya Leoshkevich
e7cd7a3916 accel/tcg: Unmap perf_marker
Coverity complains that perf_marker is never unmapped.
Fix by unmapping it in perf_exit().

Fixes: Coverity CID 1507929
Fixes: 5584e2dbe8 ("tcg: add perfmap and jitdump")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230605114134.1169974-1-iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:06:49 -07:00
Richard Henderson
bc54ef8c6a plugins: Move plugin_insn_append to translator.c
This function is only used in translator.c, and uses a
target-specific typedef: abi_ptr.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson
cac9b0fd08 tcg: Remove target-specific headers from tcg.[ch]
This finally paves the way for tcg/ to be built once per mode.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson
653c46daf2 accel/tcg: Tidy includes for translator.[ch]
Reduce the header to only bswap.h and cpu_ldst.h.
Move exec/translate-all.h to translator.c.
Reduce tcg.h and tcg-op.h to tcg-op-common.h.
Remove otherwise unused headers.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson
309e014dd1 accel/tcg: Move translator_fake_ldb out of line
This is used by exactly one host in extraordinary circumstances.
This means that translator.h need not include plugin-gen.h;
translator.c already includes plugin-gen.h.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson
dfd1b81274 accel/tcg: Introduce translator_io_start
New wrapper around gen_io_start which takes care of the USE_ICOUNT
check, as well as marking the DisasContext to end the TB.
Remove exec/gen-icount.h.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson
5623423359 accel/tcg: Move most of gen-icount.h into translator.c
The only usage of gen_tb_start and gen_tb_end are here.
Move the static icount_start_insn variable into a local
within translator_loop.  Simplify the two subroutines
by passing in the existing local cflags variable.

Leave only the declaration of gen_io_start in gen-icount.h.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson
85314e13ad exec-all: Widen TranslationBlock pc and cs_base to 64-bits
This makes TranslationBlock agnostic to the address size of the guest.
Use vaddr for pc, since that's always a virtual address.
Use uint64_t for cs_base, since usage varies between guests.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson
0a18945d03 tcg: Remove NO_CPU_IO_DEFS
From this remove, it's no longer clear what this is attempting
to protect.  The last time a use of this define was added to
the source tree, as opposed to merely moved around, was 2008.
There have been many cleanups since that time and this is
no longer required for the build to succeed.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson
28ea568a03 tcg: Add guest_mo to TCGContext
This replaces of TCG_GUEST_DEFAULT_MO in tcg-op-ldst.c.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson
747bd69d0f tcg: Add insn_start_words to TCGContext
This will enable replacement of TARGET_INSN_START_WORDS in tcg.c.
Split out "tcg/insn-start-words.h" and use it in target/.

Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson
c213ee2dfc tcg: Split helper-proto.h
Create helper-proto-common.h without the target specific portion.
Use that in tcg-op-common.h.  Include helper-proto.h in target/arm
and target/hexagon before helper-info.c.inc; all other targets are
already correct in this regard.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson
d53106c997 tcg: Pass TCGHelperInfo to tcg_gen_callN
In preparation for compiling tcg/ only once, eliminate
the all_helpers array.  Instantiate the info structs for
the generic helpers in accel/tcg/, and the structs for
the target-specific helpers in each translate.c.

Since we don't see all of the info structs at startup,
initialize at first use, using g_once_init_* to make
sure we don't race while doing so.

Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson
70f168f88c tcg: Split out tcg/oversized-guest.h
Move a use of TARGET_LONG_BITS out of tcg/tcg.h.
Include the new file only where required.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:28 -07:00
Richard Henderson
e5b4906377 *: Add missing includes of tcg/tcg.h
This had been pulled in from exec/cpu_ldst.h, via exec/exec-all.h,
but the include of tcg.h will be removed.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:28 -07:00
Richard Henderson
d0a9bb5ecb tcg: Add tlb_fast_offset to TCGContext
Disconnect the layout of ArchCPU from TCG compilation.
Pass the relative offset of 'env' and 'neg.tlb.f' as a parameter.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:28 -07:00
Richard Henderson
238f43809a tcg: Widen CPUTLBEntry comparators to 64-bits
This makes CPUTLBEntry agnostic to the address size of the guest.
When 32-bit addresses are in effect, we can simply read the low
32 bits of the 64-bit field.  Similarly when we need to update
the field for setting TLB_NOTDIRTY.

For TCG backends that could in theory be big-endian, but in
practice are not (arm, loongarch, riscv), use QEMU_BUILD_BUG_ON
to document and ensure this is not accidentally missed.

For s390x, which is always big-endian, use HOST_BIG_ENDIAN anyway,
to document the reason for the adjustment.

For sparc64 and ppc64, always perform a 64-bit load, and rely on
the following 32-bit comparison to ignore the high bits.

Rearrange mips and ppc if ladders for clarity.

Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:28 -07:00
Richard Henderson
ff0c61bf35 tcg: Move TCG_TYPE_TL from tcg.h to tcg-op.h
Removes the only use of TARGET_LONG_BITS from tcg.h, which is to be
target independent.  Move the symbol to a define in tcg-op.h, which
will continue to be target dependent.  Rather than complicate matters
for the use in tb_gen_code(), expand the definition there.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:28 -07:00
Alex Bennée
367189efae accel/tcg: include cs_base in our hash calculations
We weren't using cs_base in the hash calculations before. Since the
arm front end moved a chunk of flags in a378206a20 (target/arm: Move
mode specific TB flags to tb->cs_base) they comprise of an important
part of the execution state.

Widen the tb_hash_func to include cs_base and expand to qemu_xxhash8()
to accommodate it.

My initial benchmark shows very little difference in the
runtime.

Before:

armhf

➜  hyperfine -w 2 -m 20 "./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot"
Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot
  Time (mean ± σ):     24.627 s ±  2.708 s    [User: 34.309 s, System: 1.797 s]
  Range (min … max):   22.345 s … 29.864 s    20 runs

arm64

➜  hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot"
Benchmark 1: 20
  Time (mean ± σ):     62.559 s ±  2.917 s    [User: 189.115 s, System: 4.089 s]
  Range (min … max):   59.997 s … 70.153 s    10 runs

After:

armhf

Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot
  Time (mean ± σ):     24.223 s ±  2.151 s    [User: 34.284 s, System: 1.906 s]
  Range (min … max):   22.000 s … 28.476 s    20 runs

arm64

hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot"
Benchmark 1: 20
  Time (mean ± σ):     62.769 s ±  1.978 s    [User: 188.431 s, System: 5.269 s]
  Range (min … max):   60.285 s … 66.868 s    10 runs

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230526165401.574474-12-alex.bennee@linaro.org
Message-Id: <20230524133952.3971948-11-alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-01 11:05:05 -04:00
Alex Bennée
d0aaf08bb9 tcg: remove the final vestiges of dstate
Now we no longer have dynamic state affecting things we can remove the
additional fields in cpu.h and simplify the TB hash calculation.

For the benchmark:

    hyperfine -w 2 -m 20 \
      "./arm-softmmu/qemu-system-arm -cpu cortex-a15 \
        -machine type=virt,highmem=off \
        -display none -m 2048 \
        -serial mon:stdio \
        -netdev user,id=unet,hostfwd=tcp::2222-:22 \
        -device virtio-net-pci,netdev=unet \
        -device virtio-scsi-pci \
        -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf \
        -device scsi-hd,drive=hd -smp 4 \
        -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage \
        -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' \
        -snapshot"

It has a marginal effect on runtime, before:

  Time (mean ± σ):     26.279 s ±  2.438 s    [User: 41.113 s, System: 1.843 s]
  Range (min … max):   24.420 s … 32.565 s    20 runs

after:

  Time (mean ± σ):     24.440 s ±  2.885 s    [User: 34.474 s, System: 2.028 s]
  Range (min … max):   21.663 s … 29.937 s    20 runs

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1358
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230526165401.574474-10-alex.bennee@linaro.org
Message-Id: <20230524133952.3971948-9-alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-01 11:05:05 -04:00
Richard Henderson
b3f4144fa9 accel/tcg: Extract store_atom_insert_al16 to host header
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30 09:51:11 -07:00
Richard Henderson
af844a1149 accel/tcg: Extract load_atom_extract_al16_or_al8 to host header
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30 09:51:11 -07:00
Richard Henderson
9e0e6a7e8e accel/tcg: Fix check for page writeability in load_atomic16_or_exit
PAGE_WRITE is current writability, as modified by TB protection;
PAGE_WRITE_ORG is the original page writability.

Fixes: cdfac37be0 ("accel/tcg: Honor atomicity of loads")
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30 09:51:11 -07:00
Richard Henderson
645e3a812a tcg: Remove DEBUG_DISAS
This had been set since the beginning, is never undefined,
and it would seem to be harmful to debugging to do so.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 18:54:55 -07:00
Richard Henderson
8dc24ff467 accel/tcg: Correctly use atomic128.h in ldst_atomicity.c.inc
Remove the locally defined load_atomic16 and store_atomic16,
along with HAVE_al16 and HAVE_al16_fast in favor of the
routines defined in atomic128.h.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 18:54:55 -07:00
Richard Henderson
4deb39ebb3 accel/tcg: Eliminate #if on HAVE_ATOMIC128 and HAVE_CMPXCHG128
These symbols will shortly become dynamic runtime tests and
therefore not appropriate for the preprocessor.  Use the
matching CONFIG_* symbols for that purpose.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 18:54:55 -07:00
Richard Henderson
7bedee3243 accel/tcg: Remove prot argument to atomic_mmu_lookup
Now that load/store are gone, we're always passing
PAGE_READ | PAGE_WRITE for RMW atomic operations.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 18:54:55 -07:00
Richard Henderson
ec4a9629a1 accel/tcg: Remove cpu_atomic_{ld,st}o_*_mmu
Atomic load/store of 128-byte quantities is now handled
by cpu_{ld,st}16_mmu.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 18:54:55 -07:00
Richard Henderson
fbea7a4084 accel/tcg: Unify cpu_{ld,st}*_{be,le}_mmu
With the current structure of cputlb.c, there is no difference
between the little-endian and big-endian entry points, aside
from the assert.  Unify the pairs of functions.

The only use of the functions with explicit endianness was in
target/sparc64, and that was only to satisfy the assert: the
correct endianness is already built into memop.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 18:54:28 -07:00
Richard Henderson
333c813b06 include/qemu: Move CONFIG_ATOMIC128_OPT handling to atomic128.h
Not only the routines in ldst_atomicity.c.inc need markup,
but also the ones in the headers.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 16:51:18 -07:00
Richard Henderson
297e818219 accel/tcg: Fix append_mem_cb
In fcdab382c8 we removed a tcg_gen_extu_tl_i64 from gen_empty_mem_cb,
and failed to adjust the associated copy, leading to a failed assert.

Fixes: fcdab382c8 ("accel/tcg: Widen plugin_gen_empty_mem_callback to i64")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230518145813.2940745-1-richard.henderson@linaro.org>
2023-05-18 09:28:44 -07:00
Paolo Bonzini
2e73952926 tcg: round-robin: do not use mb_read for rr_current_cpu
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-05-18 08:53:51 +02:00
Gavin Shan
856e23a0fb kvm: Enable dirty ring for arm64
arm64 has different capability from x86 to enable the dirty ring, which
is KVM_CAP_DIRTY_LOG_RING_ACQ_REL. Besides, arm64 also needs the backup
bitmap extension (KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP) when 'kvm-arm-gicv3'
or 'arm-its-kvm' device is enabled. Here the extension is always enabled
and the unnecessary overhead to do the last stage of dirty log synchronization
when those two devices aren't used is introduced, but the overhead should
be very small and acceptable. The benefit is cover future cases where those
two devices are used without modifying the code.

Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Tested-by: Zhenyu Zhang <zhenyzha@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230509022122.20888-5-gshan@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-05-18 08:53:51 +02:00
Gavin Shan
3794cb9485 kvm: Add helper kvm_dirty_ring_init()
Due to multiple capabilities associated with the dirty ring for different
architectures: KVM_CAP_DIRTY_{LOG_RING, LOG_RING_ACQ_REL} for x86 and
arm64 separately. There will be more to be done in order to support the
dirty ring for arm64.

Lets add helper kvm_dirty_ring_init() to enable the dirty ring. With this,
the code looks a bit clean.

No functional change intended.

Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Zhenyu Zhang <zhenyzha@redhat.com>
Message-Id: <20230509022122.20888-4-gshan@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-05-18 08:53:50 +02:00
Gavin Shan
b20cc77692 kvm: Synchronize the backup bitmap in the last stage
In the last stage of live migration or memory slot removal, the
backup bitmap needs to be synchronized when it has been enabled.

Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Zhenyu Zhang <zhenyzha@redhat.com>
Message-Id: <20230509022122.20888-3-gshan@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-05-18 08:53:50 +02:00
Gavin Shan
1e493be587 migration: Add last stage indicator to global dirty log
The global dirty log synchronization is used when KVM and dirty ring
are enabled. There is a particularity for ARM64 where the backup
bitmap is used to track dirty pages in non-running-vcpu situations.
It means the dirty ring works with the combination of ring buffer
and backup bitmap. The dirty bits in the backup bitmap needs to
collected in the last stage of live migration.

In order to identify the last stage of live migration and pass it
down, an extra parameter is added to the relevant functions and
callbacks. This last stage indicator isn't used until the dirty
ring is enabled in the subsequent patches.

No functional change intended.

Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Zhenyu Zhang <zhenyzha@redhat.com>
Message-Id: <20230509022122.20888-2-gshan@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-05-18 08:53:50 +02:00
Richard Henderson
a66efde188 tcg: Add tlb_dyn_max_bits to TCGContext
Disconnect guest tlb parameters from TCG compilation.

Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16 20:13:51 -07:00
Richard Henderson
aece72b76b tcg: Add page_bits and page_mask to TCGContext
Disconnect guest page size from TCG compilation.
While this could be done via exec/target_page.h, we want to cache
the value across multiple memory access operations, so we might
as well initialize this early.

The changes within tcg/ are entirely mechanical:

    sed -i s/TARGET_PAGE_BITS/s->page_bits/g
    sed -i s/TARGET_PAGE_MASK/s->page_mask/g

Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16 20:13:51 -07:00
Richard Henderson
4baf3978c0 tcg: Add addr_type to TCGContext
This will enable replacement of TARGET_LONG_BITS within tcg/.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16 16:30:29 -07:00
Richard Henderson
fcdab382c8 accel/tcg: Widen plugin_gen_empty_mem_callback to i64
Since we do this inside gen_empty_mem_cb anyway, let's
do this earlier inside tcg expansion.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16 16:30:29 -07:00
Richard Henderson
b6d9164518 accel/tcg: Merge do_gen_mem_cb into caller
As do_gen_mem_cb is called once, merge it into gen_empty_mem_cb.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16 16:30:29 -07:00
Richard Henderson
f5c346ac41 accel/tcg: Merge gen_mem_wrapped with plugin_gen_empty_mem_callback
As gen_mem_wrapped is only used in plugin_gen_empty_mem_callback,
we can avoid the curiosity of union mem_gen_fn by inlining it.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16 16:30:29 -07:00
Richard Henderson
ddfdd4178b tcg: Widen helper_atomic_* addresses to uint64_t
Always pass the target address as uint64_t.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16 16:30:29 -07:00
Richard Henderson
e570597a8a tcg: Widen helper_{ld,st}_i128 addresses to uint64_t
Always pass the target address as uint64_t.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16 16:30:29 -07:00
Richard Henderson
24e46e6c9d accel/tcg: Widen tcg-ldst.h addresses to uint64_t
Always pass the target address as uint64_t.
Adjust tcg_out_{ld,st}_helper_args to match.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-16 16:30:29 -07:00