qemu/accel/tcg
Alex Bennée 367189efae accel/tcg: include cs_base in our hash calculations
We weren't using cs_base in the hash calculations before. Since the
arm front end moved a chunk of flags in a378206a20 (target/arm: Move
mode specific TB flags to tb->cs_base) they comprise of an important
part of the execution state.

Widen the tb_hash_func to include cs_base and expand to qemu_xxhash8()
to accommodate it.

My initial benchmark shows very little difference in the
runtime.

Before:

armhf

➜  hyperfine -w 2 -m 20 "./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot"
Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot
  Time (mean ± σ):     24.627 s ±  2.708 s    [User: 34.309 s, System: 1.797 s]
  Range (min … max):   22.345 s … 29.864 s    20 runs

arm64

➜  hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot"
Benchmark 1: 20
  Time (mean ± σ):     62.559 s ±  2.917 s    [User: 189.115 s, System: 4.089 s]
  Range (min … max):   59.997 s … 70.153 s    10 runs

After:

armhf

Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot
  Time (mean ± σ):     24.223 s ±  2.151 s    [User: 34.284 s, System: 1.906 s]
  Range (min … max):   22.000 s … 28.476 s    20 runs

arm64

hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot"
Benchmark 1: 20
  Time (mean ± σ):     62.769 s ±  1.978 s    [User: 188.431 s, System: 5.269 s]
  Range (min … max):   60.285 s … 66.868 s    10 runs

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230526165401.574474-12-alex.bennee@linaro.org
Message-Id: <20230524133952.3971948-11-alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-01 11:05:05 -04:00
..
atomic_common.c.inc accel/tcg: Remove cpu_atomic_{ld,st}o_*_mmu 2023-05-23 18:54:55 -07:00
atomic_template.h accel/tcg: Remove prot argument to atomic_mmu_lookup 2023-05-23 18:54:55 -07:00
cpu-exec-common.c accel/tcg: Add cpu_in_serial_context 2023-05-11 09:53:41 +01:00
cpu-exec.c accel/tcg: include cs_base in our hash calculations 2023-06-01 11:05:05 -04:00
cputlb.c accel/tcg: Correctly use atomic128.h in ldst_atomicity.c.inc 2023-05-23 18:54:55 -07:00
debuginfo.c accel/tcg: Add debuginfo support 2023-01-16 10:14:12 -10:00
debuginfo.h coroutine: Split qemu/coroutine-core.h off qemu/coroutine.h 2023-01-20 07:21:46 +01:00
internal.h accel/tcg: Add cpu_in_serial_context 2023-05-11 09:53:41 +01:00
ldst_atomicity.c.inc accel/tcg: Extract store_atom_insert_al16 to host header 2023-05-30 09:51:11 -07:00
ldst_common.c.inc accel/tcg: Unify cpu_{ld,st}*_{be,le}_mmu 2023-05-23 18:54:28 -07:00
meson.build accel/tcg: Restrict 'qapi-commands-machine.h' to system emulation 2023-02-27 22:29:01 +01:00
monitor.c accel/tcg: Report one-insn-per-tb in 'info jit', not 'info status' 2023-05-02 15:47:40 +01:00
perf.c accel/tcg: Replace TARGET_TB_PCREL with CF_PCREL 2023-03-01 07:31:44 -10:00
perf.h tcg: add perfmap and jitdump 2023-01-16 10:14:12 -10:00
plugin-gen.c accel/tcg: Fix append_mem_cb 2023-05-18 09:28:44 -07:00
plugin-helpers.h tcg: exclude non-memory effecting helpers from instrumentation 2023-02-02 11:48:20 +00:00
tb-context.h tcg: Move tb_phys_invalidate_count to tb_ctx 2021-07-09 09:38:33 -07:00
tb-hash.h accel/tcg: include cs_base in our hash calculations 2023-06-01 11:05:05 -04:00
tb-jmp-cache.h accel/tcg: Move jmp-cache CF_PCREL checks to caller 2023-03-01 07:33:03 -10:00
tb-maint.c accel/tcg: include cs_base in our hash calculations 2023-06-01 11:05:05 -04:00
tcg-accel-ops-icount.c accel/tcg/tcg-accel-ops-rr: ensure fairness with icount 2023-05-11 09:53:41 +01:00
tcg-accel-ops-icount.h accel/tcg/tcg-accel-ops-rr: ensure fairness with icount 2023-05-11 09:53:41 +01:00
tcg-accel-ops-mttcg.c Revert "accel/tcg: Init TCG cflags in vCPU thread handler" 2022-10-25 22:01:14 +10:00
tcg-accel-ops-mttcg.h Clean up header guards that don't match their file name 2022-05-11 16:49:06 +02:00
tcg-accel-ops-rr.c tcg: round-robin: do not use mb_read for rr_current_cpu 2023-05-18 08:53:51 +02:00
tcg-accel-ops-rr.h Clean up header guards that don't match their file name 2022-05-11 16:49:06 +02:00
tcg-accel-ops.c accel/tcg: Fix overwrite problems of tcg_cflags 2023-04-04 08:19:49 -07:00
tcg-accel-ops.h Clean up header guards that don't match their file name 2022-05-11 16:49:06 +02:00
tcg-all.c accel/tcg: Use one_insn_per_tb global instead of old singlestep global 2023-05-02 15:47:40 +01:00
tcg-runtime-gvec.c tcg: Add tcg_gen_gvec_andcs 2023-05-02 13:05:45 -07:00
tcg-runtime.c accel/tcg: Move helper_lookup_tb_ptr to cpu-exec.c 2021-07-09 20:23:06 -07:00
tcg-runtime.h accel/tcg: Unify cpu_{ld,st}*_{be,le}_mmu 2023-05-23 18:54:28 -07:00
trace-events accel/tcg: Remove trace events from trace-root.h 2022-12-20 17:11:12 -08:00
trace.h trace: switch position of headers to what Meson requires 2020-08-21 06:18:24 -04:00
translate-all.c tcg: remove the final vestiges of dstate 2023-06-01 11:05:05 -04:00
translator.c tcg: Remove DEBUG_DISAS 2023-05-23 18:54:55 -07:00
user-exec-stub.c replay: Extract core API to 'exec/replay-core.h' 2023-02-27 22:29:01 +01:00
user-exec.c accel/tcg: Eliminate #if on HAVE_ATOMIC128 and HAVE_CMPXCHG128 2023-05-23 18:54:55 -07:00