qemu/target/arm
Richard Henderson e979972a6a target/arm: Rely on hflags correct in cpu_get_tb_cpu_state
This is the payoff.

From perf record -g data of ubuntu 18 boot and shutdown:

BEFORE:

-   23.02%     2.82%  qemu-system-aar  [.] helper_lookup_tb_ptr
   - 20.22% helper_lookup_tb_ptr
      + 10.05% tb_htable_lookup
      - 9.13% cpu_get_tb_cpu_state
           3.20% aa64_va_parameters_both
           0.55% fp_exception_el

-   11.66%     4.74%  qemu-system-aar  [.] cpu_get_tb_cpu_state
   - 6.96% cpu_get_tb_cpu_state
        3.63% aa64_va_parameters_both
        0.60% fp_exception_el
        0.53% sve_exception_el

AFTER:

-   16.40%     3.40%  qemu-system-aar  [.] helper_lookup_tb_ptr
   - 13.03% helper_lookup_tb_ptr
      + 11.19% tb_htable_lookup
        0.55% cpu_get_tb_cpu_state

     0.98%     0.71%  qemu-system-aar  [.] cpu_get_tb_cpu_state

     0.87%     0.24%  qemu-system-aar  [.] rebuild_hflags_a64

Before, helper_lookup_tb_ptr is the second hottest function in the
application, consuming almost a quarter of the runtime.  Within the
entire execution, cpu_get_tb_cpu_state consumes about 12%.

After, helper_lookup_tb_ptr has dropped to the fourth hottest function,
with consumption dropping to a sixth of the runtime.  Within the
entire execution, cpu_get_tb_cpu_state has dropped below 1%, and the
supporting function to rebuild hflags also consumes about 1%.

Assertions are retained for --enable-debug-tcg.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20191023150057.25731-25-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 17:16:28 +01:00
..
a32-uncond.decode target/arm: Convert Unallocated memory hint 2019-09-05 13:23:03 +01:00
a32.decode target/arm: Convert SVC 2019-09-05 13:23:03 +01:00
arch_dump.c target/arm: Add aa{32, 64}_vfp_{dreg, qreg} helpers 2018-01-25 11:45:29 +00:00
arm_ldst.h target: Do not include "exec/exec-all.h" if it is not necessary 2018-06-01 14:15:10 +02:00
arm-powerctl.c target/arm/arm-powerctl: Add new arm_set_cpu_on_and_reset() 2019-02-28 11:03:04 +00:00
arm-powerctl.h target/arm/arm-powerctl: Add new arm_set_cpu_on_and_reset() 2019-02-28 11:03:04 +00:00
arm-semi.c target/arm/arm-semi: Implement SH_EXT_STDOUT_STDERR extension 2019-10-15 18:09:04 +01:00
cpu64.c target-arm queue: 2019-08-16 17:21:40 +01:00
cpu-param.h tcg: Split out target/arch/cpu-param.h 2019-06-10 07:03:34 -07:00
cpu-qom.h hw/core: Move cpu.c, cpu.h from qom/ to hw/core/ 2019-08-21 13:24:01 +02:00
cpu.c target/arm: Rebuild hflags at EL changes 2019-10-24 17:16:28 +01:00
cpu.h target/arm: Add arm_rebuild_hflags 2019-10-24 17:16:28 +01:00
crypto_helper.c target: Do not include "exec/exec-all.h" if it is not necessary 2018-06-01 14:15:10 +02:00
debug_helper.c target/arm: Move debug routines to debug_helper.c 2019-07-04 17:14:43 +01:00
gdbstub64.c Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
gdbstub.c Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
helper-a64.c target/arm: Rebuild hflags at EL changes 2019-10-24 17:16:28 +01:00
helper-a64.h target/arm: Split helper_msr_i_pstate into 3 2019-03-05 15:55:08 +00:00
helper-sve.h target/arm: Rewrite vector gather first-fault loads 2018-10-08 14:55:03 +01:00
helper.c target/arm: Rely on hflags correct in cpu_get_tb_cpu_state 2019-10-24 17:16:28 +01:00
helper.h target/arm: Add HELPER(rebuild_hflags_{a32, a64, m32}) 2019-10-24 17:16:28 +01:00
idau.h qom: make interface types abstract 2018-12-11 15:45:22 -02:00
internals.h target/arm: Split out arm_mmu_idx_el 2019-10-24 17:16:28 +01:00
iwmmxt_helper.c target/arm: Untabify iwmmxt_helper.c 2018-08-24 13:17:48 +01:00
kvm32.c Clean up inclusion of sysemu/sysemu.h 2019-08-16 13:31:53 +02:00
kvm64.c target-arm queue: 2019-08-16 17:21:40 +01:00
kvm_arm.h intc/arm_gic: Support IRQ injection for more than 256 vpus 2019-10-15 18:09:02 +01:00
kvm-consts.h arm: add trailing ; after MISMATCH_CHECK 2017-02-01 03:37:18 +02:00
kvm-stub.c Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
kvm.c ARM: KVM: Check KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 for smp_cpus > 256 2019-10-15 18:09:02 +01:00
m_helper.c target/arm: Rebuild hflags for M-profile 2019-10-24 17:16:28 +01:00
machine.c target/arm: Rebuild hflags at EL changes 2019-10-24 17:16:28 +01:00
Makefile.objs target/arm: Add skeleton for T16 decodetree 2019-09-05 13:23:03 +01:00
monitor.c Include hw/boards.h a bit less 2019-08-16 13:31:53 +02:00
neon_helper.c target/arm: Use tcg_gen_abs_i64 and tcg_gen_gvec_abs 2019-05-13 22:52:08 +00:00
op_addsub.h
op_helper.c target/arm: Rebuild hflags at CPSR writes 2019-10-24 17:16:28 +01:00
pauth_helper.c target/arm: Fix output of PAuth Auth 2019-06-13 15:14:03 +01:00
psci.c sysemu: Split sysemu/runstate.h off sysemu/sysemu.h 2019-08-16 13:37:36 +02:00
sve_helper.c tcg: Introduce set/clear_helper_retaddr 2019-07-14 12:19:00 +02:00
sve.decode target/arm: SVE brk[ab] merging does not have s bit 2019-01-07 15:23:45 +00:00
t16.decode target/arm: Convert T16, long branches 2019-09-05 13:23:04 +01:00
t32.decode target/arm: Convert TT 2019-09-05 13:23:03 +01:00
tlb_helper.c target/arm: Move TLB related routines to tlb_helper.c 2019-07-01 17:29:01 +01:00
trace-events trace-events: Shorten file names in comments 2019-03-22 16:18:07 +00:00
translate-a64.c target/arm: Rebuild hflags at MSR writes 2019-10-24 17:16:28 +01:00
translate-a64.h Allow page table bit to swap endianness. 2019-09-04 16:29:18 +01:00
translate-sve.c tcg: TCGMemOp is now accelerator independent MemOp 2019-09-03 08:30:38 -07:00
translate-vfp.inc.c target/arm: Free TCG temps in trans_VMOV_64_sp() 2019-09-03 16:20:35 +01:00
translate.c target/arm: Rebuild hflags for M-profile 2019-10-24 17:16:28 +01:00
translate.h Allow page table bit to swap endianness. 2019-09-04 16:29:18 +01:00
vec_helper.c target/arm: Add helpers for FMLAL 2019-02-28 11:03:05 +00:00
vfp_helper.c target/arm/vfp_helper: Call set_fpscr_to_host before updating to FPSCR 2019-07-08 14:11:31 +01:00
vfp-uncond.decode target/arm: Convert VCVTA/VCVTN/VCVTP/VCVTM to decodetree 2019-06-13 15:14:04 +01:00
vfp.decode target/arm: Use vfp_expand_imm() for AArch32 VFP VMOV_imm 2019-06-17 15:14:19 +01:00