The following improvements are made for predicated HVX instructions
During gen_commit_hvx, unconditionally move the "new" value into
the dest
Don't set slot_cancelled
Remove runtime bookkeeping of which registers were updated
Reduce the cases where gen_log_vreg_write[_pair] is called
It's only needed for special operands VxxV and VyV
Remove gen_log_qreg_write
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20230307025828.1612809-15-tsimpson@quicinc.com>
Performance improvement
Add pkt and insn to DisasContext
Many functions need information from all 3 structures, so merge
them together.
2)
Bug fix
Fix predicated assignment to .tmp and .cur
3)
Performance improvement
Add overrides for S2_asr_r_r_sat/S2_asl_r_r_sat
These functions will not be handled by idef-parser
4-11)
The final 8 patches improve change-of-flow handling.
Currently, we set the PC to a new address before exiting a TB. The
ultimate goal is to use direct block chaining. However, several steps
are needed along the way.
4)
When a packet has more than one change-of-flow (COF) instruction, only
the first one taken is considered. The runtime bookkeeping is only
needed when there is more than one COF instruction in a packet.
5, 6)
Remove PC and next_PC from the runtime state and always use a
translation-time constant. Note that next_PC is used by call instructions
to set LR and by conditional COF instructions to set the fall-through
address.
7, 8, 9)
Add helper overrides for COF instructions. In particular, we must
distinguish those that use a PC-relative address for the destination.
These are candidates for direct block chaining later.
10)
Use direct block chaining for packets that have a single PC-relative
COF instruction. Instead of generating the code while processing the
instruction, we record the effect in DisasContext and generate the code
during gen_end_tb.
11)
Use direct block chaining for tight loops. We look for TBs that end
with an endloop0 that will branch back to the TB start address.
12-21)
Instruction definition parser (idef-parser) from rev.ng
Parses the instruction semantics and generates TCG
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEENjXHiM5iuR/UxZq0ewJE+xLeRCIFAmOc2BEACgkQewJE+xLe
RCKqFwf/U/uWaQiF59OXyLHj9PR/bTf7PmZL12g8MTrntzmtIpRiTQb7ajJaLwyn
TcCG9j9Ss6kWBq+LH5TBvstnSN9/3qEgnj2b26y6EAn85mSh6fai4foUPjXFUy7m
2Of0kuc2WKmwxN9C2iw6Hm6pbL3FSnYzKtBuSFzYyAIS0doLFT97zE97XnBtTQ4C
49JdNgQW9CKt7cCpKTcQA4N3ZO8LdARdvOtTShX1++qd4Trm0haTGRdaygSrTlS7
Eeqs4nbakKEE6VH2iltPGKX+KHbMCf2ZW7lefxHi+EuzE0DBIVoM64UnalyFfcSU
hVMGF15HgAIAjecim0Y4AbPB/zVlEw==
=PC9+
-----END PGP SIGNATURE-----
Merge tag 'pull-hex-20221216-1' of https://github.com/quic/qemu into staging
1)
Performance improvement
Add pkt and insn to DisasContext
Many functions need information from all 3 structures, so merge
them together.
2)
Bug fix
Fix predicated assignment to .tmp and .cur
3)
Performance improvement
Add overrides for S2_asr_r_r_sat/S2_asl_r_r_sat
These functions will not be handled by idef-parser
4-11)
The final 8 patches improve change-of-flow handling.
Currently, we set the PC to a new address before exiting a TB. The
ultimate goal is to use direct block chaining. However, several steps
are needed along the way.
4)
When a packet has more than one change-of-flow (COF) instruction, only
the first one taken is considered. The runtime bookkeeping is only
needed when there is more than one COF instruction in a packet.
5, 6)
Remove PC and next_PC from the runtime state and always use a
translation-time constant. Note that next_PC is used by call instructions
to set LR and by conditional COF instructions to set the fall-through
address.
7, 8, 9)
Add helper overrides for COF instructions. In particular, we must
distinguish those that use a PC-relative address for the destination.
These are candidates for direct block chaining later.
10)
Use direct block chaining for packets that have a single PC-relative
COF instruction. Instead of generating the code while processing the
instruction, we record the effect in DisasContext and generate the code
during gen_end_tb.
11)
Use direct block chaining for tight loops. We look for TBs that end
with an endloop0 that will branch back to the TB start address.
12-21)
Instruction definition parser (idef-parser) from rev.ng
Parses the instruction semantics and generates TCG
# gpg: Signature made Fri 16 Dec 2022 20:41:53 GMT
# gpg: using RSA key 3635C788CE62B91FD4C59AB47B0244FB12DE4422
# gpg: Good signature from "Taylor Simpson (Rock on) <tsimpson@quicinc.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 3635 C788 CE62 B91F D4C5 9AB4 7B02 44FB 12DE 4422
* tag 'pull-hex-20221216-1' of https://github.com/quic/qemu: (21 commits)
target/hexagon: import additional tests
target/hexagon: call idef-parser functions
target/hexagon: import parser for idef-parser
target/hexagon: import lexer for idef-parser
target/hexagon: prepare input for the idef-parser
target/hexagon: introduce new helper functions
target/hexagon: make helper functions non-static
target/hexagon: make slot number an unsigned
target/hexagon: import README for idef-parser
target/hexagon: update MAINTAINERS for idef-parser
Hexagon (target/hexagon) Use direct block chaining for tight loops
Hexagon (target/hexagon) Use direct block chaining for direct jump/branch
Hexagon (target/hexagon) Add overrides for various forms of jump
Hexagon (target/hexagon) Add overrides for compound compare and jump
Hexagon (target/hexagon) Add overrides for direct call instructions
Hexagon (target/hexagon) Remove next_PC from runtime state
Hexagon (target/hexagon) Remove PC from the runtime state
Hexagon (target/hexagon) Only use branch_taken when packet has multi cof
Hexagon (target/hexagon) Add overrides for S2_asr_r_r_sat/S2_asl_r_r_sat
Hexagon (target/hexagon) Fix predicated assignment to .tmp and .cur
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Direct block chaining is documented here
https://qemu.readthedocs.io/en/latest/devel/tcg.html#direct-block-chaining
Hexagon inner loops end with the endloop0 instruction
To go back to the beginning of the loop, this instructions writes to PC
from register SA0 (start address 0). To use direct block chaining, we
have to assign PC with a constant value. So, we specialize the code
generation when the start of the translation block is equal to SA0.
When this is the case, we defer the compare/branch from endloop0 to
gen_end_tb. When this is done, we can assign the start address of the TB
to PC.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <20221108162906.3166-12-tsimpson@quicinc.com>
The imported files don't properly mark all CONDEXEC instructions, so
we add some logic to hex_common.py to add the attribute.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <20221108162906.3166-7-tsimpson@quicinc.com>
Convert the hexagon CPU class to use 3-phase reset, so it doesn't
need to use device_class_set_parent_reset() any more.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Edgar E. Iglesias <edgar@zeroasic.com>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Message-id: 20221124115023.2437291-6-peter.maydell@linaro.org
ArchCPU is our interface with target-specific code. Use it as
a forward-declared opaque pointer (abstract type), having its
structure defined by each target.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220214183144.27402-15-f4bug@amsat.org>
Replace the boilerplate code to declare CPU QOM types
and macros, and forward-declare the CPU instance type.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220214183144.27402-14-f4bug@amsat.org>
While CPUState is our interface with generic code, CPUArchState is
our interface with target-specific code. Use CPUArchState as an
abstract type, defined by each target.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220214183144.27402-13-f4bug@amsat.org>
HexagonCPU field parent_class is of type CPUClass, which
is declared in "hw/core/cpu.h".
Reviewed-by: Damien Hedde <damien.hedde@greensocs.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220214183144.27402-11-f4bug@amsat.org>
The qemu-common.h header is not supposed to be included from any
other header files, only from .c files (as documented in a comment at
the start of it).
Move the include to linux-user/hexagon/cpu_loop.c, which needs it for
the declaration of cpu_exec_step_atomic().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Message-id: 20211129200510.1233037-3-peter.maydell@linaro.org
HVX is a set of wide vector instructions. Machine state includes
vector registers (VRegs)
vector predicate registers (QRegs)
temporary registers for intermediate values
store buffer (masked stores and scatter/gather)
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
The function is trivial for user-only, but still must be present.
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There is nothing target specific about this. The implementation
is host specific, but the declaration is 100% common.
Reviewed-By: Warner Losh <imp@bsdimp.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Remove hexagon_env_get_cpu and replace with env_archcpu
Replace CPU(hexagon_env_get_cpu(env)) with env_cpu(env)
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1617930474-31979-5-git-send-email-tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add target state header, target definitions and initialization routines
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
Message-Id: <1612763186-18161-5-git-send-email-tsimpson@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>