Hexagon HVX (target/hexagon) README
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
This commit is contained in:
parent
91e8394415
commit
375bcf389f
@ -1,9 +1,13 @@
|
|||||||
Hexagon is Qualcomm's very long instruction word (VLIW) digital signal
|
Hexagon is Qualcomm's very long instruction word (VLIW) digital signal
|
||||||
processor(DSP).
|
processor(DSP). We also support Hexagon Vector eXtensions (HVX). HVX
|
||||||
|
is a wide vector coprocessor designed for high performance computer vision,
|
||||||
|
image processing, machine learning, and other workloads.
|
||||||
|
|
||||||
The following versions of the Hexagon core are supported
|
The following versions of the Hexagon core are supported
|
||||||
Scalar core: v67
|
Scalar core: v67
|
||||||
https://developer.qualcomm.com/downloads/qualcomm-hexagon-v67-programmer-s-reference-manual
|
https://developer.qualcomm.com/downloads/qualcomm-hexagon-v67-programmer-s-reference-manual
|
||||||
|
HVX extension: v66
|
||||||
|
https://developer.qualcomm.com/downloads/qualcomm-hexagon-v66-hvx-programmer-s-reference-manual
|
||||||
|
|
||||||
We presented an overview of the project at the 2019 KVM Forum.
|
We presented an overview of the project at the 2019 KVM Forum.
|
||||||
https://kvmforum2019.sched.com/event/Tmwc/qemu-hexagon-automatic-translation-of-the-isa-manual-pseudcode-to-tiny-code-instructions-of-a-vliw-architecture-niccolo-izzo-revng-taylor-simpson-qualcomm-innovation-center
|
https://kvmforum2019.sched.com/event/Tmwc/qemu-hexagon-automatic-translation-of-the-isa-manual-pseudcode-to-tiny-code-instructions-of-a-vliw-architecture-niccolo-izzo-revng-taylor-simpson-qualcomm-innovation-center
|
||||||
@ -124,6 +128,71 @@ There are also cases where we brute force the TCG code generation.
|
|||||||
Instructions with multiple definitions are examples. These require special
|
Instructions with multiple definitions are examples. These require special
|
||||||
handling because qemu helpers can only return a single value.
|
handling because qemu helpers can only return a single value.
|
||||||
|
|
||||||
|
For HVX vectors, the generator behaves slightly differently. The wide vectors
|
||||||
|
won't fit in a TCGv or TCGv_i64, so we pass TCGv_ptr variables to pass the
|
||||||
|
address to helper functions. Here's an example for an HVX vector-add-word
|
||||||
|
istruction.
|
||||||
|
static void generate_V6_vaddw(
|
||||||
|
CPUHexagonState *env,
|
||||||
|
DisasContext *ctx,
|
||||||
|
Insn *insn,
|
||||||
|
Packet *pkt)
|
||||||
|
{
|
||||||
|
const int VdN = insn->regno[0];
|
||||||
|
const intptr_t VdV_off =
|
||||||
|
ctx_future_vreg_off(ctx, VdN, 1, true);
|
||||||
|
TCGv_ptr VdV = tcg_temp_local_new_ptr();
|
||||||
|
tcg_gen_addi_ptr(VdV, cpu_env, VdV_off);
|
||||||
|
const int VuN = insn->regno[1];
|
||||||
|
const intptr_t VuV_off =
|
||||||
|
vreg_src_off(ctx, VuN);
|
||||||
|
TCGv_ptr VuV = tcg_temp_local_new_ptr();
|
||||||
|
const int VvN = insn->regno[2];
|
||||||
|
const intptr_t VvV_off =
|
||||||
|
vreg_src_off(ctx, VvN);
|
||||||
|
TCGv_ptr VvV = tcg_temp_local_new_ptr();
|
||||||
|
tcg_gen_addi_ptr(VuV, cpu_env, VuV_off);
|
||||||
|
tcg_gen_addi_ptr(VvV, cpu_env, VvV_off);
|
||||||
|
TCGv slot = tcg_constant_tl(insn->slot);
|
||||||
|
gen_helper_V6_vaddw(cpu_env, VdV, VuV, VvV, slot);
|
||||||
|
tcg_temp_free(slot);
|
||||||
|
gen_log_vreg_write(ctx, VdV_off, VdN, EXT_DFL, insn->slot, false);
|
||||||
|
ctx_log_vreg_write(ctx, VdN, EXT_DFL, false);
|
||||||
|
tcg_temp_free_ptr(VdV);
|
||||||
|
tcg_temp_free_ptr(VuV);
|
||||||
|
tcg_temp_free_ptr(VvV);
|
||||||
|
}
|
||||||
|
|
||||||
|
Notice that we also generate a variable named <operand>_off for each operand of
|
||||||
|
the instruction. This makes it easy to override the instruction semantics with
|
||||||
|
functions from tcg-op-gvec.h. Here's the override for this instruction.
|
||||||
|
#define fGEN_TCG_V6_vaddw(SHORTCODE) \
|
||||||
|
tcg_gen_gvec_add(MO_32, VdV_off, VuV_off, VvV_off, \
|
||||||
|
sizeof(MMVector), sizeof(MMVector))
|
||||||
|
|
||||||
|
Finally, we notice that the override doesn't use the TCGv_ptr variables, so
|
||||||
|
we don't generate them when an override is present. Here is what we generate
|
||||||
|
when the override is present.
|
||||||
|
static void generate_V6_vaddw(
|
||||||
|
CPUHexagonState *env,
|
||||||
|
DisasContext *ctx,
|
||||||
|
Insn *insn,
|
||||||
|
Packet *pkt)
|
||||||
|
{
|
||||||
|
const int VdN = insn->regno[0];
|
||||||
|
const intptr_t VdV_off =
|
||||||
|
ctx_future_vreg_off(ctx, VdN, 1, true);
|
||||||
|
const int VuN = insn->regno[1];
|
||||||
|
const intptr_t VuV_off =
|
||||||
|
vreg_src_off(ctx, VuN);
|
||||||
|
const int VvN = insn->regno[2];
|
||||||
|
const intptr_t VvV_off =
|
||||||
|
vreg_src_off(ctx, VvN);
|
||||||
|
fGEN_TCG_V6_vaddw({ fHIDE(int i;) fVFOREACH(32, i) { VdV.w[i] = VuV.w[i] + VvV.w[i] ; } });
|
||||||
|
gen_log_vreg_write(ctx, VdV_off, VdN, EXT_DFL, insn->slot, false);
|
||||||
|
ctx_log_vreg_write(ctx, VdN, EXT_DFL, false);
|
||||||
|
}
|
||||||
|
|
||||||
In addition to instruction semantics, we use a generator to create the decode
|
In addition to instruction semantics, we use a generator to create the decode
|
||||||
tree. This generation is also a two step process. The first step is to run
|
tree. This generation is also a two step process. The first step is to run
|
||||||
target/hexagon/gen_dectree_import.c to produce
|
target/hexagon/gen_dectree_import.c to produce
|
||||||
@ -140,6 +209,7 @@ runtime information for each thread and contains stuff like the GPR and
|
|||||||
predicate registers.
|
predicate registers.
|
||||||
|
|
||||||
macros.h
|
macros.h
|
||||||
|
mmvec/macros.h
|
||||||
|
|
||||||
The Hexagon arch lib relies heavily on macros for the instruction semantics.
|
The Hexagon arch lib relies heavily on macros for the instruction semantics.
|
||||||
This is a great advantage for qemu because we can override them for different
|
This is a great advantage for qemu because we can override them for different
|
||||||
@ -203,6 +273,15 @@ During runtime, the following fields in CPUHexagonState (see cpu.h) are used
|
|||||||
pred_written boolean indicating if predicate was written
|
pred_written boolean indicating if predicate was written
|
||||||
mem_log_stores record of the stores (indexed by slot)
|
mem_log_stores record of the stores (indexed by slot)
|
||||||
|
|
||||||
|
For Hexagon Vector eXtensions (HVX), the following fields are used
|
||||||
|
VRegs Vector registers
|
||||||
|
future_VRegs Registers to be stored during packet commit
|
||||||
|
tmp_VRegs Temporary registers *not* stored during commit
|
||||||
|
VRegs_updated Mask of predicated vector writes
|
||||||
|
QRegs Q (vector predicate) registers
|
||||||
|
future_QRegs Registers to be stored during packet commit
|
||||||
|
QRegs_updated Mask of predicated vector writes
|
||||||
|
|
||||||
*** Debugging ***
|
*** Debugging ***
|
||||||
|
|
||||||
You can turn on a lot of debugging by changing the HEX_DEBUG macro to 1 in
|
You can turn on a lot of debugging by changing the HEX_DEBUG macro to 1 in
|
||||||
|
Loading…
Reference in New Issue
Block a user