target/hexagon: import README for idef-parser
Signed-off-by: Alessandro Di Federico <ale@rev.ng> Signed-off-by: Anton Johansson <anjo@rev.ng> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com> Reviewed-by: Taylor Simpson <tsimpson@quicinc.com> Message-Id: <20220923173831.227551-3-anjo@rev.ng>
This commit is contained in:
parent
647357d69e
commit
95e1150500
@ -27,6 +27,10 @@ Hexagon-specific code are
|
|||||||
encode*.def Encoding patterns for each instruction
|
encode*.def Encoding patterns for each instruction
|
||||||
iclass.def Instruction class definitions used to determine
|
iclass.def Instruction class definitions used to determine
|
||||||
legal VLIW slots for each instruction
|
legal VLIW slots for each instruction
|
||||||
|
qemu/target/hexagon/idef-parser
|
||||||
|
Parser that, given the high-level definitions of an instruction,
|
||||||
|
produces a C function generating equivalent tiny code instructions.
|
||||||
|
See README.rst.
|
||||||
qemu/linux-user/hexagon
|
qemu/linux-user/hexagon
|
||||||
Helpers for loading the ELF file and making Linux system calls,
|
Helpers for loading the ELF file and making Linux system calls,
|
||||||
signals, etc
|
signals, etc
|
||||||
@ -47,6 +51,7 @@ header files in <BUILD_DIR>/target/hexagon
|
|||||||
gen_tcg_funcs.py -> tcg_funcs_generated.c.inc
|
gen_tcg_funcs.py -> tcg_funcs_generated.c.inc
|
||||||
gen_tcg_func_table.py -> tcg_func_table_generated.c.inc
|
gen_tcg_func_table.py -> tcg_func_table_generated.c.inc
|
||||||
gen_helper_funcs.py -> helper_funcs_generated.c.inc
|
gen_helper_funcs.py -> helper_funcs_generated.c.inc
|
||||||
|
gen_idef_parser_funcs.py -> idef_parser_input.h
|
||||||
|
|
||||||
Qemu helper functions have 3 parts
|
Qemu helper functions have 3 parts
|
||||||
DEF_HELPER declaration indicates the signature of the helper
|
DEF_HELPER declaration indicates the signature of the helper
|
||||||
|
722
target/hexagon/idef-parser/README.rst
Normal file
722
target/hexagon/idef-parser/README.rst
Normal file
@ -0,0 +1,722 @@
|
|||||||
|
Hexagon ISA instruction definitions to tinycode generator compiler
|
||||||
|
------------------------------------------------------------------
|
||||||
|
|
||||||
|
idef-parser is a small compiler able to translate the Hexagon ISA description
|
||||||
|
language into tinycode generator code, that can be easily integrated into QEMU.
|
||||||
|
|
||||||
|
Compilation Example
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
To better understand the scope of the idef-parser, we'll explore an applicative
|
||||||
|
example. Let's start by one of the simplest Hexagon instruction: the ``add``.
|
||||||
|
|
||||||
|
The ISA description language represents the ``add`` instruction as
|
||||||
|
follows:
|
||||||
|
|
||||||
|
.. code:: c
|
||||||
|
|
||||||
|
A2_add(RdV, in RsV, in RtV) {
|
||||||
|
{ RdV=RsV+RtV;}
|
||||||
|
}
|
||||||
|
|
||||||
|
idef-parser will compile the above code into the following code:
|
||||||
|
|
||||||
|
.. code:: c
|
||||||
|
|
||||||
|
/* A2_add */
|
||||||
|
void emit_A2_add(DisasContext *ctx, Insn *insn, Packet *pkt, TCGv_i32 RdV,
|
||||||
|
TCGv_i32 RsV, TCGv_i32 RtV)
|
||||||
|
/* { RdV=RsV+RtV;} */
|
||||||
|
{
|
||||||
|
TCGv_i32 tmp_0 = tcg_temp_new_i32();
|
||||||
|
tcg_gen_add_i32(tmp_0, RsV, RtV);
|
||||||
|
tcg_gen_mov_i32(RdV, tmp_0);
|
||||||
|
tcg_temp_free_i32(tmp_0);
|
||||||
|
}
|
||||||
|
|
||||||
|
The output of the compilation process will be a function, containing the
|
||||||
|
tinycode generator code, implementing the correct semantics. That function will
|
||||||
|
not access any global variable, because all the accessed data structures will be
|
||||||
|
passed explicitly as function parameters. Among the passed parameters we will
|
||||||
|
have TCGv (tinycode variables) representing the input and output registers of
|
||||||
|
the architecture, integers representing the immediates that come from the code,
|
||||||
|
and other data structures which hold information about the disassemblation
|
||||||
|
context (``DisasContext`` struct).
|
||||||
|
|
||||||
|
Let's begin by describing the input code. The ``add`` instruction is associated
|
||||||
|
with a unique identifier, in this case ``A2_add``, which allows to distinguish
|
||||||
|
variants of the same instruction, and expresses the class to which the
|
||||||
|
instruction belongs, in this case ``A2`` corresponds to the Hexagon
|
||||||
|
``ALU32/ALU`` instruction subclass.
|
||||||
|
|
||||||
|
After the instruction identifier, we have a series of parameters that represents
|
||||||
|
TCG variables that will be passed to the generated function. Parameters marked
|
||||||
|
with ``in`` are already initialized, while the others are output parameters.
|
||||||
|
|
||||||
|
We will leverage this information to infer several information:
|
||||||
|
|
||||||
|
- Fill in the output function signature with the correct TCGv registers
|
||||||
|
- Fill in the output function signature with the immediate integers
|
||||||
|
- Keep track of which registers, among the declared one, have been
|
||||||
|
initialized
|
||||||
|
|
||||||
|
Let's now observe the actual instruction description code, in this case:
|
||||||
|
|
||||||
|
.. code:: c
|
||||||
|
|
||||||
|
{ RdV=RsV+RtV;}
|
||||||
|
|
||||||
|
This code is composed by a subset of the C syntax, and is the result of the
|
||||||
|
application of some macro definitions contained in the ``macros.h`` file.
|
||||||
|
|
||||||
|
This file is used to reduce the complexity of the input language where complex
|
||||||
|
variants of similar constructs can be mapped to a unique primitive, so that the
|
||||||
|
idef-parser has to handle a lower number of computation primitives.
|
||||||
|
|
||||||
|
As you may notice, the description code modifies the registers which have been
|
||||||
|
declared by the declaration statements. In this case all the three registers
|
||||||
|
will be declared, ``RsV`` and ``RtV`` will also be read and ``RdV`` will be
|
||||||
|
written.
|
||||||
|
|
||||||
|
Now let's have a quick look at the generated code, line by line.
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
TCGv_i32 tmp_0 = tcg_temp_new_i32();
|
||||||
|
|
||||||
|
This code starts by declaring a temporary TCGv to hold the result from the sum
|
||||||
|
operation.
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
tcg_gen_add_i32(tmp_0, RsV, RtV);
|
||||||
|
|
||||||
|
Then, we are generating the sum tinycode operator between the selected
|
||||||
|
registers, storing the result in the just declared temporary.
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
tcg_gen_mov_i32(RdV, tmp_0);
|
||||||
|
|
||||||
|
The result of the addition is now stored in the temporary, we move it into the
|
||||||
|
correct destination register. This code may seem inefficient, but QEMU will
|
||||||
|
perform some optimizations on the tinycode, reducing the unnecessary copy.
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
tcg_temp_free_i32(tmp_0);
|
||||||
|
|
||||||
|
Finally, we free the temporary we used to hold the addition result.
|
||||||
|
|
||||||
|
Parser Input
|
||||||
|
------------
|
||||||
|
|
||||||
|
Before moving on to the structure of idef-parser itself, let us spend some words
|
||||||
|
on its' input. There are two preprocessing steps applied to the generated
|
||||||
|
instruction semantics in ``semantics_generated.pyinc`` that we need to consider.
|
||||||
|
Firstly,
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
gen_idef_parser_funcs.py
|
||||||
|
|
||||||
|
which takes instruction semantics in ``semantics_generated.pyinc`` to C-like
|
||||||
|
pseudo code, output into ``idef_parser_input.h.inc``. For instance, the
|
||||||
|
``J2_jumpr`` instruction which jumps to an address stored in a register
|
||||||
|
argument. This is instruction is defined as
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
SEMANTICS( \
|
||||||
|
"J2_jumpr", \
|
||||||
|
"jumpr Rs32", \
|
||||||
|
"""{fJUMPR(RsN,RsV,COF_TYPE_JUMPR);}""" \
|
||||||
|
)
|
||||||
|
|
||||||
|
in ``semantics_generated.pyinc``. Running ``gen_idef_parser_funcs.py``
|
||||||
|
we obtain the pseudo code
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
J2_jumpr(in RsV) {
|
||||||
|
{fJUMPR(RsN,RsV,COF_TYPE_JUMPR);}
|
||||||
|
}
|
||||||
|
|
||||||
|
with macros such as ``fJUMPR`` intact.
|
||||||
|
|
||||||
|
The second step is to expand macros into a form suitable for our parser.
|
||||||
|
These macros are defined in ``idef-parser/macros.inc`` and the step is
|
||||||
|
carried out by the ``prepare`` script which runs the C preprocessor on
|
||||||
|
``idef_parser_input.h.inc`` to produce
|
||||||
|
``idef_parser_input.preprocessed.h.inc``.
|
||||||
|
|
||||||
|
To finish the above example, after preprocessing ``J2_jumpr`` we obtain
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
J2_jumpr(in RsV) {
|
||||||
|
{(PC = RsV);}
|
||||||
|
}
|
||||||
|
|
||||||
|
where ``fJUMPR(RsN,RsV,COF_TYPE_JUMPR);`` was expanded to ``(PC = RsV)``,
|
||||||
|
signifying a write to the Program Counter ``PC``. Note, that ``PC`` in
|
||||||
|
this expression is not a variable in the strict C sense since it is not
|
||||||
|
declared anywhere, but rather a symbol which is easy to match in
|
||||||
|
idef-parser later on.
|
||||||
|
|
||||||
|
Parser Structure
|
||||||
|
----------------
|
||||||
|
|
||||||
|
The idef-parser is built using the ``flex`` and ``bison``.
|
||||||
|
|
||||||
|
``flex`` is used to split the input string into tokens, each described using a
|
||||||
|
regular expression. The token description is contained in the
|
||||||
|
``idef-parser.lex`` source file. The flex-generated scanner takes care also to
|
||||||
|
extract from the input text other meaningful information, e.g., the numerical
|
||||||
|
value in case of an immediate constant, and decorates the token with the
|
||||||
|
extracted information.
|
||||||
|
|
||||||
|
``bison`` is used to generate the actual parser, starting from the parsing
|
||||||
|
description contained in the ``idef-parser.y`` file. The generated parser
|
||||||
|
executes the ``main`` function at the end of the ``idef-parser.y`` file, which
|
||||||
|
opens input and output files, creates the parsing context, and eventually calls
|
||||||
|
the ``yyparse()`` function, which starts the execution of the LALR(1) parser
|
||||||
|
(see `Wikipedia <https://en.wikipedia.org/wiki/LALR_parser>`__ for more
|
||||||
|
information about LALR parsing techniques). The LALR(1) parser, whenever it has
|
||||||
|
to shift a token, calls the ``yylex()`` function, which is defined by the
|
||||||
|
flex-generated code, and reads the input file returning the next scanned token.
|
||||||
|
|
||||||
|
The tokens are mapped on the source language grammar, defined in the
|
||||||
|
``idef-parser.y`` file to build a unique syntactic tree, according to the
|
||||||
|
specified operator precedences and associativity rules.
|
||||||
|
|
||||||
|
The grammar describes the whole file which contains the Hexagon instruction
|
||||||
|
descriptions, therefore it starts from the ``input`` nonterminal, which is a
|
||||||
|
list of instructions, each instruction is represented by the following grammar
|
||||||
|
rule, representing the structure of the input file shown above:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
instruction : INAME arguments code
|
||||||
|
| error
|
||||||
|
|
||||||
|
arguments : '(' ')'
|
||||||
|
| '(' argument_list ')';
|
||||||
|
|
||||||
|
argument_list : argument_decl ',' argument_list
|
||||||
|
| argument_decl
|
||||||
|
|
||||||
|
argument_decl : REG
|
||||||
|
| PRED
|
||||||
|
| IN REG
|
||||||
|
| IN PRED
|
||||||
|
| IMM
|
||||||
|
| var
|
||||||
|
;
|
||||||
|
|
||||||
|
code : '{' statements '}'
|
||||||
|
|
||||||
|
statements : statements statement
|
||||||
|
| statement
|
||||||
|
|
||||||
|
statement : control_statement
|
||||||
|
| var_decl ';'
|
||||||
|
| rvalue ';'
|
||||||
|
| code_block
|
||||||
|
| ';'
|
||||||
|
|
||||||
|
code_block : '{' statements '}'
|
||||||
|
| '{' '}'
|
||||||
|
|
||||||
|
With this initial portion of the grammar we are defining the instruction, its'
|
||||||
|
arguments, and its' statements. Each argument is defined by the
|
||||||
|
``argument_decl`` rule, and can be either
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
Description Example
|
||||||
|
----------------------------------------
|
||||||
|
output register RsV
|
||||||
|
output predicate register P0
|
||||||
|
input register in RsV
|
||||||
|
input predicate register in P0
|
||||||
|
immediate value 1234
|
||||||
|
local variable EA
|
||||||
|
|
||||||
|
Note, the only local variable allowed to be used as an argument is the effective
|
||||||
|
address ``EA``. Similarly, each statement can be a ``control_statement``, a
|
||||||
|
variable declaration such as ``int a;``, a code block, which is just a
|
||||||
|
bracket-enclosed list of statements, a ``';'``, which is a ``nop`` instruction,
|
||||||
|
and an ``rvalue ';'``.
|
||||||
|
|
||||||
|
Expressions
|
||||||
|
~~~~~~~~~~~
|
||||||
|
|
||||||
|
Allowed in the input code are C language expressions with a few exceptions
|
||||||
|
to simplify parsing. For instance, variable names such as ``RdV``, ``RssV``,
|
||||||
|
``PdV``, ``CsV``, and other idiomatic register names from Hexagon, are
|
||||||
|
reserved specifically for register arguments. These arguments then map to
|
||||||
|
``TCGv_i32`` or ``TCGv_i64`` depending on the register size. Similarly, ``UiV``,
|
||||||
|
``riV``, etc. refer to immediate arguments and will map to C integers.
|
||||||
|
|
||||||
|
Also, as mentioned earlier, the names ``PC``, ``SP``, ``FP``, etc. are used to
|
||||||
|
refer to Hexagon registers such as the program counter, stack pointer, and frame
|
||||||
|
pointer seen here. Writes to these registers then correspond to assignments
|
||||||
|
``PC = ...``, and reads correspond to uses of the variable ``PC``.
|
||||||
|
|
||||||
|
Moreover, another example of one such exception is the selective expansion of
|
||||||
|
macros present in ``macros.h``. As an example, consider the ``fABS`` macro which
|
||||||
|
in plain C is defined as
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
#define fABS(A) (((A) < 0) ? (-(A)) : (A))
|
||||||
|
|
||||||
|
and returns the absolute value of the argument ``A``. This macro is not included
|
||||||
|
in ``idef-parser/macros.inc`` and as such is not expanded and kept as a "call"
|
||||||
|
``fABS(...)``. Reason being, that ``fABS`` is easier to match and map to
|
||||||
|
``tcg_gen_abs_<width>``, compared to the full ternary expression above. Loads of
|
||||||
|
macros in ``macros.h`` are kept unexpanded to aid in parsing, as seen in the
|
||||||
|
example above, for more information see ``idef-parser/idef-parser.lex``.
|
||||||
|
|
||||||
|
Finally, in mapping these input expressions to tinycode generators, idef-parser
|
||||||
|
tries to perform as much as possible in plain C. Such as, performing binary
|
||||||
|
operations in C instead of tinycode generators, thus effectively constant
|
||||||
|
folding the expression.
|
||||||
|
|
||||||
|
Variables and Variable Declarations
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Similarly to C, variables in the input code must be explicitly declared, such as
|
||||||
|
``int var1;`` which declares an uninitialized variable ``var1``. Initialization
|
||||||
|
``int var2 = 0;`` is also allowed and behaves as expected. In tinycode
|
||||||
|
generators the previous declarations are mapped to
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
int var1; -> TCGv_i32 var1 = tcg_temp_local_new_i32();
|
||||||
|
|
||||||
|
int var2 = 0; -> TCGv_i32 var1 = tcg_temp_local_new_i32();
|
||||||
|
tcg_gen_movi_i32(j, ((int64_t) 0ULL));
|
||||||
|
|
||||||
|
which are later automatically freed at the end of the function they're declared
|
||||||
|
in. Contrary to C, we only allow variables to be declared with an integer type
|
||||||
|
specified in the following table (without permutation of keywords)
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
type bit-width signedness
|
||||||
|
----------------------------------------------------------
|
||||||
|
int 32 signed
|
||||||
|
signed
|
||||||
|
signed int
|
||||||
|
|
||||||
|
unsigned 32 unsigned
|
||||||
|
unsigned int
|
||||||
|
|
||||||
|
long 64 signed
|
||||||
|
long int
|
||||||
|
signed long
|
||||||
|
signed long int
|
||||||
|
|
||||||
|
unsigned long 64 unsigned
|
||||||
|
unsigned long int
|
||||||
|
|
||||||
|
long long 64 signed
|
||||||
|
long long int
|
||||||
|
signed long long
|
||||||
|
signed long long int
|
||||||
|
|
||||||
|
unsigned long long 64 unsigned
|
||||||
|
unsigned long long int
|
||||||
|
|
||||||
|
size[1,2,4,8][s,u]_t 8-64 signed or unsigned
|
||||||
|
|
||||||
|
In idef-parser, variable names are matched by a generic ``VARID`` token,
|
||||||
|
which will feature the variable name as a decoration. For a variable declaration
|
||||||
|
idef-parser calls ``gen_varid_allocate`` with the ``VARID`` token to save the
|
||||||
|
name, size, and bit width of the newly declared variable. In addition, this
|
||||||
|
function also ensures that variables aren't declared multiple times, and prints
|
||||||
|
and error message if that is the case. Upon use of a variable, the ``VARID``
|
||||||
|
token is used to lookup the size and bit width of the variable.
|
||||||
|
|
||||||
|
Type System
|
||||||
|
~~~~~~~~~~~
|
||||||
|
|
||||||
|
idef-parser features a simple type system which is used to correctly implement
|
||||||
|
the signedness and bit width of the operations.
|
||||||
|
|
||||||
|
The type of each ``rvalue`` is determined by two attributes: its bit width
|
||||||
|
(``unsigned bit_width``) and its signedness (``HexSignedness signedness``).
|
||||||
|
|
||||||
|
For each operation, the type of ``rvalue``\ s influence the way in which the
|
||||||
|
operands are handled and emitted. For example a right shift between signed
|
||||||
|
operators will be an arithmetic shift, while one between unsigned operators
|
||||||
|
will be a logical shift. If one of the two operands is signed, and the other
|
||||||
|
is unsigned, the operation will be signed.
|
||||||
|
|
||||||
|
The bit width also influences the outcome of the operations, in particular while
|
||||||
|
the input languages features a fine granularity type system, with types of 8,
|
||||||
|
16, 32, 64 (and more for vectorial instructions) bits, the tinycode only
|
||||||
|
features 32 and 64 bit widths. We propagate as much as possible the fine
|
||||||
|
granularity type, until the value has to be used inside an operation between
|
||||||
|
``rvalue``\ s; in that case if one of the two operands is greater than 32 bits
|
||||||
|
we promote the whole operation to 64 bit, taking care of properly extending the
|
||||||
|
two operands. Fortunately, the most critical instructions already feature
|
||||||
|
explicit casts and zero/sign extensions which are properly propagated down to
|
||||||
|
our parser.
|
||||||
|
|
||||||
|
The combination of ``rvalue``\ s are handled through the use of the
|
||||||
|
``gen_bin_op`` and ``gen_bin_cmp`` helper functions. These two functions handle
|
||||||
|
the appropriate compile-time or run-time emission of operations to perform the
|
||||||
|
required computation.
|
||||||
|
|
||||||
|
Control Statements
|
||||||
|
~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
``control_statement``\ s are all the statements which modify the order of
|
||||||
|
execution of the generated code according to input parameters. They are expanded
|
||||||
|
by the following grammar rule:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
control_statement : frame_check
|
||||||
|
| cancel_statement
|
||||||
|
| if_statement
|
||||||
|
| for_statement
|
||||||
|
| fpart1_statement
|
||||||
|
|
||||||
|
``if_statement``\ s require the emission of labels and branch instructions which
|
||||||
|
effectively perform conditional jumps (``tcg_gen_brcondi``) according to the
|
||||||
|
value of an expression. Note, the tinycode generators we produce for conditional
|
||||||
|
statements do not perfectly mirror what would be expected in C, for instance we
|
||||||
|
do not reproduce short-circuiting of the ``&&`` operator, and use of the ``||``
|
||||||
|
operator is disallowed. All the predicated instructions, and in general all the
|
||||||
|
instructions where there could be alternative values assigned to an ``lvalue``,
|
||||||
|
like C-style ternary expressions:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
rvalue : rvalue QMARK rvalue COLON rvalue
|
||||||
|
|
||||||
|
are handled using the conditional move tinycode instruction
|
||||||
|
(``tcg_gen_movcond``), which avoids the additional complexity of managing labels
|
||||||
|
and jumps.
|
||||||
|
|
||||||
|
Instead, regarding the ``for`` loops, exploiting the fact that they always
|
||||||
|
iterate on immediate values, therefore their iteration ranges are always known
|
||||||
|
at compile time, we implemented those emitting plain C ``for`` loops. This is
|
||||||
|
possible because the loops will be executed in the QEMU code, leading to the
|
||||||
|
consequential unrolling of the for loop, since the tinycode generator
|
||||||
|
instructions will be executed multiple times, and the respective generated
|
||||||
|
tinycode will represent the unrolled execution of the loop.
|
||||||
|
|
||||||
|
Parsing Context
|
||||||
|
~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
All the helper functions in ``idef-parser.y`` carry two fixed parameters, which
|
||||||
|
are the parsing context ``c`` and the ``YYLLOC`` location information. The
|
||||||
|
context is explicitly passed to all the functions because the parser we generate
|
||||||
|
is a reentrant one, meaning that it does not have any global variable, and
|
||||||
|
therefore the instruction compilation could easily be parallelized in the
|
||||||
|
future. Finally for each rule we propagate information about the location of the
|
||||||
|
involved tokens to generate pretty error reporting, able to highlight the
|
||||||
|
portion of the input code which generated each error.
|
||||||
|
|
||||||
|
Debugging
|
||||||
|
---------
|
||||||
|
|
||||||
|
Developing the idef-parser can lead to two types of errors: compile-time errors
|
||||||
|
and parsing errors.
|
||||||
|
|
||||||
|
Compile-time errors in Bison-generated parsers are usually due to conflicts in
|
||||||
|
the described grammar. Conflicts forbid the grammar to produce a unique
|
||||||
|
derivation tree, thus must be solved (except for the dangling else problem,
|
||||||
|
which is marked as expected through the ``%expect 1`` Bison option).
|
||||||
|
|
||||||
|
For solving conflicts you need a basic understanding of `shift-reduce conflicts
|
||||||
|
<https://www.gnu.org/software/Bison/manual/html_node/Shift_002fReduce.html>`__
|
||||||
|
and `reduce-reduce conflicts
|
||||||
|
<https://www.gnu.org/software/Bison/manual/html_node/Reduce_002fReduce.html>`__,
|
||||||
|
then, if you are using a Bison version > 3.7.1 you can ask Bison to generate
|
||||||
|
some counterexamples which highlight ambiguous derivations, passing the
|
||||||
|
``-Wcex`` option to Bison. In general shift/reduce conflicts are solved by
|
||||||
|
redesigning the grammar in an unambiguous way or by setting the token priority
|
||||||
|
correctly, while reduce/reduce conflicts are solved by redesigning the
|
||||||
|
interested part of the grammar.
|
||||||
|
|
||||||
|
Run-time errors can be divided between lexing and parsing errors, lexing errors
|
||||||
|
are hard to detect, since the ``var`` token will catch everything which is not
|
||||||
|
catched by other tokens, but easy to fix, because most of the time a simple
|
||||||
|
regex editing will be enough.
|
||||||
|
|
||||||
|
idef-parser features a fancy parsing error reporting scheme, which for each
|
||||||
|
parsing error reports the fragment of the input text which was involved in the
|
||||||
|
parsing rule that generated an error.
|
||||||
|
|
||||||
|
Implementing an instruction goes through several sequential steps, here are some
|
||||||
|
suggestions to make each instruction proceed to the next step.
|
||||||
|
|
||||||
|
- not-emitted
|
||||||
|
|
||||||
|
Means that the parsing of the input code relative to that instruction failed,
|
||||||
|
this could be due to a lexical error or to some mismatch between the order of
|
||||||
|
valid tokens and a parser rule. You should check that tokens are correctly
|
||||||
|
identified and mapped, and that there is a rule matching the token sequence
|
||||||
|
that you need to parse.
|
||||||
|
|
||||||
|
- emitted
|
||||||
|
|
||||||
|
This instruction class contains all the instructions which are emitted but
|
||||||
|
fail to compile when included in QEMU. The compilation errors are shown by
|
||||||
|
the QEMU building process and will lead to fixing the bug. Most common
|
||||||
|
errors regard the mismatch of parameters for tinycode generator functions,
|
||||||
|
which boil down to errors in the idef-parser type system.
|
||||||
|
|
||||||
|
- compiled
|
||||||
|
|
||||||
|
These instruction generate valid tinycode generator code, which however fail
|
||||||
|
the QEMU or the harness tests, these cases must be handled manually by
|
||||||
|
looking into the failing tests and looking at the generated tinycode
|
||||||
|
generator instruction and at the generated tinycode itself. Tip: handle the
|
||||||
|
failing harness tests first, because they usually feature only a single
|
||||||
|
instruction, thus will require less execution trace navigation. If a
|
||||||
|
multi-threaded test fail, fixing all the other tests will be the easier
|
||||||
|
option, hoping that the multi-threaded one will be indirectly fixed.
|
||||||
|
|
||||||
|
An example of debugging this type of failure is provided in the following
|
||||||
|
section.
|
||||||
|
|
||||||
|
- tests-passed
|
||||||
|
|
||||||
|
This is the final goal for each instruction, meaning that the instruction
|
||||||
|
passes the test suite.
|
||||||
|
|
||||||
|
Another approach to fix QEMU system test, where many instructions might fail, is
|
||||||
|
to compare the execution trace of your implementation with the reference
|
||||||
|
implementations already present in QEMU. To do so you should obtain a QEMU build
|
||||||
|
where the instruction pass the test, and run it with the following command:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
sudo unshare -p sudo -u <USER> bash -c \
|
||||||
|
'env -i <qemu-hexagon full path> -d cpu <TEST>'
|
||||||
|
|
||||||
|
And do the same for your implementation, the generated execution traces will be
|
||||||
|
inherently aligned and can be inspected for behavioral differences using the
|
||||||
|
``diff`` tool.
|
||||||
|
|
||||||
|
Example of debugging erroneous tinycode generator code
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The goal of this section is to provide a complete example of debugging
|
||||||
|
incorrectly emitted tinycode generator for a single instruction.
|
||||||
|
|
||||||
|
Let's first introduce a bug in the tinycode generator of the ``A2_add``
|
||||||
|
instruction,
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
void emit_A2_add(DisasContext *ctx, Insn *insn, Packet *pkt, TCGv_i32 RdV,
|
||||||
|
TCGv_i32 RsV, TCGv_i32 RtV)
|
||||||
|
/* RdV=RsV+RtV;} */
|
||||||
|
{
|
||||||
|
TCGv_i32 tmp_0 = tcg_temp_new_i32();
|
||||||
|
tcg_gen_add_i32(tmp_0, RsV, RsV);
|
||||||
|
tcg_gen_mov_i32(RdV, tmp_0);
|
||||||
|
tcg_temp_free_i32(tmp_0);
|
||||||
|
}
|
||||||
|
|
||||||
|
Here the bug, albeit hard to spot, is in ``tcg_gen_add_i32(tmp_0, RsV, RsV);``
|
||||||
|
where we compute ``RsV + RsV`` instead of ``RsV + RtV``, as would be expected.
|
||||||
|
This particular bug is a bit tricky to pinpoint when debugging, since the
|
||||||
|
``A2_add`` instruction is so ubiquitous. As a result, pretty much all tests will
|
||||||
|
fail and therefore not provide a lot of information about the bug.
|
||||||
|
|
||||||
|
For example, let's run the ``check-tcg`` tests
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
make check-tcg TIMEOUT=1200 \
|
||||||
|
DOCKER_IMAGE=debian-hexagon-cross \
|
||||||
|
ENGINE=podman V=1 \
|
||||||
|
DOCKER_CROSS_CC_GUEST=hexagon-unknown-linux-musl-clang
|
||||||
|
|
||||||
|
In the output, we find a failure in the very first test case ``float_convs``
|
||||||
|
due to a segmentation fault. Similarly, all harness and libc tests will fail as
|
||||||
|
well. At this point we have no clue where the actual bug lies, and need to start
|
||||||
|
ruling out instructions. As such a good starting point is to utilize the debug
|
||||||
|
options ``-d in_asm,cpu`` of QEMU to inspect the Hexagon instructions being run,
|
||||||
|
alongside the CPU state. We additionally need a working version of the emulator
|
||||||
|
to compare our buggy CPU state against, running
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
meson configure -Dhexagon_idef_parser_enabled=false
|
||||||
|
|
||||||
|
will disable the idef-parser for all instructions and fallback on manual
|
||||||
|
tinycode generator overrides, or on helper function implementations. Recompiling
|
||||||
|
gives us ``qemu-hexagon`` which passes all tests. If ``qemu-heaxgon-buggy`` is
|
||||||
|
our binary with the incorrect tinycode generators, we can compare the CPU state
|
||||||
|
between the two versions
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
./qemu-hexagon-buggy -d in_asm,cpu float_convs &> out_buggy
|
||||||
|
./qemu-hexagon -d in_asm,cpu float_convs &> out_working
|
||||||
|
|
||||||
|
Looking at ``diff -u out_buggy out_working`` shows us that the CPU state begins
|
||||||
|
to diverge on line 141, with an incorrect value in the ``R1`` register
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
@@ -138,7 +138,7 @@
|
||||||
|
|
||||||
|
General Purpose Registers = {
|
||||||
|
r0 = 0x4100f9c0
|
||||||
|
- r1 = 0x00042108
|
||||||
|
+ r1 = 0x00000000
|
||||||
|
r2 = 0x00021084
|
||||||
|
r3 = 0x00000000
|
||||||
|
r4 = 0x00000000
|
||||||
|
|
||||||
|
If we also look into ``out_buggy`` directly we can inspect the input assembly
|
||||||
|
which the caused the incorrect CPU state, around line 141 we find
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
116 | ----------------
|
||||||
|
117 | IN: _start_c
|
||||||
|
118 | 0x000210b0: 0xa09dc002 { allocframe(R29,#0x10):raw }
|
||||||
|
... | ...
|
||||||
|
137 | 0x000210fc: 0x5a00c4aa { call PC+2388 }
|
||||||
|
138 |
|
||||||
|
139 | General Purpose Registers = {
|
||||||
|
140 | r0 = 0x4100fa70
|
||||||
|
141 | r1 = 0x00042108
|
||||||
|
142 | r2 = 0x00021084
|
||||||
|
143 | r3 = 0x00000000
|
||||||
|
|
||||||
|
Importantly, we see some Hexagon assembly followed by a dump of the CPU state,
|
||||||
|
now the CPU state is actually dumped before the input assembly above is ran.
|
||||||
|
As such, we are actually interested in the instructions ran before this.
|
||||||
|
|
||||||
|
Scrolling up a bit, we find
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
54 | ----------------
|
||||||
|
55 | IN: _start
|
||||||
|
56 | 0x00021088: 0x6a09c002 { R2 = C9/pc }
|
||||||
|
57 | 0x0002108c: 0xbfe2ff82 { R2 = add(R2,#0xfffffffc) }
|
||||||
|
58 | 0x00021090: 0x9182c001 { R1 = memw(R2+#0x0) }
|
||||||
|
59 | 0x00021094: 0xf302c101 { R1 = add(R2,R1) }
|
||||||
|
60 | 0x00021098: 0x7800c01e { R30 = #0x0 }
|
||||||
|
61 | 0x0002109c: 0x707dc000 { R0 = R29 }
|
||||||
|
62 | 0x000210a0: 0x763dfe1d { R29 = and(R29,#0xfffffff0) }
|
||||||
|
63 | 0x000210a4: 0xa79dfdfe { memw(R29+#0xfffffff8) = R29 }
|
||||||
|
64 | 0x000210a8: 0xbffdff1d { R29 = add(R29,#0xfffffff8) }
|
||||||
|
65 | 0x000210ac: 0x5a00c002 { call PC+4 }
|
||||||
|
66 |
|
||||||
|
67 | General Purpose Registers = {
|
||||||
|
68 | r0 = 0x00000000
|
||||||
|
69 | r1 = 0x00000000
|
||||||
|
70 | r2 = 0x00000000
|
||||||
|
71 | r3 = 0x00000000
|
||||||
|
|
||||||
|
Remember, the instructions on lines 56-65 are ran on the CPU state shown below
|
||||||
|
instructions, and as the CPU state has not diverged at this point, we know the
|
||||||
|
starting state is accurate. The bug must then lie within the instructions shown
|
||||||
|
here. Next we may notice that ``R1`` is only touched by lines 57 and 58, that is
|
||||||
|
by
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
58 | 0x00021090: 0x9182c001 { R1 = memw(R2+#0x0) }
|
||||||
|
59 | 0x00021094: 0xf302c101 { R1 = add(R2,R1) }
|
||||||
|
|
||||||
|
Therefore, we are either dealing with an correct load instruction
|
||||||
|
``R1 = memw(R2+#0x0)`` or with an incorrect add ``R1 = add(R2,R1)``. At this
|
||||||
|
point it might be easy enough to go directly to the emitted code for the
|
||||||
|
instructions mentioned and look for bugs, but we could also run
|
||||||
|
``./qemu-heaxgon -d op,in_asm float_conv`` where we find for the following
|
||||||
|
tinycode for the Hexagon ``add`` instruction
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
---- 00021094
|
||||||
|
mov_i32 pkt_has_store_s1,$0x0
|
||||||
|
add_i32 tmp0,r2,r2
|
||||||
|
mov_i32 loc2,tmp0
|
||||||
|
mov_i32 new_r1,loc2
|
||||||
|
mov_i32 r1,new_r1
|
||||||
|
|
||||||
|
Here we have finally located our bug ``add_i32 tmp0,r2,r2``.
|
||||||
|
|
||||||
|
Limitations and Future Development
|
||||||
|
----------------------------------
|
||||||
|
|
||||||
|
The main limitation of the current parser is given by the syntax-driven nature
|
||||||
|
of the Bison-generated parsers. This has the severe implication of only being
|
||||||
|
able to generate code in the order of evaluation of the various rules, without,
|
||||||
|
in any case, being able to backtrack and alter the generated code.
|
||||||
|
|
||||||
|
An example limitation is highlighted by this statement of the input language:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
{ (PsV==0xff) ? (PdV=0xff) : (PdV=0x00); }
|
||||||
|
|
||||||
|
This ternary assignment, when written in this form requires us to emit some
|
||||||
|
proper control flow statements, which emit a jump to the first or to the second
|
||||||
|
code block, whose implementation is extremely convoluted, because when matching
|
||||||
|
the ternary assignment, the code evaluating the two assignments will be already
|
||||||
|
generated.
|
||||||
|
|
||||||
|
Instead we pre-process that statement, making it become:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
{ PdV = ((PsV==0xff)) ? 0xff : 0x00; }
|
||||||
|
|
||||||
|
Which can be easily matched by the following parser rules:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
statement | rvalue ';'
|
||||||
|
|
||||||
|
rvalue : rvalue QMARK rvalue COLON rvalue
|
||||||
|
| rvalue EQ rvalue
|
||||||
|
| LPAR rvalue RPAR
|
||||||
|
| assign_statement
|
||||||
|
| IMM
|
||||||
|
|
||||||
|
assign_statement : pred ASSIGN rvalue
|
||||||
|
|
||||||
|
Another example that highlight the limitation of the flex/bison parser can be
|
||||||
|
found even in the add operation we already saw:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
TCGv_i32 tmp_0 = tcg_temp_new_i32();
|
||||||
|
tcg_gen_add_i32(tmp_0, RsV, RtV);
|
||||||
|
tcg_gen_mov_i32(RdV, tmp_0);
|
||||||
|
|
||||||
|
The fact that we cannot directly use ``RdV`` as the destination of the sum is a
|
||||||
|
consequence of the syntax-driven nature of the parser. In fact when we parse the
|
||||||
|
assignment, the ``rvalue`` token, representing the sum has already been reduced,
|
||||||
|
and thus its code emitted and unchangeable. We rely on the fact that QEMU will
|
||||||
|
optimize our code reducing the useless move operations and the relative
|
||||||
|
temporaries.
|
||||||
|
|
||||||
|
A possible improvement of the parser regards the support for vectorial
|
||||||
|
instructions and floating point instructions, which will require the extension
|
||||||
|
of the scanner, the parser, and a partial re-design of the type system, allowing
|
||||||
|
to build the vectorial semantics over the available vectorial tinycode generator
|
||||||
|
primitives.
|
||||||
|
|
||||||
|
A more radical improvement will use the parser, not to generate directly the
|
||||||
|
tinycode generator code, but to generate an intermediate representation like the
|
||||||
|
LLVM IR, which in turn could be compiled using the clang TCG backend. That code
|
||||||
|
could be furtherly optimized, overcoming the limitations of the syntax-driven
|
||||||
|
parsing and could lead to a more optimized generated code.
|
Loading…
Reference in New Issue
Block a user