diff options
Diffstat (limited to 'target/hexagon/idef-parser')
| -rw-r--r-- | target/hexagon/idef-parser/README.rst | 722 | ||||
| -rw-r--r-- | target/hexagon/idef-parser/idef-parser.h | 253 | ||||
| -rw-r--r-- | target/hexagon/idef-parser/idef-parser.lex | 471 | ||||
| -rw-r--r-- | target/hexagon/idef-parser/idef-parser.y | 965 | ||||
| -rw-r--r-- | target/hexagon/idef-parser/macros.inc | 140 | ||||
| -rw-r--r-- | target/hexagon/idef-parser/parser-helpers.c | 2360 | ||||
| -rw-r--r-- | target/hexagon/idef-parser/parser-helpers.h | 376 | ||||
| -rwxr-xr-x | target/hexagon/idef-parser/prepare | 24 |
8 files changed, 5311 insertions, 0 deletions
diff --git a/target/hexagon/idef-parser/README.rst b/target/hexagon/idef-parser/README.rst new file mode 100644 index 0000000000..65e6bf4ee5 --- /dev/null +++ b/target/hexagon/idef-parser/README.rst @@ -0,0 +1,722 @@ +Hexagon ISA instruction definitions to tinycode generator compiler +------------------------------------------------------------------ + +idef-parser is a small compiler able to translate the Hexagon ISA description +language into tinycode generator code, that can be easily integrated into QEMU. + +Compilation Example +------------------- + +To better understand the scope of the idef-parser, we'll explore an applicative +example. Let's start by one of the simplest Hexagon instruction: the ``add``. + +The ISA description language represents the ``add`` instruction as +follows: + +.. code:: c + + A2_add(RdV, in RsV, in RtV) { + { RdV=RsV+RtV;} + } + +idef-parser will compile the above code into the following code: + +.. code:: c + + /* A2_add */ + void emit_A2_add(DisasContext *ctx, Insn *insn, Packet *pkt, TCGv_i32 RdV, + TCGv_i32 RsV, TCGv_i32 RtV) + /* { RdV=RsV+RtV;} */ + { + TCGv_i32 tmp_0 = tcg_temp_new_i32(); + tcg_gen_add_i32(tmp_0, RsV, RtV); + tcg_gen_mov_i32(RdV, tmp_0); + tcg_temp_free_i32(tmp_0); + } + +The output of the compilation process will be a function, containing the +tinycode generator code, implementing the correct semantics. That function will +not access any global variable, because all the accessed data structures will be +passed explicitly as function parameters. Among the passed parameters we will +have TCGv (tinycode variables) representing the input and output registers of +the architecture, integers representing the immediates that come from the code, +and other data structures which hold information about the disassemblation +context (``DisasContext`` struct). + +Let's begin by describing the input code. The ``add`` instruction is associated +with a unique identifier, in this case ``A2_add``, which allows to distinguish +variants of the same instruction, and expresses the class to which the +instruction belongs, in this case ``A2`` corresponds to the Hexagon +``ALU32/ALU`` instruction subclass. + +After the instruction identifier, we have a series of parameters that represents +TCG variables that will be passed to the generated function. Parameters marked +with ``in`` are already initialized, while the others are output parameters. + +We will leverage this information to infer several information: + +- Fill in the output function signature with the correct TCGv registers +- Fill in the output function signature with the immediate integers +- Keep track of which registers, among the declared one, have been + initialized + +Let's now observe the actual instruction description code, in this case: + +.. code:: c + + { RdV=RsV+RtV;} + +This code is composed by a subset of the C syntax, and is the result of the +application of some macro definitions contained in the ``macros.h`` file. + +This file is used to reduce the complexity of the input language where complex +variants of similar constructs can be mapped to a unique primitive, so that the +idef-parser has to handle a lower number of computation primitives. + +As you may notice, the description code modifies the registers which have been +declared by the declaration statements. In this case all the three registers +will be declared, ``RsV`` and ``RtV`` will also be read and ``RdV`` will be +written. + +Now let's have a quick look at the generated code, line by line. + +:: + + TCGv_i32 tmp_0 = tcg_temp_new_i32(); + +This code starts by declaring a temporary TCGv to hold the result from the sum +operation. + +:: + + tcg_gen_add_i32(tmp_0, RsV, RtV); + +Then, we are generating the sum tinycode operator between the selected +registers, storing the result in the just declared temporary. + +:: + + tcg_gen_mov_i32(RdV, tmp_0); + +The result of the addition is now stored in the temporary, we move it into the +correct destination register. This code may seem inefficient, but QEMU will +perform some optimizations on the tinycode, reducing the unnecessary copy. + +:: + + tcg_temp_free_i32(tmp_0); + +Finally, we free the temporary we used to hold the addition result. + +Parser Input +------------ + +Before moving on to the structure of idef-parser itself, let us spend some words +on its' input. There are two preprocessing steps applied to the generated +instruction semantics in ``semantics_generated.pyinc`` that we need to consider. +Firstly, + +:: + + gen_idef_parser_funcs.py + +which takes instruction semantics in ``semantics_generated.pyinc`` to C-like +pseudo code, output into ``idef_parser_input.h.inc``. For instance, the +``J2_jumpr`` instruction which jumps to an address stored in a register +argument. This is instruction is defined as + +:: + + SEMANTICS( \ + "J2_jumpr", \ + "jumpr Rs32", \ + """{fJUMPR(RsN,RsV,COF_TYPE_JUMPR);}""" \ + ) + +in ``semantics_generated.pyinc``. Running ``gen_idef_parser_funcs.py`` +we obtain the pseudo code + +:: + + J2_jumpr(in RsV) { + {fJUMPR(RsN,RsV,COF_TYPE_JUMPR);} + } + +with macros such as ``fJUMPR`` intact. + +The second step is to expand macros into a form suitable for our parser. +These macros are defined in ``idef-parser/macros.inc`` and the step is +carried out by the ``prepare`` script which runs the C preprocessor on +``idef_parser_input.h.inc`` to produce +``idef_parser_input.preprocessed.h.inc``. + +To finish the above example, after preprocessing ``J2_jumpr`` we obtain + +:: + + J2_jumpr(in RsV) { + {(PC = RsV);} + } + +where ``fJUMPR(RsN,RsV,COF_TYPE_JUMPR);`` was expanded to ``(PC = RsV)``, +signifying a write to the Program Counter ``PC``. Note, that ``PC`` in +this expression is not a variable in the strict C sense since it is not +declared anywhere, but rather a symbol which is easy to match in +idef-parser later on. + +Parser Structure +---------------- + +The idef-parser is built using the ``flex`` and ``bison``. + +``flex`` is used to split the input string into tokens, each described using a +regular expression. The token description is contained in the +``idef-parser.lex`` source file. The flex-generated scanner takes care also to +extract from the input text other meaningful information, e.g., the numerical +value in case of an immediate constant, and decorates the token with the +extracted information. + +``bison`` is used to generate the actual parser, starting from the parsing +description contained in the ``idef-parser.y`` file. The generated parser +executes the ``main`` function at the end of the ``idef-parser.y`` file, which +opens input and output files, creates the parsing context, and eventually calls +the ``yyparse()`` function, which starts the execution of the LALR(1) parser +(see `Wikipedia <https://en.wikipedia.org/wiki/LALR_parser>`__ for more +information about LALR parsing techniques). The LALR(1) parser, whenever it has +to shift a token, calls the ``yylex()`` function, which is defined by the +flex-generated code, and reads the input file returning the next scanned token. + +The tokens are mapped on the source language grammar, defined in the +``idef-parser.y`` file to build a unique syntactic tree, according to the +specified operator precedences and associativity rules. + +The grammar describes the whole file which contains the Hexagon instruction +descriptions, therefore it starts from the ``input`` nonterminal, which is a +list of instructions, each instruction is represented by the following grammar +rule, representing the structure of the input file shown above: + +:: + + instruction : INAME arguments code + | error + + arguments : '(' ')' + | '(' argument_list ')'; + + argument_list : argument_decl ',' argument_list + | argument_decl + + argument_decl : REG + | PRED + | IN REG + | IN PRED + | IMM + | var + ; + + code : '{' statements '}' + + statements : statements statement + | statement + + statement : control_statement + | var_decl ';' + | rvalue ';' + | code_block + | ';' + + code_block : '{' statements '}' + | '{' '}' + +With this initial portion of the grammar we are defining the instruction, its' +arguments, and its' statements. Each argument is defined by the +``argument_decl`` rule, and can be either + +:: + + Description Example + ---------------------------------------- + output register RsV + output predicate register P0 + input register in RsV + input predicate register in P0 + immediate value 1234 + local variable EA + +Note, the only local variable allowed to be used as an argument is the effective +address ``EA``. Similarly, each statement can be a ``control_statement``, a +variable declaration such as ``int a;``, a code block, which is just a +bracket-enclosed list of statements, a ``';'``, which is a ``nop`` instruction, +and an ``rvalue ';'``. + +Expressions +~~~~~~~~~~~ + +Allowed in the input code are C language expressions with a few exceptions +to simplify parsing. For instance, variable names such as ``RdV``, ``RssV``, +``PdV``, ``CsV``, and other idiomatic register names from Hexagon, are +reserved specifically for register arguments. These arguments then map to +``TCGv_i32`` or ``TCGv_i64`` depending on the register size. Similarly, ``UiV``, +``riV``, etc. refer to immediate arguments and will map to C integers. + +Also, as mentioned earlier, the names ``PC``, ``SP``, ``FP``, etc. are used to +refer to Hexagon registers such as the program counter, stack pointer, and frame +pointer seen here. Writes to these registers then correspond to assignments +``PC = ...``, and reads correspond to uses of the variable ``PC``. + +Moreover, another example of one such exception is the selective expansion of +macros present in ``macros.h``. As an example, consider the ``fABS`` macro which +in plain C is defined as + +:: + + #define fABS(A) (((A) < 0) ? (-(A)) : (A)) + +and returns the absolute value of the argument ``A``. This macro is not included +in ``idef-parser/macros.inc`` and as such is not expanded and kept as a "call" +``fABS(...)``. Reason being, that ``fABS`` is easier to match and map to +``tcg_gen_abs_<width>``, compared to the full ternary expression above. Loads of +macros in ``macros.h`` are kept unexpanded to aid in parsing, as seen in the +example above, for more information see ``idef-parser/idef-parser.lex``. + +Finally, in mapping these input expressions to tinycode generators, idef-parser +tries to perform as much as possible in plain C. Such as, performing binary +operations in C instead of tinycode generators, thus effectively constant +folding the expression. + +Variables and Variable Declarations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Similarly to C, variables in the input code must be explicitly declared, such as +``int var1;`` which declares an uninitialized variable ``var1``. Initialization +``int var2 = 0;`` is also allowed and behaves as expected. In tinycode +generators the previous declarations are mapped to + +:: + + int var1; -> TCGv_i32 var1 = tcg_temp_local_new_i32(); + + int var2 = 0; -> TCGv_i32 var1 = tcg_temp_local_new_i32(); + tcg_gen_movi_i32(j, ((int64_t) 0ULL)); + +which are later automatically freed at the end of the function they're declared +in. Contrary to C, we only allow variables to be declared with an integer type +specified in the following table (without permutation of keywords) + +:: + + type bit-width signedness + ---------------------------------------------------------- + int 32 signed + signed + signed int + + unsigned 32 unsigned + unsigned int + + long 64 signed + long int + signed long + signed long int + + unsigned long 64 unsigned + unsigned long int + + long long 64 signed + long long int + signed long long + signed long long int + + unsigned long long 64 unsigned + unsigned long long int + + size[1,2,4,8][s,u]_t 8-64 signed or unsigned + +In idef-parser, variable names are matched by a generic ``VARID`` token, +which will feature the variable name as a decoration. For a variable declaration +idef-parser calls ``gen_varid_allocate`` with the ``VARID`` token to save the +name, size, and bit width of the newly declared variable. In addition, this +function also ensures that variables aren't declared multiple times, and prints +and error message if that is the case. Upon use of a variable, the ``VARID`` +token is used to lookup the size and bit width of the variable. + +Type System +~~~~~~~~~~~ + +idef-parser features a simple type system which is used to correctly implement +the signedness and bit width of the operations. + +The type of each ``rvalue`` is determined by two attributes: its bit width +(``unsigned bit_width``) and its signedness (``HexSignedness signedness``). + +For each operation, the type of ``rvalue``\ s influence the way in which the +operands are handled and emitted. For example a right shift between signed +operators will be an arithmetic shift, while one between unsigned operators +will be a logical shift. If one of the two operands is signed, and the other +is unsigned, the operation will be signed. + +The bit width also influences the outcome of the operations, in particular while +the input languages features a fine granularity type system, with types of 8, +16, 32, 64 (and more for vectorial instructions) bits, the tinycode only +features 32 and 64 bit widths. We propagate as much as possible the fine +granularity type, until the value has to be used inside an operation between +``rvalue``\ s; in that case if one of the two operands is greater than 32 bits +we promote the whole operation to 64 bit, taking care of properly extending the +two operands. Fortunately, the most critical instructions already feature +explicit casts and zero/sign extensions which are properly propagated down to +our parser. + +The combination of ``rvalue``\ s are handled through the use of the +``gen_bin_op`` and ``gen_bin_cmp`` helper functions. These two functions handle +the appropriate compile-time or run-time emission of operations to perform the +required computation. + +Control Statements +~~~~~~~~~~~~~~~~~~ + +``control_statement``\ s are all the statements which modify the order of +execution of the generated code according to input parameters. They are expanded +by the following grammar rule: + +:: + + control_statement : frame_check + | cancel_statement + | if_statement + | for_statement + | fpart1_statement + +``if_statement``\ s require the emission of labels and branch instructions which +effectively perform conditional jumps (``tcg_gen_brcondi``) according to the +value of an expression. Note, the tinycode generators we produce for conditional +statements do not perfectly mirror what would be expected in C, for instance we +do not reproduce short-circuiting of the ``&&`` operator, and use of the ``||`` +operator is disallowed. All the predicated instructions, and in general all the +instructions where there could be alternative values assigned to an ``lvalue``, +like C-style ternary expressions: + +:: + + rvalue : rvalue QMARK rvalue COLON rvalue + +are handled using the conditional move tinycode instruction +(``tcg_gen_movcond``), which avoids the additional complexity of managing labels +and jumps. + +Instead, regarding the ``for`` loops, exploiting the fact that they always +iterate on immediate values, therefore their iteration ranges are always known +at compile time, we implemented those emitting plain C ``for`` loops. This is +possible because the loops will be executed in the QEMU code, leading to the +consequential unrolling of the for loop, since the tinycode generator +instructions will be executed multiple times, and the respective generated +tinycode will represent the unrolled execution of the loop. + +Parsing Context +~~~~~~~~~~~~~~~ + +All the helper functions in ``idef-parser.y`` carry two fixed parameters, which +are the parsing context ``c`` and the ``YYLLOC`` location information. The +context is explicitly passed to all the functions because the parser we generate +is a reentrant one, meaning that it does not have any global variable, and +therefore the instruction compilation could easily be parallelized in the +future. Finally for each rule we propagate information about the location of the +involved tokens to generate pretty error reporting, able to highlight the +portion of the input code which generated each error. + +Debugging +--------- + +Developing the idef-parser can lead to two types of errors: compile-time errors +and parsing errors. + +Compile-time errors in Bison-generated parsers are usually due to conflicts in +the described grammar. Conflicts forbid the grammar to produce a unique +derivation tree, thus must be solved (except for the dangling else problem, +which is marked as expected through the ``%expect 1`` Bison option). + +For solving conflicts you need a basic understanding of `shift-reduce conflicts +<https://www.gnu.org/software/Bison/manual/html_node/Shift_002fReduce.html>`__ +and `reduce-reduce conflicts +<https://www.gnu.org/software/Bison/manual/html_node/Reduce_002fReduce.html>`__, +then, if you are using a Bison version > 3.7.1 you can ask Bison to generate +some counterexamples which highlight ambiguous derivations, passing the +``-Wcex`` option to Bison. In general shift/reduce conflicts are solved by +redesigning the grammar in an unambiguous way or by setting the token priority +correctly, while reduce/reduce conflicts are solved by redesigning the +interested part of the grammar. + +Run-time errors can be divided between lexing and parsing errors, lexing errors +are hard to detect, since the ``var`` token will catch everything which is not +catched by other tokens, but easy to fix, because most of the time a simple +regex editing will be enough. + +idef-parser features a fancy parsing error reporting scheme, which for each +parsing error reports the fragment of the input text which was involved in the +parsing rule that generated an error. + +Implementing an instruction goes through several sequential steps, here are some +suggestions to make each instruction proceed to the next step. + +- not-emitted + + Means that the parsing of the input code relative to that instruction failed, + this could be due to a lexical error or to some mismatch between the order of + valid tokens and a parser rule. You should check that tokens are correctly + identified and mapped, and that there is a rule matching the token sequence + that you need to parse. + +- emitted + + This instruction class contains all the instructions which are emitted but + fail to compile when included in QEMU. The compilation errors are shown by + the QEMU building process and will lead to fixing the bug. Most common + errors regard the mismatch of parameters for tinycode generator functions, + which boil down to errors in the idef-parser type system. + +- compiled + + These instruction generate valid tinycode generator code, which however fail + the QEMU or the harness tests, these cases must be handled manually by + looking into the failing tests and looking at the generated tinycode + generator instruction and at the generated tinycode itself. Tip: handle the + failing harness tests first, because they usually feature only a single + instruction, thus will require less execution trace navigation. If a + multi-threaded test fail, fixing all the other tests will be the easier + option, hoping that the multi-threaded one will be indirectly fixed. + + An example of debugging this type of failure is provided in the following + section. + +- tests-passed + + This is the final goal for each instruction, meaning that the instruction + passes the test suite. + +Another approach to fix QEMU system test, where many instructions might fail, is +to compare the execution trace of your implementation with the reference +implementations already present in QEMU. To do so you should obtain a QEMU build +where the instruction pass the test, and run it with the following command: + +:: + + sudo unshare -p sudo -u <USER> bash -c \ + 'env -i <qemu-hexagon full path> -d cpu <TEST>' + +And do the same for your implementation, the generated execution traces will be +inherently aligned and can be inspected for behavioral differences using the +``diff`` tool. + +Example of debugging erroneous tinycode generator code +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The goal of this section is to provide a complete example of debugging +incorrectly emitted tinycode generator for a single instruction. + +Let's first introduce a bug in the tinycode generator of the ``A2_add`` +instruction, + +:: + + void emit_A2_add(DisasContext *ctx, Insn *insn, Packet *pkt, TCGv_i32 RdV, + TCGv_i32 RsV, TCGv_i32 RtV) + /* RdV=RsV+RtV;} */ + { + TCGv_i32 tmp_0 = tcg_temp_new_i32(); + tcg_gen_add_i32(tmp_0, RsV, RsV); + tcg_gen_mov_i32(RdV, tmp_0); + tcg_temp_free_i32(tmp_0); + } + +Here the bug, albeit hard to spot, is in ``tcg_gen_add_i32(tmp_0, RsV, RsV);`` +where we compute ``RsV + RsV`` instead of ``RsV + RtV``, as would be expected. +This particular bug is a bit tricky to pinpoint when debugging, since the +``A2_add`` instruction is so ubiquitous. As a result, pretty much all tests will +fail and therefore not provide a lot of information about the bug. + +For example, let's run the ``check-tcg`` tests + +:: + + make check-tcg TIMEOUT=1200 \ + DOCKER_IMAGE=debian-hexagon-cross \ + ENGINE=podman V=1 \ + DOCKER_CROSS_CC_GUEST=hexagon-unknown-linux-musl-clang + +In the output, we find a failure in the very first test case ``float_convs`` +due to a segmentation fault. Similarly, all harness and libc tests will fail as +well. At this point we have no clue where the actual bug lies, and need to start +ruling out instructions. As such a good starting point is to utilize the debug +options ``-d in_asm,cpu`` of QEMU to inspect the Hexagon instructions being run, +alongside the CPU state. We additionally need a working version of the emulator +to compare our buggy CPU state against, running + +:: + + meson configure -Dhexagon_idef_parser_enabled=false + +will disable the idef-parser for all instructions and fallback on manual +tinycode generator overrides, or on helper function implementations. Recompiling +gives us ``qemu-hexagon`` which passes all tests. If ``qemu-heaxgon-buggy`` is +our binary with the incorrect tinycode generators, we can compare the CPU state +between the two versions + +:: + + ./qemu-hexagon-buggy -d in_asm,cpu float_convs &> out_buggy + ./qemu-hexagon -d in_asm,cpu float_convs &> out_working + +Looking at ``diff -u out_buggy out_working`` shows us that the CPU state begins +to diverge on line 141, with an incorrect value in the ``R1`` register + +:: + + @@ -138,7 +138,7 @@ + + General Purpose Registers = { + r0 = 0x4100f9c0 + - r1 = 0x00042108 + + r1 = 0x00000000 + r2 = 0x00021084 + r3 = 0x00000000 + r4 = 0x00000000 + +If we also look into ``out_buggy`` directly we can inspect the input assembly +which the caused the incorrect CPU state, around line 141 we find + +:: + + 116 | ---------------- + 117 | IN: _start_c + 118 | 0x000210b0: 0xa09dc002 { allocframe(R29,#0x10):raw } + ... | ... + 137 | 0x000210fc: 0x5a00c4aa { call PC+2388 } + 138 | + 139 | General Purpose Registers = { + 140 | r0 = 0x4100fa70 + 141 | r1 = 0x00042108 + 142 | r2 = 0x00021084 + 143 | r3 = 0x00000000 + +Importantly, we see some Hexagon assembly followed by a dump of the CPU state, +now the CPU state is actually dumped before the input assembly above is ran. +As such, we are actually interested in the instructions ran before this. + +Scrolling up a bit, we find + +:: + + 54 | ---------------- + 55 | IN: _start + 56 | 0x00021088: 0x6a09c002 { R2 = C9/pc } + 57 | 0x0002108c: 0xbfe2ff82 { R2 = add(R2,#0xfffffffc) } + 58 | 0x00021090: 0x9182c001 { R1 = memw(R2+#0x0) } + 59 | 0x00021094: 0xf302c101 { R1 = add(R2,R1) } + 60 | 0x00021098: 0x7800c01e { R30 = #0x0 } + 61 | 0x0002109c: 0x707dc000 { R0 = R29 } + 62 | 0x000210a0: 0x763dfe1d { R29 = and(R29,#0xfffffff0) } + 63 | 0x000210a4: 0xa79dfdfe { memw(R29+#0xfffffff8) = R29 } + 64 | 0x000210a8: 0xbffdff1d { R29 = add(R29,#0xfffffff8) } + 65 | 0x000210ac: 0x5a00c002 { call PC+4 } + 66 | + 67 | General Purpose Registers = { + 68 | r0 = 0x00000000 + 69 | r1 = 0x00000000 + 70 | r2 = 0x00000000 + 71 | r3 = 0x00000000 + +Remember, the instructions on lines 56-65 are ran on the CPU state shown below +instructions, and as the CPU state has not diverged at this point, we know the +starting state is accurate. The bug must then lie within the instructions shown +here. Next we may notice that ``R1`` is only touched by lines 57 and 58, that is +by + +:: + + 58 | 0x00021090: 0x9182c001 { R1 = memw(R2+#0x0) } + 59 | 0x00021094: 0xf302c101 { R1 = add(R2,R1) } + +Therefore, we are either dealing with an correct load instruction +``R1 = memw(R2+#0x0)`` or with an incorrect add ``R1 = add(R2,R1)``. At this +point it might be easy enough to go directly to the emitted code for the +instructions mentioned and look for bugs, but we could also run +``./qemu-heaxgon -d op,in_asm float_conv`` where we find for the following +tinycode for the Hexagon ``add`` instruction + +:: + + ---- 00021094 + mov_i32 pkt_has_store_s1,$0x0 + add_i32 tmp0,r2,r2 + mov_i32 loc2,tmp0 + mov_i32 new_r1,loc2 + mov_i32 r1,new_r1 + +Here we have finally located our bug ``add_i32 tmp0,r2,r2``. + +Limitations and Future Development +---------------------------------- + +The main limitation of the current parser is given by the syntax-driven nature +of the Bison-generated parsers. This has the severe implication of only being +able to generate code in the order of evaluation of the various rules, without, +in any case, being able to backtrack and alter the generated code. + +An example limitation is highlighted by this statement of the input language: + +:: + + { (PsV==0xff) ? (PdV=0xff) : (PdV=0x00); } + +This ternary assignment, when written in this form requires us to emit some +proper control flow statements, which emit a jump to the first or to the second +code block, whose implementation is extremely convoluted, because when matching +the ternary assignment, the code evaluating the two assignments will be already +generated. + +Instead we pre-process that statement, making it become: + +:: + + { PdV = ((PsV==0xff)) ? 0xff : 0x00; } + +Which can be easily matched by the following parser rules: + +:: + + statement | rvalue ';' + + rvalue : rvalue QMARK rvalue COLON rvalue + | rvalue EQ rvalue + | LPAR rvalue RPAR + | assign_statement + | IMM + + assign_statement : pred ASSIGN rvalue + +Another example that highlight the limitation of the flex/bison parser can be +found even in the add operation we already saw: + +:: + + TCGv_i32 tmp_0 = tcg_temp_new_i32(); + tcg_gen_add_i32(tmp_0, RsV, RtV); + tcg_gen_mov_i32(RdV, tmp_0); + +The fact that we cannot directly use ``RdV`` as the destination of the sum is a +consequence of the syntax-driven nature of the parser. In fact when we parse the +assignment, the ``rvalue`` token, representing the sum has already been reduced, +and thus its code emitted and unchangeable. We rely on the fact that QEMU will +optimize our code reducing the useless move operations and the relative +temporaries. + +A possible improvement of the parser regards the support for vectorial +instructions and floating point instructions, which will require the extension +of the scanner, the parser, and a partial re-design of the type system, allowing +to build the vectorial semantics over the available vectorial tinycode generator +primitives. + +A more radical improvement will use the parser, not to generate directly the +tinycode generator code, but to generate an intermediate representation like the +LLVM IR, which in turn could be compiled using the clang TCG backend. That code +could be furtherly optimized, overcoming the limitations of the syntax-driven +parsing and could lead to a more optimized generated code. diff --git a/target/hexagon/idef-parser/idef-parser.h b/target/hexagon/idef-parser/idef-parser.h new file mode 100644 index 0000000000..5c49d4da3e --- /dev/null +++ b/target/hexagon/idef-parser/idef-parser.h @@ -0,0 +1,253 @@ +/* + * Copyright(c) 2019-2022 rev.ng Labs Srl. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + */ + +#ifndef IDEF_PARSER_H +#define IDEF_PARSER_H + +#include <inttypes.h> +#include <stdio.h> +#include <stdbool.h> +#include <glib.h> + +#define TCGV_NAME_SIZE 7 +#define MAX_WRITTEN_REGS 32 +#define OFFSET_STR_LEN 32 +#define ALLOC_LIST_LEN 32 +#define ALLOC_NAME_SIZE 32 +#define INIT_LIST_LEN 32 +#define OUT_BUF_LEN (1024 * 1024) +#define SIGNATURE_BUF_LEN (128 * 1024) +#define HEADER_BUF_LEN (128 * 1024) + +/* Variadic macros to wrap the buffer printing functions */ +#define EMIT(c, ...) \ + do { \ + g_string_append_printf((c)->out_str, __VA_ARGS__); \ + } while (0) + +#define EMIT_SIG(c, ...) \ + do { \ + g_string_append_printf((c)->signature_str, __VA_ARGS__); \ + } while (0) + +#define EMIT_HEAD(c, ...) \ + do { \ + g_string_append_printf((c)->header_str, __VA_ARGS__); \ + } while (0) + +/** + * Type of register, assigned to the HexReg.type field + */ +typedef enum { GENERAL_PURPOSE, CONTROL, MODIFIER, DOTNEW } HexRegType; + +typedef enum { UNKNOWN_SIGNEDNESS, SIGNED, UNSIGNED } HexSignedness; + +/** + * Semantic record of the REG tokens, identifying registers + */ +typedef struct HexReg { + uint8_t id; /**< Identifier of the register */ + HexRegType type; /**< Type of the register */ + unsigned bit_width; /**< Bit width of the reg, 32 or 64 bits */ +} HexReg; + +/** + * Data structure, identifying a TCGv temporary value + */ +typedef struct HexTmp { + unsigned index; /**< Index of the TCGv temporary value */ +} HexTmp; + +/** + * Enum of the possible immediated, an immediate is a value which is known + * at tinycode generation time, e.g. an integer value, not a TCGv + */ +enum ImmUnionTag { + I, + VARIABLE, + VALUE, + QEMU_TMP, + IMM_PC, + IMM_NPC, + IMM_CONSTEXT, +}; + +/** + * Semantic record of the IMM token, identifying an immediate constant + */ +typedef struct HexImm { + union { + char id; /**< Identifier, used when type is VARIABLE */ + uint64_t value; /**< Immediate value, used when type is VALUE */ + uint64_t index; /**< Index, used when type is QEMU_TMP */ + }; + enum ImmUnionTag type; /**< Type of the immediate */ +} HexImm; + +/** + * Semantic record of the PRED token, identifying a predicate + */ +typedef struct HexPred { + char id; /**< Identifier of the predicate */ +} HexPred; + +/** + * Semantic record of the SAT token, identifying the saturate operator + * Note: All saturates are assumed to implicitly set overflow. + */ +typedef struct HexSat { + HexSignedness signedness; /**< Signedness of the sat. op. */ +} HexSat; + +/** + * Semantic record of the CAST token, identifying the cast operator + */ +typedef struct HexCast { + unsigned bit_width; /**< Bit width of the cast operator */ + HexSignedness signedness; /**< Unsigned flag for the cast operator */ +} HexCast; + +/** + * Semantic record of the EXTRACT token, identifying the cast operator + */ +typedef struct HexExtract { + unsigned bit_width; /**< Bit width of the extract operator */ + unsigned storage_bit_width; /**< Actual bit width of the extract operator */ + HexSignedness signedness; /**< Unsigned flag for the extract operator */ +} HexExtract; + +/** + * Semantic record of the MPY token, identifying the fMPY multiplication + * operator + */ +typedef struct HexMpy { + unsigned first_bit_width; /**< Bit width of 1st operand of fMPY */ + unsigned second_bit_width; /**< Bit width of 2nd operand of fMPY */ + HexSignedness first_signedness; /**< Signedness of 1st operand of fMPY */ + HexSignedness second_signedness; /**< Signedness of 2nd operand of fMPY */ +} HexMpy; + +/** + * Semantic record of the VARID token, identifying declared variables + * of the input language + */ +typedef struct HexVar { + GString *name; /**< Name of the VARID variable */ +} HexVar; + +/** + * Data structure uniquely identifying a declared VARID variable, used for + * keeping track of declared variable, so that any variable is declared only + * once, and its properties are propagated through all the subsequent instances + * of that variable + */ +typedef struct Var { + GString *name; /**< Name of the VARID variable */ + uint8_t bit_width; /**< Bit width of the VARID variable */ + HexSignedness signedness; /**< Unsigned flag for the VARID var */ +} Var; + +/** + * Enum of the possible rvalue types, used in the HexValue.type field + */ +typedef enum RvalueUnionTag { + REGISTER, REGISTER_ARG, TEMP, IMMEDIATE, PREDICATE, VARID +} RvalueUnionTag; + +/** + * Semantic record of the rvalue token, identifying any numeric value, + * immediate or register based. The rvalue tokens are combined together + * through the use of several operators, to encode expressions + */ +typedef struct HexValue { + union { + HexReg reg; /**< rvalue of register type */ + HexTmp tmp; /**< rvalue of temporary type */ + HexImm imm; /**< rvalue of immediate type */ + HexPred pred; /**< rvalue of predicate type */ + HexVar var; /**< rvalue of declared variable type */ + }; + RvalueUnionTag type; /**< Type of the rvalue */ + unsigned bit_width; /**< Bit width of the rvalue */ + HexSignedness signedness; /**< Unsigned flag for the rvalue */ + bool is_dotnew; /**< rvalue of predicate type is dotnew? */ + bool is_manual; /**< Opt out of automatic freeing of params */ +} HexValue; + +/** + * State of ternary operator + */ +typedef enum TernaryState { IN_LEFT, IN_RIGHT } TernaryState; + +/** + * Data structure used to handle side effects inside ternary operators + */ +typedef struct Ternary { + TernaryState state; + HexValue cond; +} Ternary; + +/** + * Operator type, used for referencing the correct operator when calling the + * gen_bin_op() function, which in turn will generate the correct code to + * execute the operation between the two rvalues + */ +typedef enum OpType { + ADD_OP, SUB_OP, MUL_OP, ASL_OP, ASR_OP, LSR_OP, ANDB_OP, ORB_OP, + XORB_OP, ANDL_OP, MINI_OP, MAXI_OP +} OpType; + +/** + * Data structure including instruction specific information, to be cleared + * out after the compilation of each instruction + */ +typedef struct Inst { + GString *name; /**< Name of the compiled instruction */ + char *code_begin; /**< Beginning of instruction input code */ + char *code_end; /**< End of instruction input code */ + unsigned tmp_count; /**< Index of the last declared TCGv temp */ + unsigned qemu_tmp_count; /**< Index of the last declared int temp */ + unsigned if_count; /**< Index of the last declared if label */ + unsigned error_count; /**< Number of generated errors */ + GArray *allocated; /**< Allocated declaredVARID vars */ + GArray *init_list; /**< List of initialized registers */ + GArray *strings; /**< Strings allocated by the instruction */ +} Inst; + +/** + * Data structure representing the whole translation context, which in a + * reentrant flex/bison parser just like ours is passed between the scanner + * and the parser, holding all the necessary information to perform the + * parsing, this data structure survives between the compilation of different + * instructions + */ +typedef struct Context { + void *scanner; /**< Reentrant parser state pointer */ + char *input_buffer; /**< Buffer containing the input code */ + GString *out_str; /**< String containing the output code */ + GString *signature_str; /**< String containing the signatures code */ + GString *header_str; /**< String containing the header code */ + FILE *defines_file; /**< FILE * of the generated header */ + FILE *output_file; /**< FILE * of the C output file */ + FILE *enabled_file; /**< FILE * of the list of enabled inst */ + GArray *ternary; /**< Array to track nesting of ternary ops */ + unsigned total_insn; /**< Number of instructions in input file */ + unsigned implemented_insn; /**< Instruction compiled without errors */ + Inst inst; /**< Parsing data of the current inst */ +} Context; + +#endif /* IDEF_PARSER_H */ diff --git a/target/hexagon/idef-parser/idef-parser.lex b/target/hexagon/idef-parser/idef-parser.lex new file mode 100644 index 0000000000..ff87a02c3a --- /dev/null +++ b/target/hexagon/idef-parser/idef-parser.lex @@ -0,0 +1,471 @@ +%option noyywrap noinput nounput +%option 8bit reentrant bison-bridge +%option warn nodefault +%option bison-locations + +%{ +/* + * Copyright(c) 2019-2022 rev.ng Labs Srl. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + */ + +#include <string.h> +#include <stdbool.h> + +#include "hex_regs.h" + +#include "idef-parser.h" +#include "idef-parser.tab.h" + +/* Keep track of scanner position for error message printout */ +#define YY_USER_ACTION yylloc->first_column = yylloc->last_column; \ + for (int i = 0; yytext[i] != '\0'; i++) { \ + yylloc->last_column++; \ + } + +/* Global Error Counter */ +int error_count; + +%} + +/* Definitions */ +DIGIT [0-9] +LOWER_ID [a-z] +UPPER_ID [A-Z] +ID LOWER_ID|UPPER_ID +INST_NAME [A-Z]+[0-9]_([A-Za-z]|[0-9]|_)+ +HEX_DIGIT [0-9a-fA-F] +REG_ID_32 e|s|d|t|u|v|x|y +REG_ID_64 ee|ss|dd|tt|uu|vv|xx|yy +SYS_ID_32 s|d +SYS_ID_64 ss|dd +PRED_ID d|s|t|u|v|e|x|x +IMM_ID r|s|S|u|U +VAR_ID [a-zA-Z_][a-zA-Z0-9_]* +SIGN_ID s|u +STRING_LIT \"(\\.|[^"\\])*\" + +/* Tokens */ +%% + +[ \t\f\v]+ { /* Ignore whitespaces. */ } +[\n\r]+ { /* Ignore newlines. */ } +^#.*$ { /* Ignore linemarkers. */ } + +{INST_NAME} { yylval->string = g_string_new(yytext); + return INAME; } +"fFLOAT" | +"fUNFLOAT" | +"fDOUBLE" | +"fUNDOUBLE" | +"0.0" | +"0x1.0p52" | +"0x1.0p-52" { return FAIL; } +"in" { return IN; } +"R"{REG_ID_32}"V" { + yylval->rvalue.type = REGISTER_ARG; + yylval->rvalue.reg.type = GENERAL_PURPOSE; + yylval->rvalue.reg.id = yytext[1]; + yylval->rvalue.reg.bit_width = 32; + yylval->rvalue.bit_width = 32; + yylval->rvalue.is_dotnew = false; + yylval->rvalue.signedness = SIGNED; + return REG; } +"R"{REG_ID_64}"V" { + yylval->rvalue.type = REGISTER_ARG; + yylval->rvalue.reg.type = GENERAL_PURPOSE; + yylval->rvalue.reg.id = yytext[1]; + yylval->rvalue.reg.bit_width = 64; + yylval->rvalue.bit_width = 64; + yylval->rvalue.is_dotnew = false; + yylval->rvalue.signedness = SIGNED; + return REG; } +"MuV" { + yylval->rvalue.type = REGISTER_ARG; + yylval->rvalue.reg.type = MODIFIER; + yylval->rvalue.reg.id = 'u'; + yylval->rvalue.reg.bit_width = 32; + yylval->rvalue.bit_width = 32; + yylval->rvalue.signedness = SIGNED; + return REG; } +"C"{REG_ID_32}"V" { + yylval->rvalue.type = REGISTER_ARG; + yylval->rvalue.reg.type = CONTROL; + yylval->rvalue.reg.id = yytext[1]; + yylval->rvalue.reg.bit_width = 32; + yylval->rvalue.bit_width = 32; + yylval->rvalue.is_dotnew = false; + yylval->rvalue.signedness = SIGNED; + return REG; } +"C"{REG_ID_64}"V" { + yylval->rvalue.type = REGISTER_ARG; + yylval->rvalue.reg.type = CONTROL; + yylval->rvalue.reg.id = yytext[1]; + yylval->rvalue.reg.bit_width = 64; + yylval->rvalue.bit_width = 64; + yylval->rvalue.is_dotnew = false; + yylval->rvalue.signedness = SIGNED; + return REG; } +{IMM_ID}"iV" { + yylval->rvalue.type = IMMEDIATE; + yylval->rvalue.signedness = SIGNED; + yylval->rvalue.imm.type = VARIABLE; + yylval->rvalue.imm.id = yytext[0]; + yylval->rvalue.bit_width = 32; + yylval->rvalue.is_dotnew = false; + return IMM; } +"P"{PRED_ID}"V" { + yylval->rvalue.type = PREDICATE; + yylval->rvalue.pred.id = yytext[1]; + yylval->rvalue.bit_width = 32; + yylval->rvalue.is_dotnew = false; + yylval->rvalue.signedness = SIGNED; + return PRED; } +"P"{PRED_ID}"N" { + yylval->rvalue.type = PREDICATE; + yylval->rvalue.pred.id = yytext[1]; + yylval->rvalue.bit_width = 32; + yylval->rvalue.is_dotnew = true; + yylval->rvalue.signedness = SIGNED; + return PRED; } +"IV1DEAD()" | +"fPAUSE(uiV);" { return ';'; } +"+=" { return INC; } +"-=" { return DEC; } +"++" { return PLUSPLUS; } +"&=" { return ANDA; } +"|=" { return ORA; } +"^=" { return XORA; } +"<<" { return ASL; } +">>" { return ASR; } +">>>" { return LSR; } +"==" { return EQ; } +"!=" { return NEQ; } +"<=" { return LTE; } +">=" { return GTE; } +"&&" { return ANDL; } +"else" { return ELSE; } +"for" { return FOR; } +"fREAD_IREG" { return ICIRC; } +"fPART1" { return PART1; } +"if" { return IF; } +"fFRAME_SCRAMBLE" { return FSCR; } +"fFRAME_UNSCRAMBLE" { return FSCR; } +"fFRAMECHECK" { return FCHK; } +"Constant_extended" { return CONSTEXT; } +"fCL1_"{DIGIT} { return LOCNT; } +"fbrev" { return BREV; } +"fSXTN" { return SXT; } +"fZXTN" { return ZXT; } +"fDF_MAX" | +"fSF_MAX" | +"fMAX" { return MAX; } +"fDF_MIN" | +"fSF_MIN" | +"fMIN" { return MIN; } +"fABS" { return ABS; } +"fRNDN" { return ROUND; } +"fCRND" { return CROUND; } +"fCRNDN" { return CROUND; } +"fPM_CIRI" { return CIRCADD; } +"fPM_CIRR" { return CIRCADD; } +"fCOUNTONES_"{DIGIT} { return COUNTONES; } +"fSATN" { yylval->sat.signedness = SIGNED; + return SAT; } +"fSATUN" { yylval->sat.signedness = UNSIGNED; + return SAT; } +"fCONSTLL" { yylval->cast.bit_width = 64; + yylval->cast.signedness = SIGNED; + return CAST; } +"fSE32_64" { yylval->cast.bit_width = 64; + yylval->cast.signedness = SIGNED; + return CAST; } +"fCAST4_4u" { yylval->cast.bit_width = 32; + yylval->cast.signedness = UNSIGNED; + return CAST; } +"fCAST4_8s" { yylval->cast.bit_width = 64; + yylval->cast.signedness = SIGNED; + return CAST; } +"fCAST4_8u" { return CAST4_8U; } +"fCAST4u" { yylval->cast.bit_width = 32; + yylval->cast.signedness = UNSIGNED; + return CAST; } +"fNEWREG" | +"fCAST4_4s" | +"fCAST4s" { yylval->cast.bit_width = 32; + yylval->cast.signedness = SIGNED; + return CAST; } +"fCAST8_8u" { yylval->cast.bit_width = 64; + yylval->cast.signedness = UNSIGNED; + return CAST; } +"fCAST8u" { yylval->cast.bit_width = 64; + yylval->cast.signedness = UNSIGNED; + return CAST; } +"fCAST8_8s" | +"fCAST8s" { yylval->cast.bit_width = 64; + yylval->cast.signedness = SIGNED; + return CAST; } +"fGETBIT" { yylval->extract.bit_width = 1; + yylval->extract.storage_bit_width = 1; + yylval->extract.signedness = UNSIGNED; + return EXTRACT; } +"fGETBYTE" { yylval->extract.bit_width = 8; + yylval->extract.storage_bit_width = 8; + yylval->extract.signedness = SIGNED; + return EXTRACT; } +"fGETUBYTE" { yylval->extract.bit_width = 8; + yylval->extract.storage_bit_width = 8; + yylval->extract.signedness = UNSIGNED; + return EXTRACT; } +"fGETHALF" { yylval->extract.bit_width = 16; + yylval->extract.storage_bit_width = 16; + yylval->extract.signedness = SIGNED; + return EXTRACT; } +"fGETUHALF" { yylval->extract.bit_width = 16; + yylval->extract.storage_bit_width = 16; + yylval->extract.signedness = UNSIGNED; + return EXTRACT; } +"fGETWORD" { yylval->extract.bit_width = 32; + yylval->extract.storage_bit_width = 64; + yylval->extract.signedness = SIGNED; + return EXTRACT; } +"fGETUWORD" { yylval->extract.bit_width = 32; + yylval->extract.storage_bit_width = 64; + yylval->extract.signedness = UNSIGNED; + return EXTRACT; } +"fEXTRACTU_RANGE" { return EXTRANGE; } +"fSETBIT" { yylval->cast.bit_width = 1; + yylval->cast.signedness = SIGNED; + return DEPOSIT; } +"fSETBYTE" { yylval->cast.bit_width = 8; + yylval->cast.signedness = SIGNED; + return DEPOSIT; } +"fSETHALF" { yylval->cast.bit_width = 16; + yylval->cast.signedness = SIGNED; + return SETHALF; } +"fSETWORD" { yylval->cast.bit_width = 32; + yylval->cast.signedness = SIGNED; + return DEPOSIT; } +"fINSERT_BITS" { return INSBITS; } +"fSETBITS" { return SETBITS; } +"fMPY16UU" { yylval->mpy.first_bit_width = 16; + yylval->mpy.second_bit_width = 16; + yylval->mpy.first_signedness = UNSIGNED; + yylval->mpy.second_signedness = UNSIGNED; + return MPY; } +"fMPY16SU" { yylval->mpy.first_bit_width = 16; + yylval->mpy.second_bit_width = 16; + yylval->mpy.first_signedness = SIGNED; + yylval->mpy.second_signedness = UNSIGNED; + return MPY; } +"fMPY16SS" { yylval->mpy.first_bit_width = 16; + yylval->mpy.second_bit_width = 16; + yylval->mpy.first_signedness = SIGNED; + yylval->mpy.second_signedness = SIGNED; + return MPY; } +"fMPY32UU" { yylval->mpy.first_bit_width = 32; + yylval->mpy.second_bit_width = 32; + yylval->mpy.first_signedness = UNSIGNED; + yylval->mpy.second_signedness = UNSIGNED; + return MPY; } +"fMPY32SU" { yylval->mpy.first_bit_width = 32; + yylval->mpy.second_bit_width = 32; + yylval->mpy.first_signedness = SIGNED; + yylval->mpy.second_signedness = UNSIGNED; + return MPY; } +"fSFMPY" | +"fMPY32SS" { yylval->mpy.first_bit_width = 32; + yylval->mpy.second_bit_width = 32; + yylval->mpy.first_signedness = SIGNED; + yylval->mpy.second_signedness = SIGNED; + return MPY; } +"fMPY3216SS" { yylval->mpy.first_bit_width = 32; + yylval->mpy.second_bit_width = 16; + yylval->mpy.first_signedness = SIGNED; + yylval->mpy.second_signedness = SIGNED; + return MPY; } +"fMPY3216SU" { yylval->mpy.first_bit_width = 32; + yylval->mpy.second_bit_width = 16; + yylval->mpy.first_signedness = SIGNED; + yylval->mpy.second_signedness = UNSIGNED; + return MPY; } +"fNEWREG_ST" | +"fIMMEXT" | +"fMUST_IMMEXT" | +"fPASS" | +"fECHO" { return IDENTITY; } +"(size8u_t)" { yylval->cast.bit_width = 64; + yylval->cast.signedness = UNSIGNED; + return CAST; } +"(unsigned int)" { yylval->cast.bit_width = 32; + yylval->cast.signedness = UNSIGNED; + return CAST; } +"fREAD_PC()" | +"PC" { return PC; } +"fREAD_NPC()" | +"NPC" { return NPC; } +"fGET_LPCFG" | +"USR.LPCFG" { return LPCFG; } +"LOAD_CANCEL(EA)" { return LOAD_CANCEL; } +"STORE_CANCEL(EA)" | +"CANCEL" { return CANCEL; } +"N"{LOWER_ID}"N" { yylval->rvalue.type = REGISTER_ARG; + yylval->rvalue.reg.type = DOTNEW; + yylval->rvalue.reg.id = yytext[1]; + yylval->rvalue.reg.bit_width = 32; + yylval->rvalue.bit_width = 32; + yylval->rvalue.signedness = UNSIGNED; + return REG; } +"fREAD_SP()" | +"SP" { yylval->rvalue.type = REGISTER; + yylval->rvalue.reg.type = GENERAL_PURPOSE; + yylval->rvalue.reg.id = HEX_REG_SP; + yylval->rvalue.reg.bit_width = 32; + yylval->rvalue.bit_width = 32; + yylval->rvalue.signedness = UNSIGNED; + return REG; } +"fREAD_FP()" | +"FP" { yylval->rvalue.type = REGISTER; + yylval->rvalue.reg.type = GENERAL_PURPOSE; + yylval->rvalue.reg.id = HEX_REG_FP; + yylval->rvalue.reg.bit_width = 32; + yylval->rvalue.bit_width = 32; + yylval->rvalue.signedness = UNSIGNED; + return REG; } +"fREAD_LR()" | +"LR" { yylval->rvalue.type = REGISTER; + yylval->rvalue.reg.type = GENERAL_PURPOSE; + yylval->rvalue.reg.id = HEX_REG_LR; + yylval->rvalue.reg.bit_width = 32; + yylval->rvalue.bit_width = 32; + yylval->rvalue.signedness = UNSIGNED; + return REG; } +"fREAD_GP()" | +"GP" { yylval->rvalue.type = REGISTER; + yylval->rvalue.reg.type = CONTROL; + yylval->rvalue.reg.id = HEX_REG_GP; + yylval->rvalue.reg.bit_width = 32; + yylval->rvalue.bit_width = 32; + yylval->rvalue.signedness = UNSIGNED; + return REG; } +"fREAD_LC"[01] { yylval->rvalue.type = REGISTER; + yylval->rvalue.reg.type = CONTROL; + yylval->rvalue.reg.id = HEX_REG_LC0 + + (yytext[8] - '0') * 2; + yylval->rvalue.reg.bit_width = 32; + yylval->rvalue.bit_width = 32; + yylval->rvalue.signedness = UNSIGNED; + return REG; } +"LC"[01] { yylval->rvalue.type = REGISTER; + yylval->rvalue.reg.type = CONTROL; + yylval->rvalue.reg.id = HEX_REG_LC0 + + (yytext[2] - '0') * 2; + yylval->rvalue.reg.bit_width = 32; + yylval->rvalue.bit_width = 32; + yylval->rvalue.signedness = UNSIGNED; + return REG; } +"fREAD_SA"[01] { yylval->rvalue.type = REGISTER; + yylval->rvalue.reg.type = CONTROL; + yylval->rvalue.reg.id = HEX_REG_SA0 + + (yytext[8] - '0') * 2; + yylval->rvalue.reg.bit_width = 32; + yylval->rvalue.bit_width = 32; + yylval->rvalue.signedness = UNSIGNED; + return REG; } +"SA"[01] { yylval->rvalue.type = REGISTER; + yylval->rvalue.reg.type = CONTROL; + yylval->rvalue.reg.id = HEX_REG_SA0 + + (yytext[2] - '0') * 2; + yylval->rvalue.reg.bit_width = 32; + yylval->rvalue.bit_width = 32; + yylval->rvalue.signedness = UNSIGNED; + return REG; } +"fREAD_P0()" { yylval->rvalue.type = PREDICATE; + yylval->rvalue.pred.id = '0'; + yylval->rvalue.bit_width = 32; + return PRED; } +[pP]{DIGIT} { yylval->rvalue.type = PREDICATE; + yylval->rvalue.pred.id = yytext[1]; + yylval->rvalue.bit_width = 32; + yylval->rvalue.is_dotnew = false; + return PRED; } +[pP]{DIGIT}[nN] { yylval->rvalue.type = PREDICATE; + yylval->rvalue.pred.id = yytext[1]; + yylval->rvalue.bit_width = 32; + yylval->rvalue.is_dotnew = true; + return PRED; } +"fLSBNEW" { return LSBNEW; } +"N" { yylval->rvalue.type = IMMEDIATE; + yylval->rvalue.bit_width = 32; + yylval->rvalue.imm.type = VARIABLE; + yylval->rvalue.imm.id = 'N'; + return IMM; } +"i" { yylval->rvalue.type = IMMEDIATE; + yylval->rvalue.bit_width = 32; + yylval->rvalue.signedness = SIGNED; + yylval->rvalue.imm.type = I; + return IMM; } +{SIGN_ID} { if (yytext[0] == 'u') { + yylval->signedness = UNSIGNED; + } else { + yylval->signedness = SIGNED; + } + return SIGN; + } +"0x"{HEX_DIGIT}+ | +{DIGIT}+ { yylval->rvalue.type = IMMEDIATE; + yylval->rvalue.bit_width = 32; + yylval->rvalue.signedness = SIGNED; + yylval->rvalue.imm.type = VALUE; + yylval->rvalue.imm.value = strtoull(yytext, NULL, 0); + return IMM; } +"0x"{HEX_DIGIT}+"ULL" | +{DIGIT}+"ULL" { yylval->rvalue.type = IMMEDIATE; + yylval->rvalue.bit_width = 64; + yylval->rvalue.signedness = UNSIGNED; + yylval->rvalue.imm.type = VALUE; + yylval->rvalue.imm.value = strtoull(yytext, NULL, 0); + return IMM; } +"fLOAD" { return LOAD; } +"fSTORE" { return STORE; } +"fROTL" { return ROTL; } +"fCARRY_FROM_ADD" { return CARRY_FROM_ADD; } +"fADDSAT64" { return ADDSAT64; } +"size"[1248][us]"_t" { /* Handles "size_t" variants of int types */ + const unsigned int bits_per_byte = 8; + const unsigned int bytes = yytext[4] - '0'; + yylval->rvalue.bit_width = bits_per_byte * bytes; + if (yytext[5] == 'u') { + yylval->rvalue.signedness = UNSIGNED; + } else { + yylval->rvalue.signedness = SIGNED; + } + return TYPE_SIZE_T; } +"unsigned" { return TYPE_UNSIGNED; } +"long" { return TYPE_LONG; } +"int" { return TYPE_INT; } +"const" { /* Emit no token */ } +{VAR_ID} { /* Variable name, we adopt the C names convention */ + yylval->rvalue.type = VARID; + yylval->rvalue.var.name = g_string_new(yytext); + /* Default to an unknown signedness and 0 width. */ + yylval->rvalue.bit_width = 0; + yylval->rvalue.signedness = UNKNOWN_SIGNEDNESS; + return VAR; } +"fatal("{STRING_LIT}")" { /* Emit no token */ } +"fHINTJR(RsV)" { /* Emit no token */ } +. { return yytext[0]; } + +%% diff --git a/target/hexagon/idef-parser/idef-parser.y b/target/hexagon/idef-parser/idef-parser.y new file mode 100644 index 0000000000..8be44a0ad1 --- /dev/null +++ b/target/hexagon/idef-parser/idef-parser.y @@ -0,0 +1,965 @@ +%{ +/* + * Copyright(c) 2019-2022 rev.ng Labs Srl. All Rights Reserved. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + */ + +#include "idef-parser.h" +#include "parser-helpers.h" +#include "idef-parser.tab.h" +#include "idef-parser.yy.h" + +/* Uncomment this to disable yyasserts */ +/* #define NDEBUG */ + +#define ERR_LINE_CONTEXT 40 + +%} + +%lex-param {void *scanner} +%parse-param {void *scanner} +%parse-param {Context *c} + +%define parse.error verbose +%define parse.lac full +%define api.pure full + +%locations + +%union { + GString *string; + HexValue rvalue; + HexSat sat; + HexCast cast; + HexExtract extract; + HexMpy mpy; + HexSignedness signedness; + int index; +} + +/* Tokens */ +%start input + +%expect 1 + +%token IN INAME VAR +%token ABS CROUND ROUND CIRCADD COUNTONES INC DEC ANDA ORA XORA PLUSPLUS ASL +%token ASR LSR EQ NEQ LTE GTE MIN MAX ANDL FOR ICIRC IF MUN FSCR FCHK SXT +%token ZXT CONSTEXT LOCNT BREV SIGN LOAD STORE PC NPC LPCFG +%token LOAD_CANCEL CANCEL IDENTITY PART1 ROTL INSBITS SETBITS EXTRANGE +%token CAST4_8U FAIL CARRY_FROM_ADD ADDSAT64 LSBNEW +%token TYPE_SIZE_T TYPE_INT TYPE_SIGNED TYPE_UNSIGNED TYPE_LONG + +%token <rvalue> REG IMM PRED +%token <index> ELSE +%token <mpy> MPY +%token <sat> SAT +%token <cast> CAST DEPOSIT SETHALF +%token <extract> EXTRACT +%type <string> INAME +%type <rvalue> rvalue lvalue VAR assign_statement var var_decl var_type +%type <rvalue> FAIL +%type <rvalue> TYPE_SIGNED TYPE_UNSIGNED TYPE_INT TYPE_LONG TYPE_SIZE_T +%type <index> if_stmt IF +%type <signedness> SIGN + +/* Operator Precedences */ +%left MIN MAX +%left '(' +%left ',' +%left '=' +%right CIRCADD +%right INC DEC ANDA ORA XORA +%left '?' ':' +%left ANDL +%left '|' +%left '^' ANDOR +%left '&' +%left EQ NEQ +%left '<' '>' LTE GTE +%left ASL ASR LSR +%right ABS +%left '-' '+' +%left '*' '/' '%' MPY +%right '~' '!' +%left '[' +%right CAST +%right LOCNT BREV + +/* Bison Grammar */ +%% + +/* Input file containing the description of each hexagon instruction */ +input : instructions + { + YYACCEPT; + } + ; + +instructions : instruction instructions + | %empty + ; + +instruction : INAME + { + gen_inst(c, $1); + } + arguments + { + EMIT_SIG(c, ")"); + EMIT_HEAD(c, "{\n"); + } + code + { + gen_inst_code(c, &@1); + } + | error /* Recover gracefully after instruction compilation error */ + { + free_instruction(c); + } + ; + +arguments : '(' ')' + | '(' argument_list ')'; + +argument_list : argument_decl ',' argument_list + | argument_decl + ; + +var : VAR + { + track_string(c, $1.var.name); + $$ = $1; + } + ; + +/* + * Here the integer types are defined from valid combinations of + * `signed`, `unsigned`, `int`, and `long` tokens. The `signed` + * and `unsigned` tokens are here assumed to always be placed + * first in the type declaration, which is not the case in + * normal C. Similarly, `int` is assumed to always be placed + * last in the type. + */ +type_int : TYPE_INT + | TYPE_SIGNED + | TYPE_SIGNED TYPE_INT; +type_uint : TYPE_UNSIGNED + | TYPE_UNSIGNED TYPE_INT; +type_ulonglong : TYPE_UNSIGNED TYPE_LONG TYPE_LONG + | TYPE_UNSIGNED TYPE_LONG TYPE_LONG TYPE_INT; + +/* + * Here the various valid int types defined above specify + * their `signedness` and `bit_width`. The LP64 convention + * is assumed where longs are 64-bit, long longs are then + * assumed to also be 64-bit. + */ +var_type : TYPE_SIZE_T + { + yyassert(c, &@1, $1.bit_width <= 64, + "Variables with size > 64-bit are not supported!"); + $$ = $1; + } + | type_int + { + $$.signedness = SIGNED; + $$.bit_width = 32; + } + | type_uint + { + $$.signedness = UNSIGNED; + $$.bit_width = 32; + } + | type_ulonglong + { + $$.signedness = UNSIGNED; + $$.bit_width = 64; + } + ; + +/* Rule to capture declarations of VARs */ +var_decl : var_type IMM + { + /* + * Rule to capture "int i;" declarations since "i" is special + * and assumed to be always be IMM. Moreover, "i" is only + * assumed to be used in for-loops. + * + * Therefore we want to NOP these declarations. + */ + yyassert(c, &@2, $2.imm.type == I, + "Variable declaration with immedaties only allowed" + " for the loop induction variable \"i\""); + $$ = $2; + } + | var_type var + { + /* + * Allocate new variable, this checks that it hasn't already + * been declared. + */ + gen_varid_allocate(c, &@1, &$2, $1.bit_width, $1.signedness); + /* Copy var for variable name */ + $$ = $2; + /* Copy type info from var_type */ + $$.signedness = $1.signedness; + $$.bit_width = $1.bit_width; + } + ; + +/* Return the modified registers list */ +code : '{' statements '}' + { + c->inst.code_begin = c->input_buffer + @2.first_column - 1; + c->inst.code_end = c->input_buffer + @2.last_column - 1; + } + | '{' + { + /* Nop */ + } + '}' + ; + +argument_decl : REG + { + emit_arg(c, &@1, &$1); + /* Enqueue register into initialization list */ + g_array_append_val(c->inst.init_list, $1); + } + | PRED + { + emit_arg(c, &@1, &$1); + /* Enqueue predicate into initialization list */ + g_array_append_val(c->inst.init_list, $1); + } + | IN REG + { + emit_arg(c, &@2, &$2); + } + | IN PRED + { + emit_arg(c, &@2, &$2); + } + | IMM + { + EMIT_SIG(c, ", int %ciV", $1.imm.id); + } + ; + +code_block : '{' statements '}' + | '{' '}' + ; + +/* A list of one or more statements */ +statements : statements statement + | statement + ; + +/* Statements can be assignment (rvalue ';'), control or memory statements */ +statement : control_statement + | var_decl ';' + | rvalue ';' + { + gen_rvalue_free(c, &@1, &$1); + } + | code_block + | ';' + ; + +assign_statement : lvalue '=' rvalue + { + @1.last_column = @3.last_column; + gen_assign(c, &@1, &$1, &$3); + $$ = $1; + } + | var_decl '=' rvalue + { + @1.last_column = @3.last_column; + gen_assign(c, &@1, &$1, &$3); + $$ = $1; + } + | lvalue INC rvalue + { + @1.last_column = @3.last_column; + HexValue tmp = gen_bin_op(c, &@1, ADD_OP, &$1, &$3); + gen_assign(c, &@1, &$1, &tmp); + $$ = $1; + } + | lvalue DEC rvalue + { + @1.last_column = @3.last_column; + HexValue tmp = gen_bin_op(c, &@1, SUB_OP, &$1, &$3); + gen_assign(c, &@1, &$1, &tmp); + $$ = $1; + } + | lvalue ANDA rvalue + { + @1.last_column = @3.last_column; + HexValue tmp = gen_bin_op(c, &@1, ANDB_OP, &$1, &$3); + gen_assign(c, &@1, &$1, &tmp); + $$ = $1; + } + | lvalue ORA rvalue + { + @1.last_column = @3.last_column; + HexValue tmp = gen_bin_op(c, &@1, ORB_OP, &$1, &$3); + gen_assign(c, &@1, &$1, &tmp); + $$ = $1; + } + | lvalue XORA rvalue + { + @1.last_column = @3.last_column; + HexValue tmp = gen_bin_op(c, &@1, XORB_OP, &$1, &$3); + gen_assign(c, &@1, &$1, &tmp); + $$ = $1; + } + | PRED '=' rvalue + { + @1.last_column = @3.last_column; + gen_pred_assign(c, &@1, &$1, &$3); + } + | IMM '=' rvalue + { + @1.last_column = @3.last_column; + yyassert(c, &@1, $3.type == IMMEDIATE, + "Cannot assign non-immediate to immediate!"); + yyassert(c, &@1, $1.imm.type == VARIABLE, + "Cannot assign to non-variable!"); + /* Assign to the function argument */ + OUT(c, &@1, &$1, " = ", &$3, ";\n"); + $$ = $1; + } + | PC '=' rvalue + { + @1.last_column = @3.last_column; + yyassert(c, &@1, !is_inside_ternary(c), + "Assignment side-effect not modeled!"); + $3 = gen_rvalue_truncate(c, &@1, &$3); + $3 = rvalue_materialize(c, &@1, &$3); + OUT(c, &@1, "gen_write_new_pc(", &$3, ");\n"); + gen_rvalue_free(c, &@1, &$3); /* Free temporary value */ + } + | LOAD '(' IMM ',' IMM ',' SIGN ',' var ',' lvalue ')' + { + @1.last_column = @12.last_column; + yyassert(c, &@1, !is_inside_ternary(c), + "Assignment side-effect not modeled!"); + yyassert(c, &@1, $3.imm.value == 1, + "LOAD of arrays not supported!"); + gen_load(c, &@1, &$5, $7, &$9, &$11); + } + | STORE '(' IMM ',' IMM ',' var ',' rvalue ')' + /* Store primitive */ + { + @1.last_column = @10.last_column; + yyassert(c, &@1, !is_inside_ternary(c), + "Assignment side-effect not modeled!"); + yyassert(c, &@1, $3.imm.value == 1, + "STORE of arrays not supported!"); + gen_store(c, &@1, &$5, &$7, &$9); + } + | LPCFG '=' rvalue + { + @1.last_column = @3.last_column; + yyassert(c, &@1, !is_inside_ternary(c), + "Assignment side-effect not modeled!"); + $3 = gen_rvalue_truncate(c, &@1, &$3); + $3 = rvalue_materialize(c, &@1, &$3); + OUT(c, &@1, "SET_USR_FIELD(USR_LPCFG, ", &$3, ");\n"); + gen_rvalue_free(c, &@1, &$3); + } + | DEPOSIT '(' rvalue ',' rvalue ',' rvalue ')' + { + @1.last_column = @8.last_column; + yyassert(c, &@1, !is_inside_ternary(c), + "Assignment side-effect not modeled!"); + gen_deposit_op(c, &@1, &$5, &$7, &$3, &$1); + } + | SETHALF '(' rvalue ',' lvalue ',' rvalue ')' + { + @1.last_column = @8.last_column; + yyassert(c, &@1, !is_inside_ternary(c), + "Assignment side-effect not modeled!"); + gen_sethalf(c, &@1, &$1, &$3, &$5, &$7); + } + | SETBITS '(' rvalue ',' rvalue ',' rvalue ',' rvalue ')' + { + @1.last_column = @10.last_column; + yyassert(c, &@1, !is_inside_ternary(c), + "Assignment side-effect not modeled!"); + gen_setbits(c, &@1, &$3, &$5, &$7, &$9); + } + | INSBITS '(' lvalue ',' rvalue ',' rvalue ',' rvalue ')' + { + @1.last_column = @10.last_column; + yyassert(c, &@1, !is_inside_ternary(c), + "Assignment side-effect not modeled!"); + gen_rdeposit_op(c, &@1, &$3, &$9, &$7, &$5); + } + | IDENTITY '(' rvalue ')' + { + @1.last_column = @4.last_column; + $$ = $3; + } + ; + +control_statement : frame_check + | cancel_statement + | if_statement + | for_statement + | fpart1_statement + ; + +frame_check : FCHK '(' rvalue ',' rvalue ')' ';' + { + gen_rvalue_free(c, &@1, &$3); + gen_rvalue_free(c, &@1, &$5); + } + ; + +cancel_statement : LOAD_CANCEL + { + gen_load_cancel(c, &@1); + } + | CANCEL + { + gen_cancel(c, &@1); + } + ; + +if_statement : if_stmt + { + /* Fix else label */ + OUT(c, &@1, "gen_set_label(if_label_", &$1, ");\n"); + } + | if_stmt ELSE + { + @1.last_column = @2.last_column; + $2 = gen_if_else(c, &@1, $1); + } + statement + { + OUT(c, &@1, "gen_set_label(if_label_", &$2, ");\n"); + } + ; + +for_statement : FOR '(' IMM '=' IMM ';' IMM '<' IMM ';' IMM PLUSPLUS ')' + { + yyassert(c, &@3, + $3.imm.type == I && + $7.imm.type == I && + $11.imm.type == I, + "Loop induction variable must be \"i\""); + @1.last_column = @13.last_column; + OUT(c, &@1, "for (int ", &$3, " = ", &$5, "; ", + &$7, " < ", &$9); + OUT(c, &@1, "; ", &$11, "++) {\n"); + } + code_block + { + OUT(c, &@1, "}\n"); + } + ; + +fpart1_statement : PART1 + { + OUT(c, &@1, "if (insn->part1) {\n"); + } + '(' statements ')' + { + @1.last_column = @3.last_column; + OUT(c, &@1, "return; }\n"); + } + ; + +if_stmt : IF '(' rvalue ')' + { + @1.last_column = @3.last_column; + $1 = gen_if_cond(c, &@1, &$3); + } + statement + { + $$ = $1; + } + ; + +rvalue : FAIL + { + yyassert(c, &@1, false, "Encountered a FAIL token as rvalue.\n"); + } + | assign_statement + | REG + { + $$ = $1; + } + | IMM + { + $$ = $1; + } + | PRED + { + $$ = gen_rvalue_pred(c, &@1, &$1); + } + | PC + { + /* Read PC from the CR */ + HexValue rvalue; + memset(&rvalue, 0, sizeof(HexValue)); + rvalue.type = IMMEDIATE; + rvalue.imm.type = IMM_PC; + rvalue.bit_width = 32; + rvalue.signedness = UNSIGNED; + $$ = rvalue; + } + | NPC + { + /* + * NPC is only read from CALLs, so we can hardcode it + * at translation time + */ + HexValue rvalue; + memset(&rvalue, 0, sizeof(HexValue)); + rvalue.type = IMMEDIATE; + rvalue.imm.type = IMM_NPC; + rvalue.bit_width = 32; + rvalue.signedness = UNSIGNED; + $$ = rvalue; + } + | CONSTEXT + { + HexValue rvalue; + memset(&rvalue, 0, sizeof(HexValue)); + rvalue.type = IMMEDIATE; + rvalue.imm.type = IMM_CONSTEXT; + rvalue.signedness = UNSIGNED; + rvalue.is_dotnew = false; + rvalue.is_manual = false; + $$ = rvalue; + } + | var + { + $$ = gen_rvalue_var(c, &@1, &$1); + } + | MPY '(' rvalue ',' rvalue ')' + { + @1.last_column = @6.last_column; + $$ = gen_rvalue_mpy(c, &@1, &$1, &$3, &$5); + } + | rvalue '+' rvalue + { + @1.last_column = @3.last_column; + $$ = gen_bin_op(c, &@1, ADD_OP, &$1, &$3); + } + | rvalue '-' rvalue + { + @1.last_column = @3.last_column; + $$ = gen_bin_op(c, &@1, SUB_OP, &$1, &$3); + } + | rvalue '*' rvalue + { + @1.last_column = @3.last_column; + $$ = gen_bin_op(c, &@1, MUL_OP, &$1, &$3); + } + | rvalue ASL rvalue + { + @1.last_column = @3.last_column; + $$ = gen_bin_op(c, &@1, ASL_OP, &$1, &$3); + } + | rvalue ASR rvalue + { + @1.last_column = @3.last_column; + assert_signedness(c, &@1, $1.signedness); + if ($1.signedness == UNSIGNED) { + $$ = gen_bin_op(c, &@1, LSR_OP, &$1, &$3); + } else if ($1.signedness == SIGNED) { + $$ = gen_bin_op(c, &@1, ASR_OP, &$1, &$3); + } + } + | rvalue LSR rvalue + { + @1.last_column = @3.last_column; + $$ = gen_bin_op(c, &@1, LSR_OP, &$1, &$3); + } + | rvalue '&' rvalue + { + @1.last_column = @3.last_column; + $$ = gen_bin_op(c, &@1, ANDB_OP, &$1, &$3); + } + | rvalue '|' rvalue + { + @1.last_column = @3.last_column; + $$ = gen_bin_op(c, &@1, ORB_OP, &$1, &$3); + } + | rvalue '^' rvalue + { + @1.last_column = @3.last_column; + $$ = gen_bin_op(c, &@1, XORB_OP, &$1, &$3); + } + | rvalue ANDL rvalue + { + @1.last_column = @3.last_column; + $$ = gen_bin_op(c, &@1, ANDL_OP, &$1, &$3); + } + | MIN '(' rvalue ',' rvalue ')' + { + @1.last_column = @3.last_column; + $$ = gen_bin_op(c, &@1, MINI_OP, &$3, &$5); + } + | MAX '(' rvalue ',' rvalue ')' + { + @1.last_column = @3.last_column; + $$ = gen_bin_op(c, &@1, MAXI_OP, &$3, &$5); + } + | '~' rvalue + { + @1.last_column = @2.last_column; + $$ = gen_rvalue_not(c, &@1, &$2); + } + | '!' rvalue + { + @1.last_column = @2.last_column; + $$ = gen_rvalue_notl(c, &@1, &$2); + } + | SAT '(' IMM ',' rvalue ')' + { + @1.last_column = @6.last_column; + $$ = gen_rvalue_sat(c, &@1, &$1, &$3, &$5); + } + | CAST rvalue + { + @1.last_column = @2.last_column; + /* Assign target signedness */ + $2.signedness = $1.signedness; + $$ = gen_cast_op(c, &@1, &$2, $1.bit_width, $1.signedness); + } + | rvalue EQ rvalue + { + @1.last_column = @3.last_column; + $$ = gen_bin_cmp(c, &@1, TCG_COND_EQ, &$1, &$3); + } + | rvalue NEQ rvalue + { + @1.last_column = @3.last_column; + $$ = gen_bin_cmp(c, &@1, TCG_COND_NE, &$1, &$3); + } + | rvalue '<' rvalue + { + @1.last_column = @3.last_column; + + assert_signedness(c, &@1, $1.signedness); + assert_signedness(c, &@1, $3.signedness); + if ($1.signedness == UNSIGNED || $3.signedness == UNSIGNED) { + $$ = gen_bin_cmp(c, &@1, TCG_COND_LTU, &$1, &$3); + } else { + $$ = gen_bin_cmp(c, &@1, TCG_COND_LT, &$1, &$3); + } + } + | rvalue '>' rvalue + { + @1.last_column = @3.last_column; + + assert_signedness(c, &@1, $1.signedness); + assert_signedness(c, &@1, $3.signedness); + if ($1.signedness == UNSIGNED || $3.signedness == UNSIGNED) { + $$ = gen_bin_cmp(c, &@1, TCG_COND_GTU, &$1, &$3); + } else { + $$ = gen_bin_cmp(c, &@1, TCG_COND_GT, &$1, &$3); + } + } + | rvalue LTE rvalue + { + @1.last_column = @3.last_column; + + assert_signedness(c, &@1, $1.signedness); + assert_signedness(c, &@1, $3.signedness); + if ($1.signedness == UNSIGNED || $3.signedness == UNSIGNED) { + $$ = gen_bin_cmp(c, &@1, TCG_COND_LEU, &$1, &$3); + } else { + $$ = gen_bin_cmp(c, &@1, TCG_COND_LE, &$1, &$3); + } + } + | rvalue GTE rvalue + { + @1.last_column = @3.last_column; + + assert_signedness(c, &@1, $1.signedness); + assert_signedness(c, &@1, $3.signedness); + if ($1.signedness == UNSIGNED || $3.signedness == UNSIGNED) { + $$ = gen_bin_cmp(c, &@1, TCG_COND_GEU, &$1, &$3); + } else { + $$ = gen_bin_cmp(c, &@1, TCG_COND_GE, &$1, &$3); + } + } + | rvalue '?' + { + $1.is_manual = true; + Ternary t = { 0 }; + t.state = IN_LEFT; + t.cond = $1; + g_array_append_val(c->ternary, t); + } + rvalue ':' + { + Ternary *t = &g_array_index(c->ternary, Ternary, + c->ternary->len - 1); + t->state = IN_RIGHT; + } + rvalue + { + @1.last_column = @5.last_column; + $$ = gen_rvalue_ternary(c, &@1, &$1, &$4, &$7); + } + | FSCR '(' rvalue ')' + { + @1.last_column = @4.last_column; + $$ = gen_rvalue_fscr(c, &@1, &$3); + } + | SXT '(' rvalue ',' IMM ',' rvalue ')' + { + @1.last_column = @8.last_column; + yyassert(c, &@1, $5.type == IMMEDIATE && + $5.imm.type == VALUE, + "SXT expects immediate values\n"); + $$ = gen_extend_op(c, &@1, &$3, $5.imm.value, &$7, SIGNED); + } + | ZXT '(' rvalue ',' IMM ',' rvalue ')' + { + @1.last_column = @8.last_column; + yyassert(c, &@1, $5.type == IMMEDIATE && + $5.imm.type == VALUE, + "ZXT expects immediate values\n"); + $$ = gen_extend_op(c, &@1, &$3, $5.imm.value, &$7, UNSIGNED); + } + | '(' rvalue ')' + { + $$ = $2; + } + | ABS rvalue + { + @1.last_column = @2.last_column; + $$ = gen_rvalue_abs(c, &@1, &$2); + } + | CROUND '(' rvalue ',' rvalue ')' + { + @1.last_column = @6.last_column; + $$ = gen_convround_n(c, &@1, &$3, &$5); + } + | CROUND '(' rvalue ')' + { + @1.last_column = @4.last_column; + $$ = gen_convround(c, &@1, &$3); + } + | ROUND '(' rvalue ',' rvalue ')' + { + @1.last_column = @6.last_column; + $$ = gen_round(c, &@1, &$3, &$5); + } + | '-' rvalue + { + @1.last_column = @2.last_column; + $$ = gen_rvalue_neg(c, &@1, &$2); + } + | ICIRC '(' rvalue ')' ASL IMM + { + @1.last_column = @6.last_column; + $$ = gen_tmp(c, &@1, 32, UNSIGNED); + OUT(c, &@1, "gen_read_ireg(", &$$, ", ", &$3, ", ", &$6, ");\n"); + gen_rvalue_free(c, &@1, &$3); + } + | CIRCADD '(' rvalue ',' rvalue ',' rvalue ')' + { + @1.last_column = @8.last_column; + gen_circ_op(c, &@1, &$3, &$5, &$7); + } + | LOCNT '(' rvalue ')' + { + @1.last_column = @4.last_column; + /* Leading ones count */ + $$ = gen_locnt_op(c, &@1, &$3); + } + | COUNTONES '(' rvalue ')' + { + @1.last_column = @4.last_column; + /* Ones count */ + $$ = gen_ctpop_op(c, &@1, &$3); + } + | LPCFG + { + $$ = gen_tmp_value(c, &@1, "0", 32, UNSIGNED); + OUT(c, &@1, "GET_USR_FIELD(USR_LPCFG, ", &$$, ");\n"); + } + | EXTRACT '(' rvalue ',' rvalue ')' + { + @1.last_column = @6.last_column; + $$ = gen_extract_op(c, &@1, &$5, &$3, &$1); + } + | EXTRANGE '(' rvalue ',' rvalue ',' rvalue ')' + { + @1.last_column = @8.last_column; + yyassert(c, &@1, $5.type == IMMEDIATE && + $5.imm.type == VALUE && + $7.type == IMMEDIATE && + $7.imm.type == VALUE, + "Range extract needs immediate values!\n"); + $$ = gen_rextract_op(c, + &@1, + &$3, + $7.imm.value, + $5.imm.value - $7.imm.value + 1); + } + | CAST4_8U '(' rvalue ')' + { + @1.last_column = @4.last_column; + $$ = gen_rvalue_truncate(c, &@1, &$3); + $$.signedness = UNSIGNED; + $$ = rvalue_materialize(c, &@1, &$$); + $$ = gen_rvalue_extend(c, &@1, &$$); + } + | BREV '(' rvalue ')' + { + @1.last_column = @4.last_column; + $$ = gen_rvalue_brev(c, &@1, &$3); + } + | ROTL '(' rvalue ',' rvalue ')' + { + @1.last_column = @6.last_column; + $$ = gen_rotl(c, &@1, &$3, &$5); + } + | ADDSAT64 '(' rvalue ',' rvalue ',' rvalue ')' + { + @1.last_column = @8.last_column; + gen_addsat64(c, &@1, &$3, &$5, &$7); + } + | CARRY_FROM_ADD '(' rvalue ',' rvalue ',' rvalue ')' + { + @1.last_column = @8.last_column; + $$ = gen_carry_from_add(c, &@1, &$3, &$5, &$7); + } + | LSBNEW '(' rvalue ')' + { + @1.last_column = @4.last_column; + HexValue one = gen_imm_value(c, &@1, 1, 32, UNSIGNED); + $$ = gen_bin_op(c, &@1, ANDB_OP, &$3, &one); + } + ; + +lvalue : FAIL + { + @1.last_column = @1.last_column; + yyassert(c, &@1, false, "Encountered a FAIL token as lvalue.\n"); + } + | REG + { + $$ = $1; + } + | var + { + $$ = $1; + } + ; + +%% + +int main(int argc, char **argv) +{ + if (argc != 5) { + fprintf(stderr, + "Semantics: Hexagon ISA to tinycode generator compiler\n\n"); + fprintf(stderr, + "Usage: ./semantics IDEFS EMITTER_C EMITTER_H " + "ENABLED_INSTRUCTIONS_LIST\n"); + return 1; + } + + enum { + ARG_INDEX_ARGV0 = 0, + ARG_INDEX_IDEFS, + ARG_INDEX_EMITTER_C, + ARG_INDEX_EMITTER_H, + ARG_INDEX_ENABLED_INSTRUCTIONS_LIST + }; + + FILE *enabled_file = fopen(argv[ARG_INDEX_ENABLED_INSTRUCTIONS_LIST], "w"); + + FILE *output_file = fopen(argv[ARG_INDEX_EMITTER_C], "w"); + fputs("#include \"qemu/osdep.h\"\n", output_file); + fputs("#include \"qemu/log.h\"\n", output_file); + fputs("#include \"cpu.h\"\n", output_file); + fputs("#include \"internal.h\"\n", output_file); + fputs("#include \"tcg/tcg-op.h\"\n", output_file); + fputs("#include \"insn.h\"\n", output_file); + fputs("#include \"opcodes.h\"\n", output_file); + fputs("#include \"translate.h\"\n", output_file); + fputs("#define QEMU_GENERATE\n", output_file); + fputs("#include \"genptr.h\"\n", output_file); + fputs("#include \"tcg/tcg.h\"\n", output_file); + fputs("#include \"macros.h\"\n", output_file); + fprintf(output_file, "#include \"%s\"\n", argv[ARG_INDEX_EMITTER_H]); + + FILE *defines_file = fopen(argv[ARG_INDEX_EMITTER_H], "w"); + assert(defines_file != NULL); + fputs("#ifndef HEX_EMITTER_H\n", defines_file); + fputs("#define HEX_EMITTER_H\n", defines_file); + fputs("\n", defines_file); + fputs("#include \"insn.h\"\n\n", defines_file); + + /* Parser input file */ + Context context = { 0 }; + context.defines_file = defines_file; + context.output_file = output_file; + context.enabled_file = enabled_file; + /* Initialize buffers */ + context.out_str = g_string_new(NULL); + context.signature_str = g_string_new(NULL); + context.header_str = g_string_new(NULL); + context.ternary = g_array_new(FALSE, TRUE, sizeof(Ternary)); + /* Read input file */ + FILE *input_file = fopen(argv[ARG_INDEX_IDEFS], "r"); + fseek(input_file, 0L, SEEK_END); + long input_size = ftell(input_file); + context.input_buffer = (char *) calloc(input_size + 1, sizeof(char)); + fseek(input_file, 0L, SEEK_SET); + size_t read_chars = fread(context.input_buffer, + sizeof(char), + input_size, + input_file); + if (read_chars != (size_t) input_size) { + fprintf(stderr, "Error: an error occurred while reading input file!\n"); + return -1; + } + yylex_init(&context.scanner); + YY_BUFFER_STATE buffer; + buffer = yy_scan_string(context.input_buffer, context.scanner); + /* Start the parsing procedure */ + yyparse(context.scanner, &context); + if (context.implemented_insn != context.total_insn) { + fprintf(stderr, + "Warning: %d/%d meta instructions have been implemented!\n", + context.implemented_insn, + context.total_insn); + } + fputs("#endif " START_COMMENT " HEX_EMITTER_h " END_COMMENT "\n", + defines_file); + /* Cleanup */ + yy_delete_buffer(buffer, context.scanner); + yylex_destroy(context.scanner); + free(context.input_buffer); + g_string_free(context.out_str, TRUE); + g_string_free(context.signature_str, TRUE); + g_string_free(context.header_str, TRUE); + g_array_free(context.ternary, TRUE); + fclose(output_file); + fclose(input_file); + fclose(defines_file); + fclose(enabled_file); + + return 0; +} diff --git a/target/hexagon/idef-parser/macros.inc b/target/hexagon/idef-parser/macros.inc new file mode 100644 index 0000000000..6b697da87a --- /dev/null +++ b/target/hexagon/idef-parser/macros.inc @@ -0,0 +1,140 @@ +/* + * Copyright(c) 2019-2022 rev.ng Labs Srl. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + */ + +/* Copy rules */ +#define fLSBOLD(VAL) (fGETBIT(0, VAL)) +#define fSATH(VAL) fSATN(16, VAL) +#define fSATUH(VAL) fSATUN(16, VAL) +#define fVSATH(VAL) fVSATN(16, VAL) +#define fVSATUH(VAL) fVSATUN(16, VAL) +#define fSATUB(VAL) fSATUN(8, VAL) +#define fSATB(VAL) fSATN(8, VAL) +#define fVSATUB(VAL) fVSATUN(8, VAL) +#define fVSATB(VAL) fVSATN(8, VAL) +#define fCALL(A) fWRITE_LR(fREAD_NPC()); fWRITE_NPC(A); +#define fCALLR(A) fWRITE_LR(fREAD_NPC()); fWRITE_NPC(A); +#define fCAST2_8s(A) fSXTN(16, 64, A) +#define fCAST2_8u(A) fZXTN(16, 64, A) +#define fVSATW(A) fVSATN(32, fCAST8_8s(A)) +#define fSATW(A) fSATN(32, fCAST8_8s(A)) +#define fVSAT(A) fVSATN(32, A) +#define fSAT(A) fSATN(32, A) + +/* Ease parsing */ +#define f8BITSOF(VAL) ((VAL) ? 0xff : 0x00) +#define fREAD_GP() (Constant_extended ? (0) : GP) +#define fCLIP(DST, SRC, U) (DST = fMIN((1 << U) - 1, fMAX(SRC, -(1 << U)))) +#define fBIDIR_ASHIFTL(SRC, SHAMT, REGSTYPE) \ + ((SHAMT > 0) ? \ + (fCAST##REGSTYPE##s(SRC) << SHAMT) : \ + (fCAST##REGSTYPE##s(SRC) >> -SHAMT)) + +#define fBIDIR_LSHIFTL(SRC, SHAMT, REGSTYPE) \ + ((SHAMT > 0) ? \ + (fCAST##REGSTYPE##u(SRC) << SHAMT) : \ + (fCAST##REGSTYPE##u(SRC) >>> -SHAMT)) + +#define fBIDIR_ASHIFTR(SRC, SHAMT, REGSTYPE) \ + ((SHAMT > 0) ? \ + (fCAST##REGSTYPE##s(SRC) >> SHAMT) : \ + (fCAST##REGSTYPE##s(SRC) << -SHAMT)) + +#define fBIDIR_SHIFTR(SRC, SHAMT, REGSTYPE) \ + (((SHAMT) < 0) ? ((fCAST##REGSTYPE(SRC) << ((-(SHAMT)) - 1)) << 1) \ + : (fCAST##REGSTYPE(SRC) >> (SHAMT))) + +#define fBIDIR_LSHIFTR(SRC, SHAMT, REGSTYPE) \ + fBIDIR_SHIFTR(SRC, SHAMT, REGSTYPE##u) + +#define fSATVALN(N, VAL) \ + fSET_OVERFLOW( \ + ((VAL) < 0) ? (-(1LL << ((N) - 1))) : ((1LL << ((N) - 1)) - 1) \ + ) + +#define fSAT_ORIG_SHL(A, ORIG_REG) \ + (((fCAST4s((fSAT(A)) ^ (fCAST4s(ORIG_REG)))) < 0) \ + ? fSATVALN(32, (fCAST4s(ORIG_REG))) \ + : ((((ORIG_REG) > 0) && ((A) == 0)) ? fSATVALN(32, (ORIG_REG)) \ + : fSAT(A))) + +#define fBIDIR_ASHIFTR_SAT(SRC, SHAMT, REGSTYPE) \ + (((SHAMT) < 0) ? fSAT_ORIG_SHL((fCAST##REGSTYPE##s(SRC) \ + << ((-(SHAMT)) - 1)) << 1, (SRC)) \ + : (fCAST##REGSTYPE##s(SRC) >> (SHAMT))) + +#define fBIDIR_ASHIFTL_SAT(SRC, SHAMT, REGSTYPE) \ + (((SHAMT) < 0) \ + ? ((fCAST##REGSTYPE##s(SRC) >> ((-(SHAMT)) - 1)) >> 1) \ + : fSAT_ORIG_SHL(fCAST##REGSTYPE##s(SRC) << (SHAMT), (SRC))) + +#define fEXTRACTU_BIDIR(INREG, WIDTH, OFFSET) \ + (fZXTN(WIDTH, 32, fBIDIR_LSHIFTR((INREG), (OFFSET), 4_8))) + +/* Least significant bit operations */ +#define fLSBNEW0 fLSBNEW(P0N) +#define fLSBNEW1 fLSBNEW(P1N) +#define fLSBOLDNOT(VAL) fGETBIT(0, ~VAL) +#define fLSBNEWNOT(PRED) (fLSBNEW(~PRED)) +#define fLSBNEW0NOT fLSBNEW(~P0N) +#define fLSBNEW1NOT fLSBNEW(~P1N) + +/* Assignments */ +#define fPCALIGN(IMM) (IMM = IMM & ~3) +#define fWRITE_LR(A) (LR = A) +#define fWRITE_FP(A) (FP = A) +#define fWRITE_SP(A) (SP = A) +/* + * Note: There is a rule in the parser that matches `PC = ...` and emits + * a call to `gen_write_new_pc`. We need to call `gen_write_new_pc` to + * get the correct semantics when there are multiple stores in a packet. + */ +#define fBRANCH(LOC, TYPE) (PC = LOC) +#define fJUMPR(REGNO, TARGET, TYPE) (PC = TARGET) +#define fWRITE_LOOP_REGS0(START, COUNT) SA0 = START; (LC0 = COUNT) +#define fWRITE_LOOP_REGS1(START, COUNT) SA1 = START; (LC1 = COUNT) +#define fWRITE_LC0(VAL) (LC0 = VAL) +#define fWRITE_LC1(VAL) (LC1 = VAL) +#define fSET_LPCFG(VAL) (USR.LPCFG = VAL) +#define fWRITE_P0(VAL) P0 = VAL; +#define fWRITE_P1(VAL) P1 = VAL; +#define fWRITE_P3(VAL) P3 = VAL; +#define fEA_RI(REG, IMM) (EA = REG + IMM) +#define fEA_RRs(REG, REG2, SCALE) (EA = REG + (REG2 << SCALE)) +#define fEA_IRs(IMM, REG, SCALE) (EA = IMM + (REG << SCALE)) +#define fEA_IMM(IMM) (EA = IMM) +#define fEA_REG(REG) (EA = REG) +#define fEA_BREVR(REG) (EA = fbrev(REG)) +#define fEA_GPI(IMM) (EA = fREAD_GP() + IMM) +#define fPM_I(REG, IMM) (REG = REG + IMM) +#define fPM_M(REG, MVAL) (REG = REG + MVAL) +#define fWRITE_NPC(VAL) (PC = VAL) + +/* Unary operators */ +#define fROUND(A) (A + 0x8000) + +/* Binary operators */ +#define fSCALE(N, A) (A << N) +#define fASHIFTR(SRC, SHAMT, REGSTYPE) (fCAST##REGSTYPE##s(SRC) >> SHAMT) +#define fLSHIFTR(SRC, SHAMT, REGSTYPE) (SRC >>> SHAMT) +#define fROTL(SRC, SHAMT, REGSTYPE) fROTL(SRC, SHAMT) +#define fASHIFTL(SRC, SHAMT, REGSTYPE) (fCAST##REGSTYPE##s(SRC) << SHAMT) + +/* Include fHIDE macros which hide type declarations */ +#define fHIDE(A) A + +/* Purge non-relavant parts */ +#define fBRANCH_SPECULATE_STALL(A, B, C, D, E) diff --git a/target/hexagon/idef-parser/parser-helpers.c b/target/hexagon/idef-parser/parser-helpers.c new file mode 100644 index 0000000000..8110686c51 --- /dev/null +++ b/target/hexagon/idef-parser/parser-helpers.c @@ -0,0 +1,2360 @@ +/* + * Copyright(c) 2019-2022 rev.ng Labs Srl. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + */ + +#include <assert.h> +#include <inttypes.h> +#include <stdarg.h> +#include <stdbool.h> +#include <stdint.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> + +#include "idef-parser.h" +#include "parser-helpers.h" +#include "idef-parser.tab.h" +#include "idef-parser.yy.h" + +void yyerror(YYLTYPE *locp, + yyscan_t scanner __attribute__((unused)), + Context *c, + const char *s) +{ + const char *code_ptr = c->input_buffer; + + fprintf(stderr, "WARNING (%s): '%s'\n", c->inst.name->str, s); + + fprintf(stderr, "Problematic range: "); + for (int i = locp->first_column; i < locp->last_column; i++) { + if (code_ptr[i] != '\n') { + fprintf(stderr, "%c", code_ptr[i]); + } + } + fprintf(stderr, "\n"); + + for (unsigned i = 0; + i < 80 && + code_ptr[locp->first_column - 10 + i] != '\0' && + code_ptr[locp->first_column - 10 + i] != '\n'; + i++) { + fprintf(stderr, "%c", code_ptr[locp->first_column - 10 + i]); + } + fprintf(stderr, "\n"); + for (unsigned i = 0; i < 9; i++) { + fprintf(stderr, " "); + } + fprintf(stderr, "^"); + for (int i = 0; i < (locp->last_column - locp->first_column) - 1; i++) { + fprintf(stderr, "~"); + } + fprintf(stderr, "\n"); + c->inst.error_count++; +} + +bool is_direct_predicate(HexValue *value) +{ + return value->pred.id >= '0' && value->pred.id <= '3'; +} + +bool is_inside_ternary(Context *c) +{ + return c->ternary->len > 0; +} + +/* Print functions */ +void str_print(Context *c, YYLTYPE *locp, const char *string) +{ + (void) locp; + EMIT(c, "%s", string); +} + +void uint8_print(Context *c, YYLTYPE *locp, uint8_t *num) +{ + (void) locp; + EMIT(c, "%u", *num); +} + +void uint64_print(Context *c, YYLTYPE *locp, uint64_t *num) +{ + (void) locp; + EMIT(c, "%" PRIu64, *num); +} + +void int_print(Context *c, YYLTYPE *locp, int *num) +{ + (void) locp; + EMIT(c, "%d", *num); +} + +void uint_print(Context *c, YYLTYPE *locp, unsigned *num) +{ + (void) locp; + EMIT(c, "%u", *num); +} + +void tmp_print(Context *c, YYLTYPE *locp, HexTmp *tmp) +{ + (void) locp; + EMIT(c, "tmp_%d", tmp->index); +} + +void pred_print(Context *c, YYLTYPE *locp, HexPred *pred, bool is_dotnew) +{ + (void) locp; + char suffix = is_dotnew ? 'N' : 'V'; + EMIT(c, "P%c%c", pred->id, suffix); +} + +void reg_compose(Context *c, YYLTYPE *locp, HexReg *reg, char reg_id[5]) +{ + memset(reg_id, 0, 5 * sizeof(char)); + switch (reg->type) { + case GENERAL_PURPOSE: + reg_id[0] = 'R'; + break; + case CONTROL: + reg_id[0] = 'C'; + break; + case MODIFIER: + reg_id[0] = 'M'; + break; + case DOTNEW: + reg_id[0] = 'N'; + reg_id[1] = reg->id; + reg_id[2] = 'N'; + return; + } + switch (reg->bit_width) { + case 32: + reg_id[1] = reg->id; + reg_id[2] = 'V'; + break; + case 64: + reg_id[1] = reg->id; + reg_id[2] = reg->id; + reg_id[3] = 'V'; + break; + default: + yyassert(c, locp, false, "Unhandled register bit width!\n"); + } +} + +static void reg_arg_print(Context *c, YYLTYPE *locp, HexReg *reg) +{ + char reg_id[5]; + reg_compose(c, locp, reg, reg_id); + EMIT(c, "%s", reg_id); +} + +void reg_print(Context *c, YYLTYPE *locp, HexReg *reg) +{ + (void) locp; + EMIT(c, "hex_gpr[%u]", reg->id); +} + +void imm_print(Context *c, YYLTYPE *locp, HexImm *imm) +{ + switch (imm->type) { + case I: + EMIT(c, "i"); + break; + case VARIABLE: + EMIT(c, "%ciV", imm->id); + break; + case VALUE: + EMIT(c, "((int64_t) %" PRIu64 "ULL)", (int64_t) imm->value); + break; + case QEMU_TMP: + EMIT(c, "qemu_tmp_%" PRIu64, imm->index); + break; + case IMM_PC: + EMIT(c, "ctx->base.pc_next"); + break; + case IMM_NPC: + EMIT(c, "ctx->npc"); + break; + case IMM_CONSTEXT: + EMIT(c, "insn->extension_valid"); + break; + default: + yyassert(c, locp, false, "Cannot print this expression!"); + } +} + +void var_print(Context *c, YYLTYPE *locp, HexVar *var) +{ + (void) locp; + EMIT(c, "%s", var->name->str); +} + +void rvalue_print(Context *c, YYLTYPE *locp, void *pointer) +{ + HexValue *rvalue = (HexValue *) pointer; + switch (rvalue->type) { + case REGISTER: + reg_print(c, locp, &rvalue->reg); + break; + case REGISTER_ARG: + reg_arg_print(c, locp, &rvalue->reg); + break; + case TEMP: + tmp_print(c, locp, &rvalue->tmp); + break; + case IMMEDIATE: + imm_print(c, locp, &rvalue->imm); + break; + case VARID: + var_print(c, locp, &rvalue->var); + break; + case PREDICATE: + pred_print(c, locp, &rvalue->pred, rvalue->is_dotnew); + break; + default: + yyassert(c, locp, false, "Cannot print this expression!"); + } +} + +void out_assert(Context *c, YYLTYPE *locp, + void *dummy __attribute__((unused))) +{ + yyassert(c, locp, false, "Unhandled print type!"); +} + +/* Copy output code buffer */ +void commit(Context *c) +{ + /* Emit instruction pseudocode */ + EMIT_SIG(c, "\n" START_COMMENT " "); + for (char *x = c->inst.code_begin; x < c->inst.code_end; x++) { + EMIT_SIG(c, "%c", *x); + } + EMIT_SIG(c, " " END_COMMENT "\n"); + + /* Commit instruction code to output file */ + fwrite(c->signature_str->str, sizeof(char), c->signature_str->len, + c->output_file); + fwrite(c->header_str->str, sizeof(char), c->header_str->len, + c->output_file); + fwrite(c->out_str->str, sizeof(char), c->out_str->len, + c->output_file); + + fwrite(c->signature_str->str, sizeof(char), c->signature_str->len, + c->defines_file); + fprintf(c->defines_file, ";\n"); +} + +static void gen_c_int_type(Context *c, YYLTYPE *locp, unsigned bit_width, + HexSignedness signedness) +{ + const char *signstr = (signedness == UNSIGNED) ? "u" : ""; + OUT(c, locp, signstr, "int", &bit_width, "_t"); +} + +static HexValue gen_constant(Context *c, + YYLTYPE *locp, + const char *value, + unsigned bit_width, + HexSignedness signedness) +{ + HexValue rvalue; + assert(bit_width == 32 || bit_width == 64); + memset(&rvalue, 0, sizeof(HexValue)); + rvalue.type = TEMP; + rvalue.bit_width = bit_width; + rvalue.signedness = signedness; + rvalue.is_dotnew = false; + rvalue.is_manual = true; + rvalue.tmp.index = c->inst.tmp_count; + OUT(c, locp, "TCGv_i", &bit_width, " tmp_", &c->inst.tmp_count, + " = tcg_constant_i", &bit_width, "(", value, ");\n"); + c->inst.tmp_count++; + return rvalue; +} + +/* Temporary values creation */ +HexValue gen_tmp(Context *c, + YYLTYPE *locp, + unsigned bit_width, + HexSignedness signedness) +{ + HexValue rvalue; + assert(bit_width == 32 || bit_width == 64); + memset(&rvalue, 0, sizeof(HexValue)); + rvalue.type = TEMP; + rvalue.bit_width = bit_width; + rvalue.signedness = signedness; + rvalue.is_dotnew = false; + rvalue.is_manual = false; + rvalue.tmp.index = c->inst.tmp_count; + OUT(c, locp, "TCGv_i", &bit_width, " tmp_", &c->inst.tmp_count, + " = tcg_temp_new_i", &bit_width, "();\n"); + c->inst.tmp_count++; + return rvalue; +} + +HexValue gen_tmp_local(Context *c, + YYLTYPE *locp, + unsigned bit_width, + HexSignedness signedness) +{ + HexValue rvalue; + assert(bit_width == 32 || bit_width == 64); + memset(&rvalue, 0, sizeof(HexValue)); + rvalue.type = TEMP; + rvalue.bit_width = bit_width; + rvalue.signedness = signedness; + rvalue.is_dotnew = false; + rvalue.is_manual = false; + rvalue.tmp.index = c->inst.tmp_count; + OUT(c, locp, "TCGv_i", &bit_width, " tmp_", &c->inst.tmp_count, + " = tcg_temp_local_new_i", &bit_width, "();\n"); + c->inst.tmp_count++; + return rvalue; +} + +HexValue gen_tmp_value(Context *c, + YYLTYPE *locp, + const char *value, + unsigned bit_width, + HexSignedness signedness) +{ + HexValue rvalue; + assert(bit_width == 32 || bit_width == 64); + memset(&rvalue, 0, sizeof(HexValue)); + rvalue.type = TEMP; + rvalue.bit_width = bit_width; + rvalue.signedness = signedness; + rvalue.is_dotnew = false; + rvalue.is_manual = false; + rvalue.tmp.index = c->inst.tmp_count; + OUT(c, locp, "TCGv_i", &bit_width, " tmp_", &c->inst.tmp_count, + " = tcg_const_i", &bit_width, "(", value, ");\n"); + c->inst.tmp_count++; + return rvalue; +} + +static HexValue gen_tmp_value_from_imm(Context *c, + YYLTYPE *locp, + HexValue *value) +{ + HexValue rvalue; + assert(value->type == IMMEDIATE); + memset(&rvalue, 0, sizeof(HexValue)); + rvalue.type = TEMP; + rvalue.bit_width = value->bit_width; + rvalue.signedness = value->signedness; + rvalue.is_dotnew = false; + rvalue.is_manual = false; + rvalue.tmp.index = c->inst.tmp_count; + /* + * Here we output the call to `tcg_const_i<width>` in + * order to create the temporary value. Note, that we + * add a cast + * + * `tcg_const_i<width>`((int<width>_t) ...)` + * + * This cast is required to avoid implicit integer + * conversion warnings since all immediates are + * output as `((int64_t) 123ULL)`, even if the + * integer is 32-bit. + */ + OUT(c, locp, "TCGv_i", &rvalue.bit_width, " tmp_", &c->inst.tmp_count); + OUT(c, locp, " = tcg_const_i", &rvalue.bit_width, + "((int", &rvalue.bit_width, "_t) (", value, "));\n"); + + c->inst.tmp_count++; + return rvalue; +} + +HexValue gen_imm_value(Context *c __attribute__((unused)), + YYLTYPE *locp, + int value, + unsigned bit_width, + HexSignedness signedness) +{ + (void) locp; + HexValue rvalue; + assert(bit_width == 32 || bit_width == 64); + memset(&rvalue, 0, sizeof(HexValue)); + rvalue.type = IMMEDIATE; + rvalue.bit_width = bit_width; + rvalue.signedness = signedness; + rvalue.is_dotnew = false; + rvalue.is_manual = false; + rvalue.imm.type = VALUE; + rvalue.imm.value = value; + return rvalue; +} + +HexValue gen_imm_qemu_tmp(Context *c, YYLTYPE *locp, unsigned bit_width, + HexSignedness signedness) +{ + (void) locp; + HexValue rvalue; + assert(bit_width == 32 || bit_width == 64); + memset(&rvalue, 0, sizeof(HexValue)); + rvalue.type = IMMEDIATE; + rvalue.is_dotnew = false; + rvalue.is_manual = false; + rvalue.bit_width = bit_width; + rvalue.signedness = signedness; + rvalue.imm.type = QEMU_TMP; + rvalue.imm.index = c->inst.qemu_tmp_count++; + return rvalue; +} + +void gen_rvalue_free(Context *c, YYLTYPE *locp, HexValue *rvalue) +{ + if (rvalue->type == TEMP && !rvalue->is_manual) { + const char *bit_suffix = (rvalue->bit_width == 64) ? "i64" : "i32"; + OUT(c, locp, "tcg_temp_free_", bit_suffix, "(", rvalue, ");\n"); + } +} + +static void gen_rvalue_free_manual(Context *c, YYLTYPE *locp, HexValue *rvalue) +{ + rvalue->is_manual = false; + gen_rvalue_free(c, locp, rvalue); +} + +HexValue rvalue_materialize(Context *c, YYLTYPE *locp, HexValue *rvalue) +{ + if (rvalue->type == IMMEDIATE) { + HexValue res = gen_tmp_value_from_imm(c, locp, rvalue); + gen_rvalue_free(c, locp, rvalue); + return res; + } + return *rvalue; +} + +HexValue gen_rvalue_extend(Context *c, YYLTYPE *locp, HexValue *rvalue) +{ + assert_signedness(c, locp, rvalue->signedness); + if (rvalue->bit_width > 32) { + return *rvalue; + } + + if (rvalue->type == IMMEDIATE) { + HexValue res = gen_imm_qemu_tmp(c, locp, 64, rvalue->signedness); + bool is_unsigned = (rvalue->signedness == UNSIGNED); + const char *sign_suffix = is_unsigned ? "u" : ""; + gen_c_int_type(c, locp, 64, rvalue->signedness); + OUT(c, locp, " ", &res, " = "); + OUT(c, locp, "(", sign_suffix, "int64_t) "); + OUT(c, locp, "(", sign_suffix, "int32_t) "); + OUT(c, locp, rvalue, ";\n"); + return res; + } else { + HexValue res = gen_tmp(c, locp, 64, rvalue->signedness); + bool is_unsigned = (rvalue->signedness == UNSIGNED); + const char *sign_suffix = is_unsigned ? "u" : ""; + OUT(c, locp, "tcg_gen_ext", sign_suffix, + "_i32_i64(", &res, ", ", rvalue, ");\n"); + gen_rvalue_free(c, locp, rvalue); + return res; + } +} + +HexValue gen_rvalue_truncate(Context *c, YYLTYPE *locp, HexValue *rvalue) +{ + if (rvalue->type == IMMEDIATE) { + HexValue res = *rvalue; + res.bit_width = 32; + return res; + } else { + if (rvalue->bit_width == 64) { + HexValue res = gen_tmp(c, locp, 32, rvalue->signedness); + OUT(c, locp, "tcg_gen_trunc_i64_tl(", &res, ", ", rvalue, ");\n"); + gen_rvalue_free(c, locp, rvalue); + return res; + } + } + return *rvalue; +} + +/* + * Attempts to lookup the `Var` struct associated with the given `varid`. + * The `dst` argument is populated with the found name, bit_width, and + * signedness, given that `dst` is non-NULL. Returns true if the lookup + * succeeded and false otherwise. + */ +static bool try_find_variable(Context *c, YYLTYPE *locp, + HexValue *dst, + HexValue *varid) +{ + yyassert(c, locp, varid, "varid to lookup is NULL"); + yyassert(c, locp, varid->type == VARID, + "Can only lookup variables by varid"); + for (unsigned i = 0; i < c->inst.allocated->len; i++) { + Var *curr = &g_array_index(c->inst.allocated, Var, i); + if (g_string_equal(varid->var.name, curr->name)) { + if (dst) { + dst->var.name = curr->name; + dst->bit_width = curr->bit_width; + dst->signedness = curr->signedness; + } + return true; + } + } + return false; +} + +/* Calls `try_find_variable` and asserts succcess. */ +static void find_variable(Context *c, YYLTYPE *locp, + HexValue *dst, + HexValue *varid) +{ + bool found = try_find_variable(c, locp, dst, varid); + yyassert(c, locp, found, "Use of undeclared variable!\n"); +} + +/* Handle signedness, if both unsigned -> result is unsigned, else signed */ +static inline HexSignedness bin_op_signedness(Context *c, YYLTYPE *locp, + HexSignedness sign1, + HexSignedness sign2) +{ + assert_signedness(c, locp, sign1); + assert_signedness(c, locp, sign2); + return (sign1 == UNSIGNED && sign2 == UNSIGNED) ? UNSIGNED : SIGNED; +} + +void gen_varid_allocate(Context *c, + YYLTYPE *locp, + HexValue *varid, + unsigned bit_width, + HexSignedness signedness) +{ + const char *bit_suffix = (bit_width == 64) ? "i64" : "i32"; + bool found = try_find_variable(c, locp, NULL, varid); + Var new_var; + + memset(&new_var, 0, sizeof(Var)); + + yyassert(c, locp, !found, "Redeclaration of variables not allowed!"); + assert_signedness(c, locp, signedness); + + /* `varid` only carries name information */ + new_var.name = varid->var.name; + new_var.bit_width = bit_width; + new_var.signedness = signedness; + + EMIT_HEAD(c, "TCGv_%s %s", bit_suffix, varid->var.name->str); + EMIT_HEAD(c, " = tcg_temp_local_new_%s();\n", bit_suffix); + g_array_append_val(c->inst.allocated, new_var); +} + +enum OpTypes { + IMM_IMM = 0, + IMM_REG = 1, + REG_IMM = 2, + REG_REG = 3, +}; + +HexValue gen_bin_cmp(Context *c, + YYLTYPE *locp, + TCGCond type, + HexValue *op1, + HexValue *op2) +{ + HexValue op1_m = *op1; + HexValue op2_m = *op2; + enum OpTypes op_types = (op1_m.type != IMMEDIATE) << 1 + | (op2_m.type != IMMEDIATE); + + bool op_is64bit = op1_m.bit_width == 64 || op2_m.bit_width == 64; + const char *bit_suffix = op_is64bit ? "i64" : "i32"; + unsigned bit_width = (op_is64bit) ? 64 : 32; + HexValue res = gen_tmp(c, locp, bit_width, UNSIGNED); + + /* Extend to 64-bits, if required */ + if (op_is64bit) { + op1_m = gen_rvalue_extend(c, locp, &op1_m); + op2_m = gen_rvalue_extend(c, locp, &op2_m); + } + + switch (op_types) { + case IMM_IMM: + case IMM_REG: + yyassert(c, locp, false, "Binary comparisons between IMM op IMM and" + "IMM op REG not handled!"); + break; + case REG_IMM: + OUT(c, locp, "tcg_gen_setcondi_", bit_suffix, "("); + OUT(c, locp, cond_to_str(type), ", ", &res, ", ", &op1_m, ", ", &op2_m, + ");\n"); + break; + case REG_REG: + OUT(c, locp, "tcg_gen_setcond_", bit_suffix, "("); + OUT(c, locp, cond_to_str(type), ", ", &res, ", ", &op1_m, ", ", &op2_m, + ");\n"); + break; + default: + fprintf(stderr, "Error in evalutating immediateness!"); + abort(); + } + + /* Free operands */ + gen_rvalue_free(c, locp, &op1_m); + gen_rvalue_free(c, locp, &op2_m); + + return res; +} + +static void gen_simple_op(Context *c, YYLTYPE *locp, unsigned bit_width, + const char *bit_suffix, HexValue *res, + enum OpTypes op_types, + HexValue *op1, + HexValue *op2, + const char *imm_imm, + const char *imm_reg, + const char *reg_imm, + const char *reg_reg) +{ + switch (op_types) { + case IMM_IMM: { + HexSignedness signedness = bin_op_signedness(c, locp, + op1->signedness, + op2->signedness); + gen_c_int_type(c, locp, bit_width, signedness); + OUT(c, locp, " ", res, + " = ", op1, imm_imm, op2, ";\n"); + } break; + case IMM_REG: + OUT(c, locp, imm_reg, bit_suffix, + "(", res, ", ", op2, ", ", op1, ");\n"); + break; + case REG_IMM: + OUT(c, locp, reg_imm, bit_suffix, + "(", res, ", ", op1, ", ", op2, ");\n"); + break; + case REG_REG: + OUT(c, locp, reg_reg, bit_suffix, + "(", res, ", ", op1, ", ", op2, ");\n"); + break; + } + gen_rvalue_free(c, locp, op1); + gen_rvalue_free(c, locp, op2); +} + +static void gen_sub_op(Context *c, YYLTYPE *locp, unsigned bit_width, + const char *bit_suffix, HexValue *res, + enum OpTypes op_types, HexValue *op1, + HexValue *op2) +{ + switch (op_types) { + case IMM_IMM: { + HexSignedness signedness = bin_op_signedness(c, locp, + op1->signedness, + op2->signedness); + gen_c_int_type(c, locp, bit_width, signedness); + OUT(c, locp, " ", res, + " = ", op1, " - ", op2, ";\n"); + } break; + case IMM_REG: { + OUT(c, locp, "tcg_gen_subfi_", bit_suffix, + "(", res, ", ", op1, ", ", op2, ");\n"); + } break; + case REG_IMM: { + OUT(c, locp, "tcg_gen_subi_", bit_suffix, + "(", res, ", ", op1, ", ", op2, ");\n"); + } break; + case REG_REG: { + OUT(c, locp, "tcg_gen_sub_", bit_suffix, + "(", res, ", ", op1, ", ", op2, ");\n"); + } break; + } + gen_rvalue_free(c, locp, op1); + gen_rvalue_free(c, locp, op2); +} + +static void gen_asl_op(Context *c, YYLTYPE *locp, unsigned bit_width, + bool op_is64bit, const char *bit_suffix, + HexValue *res, enum OpTypes op_types, + HexValue *op1, HexValue *op2) +{ + HexValue op1_m = *op1; + HexValue op2_m = *op2; + switch (op_types) { + case IMM_IMM: { + HexSignedness signedness = bin_op_signedness(c, locp, + op1->signedness, + op2->signedness); + gen_c_int_type(c, locp, bit_width, signedness); + OUT(c, locp, " ", res, + " = ", op1, " << ", op2, ";\n"); + } break; + case REG_IMM: { + OUT(c, locp, "if (", op2, " >= ", &bit_width, ") {\n"); + OUT(c, locp, "tcg_gen_movi_", bit_suffix, "(", res, ", 0);\n"); + OUT(c, locp, "} else {\n"); + OUT(c, locp, "tcg_gen_shli_", bit_suffix, + "(", res, ", ", op1, ", ", op2, ");\n"); + OUT(c, locp, "}\n"); + } break; + case IMM_REG: + op1_m.bit_width = bit_width; + op1_m = rvalue_materialize(c, locp, &op1_m); + /* fallthrough */ + case REG_REG: { + OUT(c, locp, "tcg_gen_shl_", bit_suffix, + "(", res, ", ", &op1_m, ", ", op2, ");\n"); + } break; + } + if (op_types == IMM_REG || op_types == REG_REG) { + /* + * Handle left shift by 64/32 which hexagon-sim expects to clear out + * register + */ + HexValue zero = gen_constant(c, locp, "0", bit_width, UNSIGNED); + HexValue edge = gen_imm_value(c, locp, bit_width, bit_width, UNSIGNED); + edge = rvalue_materialize(c, locp, &edge); + if (op_is64bit) { + op2_m = gen_rvalue_extend(c, locp, &op2_m); + } + op1_m = rvalue_materialize(c, locp, &op1_m); + op2_m = rvalue_materialize(c, locp, &op2_m); + OUT(c, locp, "tcg_gen_movcond_i", &bit_width); + OUT(c, locp, "(TCG_COND_GEU, ", res, ", ", &op2_m, ", ", &edge); + OUT(c, locp, ", ", &zero, ", ", res, ");\n"); + gen_rvalue_free(c, locp, &edge); + } + gen_rvalue_free(c, locp, &op1_m); + gen_rvalue_free(c, locp, &op2_m); +} + +static void gen_asr_op(Context *c, YYLTYPE *locp, unsigned bit_width, + bool op_is64bit, const char *bit_suffix, + HexValue *res, enum OpTypes op_types, + HexValue *op1, HexValue *op2) +{ + HexValue op1_m = *op1; + HexValue op2_m = *op2; + switch (op_types) { + case IMM_IMM: + case IMM_REG: + yyassert(c, locp, false, "ASR between IMM op IMM, and IMM op REG" + " not handled!"); + break; + case REG_IMM: { + HexSignedness signedness = bin_op_signedness(c, locp, + op1->signedness, + op2->signedness); + OUT(c, locp, "{\n"); + gen_c_int_type(c, locp, bit_width, signedness); + OUT(c, locp, " shift = ", op2, ";\n"); + OUT(c, locp, "if (", op2, " >= ", &bit_width, ") {\n"); + OUT(c, locp, " shift = ", &bit_width, " - 1;\n"); + OUT(c, locp, "}\n"); + OUT(c, locp, "tcg_gen_sari_", bit_suffix, + "(", res, ", ", op1, ", shift);\n}\n"); + } break; + case REG_REG: + OUT(c, locp, "tcg_gen_sar_", bit_suffix, + "(", res, ", ", &op1_m, ", ", op2, ");\n"); + break; + } + if (op_types == REG_REG) { + /* Handle right shift by values >= bit_width */ + const char *offset = op_is64bit ? "63" : "31"; + HexValue tmp = gen_tmp(c, locp, bit_width, SIGNED); + HexValue zero = gen_constant(c, locp, "0", bit_width, SIGNED); + HexValue edge = gen_imm_value(c, locp, bit_width, bit_width, UNSIGNED); + + edge = rvalue_materialize(c, locp, &edge); + if (op_is64bit) { + op2_m = gen_rvalue_extend(c, locp, &op2_m); + } + op1_m = rvalue_materialize(c, locp, &op1_m); + op2_m = rvalue_materialize(c, locp, &op2_m); + + OUT(c, locp, "tcg_gen_extract_", bit_suffix, "(", + &tmp, ", ", &op1_m, ", ", offset, ", 1);\n"); + OUT(c, locp, "tcg_gen_sub_", bit_suffix, "(", + &tmp, ", ", &zero, ", ", &tmp, ");\n"); + OUT(c, locp, "tcg_gen_movcond_i", &bit_width); + OUT(c, locp, "(TCG_COND_GEU, ", res, ", ", &op2_m, ", ", &edge); + OUT(c, locp, ", ", &tmp, ", ", res, ");\n"); + gen_rvalue_free(c, locp, &edge); + gen_rvalue_free(c, locp, &tmp); + } + gen_rvalue_free(c, locp, &op1_m); + gen_rvalue_free(c, locp, &op2_m); +} + +static void gen_lsr_op(Context *c, YYLTYPE *locp, unsigned bit_width, + bool op_is64bit, const char *bit_suffix, + HexValue *res, enum OpTypes op_types, + HexValue *op1, HexValue *op2) +{ + HexValue op1_m = *op1; + HexValue op2_m = *op2; + switch (op_types) { + case IMM_IMM: + case IMM_REG: + yyassert(c, locp, false, "LSR between IMM op IMM, and IMM op REG" + " not handled!"); + break; + case REG_IMM: + OUT(c, locp, "if (", op2, " >= ", &bit_width, ") {\n"); + OUT(c, locp, "tcg_gen_movi_", bit_suffix, "(", res, ", 0);\n"); + OUT(c, locp, "} else {\n"); + OUT(c, locp, "tcg_gen_shri_", bit_suffix, + "(", res, ", ", op1, ", ", op2, ");\n"); + OUT(c, locp, "}\n"); + break; + case REG_REG: + OUT(c, locp, "tcg_gen_shr_", bit_suffix, + "(", res, ", ", &op1_m, ", ", op2, ");\n"); + break; + } + if (op_types == REG_REG) { + /* Handle right shift by values >= bit_width */ + HexValue zero = gen_constant(c, locp, "0", bit_width, UNSIGNED); + HexValue edge = gen_imm_value(c, locp, bit_width, bit_width, UNSIGNED); + edge = rvalue_materialize(c, locp, &edge); + if (op_is64bit) { + op2_m = gen_rvalue_extend(c, locp, &op2_m); + } + op1_m = rvalue_materialize(c, locp, &op1_m); + op2_m = rvalue_materialize(c, locp, &op2_m); + OUT(c, locp, "tcg_gen_movcond_i", &bit_width); + OUT(c, locp, "(TCG_COND_GEU, ", res, ", ", &op2_m, ", ", &edge); + OUT(c, locp, ", ", &zero, ", ", res, ");\n"); + gen_rvalue_free(c, locp, &edge); + } + gen_rvalue_free(c, locp, &op1_m); + gen_rvalue_free(c, locp, &op2_m); +} + +/* + * Note: This implementation of logical `and` does not mirror that in C. + * We do not short-circuit logical expressions! + */ +static void gen_andl_op(Context *c, YYLTYPE *locp, unsigned bit_width, + const char *bit_suffix, HexValue *res, + enum OpTypes op_types, HexValue *op1, + HexValue *op2) +{ + (void) bit_width; + HexValue tmp1, tmp2; + HexValue zero = gen_constant(c, locp, "0", 32, UNSIGNED); + memset(&tmp1, 0, sizeof(HexValue)); + memset(&tmp2, 0, sizeof(HexValue)); + switch (op_types) { + case IMM_IMM: + case IMM_REG: + case REG_IMM: + yyassert(c, locp, false, "ANDL between IMM op IMM, IMM op REG, and" + " REG op IMM, not handled!"); + break; + case REG_REG: + tmp1 = gen_bin_cmp(c, locp, TCG_COND_NE, op1, &zero); + tmp2 = gen_bin_cmp(c, locp, TCG_COND_NE, op2, &zero); + OUT(c, locp, "tcg_gen_and_", bit_suffix, + "(", res, ", ", &tmp1, ", ", &tmp2, ");\n"); + gen_rvalue_free_manual(c, locp, &zero); + gen_rvalue_free(c, locp, &tmp1); + gen_rvalue_free(c, locp, &tmp2); + break; + } +} + +static void gen_minmax_op(Context *c, YYLTYPE *locp, unsigned bit_width, + HexValue *res, enum OpTypes op_types, + HexValue *op1, HexValue *op2, bool minmax) +{ + const char *mm; + HexValue op1_m = *op1; + HexValue op2_m = *op2; + bool is_unsigned; + + assert_signedness(c, locp, res->signedness); + is_unsigned = res->signedness == UNSIGNED; + + if (minmax) { + /* Max */ + mm = is_unsigned ? "tcg_gen_umax" : "tcg_gen_smax"; + } else { + /* Min */ + mm = is_unsigned ? "tcg_gen_umin" : "tcg_gen_smin"; + } + switch (op_types) { + case IMM_IMM: + yyassert(c, locp, false, "MINMAX between IMM op IMM, not handled!"); + break; + case IMM_REG: + op1_m.bit_width = bit_width; + op1_m = rvalue_materialize(c, locp, &op1_m); + OUT(c, locp, mm, "_i", &bit_width, "("); + OUT(c, locp, res, ", ", &op1_m, ", ", op2, ");\n"); + break; + case REG_IMM: + op2_m.bit_width = bit_width; + op2_m = rvalue_materialize(c, locp, &op2_m); + /* Fallthrough */ + case REG_REG: + OUT(c, locp, mm, "_i", &bit_width, "("); + OUT(c, locp, res, ", ", op1, ", ", &op2_m, ");\n"); + break; + } + gen_rvalue_free(c, locp, &op1_m); + gen_rvalue_free(c, locp, &op2_m); +} + +/* Code generation functions */ +HexValue gen_bin_op(Context *c, + YYLTYPE *locp, + OpType type, + HexValue *op1, + HexValue *op2) +{ + /* Replicate operands to avoid side effects */ + HexValue op1_m = *op1; + HexValue op2_m = *op2; + enum OpTypes op_types; + bool op_is64bit; + HexSignedness signedness; + unsigned bit_width; + const char *bit_suffix; + HexValue res; + + memset(&res, 0, sizeof(HexValue)); + + /* + * If the operands are VARID's we need to look up the + * type information. + */ + if (op1_m.type == VARID) { + find_variable(c, locp, &op1_m, &op1_m); + } + if (op2_m.type == VARID) { + find_variable(c, locp, &op2_m, &op2_m); + } + + op_types = (op1_m.type != IMMEDIATE) << 1 + | (op2_m.type != IMMEDIATE); + op_is64bit = op1_m.bit_width == 64 || op2_m.bit_width == 64; + /* Shift greater than 32 are 64 bits wide */ + + if (type == ASL_OP && op2_m.type == IMMEDIATE && + op2_m.imm.type == VALUE && op2_m.imm.value >= 32) { + op_is64bit = true; + } + + bit_width = (op_is64bit) ? 64 : 32; + bit_suffix = op_is64bit ? "i64" : "i32"; + + /* Extend to 64-bits, if required */ + if (op_is64bit) { + op1_m = gen_rvalue_extend(c, locp, &op1_m); + op2_m = gen_rvalue_extend(c, locp, &op2_m); + } + + signedness = bin_op_signedness(c, locp, op1_m.signedness, op2_m.signedness); + if (op_types != IMM_IMM) { + res = gen_tmp(c, locp, bit_width, signedness); + } else { + res = gen_imm_qemu_tmp(c, locp, bit_width, signedness); + } + + switch (type) { + case ADD_OP: + gen_simple_op(c, locp, bit_width, bit_suffix, &res, + op_types, &op1_m, &op2_m, + " + ", + "tcg_gen_addi_", + "tcg_gen_addi_", + "tcg_gen_add_"); + break; + case SUB_OP: + gen_sub_op(c, locp, bit_width, bit_suffix, &res, op_types, + &op1_m, &op2_m); + break; + case MUL_OP: + gen_simple_op(c, locp, bit_width, bit_suffix, &res, + op_types, &op1_m, &op2_m, + " * ", + "tcg_gen_muli_", + "tcg_gen_muli_", + "tcg_gen_mul_"); + break; + case ASL_OP: + gen_asl_op(c, locp, bit_width, op_is64bit, bit_suffix, &res, op_types, + &op1_m, &op2_m); + break; + case ASR_OP: + gen_asr_op(c, locp, bit_width, op_is64bit, bit_suffix, &res, op_types, + &op1_m, &op2_m); + break; + case LSR_OP: + gen_lsr_op(c, locp, bit_width, op_is64bit, bit_suffix, &res, op_types, + &op1_m, &op2_m); + break; + case ANDB_OP: + gen_simple_op(c, locp, bit_width, bit_suffix, &res, + op_types, &op1_m, &op2_m, + " & ", + "tcg_gen_andi_", + "tcg_gen_andi_", + "tcg_gen_and_"); + break; + case ORB_OP: + gen_simple_op(c, locp, bit_width, bit_suffix, &res, + op_types, &op1_m, &op2_m, + " | ", + "tcg_gen_ori_", + "tcg_gen_ori_", + "tcg_gen_or_"); + break; + case XORB_OP: + gen_simple_op(c, locp, bit_width, bit_suffix, &res, + op_types, &op1_m, &op2_m, + " ^ ", + "tcg_gen_xori_", + "tcg_gen_xori_", + "tcg_gen_xor_"); + break; + case ANDL_OP: + gen_andl_op(c, locp, bit_width, bit_suffix, &res, op_types, &op1_m, + &op2_m); + break; + case MINI_OP: + gen_minmax_op(c, locp, bit_width, &res, op_types, &op1_m, &op2_m, + false); + break; + case MAXI_OP: + gen_minmax_op(c, locp, bit_width, &res, op_types, &op1_m, &op2_m, true); + break; + } + return res; +} + +HexValue gen_cast_op(Context *c, + YYLTYPE *locp, + HexValue *src, + unsigned target_width, + HexSignedness signedness) +{ + assert_signedness(c, locp, src->signedness); + if (src->bit_width == target_width) { + return *src; + } else if (src->type == IMMEDIATE) { + HexValue res = *src; + res.bit_width = target_width; + res.signedness = signedness; + return res; + } else { + HexValue res = gen_tmp(c, locp, target_width, signedness); + /* Truncate */ + if (src->bit_width > target_width) { + OUT(c, locp, "tcg_gen_trunc_i64_tl(", &res, ", ", src, ");\n"); + } else { + assert_signedness(c, locp, src->signedness); + if (src->signedness == UNSIGNED) { + /* Extend unsigned */ + OUT(c, locp, "tcg_gen_extu_i32_i64(", + &res, ", ", src, ");\n"); + } else { + /* Extend signed */ + OUT(c, locp, "tcg_gen_ext_i32_i64(", + &res, ", ", src, ");\n"); + } + } + gen_rvalue_free(c, locp, src); + return res; + } +} + + +/* + * Implements an extension when the `src_width` is an immediate. + * If the `value` to extend is also an immediate we use `extract/sextract` + * from QEMU `bitops.h`. If `value` is a TCGv then we rely on + * `tcg_gen_extract/tcg_gen_sextract`. + */ +static HexValue gen_extend_imm_width_op(Context *c, + YYLTYPE *locp, + HexValue *src_width, + unsigned dst_width, + HexValue *value, + HexSignedness signedness) +{ + /* + * If the source width is not an immediate value, we need to guard + * our extend op with if statements to handle the case where + * `src_width_m` is 0. + */ + const char *sign_prefix; + bool need_guarding; + + assert_signedness(c, locp, signedness); + assert(dst_width == 64 || dst_width == 32); + assert(src_width->type == IMMEDIATE); + + sign_prefix = (signedness == UNSIGNED) ? "" : "s"; + need_guarding = (src_width->imm.type != VALUE); + + if (src_width->imm.type == VALUE && + src_width->imm.value == 0) { + /* + * We can bail out early if the source width is known to be zero + * at translation time. + */ + return gen_imm_value(c, locp, 0, dst_width, signedness); + } + + if (value->type == IMMEDIATE) { + /* + * If both the value and source width are immediates, + * we can perform the extension at translation time + * using QEMUs bitops. + */ + HexValue res = gen_imm_qemu_tmp(c, locp, dst_width, signedness); + gen_c_int_type(c, locp, dst_width, signedness); + OUT(c, locp, " ", &res, " = 0;\n"); + if (need_guarding) { + OUT(c, locp, "if (", src_width, " != 0) {\n"); + } + OUT(c, locp, &res, " = ", sign_prefix, "extract", &dst_width); + OUT(c, locp, "(", value, ", 0, ", src_width, ");\n"); + if (need_guarding) { + OUT(c, locp, "}\n"); + } + + gen_rvalue_free(c, locp, value); + return res; + } else { + /* + * If the source width is an immediate and the value to + * extend is a TCGv, then use tcg_gen_extract/tcg_gen_sextract + */ + HexValue res = gen_tmp(c, locp, dst_width, signedness); + + /* + * If the width is an immediate value we know it is non-zero + * at this point, otherwise we need an if-statement + */ + if (need_guarding) { + OUT(c, locp, "if (", src_width, " != 0) {\n"); + } + OUT(c, locp, "tcg_gen_", sign_prefix, "extract_i", &dst_width); + OUT(c, locp, "(", &res, ", ", value, ", 0, ", src_width, + ");\n"); + if (need_guarding) { + OUT(c, locp, "} else {\n"); + OUT(c, locp, "tcg_gen_movi_i", &dst_width, "(", &res, + ", 0);\n"); + OUT(c, locp, "}\n"); + } + + gen_rvalue_free(c, locp, value); + return res; + } +} + +/* + * Implements an extension when the `src_width` is given by + * a TCGv. Here we need to reimplement the behaviour of + * `tcg_gen_extract` and the like using shifts and masks. + */ +static HexValue gen_extend_tcg_width_op(Context *c, + YYLTYPE *locp, + HexValue *src_width, + unsigned dst_width, + HexValue *value, + HexSignedness signedness) +{ + HexValue src_width_m = rvalue_materialize(c, locp, src_width); + HexValue zero = gen_constant(c, locp, "0", dst_width, UNSIGNED); + HexValue shift = gen_tmp(c, locp, dst_width, UNSIGNED); + HexValue res; + + assert_signedness(c, locp, signedness); + assert(dst_width == 64 || dst_width == 32); + assert(src_width->type != IMMEDIATE); + + res = gen_tmp(c, locp, dst_width, signedness); + + OUT(c, locp, "tcg_gen_subfi_i", &dst_width); + OUT(c, locp, "(", &shift, ", ", &dst_width, ", ", &src_width_m, ");\n"); + if (signedness == UNSIGNED) { + const char *mask_str = (dst_width == 32) + ? "0xffffffff" + : "0xffffffffffffffff"; + HexValue mask = gen_tmp_value(c, locp, mask_str, + dst_width, UNSIGNED); + OUT(c, locp, "tcg_gen_shr_i", &dst_width, "(", + &mask, ", ", &mask, ", ", &shift, ");\n"); + OUT(c, locp, "tcg_gen_and_i", &dst_width, "(", + &res, ", ", value, ", ", &mask, ");\n"); + gen_rvalue_free(c, locp, &mask); + } else { + OUT(c, locp, "tcg_gen_shl_i", &dst_width, "(", + &res, ", ", value, ", ", &shift, ");\n"); + OUT(c, locp, "tcg_gen_sar_i", &dst_width, "(", + &res, ", ", &res, ", ", &shift, ");\n"); + } + OUT(c, locp, "tcg_gen_movcond_i", &dst_width, "(TCG_COND_EQ, ", &res, + ", "); + OUT(c, locp, &src_width_m, ", ", &zero, ", ", &zero, ", ", &res, + ");\n"); + + gen_rvalue_free(c, locp, &src_width_m); + gen_rvalue_free(c, locp, value); + gen_rvalue_free(c, locp, &shift); + + return res; +} + +HexValue gen_extend_op(Context *c, + YYLTYPE *locp, + HexValue *src_width, + unsigned dst_width, + HexValue *value, + HexSignedness signedness) +{ + unsigned bit_width = (dst_width = 64) ? 64 : 32; + HexValue value_m = *value; + HexValue src_width_m = *src_width; + + assert_signedness(c, locp, signedness); + yyassert(c, locp, value_m.bit_width <= bit_width && + src_width_m.bit_width <= bit_width, + "Extending to a size smaller than the current size" + " makes no sense"); + + if (value_m.bit_width < bit_width) { + value_m = gen_rvalue_extend(c, locp, &value_m); + } + + if (src_width_m.bit_width < bit_width) { + src_width_m = gen_rvalue_extend(c, locp, &src_width_m); + } + + if (src_width_m.type == IMMEDIATE) { + return gen_extend_imm_width_op(c, locp, &src_width_m, bit_width, + &value_m, signedness); + } else { + return gen_extend_tcg_width_op(c, locp, &src_width_m, bit_width, + &value_m, signedness); + } +} + +/* + * Implements `rdeposit` for the special case where `width` + * is of TCGv type. In this case we need to reimplement the behaviour + * of `tcg_gen_deposit*` using binary operations and masks/shifts. + * + * Note: this is the only type of `rdeposit` that occurs, meaning the + * `width` is _NEVER_ of IMMEDIATE type. + */ +void gen_rdeposit_op(Context *c, + YYLTYPE *locp, + HexValue *dst, + HexValue *value, + HexValue *begin, + HexValue *width) +{ + /* + * Otherwise if the width is not known, we fallback on reimplementing + * desposit in TCG. + */ + HexValue begin_m = *begin; + HexValue value_m = *value; + HexValue width_m = *width; + const char *mask_str = (dst->bit_width == 32) + ? "0xffffffffUL" + : "0xffffffffffffffffUL"; + HexValue mask = gen_constant(c, locp, mask_str, dst->bit_width, + UNSIGNED); + const char *dst_width_str = (dst->bit_width == 32) ? "32" : "64"; + HexValue k64 = gen_constant(c, locp, dst_width_str, dst->bit_width, + UNSIGNED); + HexValue res; + HexValue zero; + + assert(dst->bit_width >= value->bit_width); + assert(begin->type == IMMEDIATE && begin->imm.type == VALUE); + assert(dst->type == REGISTER_ARG); + + yyassert(c, locp, width->type != IMMEDIATE, + "Immediate index to rdeposit not handled!"); + + yyassert(c, locp, value_m.bit_width == dst->bit_width && + begin_m.bit_width == dst->bit_width && + width_m.bit_width == dst->bit_width, + "Extension/truncation should be taken care of" + " before rdeposit!"); + + width_m = rvalue_materialize(c, locp, &width_m); + + /* + * mask = 0xffffffffffffffff >> (64 - width) + * mask = mask << begin + * value = (value << begin) & mask + * res = dst & ~mask + * res = res | value + * dst = (width != 0) ? res : dst + */ + k64 = gen_bin_op(c, locp, SUB_OP, &k64, &width_m); + mask = gen_bin_op(c, locp, LSR_OP, &mask, &k64); + begin_m.is_manual = true; + mask = gen_bin_op(c, locp, ASL_OP, &mask, &begin_m); + mask.is_manual = true; + value_m = gen_bin_op(c, locp, ASL_OP, &value_m, &begin_m); + value_m = gen_bin_op(c, locp, ANDB_OP, &value_m, &mask); + + OUT(c, locp, "tcg_gen_not_i", &dst->bit_width, "(", &mask, ", ", + &mask, ");\n"); + mask.is_manual = false; + res = gen_bin_op(c, locp, ANDB_OP, dst, &mask); + res = gen_bin_op(c, locp, ORB_OP, &res, &value_m); + + /* + * We don't need to truncate `res` here, since all operations involved use + * the same bit width. + */ + + /* If the width is zero, then return the identity dst = dst */ + zero = gen_constant(c, locp, "0", res.bit_width, UNSIGNED); + OUT(c, locp, "tcg_gen_movcond_i", &res.bit_width, "(TCG_COND_NE, ", + dst); + OUT(c, locp, ", ", &width_m, ", ", &zero, ", ", &res, ", ", dst, + ");\n"); + + gen_rvalue_free(c, locp, width); + gen_rvalue_free(c, locp, &res); +} + +void gen_deposit_op(Context *c, + YYLTYPE *locp, + HexValue *dst, + HexValue *value, + HexValue *index, + HexCast *cast) +{ + HexValue value_m = *value; + unsigned bit_width = (dst->bit_width == 64) ? 64 : 32; + unsigned width = cast->bit_width; + + yyassert(c, locp, index->type == IMMEDIATE, + "Deposit index must be immediate!\n"); + + /* + * Using tcg_gen_deposit_i**(dst, dst, ...) requires dst to be + * initialized. + */ + gen_inst_init_args(c, locp); + + /* If the destination value is 32, truncate the value, otherwise extend */ + if (dst->bit_width != value->bit_width) { + if (bit_width == 32) { + value_m = gen_rvalue_truncate(c, locp, &value_m); + } else { + value_m = gen_rvalue_extend(c, locp, &value_m); + } + } + value_m = rvalue_materialize(c, locp, &value_m); + OUT(c, locp, "tcg_gen_deposit_i", &bit_width, "(", dst, ", ", dst, ", "); + OUT(c, locp, &value_m, ", ", index, " * ", &width, ", ", &width, ");\n"); + gen_rvalue_free(c, locp, index); + gen_rvalue_free(c, locp, &value_m); +} + +HexValue gen_rextract_op(Context *c, + YYLTYPE *locp, + HexValue *src, + unsigned begin, + unsigned width) +{ + unsigned bit_width = (src->bit_width == 64) ? 64 : 32; + HexValue res = gen_tmp(c, locp, bit_width, UNSIGNED); + OUT(c, locp, "tcg_gen_extract_i", &bit_width, "(", &res); + OUT(c, locp, ", ", src, ", ", &begin, ", ", &width, ");\n"); + gen_rvalue_free(c, locp, src); + return res; +} + +HexValue gen_extract_op(Context *c, + YYLTYPE *locp, + HexValue *src, + HexValue *index, + HexExtract *extract) +{ + unsigned bit_width = (src->bit_width == 64) ? 64 : 32; + unsigned width = extract->bit_width; + const char *sign_prefix; + HexValue res; + + yyassert(c, locp, index->type == IMMEDIATE, + "Extract index must be immediate!\n"); + assert_signedness(c, locp, extract->signedness); + + sign_prefix = (extract->signedness == UNSIGNED) ? "" : "s"; + res = gen_tmp(c, locp, bit_width, extract->signedness); + + OUT(c, locp, "tcg_gen_", sign_prefix, "extract_i", &bit_width, + "(", &res, ", ", src); + OUT(c, locp, ", ", index, " * ", &width, ", ", &width, ");\n"); + + /* Some extract operations have bit_width != storage_bit_width */ + if (extract->storage_bit_width > bit_width) { + HexValue tmp = gen_tmp(c, locp, extract->storage_bit_width, + extract->signedness); + const char *sign_suffix = (extract->signedness == UNSIGNED) ? "u" : ""; + OUT(c, locp, "tcg_gen_ext", sign_suffix, "_i32_i64(", + &tmp, ", ", &res, ");\n"); + gen_rvalue_free(c, locp, &res); + res = tmp; + } + + gen_rvalue_free(c, locp, src); + gen_rvalue_free(c, locp, index); + return res; +} + +void gen_write_reg(Context *c, YYLTYPE *locp, HexValue *reg, HexValue *value) +{ + HexValue value_m = *value; + yyassert(c, locp, reg->type == REGISTER, "reg must be a register!"); + value_m = gen_rvalue_truncate(c, locp, &value_m); + value_m = rvalue_materialize(c, locp, &value_m); + OUT(c, + locp, + "gen_log_reg_write(", ®->reg.id, ", ", + &value_m, ");\n"); + OUT(c, + locp, + "ctx_log_reg_write(ctx, ", ®->reg.id, + ");\n"); + gen_rvalue_free(c, locp, reg); + gen_rvalue_free(c, locp, &value_m); +} + +void gen_assign(Context *c, + YYLTYPE *locp, + HexValue *dst, + HexValue *value) +{ + HexValue value_m = *value; + unsigned bit_width; + + yyassert(c, locp, !is_inside_ternary(c), + "Assign in ternary not allowed!"); + + if (dst->type == REGISTER) { + gen_write_reg(c, locp, dst, &value_m); + return; + } + + if (dst->type == VARID) { + find_variable(c, locp, dst, dst); + } + bit_width = dst->bit_width == 64 ? 64 : 32; + + if (bit_width != value_m.bit_width) { + if (bit_width == 64) { + value_m = gen_rvalue_extend(c, locp, &value_m); + } else { + value_m = gen_rvalue_truncate(c, locp, &value_m); + } + } + + const char *imm_suffix = (value_m.type == IMMEDIATE) ? "i" : ""; + OUT(c, locp, "tcg_gen_mov", imm_suffix, "_i", &bit_width, + "(", dst, ", ", &value_m, ");\n"); + + gen_rvalue_free(c, locp, &value_m); +} + +HexValue gen_convround(Context *c, + YYLTYPE *locp, + HexValue *src) +{ + HexValue src_m = *src; + unsigned bit_width = src_m.bit_width; + const char *size = (bit_width == 32) ? "32" : "64"; + HexValue res = gen_tmp(c, locp, bit_width, src->signedness); + HexValue mask = gen_constant(c, locp, "0x3", bit_width, UNSIGNED); + HexValue one = gen_constant(c, locp, "1", bit_width, UNSIGNED); + HexValue and; + HexValue src_p1; + + src_m.is_manual = true; + + and = gen_bin_op(c, locp, ANDB_OP, &src_m, &mask); + src_p1 = gen_bin_op(c, locp, ADD_OP, &src_m, &one); + + OUT(c, locp, "tcg_gen_movcond_i", size, "(TCG_COND_EQ, ", &res); + OUT(c, locp, ", ", &and, ", ", &mask, ", "); + OUT(c, locp, &src_p1, ", ", &src_m, ");\n"); + + /* Free src but use the original `is_manual` value */ + gen_rvalue_free(c, locp, src); + + /* Free the rest of the values */ + gen_rvalue_free(c, locp, &src_p1); + + return res; +} + +static HexValue gen_convround_n_b(Context *c, + YYLTYPE *locp, + HexValue *a, + HexValue *n) +{ + HexValue one = gen_constant(c, locp, "1", 32, UNSIGNED); + HexValue res = gen_tmp(c, locp, 64, UNSIGNED); + HexValue tmp = gen_tmp(c, locp, 32, UNSIGNED); + HexValue tmp_64 = gen_tmp(c, locp, 64, UNSIGNED); + + assert(n->type != IMMEDIATE); + OUT(c, locp, "tcg_gen_ext_i32_i64(", &res, ", ", a, ");\n"); + OUT(c, locp, "tcg_gen_shl_i32(", &tmp); + OUT(c, locp, ", ", &one, ", ", n, ");\n"); + OUT(c, locp, "tcg_gen_and_i32(", &tmp); + OUT(c, locp, ", ", &tmp, ", ", a, ");\n"); + OUT(c, locp, "tcg_gen_shri_i32(", &tmp); + OUT(c, locp, ", ", &tmp, ", 1);\n"); + OUT(c, locp, "tcg_gen_ext_i32_i64(", &tmp_64, ", ", &tmp, ");\n"); + OUT(c, locp, "tcg_gen_add_i64(", &res); + OUT(c, locp, ", ", &res, ", ", &tmp_64, ");\n"); + + gen_rvalue_free(c, locp, &tmp); + gen_rvalue_free(c, locp, &tmp_64); + + return res; +} + +static HexValue gen_convround_n_c(Context *c, + YYLTYPE *locp, + HexValue *a, + HexValue *n) +{ + HexValue res = gen_tmp(c, locp, 64, UNSIGNED); + HexValue one = gen_constant(c, locp, "1", 32, UNSIGNED); + HexValue tmp = gen_tmp(c, locp, 32, UNSIGNED); + HexValue tmp_64 = gen_tmp(c, locp, 64, UNSIGNED); + + OUT(c, locp, "tcg_gen_ext_i32_i64(", &res, ", ", a, ");\n"); + OUT(c, locp, "tcg_gen_subi_i32(", &tmp); + OUT(c, locp, ", ", n, ", 1);\n"); + OUT(c, locp, "tcg_gen_shl_i32(", &tmp); + OUT(c, locp, ", ", &one, ", ", &tmp, ");\n"); + OUT(c, locp, "tcg_gen_ext_i32_i64(", &tmp_64, ", ", &tmp, ");\n"); + OUT(c, locp, "tcg_gen_add_i64(", &res); + OUT(c, locp, ", ", &res, ", ", &tmp_64, ");\n"); + + gen_rvalue_free(c, locp, &one); + gen_rvalue_free(c, locp, &tmp); + gen_rvalue_free(c, locp, &tmp_64); + + return res; +} + +HexValue gen_convround_n(Context *c, + YYLTYPE *locp, + HexValue *src, + HexValue *pos) +{ + HexValue zero = gen_constant(c, locp, "0", 64, UNSIGNED); + HexValue l_32 = gen_constant(c, locp, "1", 32, UNSIGNED); + HexValue cond = gen_tmp(c, locp, 32, UNSIGNED); + HexValue cond_64 = gen_tmp(c, locp, 64, UNSIGNED); + HexValue mask = gen_tmp(c, locp, 32, UNSIGNED); + HexValue n_64 = gen_tmp(c, locp, 64, UNSIGNED); + HexValue res = gen_tmp(c, locp, 64, UNSIGNED); + /* If input is 64 bit cast it to 32 */ + HexValue src_casted = gen_cast_op(c, locp, src, 32, src->signedness); + HexValue pos_casted = gen_cast_op(c, locp, pos, 32, pos->signedness); + HexValue r1; + HexValue r2; + HexValue r3; + + src_casted = rvalue_materialize(c, locp, &src_casted); + pos_casted = rvalue_materialize(c, locp, &pos_casted); + + /* + * r1, r2, and r3 represent the results of three different branches. + * - r1 picked if pos_casted == 0 + * - r2 picked if (src_casted & ((1 << (pos_casted - 1)) - 1)) == 0), + * that is if bits 0, ..., pos_casted-1 are all 0. + * - r3 picked otherwise. + */ + r1 = gen_rvalue_extend(c, locp, &src_casted); + r2 = gen_convround_n_b(c, locp, &src_casted, &pos_casted); + r3 = gen_convround_n_c(c, locp, &src_casted, &pos_casted); + + /* + * Calculate the condition + * (src_casted & ((1 << (pos_casted - 1)) - 1)) == 0), + * which checks if the bits 0,...,pos-1 are all 0. + */ + OUT(c, locp, "tcg_gen_sub_i32(", &mask); + OUT(c, locp, ", ", &pos_casted, ", ", &l_32, ");\n"); + OUT(c, locp, "tcg_gen_shl_i32(", &mask); + OUT(c, locp, ", ", &l_32, ", ", &mask, ");\n"); + OUT(c, locp, "tcg_gen_sub_i32(", &mask); + OUT(c, locp, ", ", &mask, ", ", &l_32, ");\n"); + OUT(c, locp, "tcg_gen_and_i32(", &cond); + OUT(c, locp, ", ", &src_casted, ", ", &mask, ");\n"); + OUT(c, locp, "tcg_gen_extu_i32_i64(", &cond_64, ", ", &cond, ");\n"); + + OUT(c, locp, "tcg_gen_ext_i32_i64(", &n_64, ", ", &pos_casted, ");\n"); + + /* + * if the bits 0, ..., pos_casted-1 are all 0, then pick r2 otherwise, + * pick r3. + */ + OUT(c, locp, "tcg_gen_movcond_i64"); + OUT(c, locp, "(TCG_COND_EQ, ", &res, ", ", &cond_64, ", ", &zero); + OUT(c, locp, ", ", &r2, ", ", &r3, ");\n"); + + /* Lastly, if the pos_casted == 0, then pick r1 */ + OUT(c, locp, "tcg_gen_movcond_i64"); + OUT(c, locp, "(TCG_COND_EQ, ", &res, ", ", &n_64, ", ", &zero); + OUT(c, locp, ", ", &r1, ", ", &res, ");\n"); + + /* Finally shift back val >>= n */ + OUT(c, locp, "tcg_gen_shr_i64(", &res); + OUT(c, locp, ", ", &res, ", ", &n_64, ");\n"); + + gen_rvalue_free(c, locp, &src_casted); + gen_rvalue_free(c, locp, &pos_casted); + + gen_rvalue_free(c, locp, &r1); + gen_rvalue_free(c, locp, &r2); + gen_rvalue_free(c, locp, &r3); + + gen_rvalue_free(c, locp, &cond); + gen_rvalue_free(c, locp, &cond_64); + gen_rvalue_free(c, locp, &mask); + gen_rvalue_free(c, locp, &n_64); + + res = gen_rvalue_truncate(c, locp, &res); + return res; +} + +HexValue gen_round(Context *c, + YYLTYPE *locp, + HexValue *src, + HexValue *pos) +{ + HexValue zero = gen_constant(c, locp, "0", 64, UNSIGNED); + HexValue one = gen_constant(c, locp, "1", 64, UNSIGNED); + HexValue res; + HexValue n_m1; + HexValue shifted; + HexValue sum; + HexValue src_width; + HexValue a; + HexValue b; + + assert_signedness(c, locp, src->signedness); + yyassert(c, locp, src->bit_width <= 32, + "fRNDN not implemented for bit widths > 32!"); + + res = gen_tmp(c, locp, 64, src->signedness); + + src_width = gen_imm_value(c, locp, src->bit_width, 32, UNSIGNED); + a = gen_extend_op(c, locp, &src_width, 64, src, SIGNED); + a = rvalue_materialize(c, locp, &a); + + src_width = gen_imm_value(c, locp, 5, 32, UNSIGNED); + b = gen_extend_op(c, locp, &src_width, 64, pos, UNSIGNED); + b = rvalue_materialize(c, locp, &b); + + /* Disable auto-free of values used more than once */ + a.is_manual = true; + b.is_manual = true; + + n_m1 = gen_bin_op(c, locp, SUB_OP, &b, &one); + shifted = gen_bin_op(c, locp, ASL_OP, &one, &n_m1); + sum = gen_bin_op(c, locp, ADD_OP, &shifted, &a); + + OUT(c, locp, "tcg_gen_movcond_i64"); + OUT(c, locp, "(TCG_COND_EQ, ", &res, ", ", &b, ", ", &zero); + OUT(c, locp, ", ", &a, ", ", &sum, ");\n"); + + gen_rvalue_free_manual(c, locp, &a); + gen_rvalue_free_manual(c, locp, &b); + gen_rvalue_free(c, locp, &sum); + + return res; +} + +/* Circular addressing mode with auto-increment */ +void gen_circ_op(Context *c, + YYLTYPE *locp, + HexValue *addr, + HexValue *increment, + HexValue *modifier) +{ + HexValue cs = gen_tmp(c, locp, 32, UNSIGNED); + HexValue increment_m = *increment; + increment_m = rvalue_materialize(c, locp, &increment_m); + OUT(c, locp, "gen_read_reg(", &cs, ", HEX_REG_CS0 + MuN);\n"); + OUT(c, + locp, + "gen_helper_fcircadd(", + addr, + ", ", + addr, + ", ", + &increment_m, + ", ", + modifier); + OUT(c, locp, ", ", &cs, ");\n"); + gen_rvalue_free(c, locp, &increment_m); + gen_rvalue_free(c, locp, modifier); + gen_rvalue_free(c, locp, &cs); +} + +HexValue gen_locnt_op(Context *c, YYLTYPE *locp, HexValue *src) +{ + const char *bit_suffix = src->bit_width == 64 ? "64" : "32"; + HexValue src_m = *src; + HexValue res; + + assert_signedness(c, locp, src->signedness); + res = gen_tmp(c, locp, src->bit_width == 64 ? 64 : 32, src->signedness); + src_m = rvalue_materialize(c, locp, &src_m); + OUT(c, locp, "tcg_gen_not_i", bit_suffix, "(", + &res, ", ", &src_m, ");\n"); + OUT(c, locp, "tcg_gen_clzi_i", bit_suffix, "(", &res, ", ", &res, ", "); + OUT(c, locp, bit_suffix, ");\n"); + gen_rvalue_free(c, locp, &src_m); + return res; +} + +HexValue gen_ctpop_op(Context *c, YYLTYPE *locp, HexValue *src) +{ + const char *bit_suffix = src->bit_width == 64 ? "64" : "32"; + HexValue src_m = *src; + HexValue res; + assert_signedness(c, locp, src->signedness); + res = gen_tmp(c, locp, src->bit_width == 64 ? 64 : 32, src->signedness); + src_m = rvalue_materialize(c, locp, &src_m); + OUT(c, locp, "tcg_gen_ctpop_i", bit_suffix, + "(", &res, ", ", &src_m, ");\n"); + gen_rvalue_free(c, locp, &src_m); + return res; +} + +HexValue gen_rotl(Context *c, YYLTYPE *locp, HexValue *src, HexValue *width) +{ + const char *suffix = src->bit_width == 64 ? "i64" : "i32"; + HexValue amount = *width; + HexValue res; + assert_signedness(c, locp, src->signedness); + res = gen_tmp(c, locp, src->bit_width, src->signedness); + if (amount.bit_width < src->bit_width) { + amount = gen_rvalue_extend(c, locp, &amount); + } else { + amount = gen_rvalue_truncate(c, locp, &amount); + } + amount = rvalue_materialize(c, locp, &amount); + OUT(c, locp, "tcg_gen_rotl_", suffix, "(", + &res, ", ", src, ", ", &amount, ");\n"); + gen_rvalue_free(c, locp, src); + gen_rvalue_free(c, locp, &amount); + + return res; +} + +HexValue gen_carry_from_add(Context *c, + YYLTYPE *locp, + HexValue *op1, + HexValue *op2, + HexValue *op3) +{ + HexValue zero = gen_constant(c, locp, "0", 64, UNSIGNED); + HexValue res = gen_tmp(c, locp, 64, UNSIGNED); + HexValue cf = gen_tmp(c, locp, 64, UNSIGNED); + HexValue op1_m = rvalue_materialize(c, locp, op1); + HexValue op2_m = rvalue_materialize(c, locp, op2); + HexValue op3_m = rvalue_materialize(c, locp, op3); + op3_m = gen_rvalue_extend(c, locp, &op3_m); + + OUT(c, locp, "tcg_gen_add2_i64(", &res, ", ", &cf, ", ", &op1_m, ", ", + &zero); + OUT(c, locp, ", ", &op3_m, ", ", &zero, ");\n"); + OUT(c, locp, "tcg_gen_add2_i64(", &res, ", ", &cf, ", ", &res, ", ", &cf); + OUT(c, locp, ", ", &op2_m, ", ", &zero, ");\n"); + + gen_rvalue_free(c, locp, &op1_m); + gen_rvalue_free(c, locp, &op2_m); + gen_rvalue_free(c, locp, &op3_m); + gen_rvalue_free(c, locp, &res); + return cf; +} + +void gen_addsat64(Context *c, + YYLTYPE *locp, + HexValue *dst, + HexValue *op1, + HexValue *op2) +{ + HexValue op1_m = rvalue_materialize(c, locp, op1); + HexValue op2_m = rvalue_materialize(c, locp, op2); + OUT(c, locp, "gen_add_sat_i64(", dst, ", ", &op1_m, ", ", &op2_m, ");\n"); +} + +void gen_inst(Context *c, GString *iname) +{ + c->total_insn++; + c->inst.name = iname; + c->inst.allocated = g_array_new(FALSE, FALSE, sizeof(Var)); + c->inst.init_list = g_array_new(FALSE, FALSE, sizeof(HexValue)); + c->inst.strings = g_array_new(FALSE, FALSE, sizeof(GString *)); + EMIT_SIG(c, "void emit_%s(DisasContext *ctx, Insn *insn, Packet *pkt", + c->inst.name->str); +} + + +/* + * Initialize declared but uninitialized registers, but only for + * non-conditional instructions + */ +void gen_inst_init_args(Context *c, YYLTYPE *locp) +{ + if (!c->inst.init_list) { + return; + } + + for (unsigned i = 0; i < c->inst.init_list->len; i++) { + HexValue *val = &g_array_index(c->inst.init_list, HexValue, i); + if (val->type == REGISTER_ARG) { + char reg_id[5]; + reg_compose(c, locp, &val->reg, reg_id); + EMIT_HEAD(c, "tcg_gen_movi_i%u(%s, 0);\n", val->bit_width, reg_id); + } else if (val->type == PREDICATE) { + char suffix = val->is_dotnew ? 'N' : 'V'; + EMIT_HEAD(c, "tcg_gen_movi_i%u(P%c%c, 0);\n", val->bit_width, + val->pred.id, suffix); + } else { + yyassert(c, locp, false, "Invalid arg type!"); + } + } + + /* Free argument init list once we have initialized everything */ + g_array_free(c->inst.init_list, TRUE); + c->inst.init_list = NULL; +} + +void gen_inst_code(Context *c, YYLTYPE *locp) +{ + if (c->inst.error_count != 0) { + fprintf(stderr, + "Parsing of instruction %s generated %d errors!\n", + c->inst.name->str, + c->inst.error_count); + } else { + free_variables(c, locp); + c->implemented_insn++; + fprintf(c->enabled_file, "%s\n", c->inst.name->str); + emit_footer(c); + commit(c); + } + free_instruction(c); +} + +void gen_pred_assign(Context *c, YYLTYPE *locp, HexValue *left_pred, + HexValue *right_pred) +{ + char pred_id[2] = {left_pred->pred.id, 0}; + bool is_direct = is_direct_predicate(left_pred); + HexValue r = rvalue_materialize(c, locp, right_pred); + r = gen_rvalue_truncate(c, locp, &r); + yyassert(c, locp, !is_inside_ternary(c), + "Predicate assign not allowed in ternary!"); + /* Extract predicate TCGv */ + if (is_direct) { + *left_pred = gen_tmp_value(c, locp, "0", 32, UNSIGNED); + } + /* Extract first 8 bits, and store new predicate value */ + OUT(c, locp, "tcg_gen_mov_i32(", left_pred, ", ", &r, ");\n"); + OUT(c, locp, "tcg_gen_andi_i32(", left_pred, ", ", left_pred, + ", 0xff);\n"); + if (is_direct) { + OUT(c, locp, "gen_log_pred_write(ctx, ", pred_id, ", ", left_pred, + ");\n"); + OUT(c, locp, "ctx_log_pred_write(ctx, ", pred_id, ");\n"); + gen_rvalue_free(c, locp, left_pred); + } + /* Free temporary value */ + gen_rvalue_free(c, locp, &r); +} + +void gen_cancel(Context *c, YYLTYPE *locp) +{ + OUT(c, locp, "gen_cancel(insn->slot);\n"); +} + +void gen_load_cancel(Context *c, YYLTYPE *locp) +{ + gen_cancel(c, locp); + OUT(c, locp, "if (insn->slot == 0 && pkt->pkt_has_store_s1) {\n"); + OUT(c, locp, "ctx->s1_store_processed = false;\n"); + OUT(c, locp, "process_store(ctx, 1);\n"); + OUT(c, locp, "}\n"); +} + +void gen_load(Context *c, YYLTYPE *locp, HexValue *width, + HexSignedness signedness, HexValue *ea, HexValue *dst) +{ + char size_suffix[4] = {0}; + const char *sign_suffix; + /* Memop width is specified in the load macro */ + assert_signedness(c, locp, signedness); + sign_suffix = (width->imm.value > 4) + ? "" + : ((signedness == UNSIGNED) ? "u" : "s"); + /* If dst is a variable, assert that is declared and load the type info */ + if (dst->type == VARID) { + find_variable(c, locp, dst, dst); + } + + snprintf(size_suffix, 4, "%" PRIu64, width->imm.value * 8); + /* Lookup the effective address EA */ + find_variable(c, locp, ea, ea); + OUT(c, locp, "if (insn->slot == 0 && pkt->pkt_has_store_s1) {\n"); + OUT(c, locp, "probe_noshuf_load(", ea, ", ", width, ", ctx->mem_idx);\n"); + OUT(c, locp, "process_store(ctx, 1);\n"); + OUT(c, locp, "}\n"); + OUT(c, locp, "tcg_gen_qemu_ld", size_suffix, sign_suffix); + OUT(c, locp, "("); + if (dst->bit_width > width->imm.value * 8) { + /* + * Cast to the correct TCG type if necessary, to avoid implict cast + * warnings. This is needed when the width of the destination var is + * larger than the size of the requested load. + */ + OUT(c, locp, "(TCGv) "); + } + OUT(c, locp, dst, ", ", ea, ", ctx->mem_idx);\n"); + /* If the var in EA was truncated it is now a tmp HexValue, so free it. */ + gen_rvalue_free(c, locp, ea); +} + +void gen_store(Context *c, YYLTYPE *locp, HexValue *width, HexValue *ea, + HexValue *src) +{ + HexValue src_m = *src; + /* Memop width is specified in the store macro */ + unsigned mem_width = width->imm.value; + /* Lookup the effective address EA */ + find_variable(c, locp, ea, ea); + src_m = rvalue_materialize(c, locp, &src_m); + OUT(c, locp, "gen_store", &mem_width, "(cpu_env, ", ea, ", ", &src_m); + OUT(c, locp, ", insn->slot);\n"); + gen_rvalue_free(c, locp, &src_m); + /* If the var in ea was truncated it is now a tmp HexValue, so free it. */ + gen_rvalue_free(c, locp, ea); +} + +void gen_sethalf(Context *c, YYLTYPE *locp, HexCast *sh, HexValue *n, + HexValue *dst, HexValue *value) +{ + yyassert(c, locp, n->type == IMMEDIATE, + "Deposit index must be immediate!\n"); + if (dst->type == VARID) { + find_variable(c, locp, dst, dst); + } + + gen_deposit_op(c, locp, dst, value, n, sh); +} + +void gen_setbits(Context *c, YYLTYPE *locp, HexValue *hi, HexValue *lo, + HexValue *dst, HexValue *value) +{ + unsigned len; + HexValue tmp; + + yyassert(c, locp, hi->type == IMMEDIATE && + hi->imm.type == VALUE && + lo->type == IMMEDIATE && + lo->imm.type == VALUE, + "Range deposit needs immediate values!\n"); + + *value = gen_rvalue_truncate(c, locp, value); + len = hi->imm.value + 1 - lo->imm.value; + tmp = gen_tmp(c, locp, 32, value->signedness); + /* Emit an `and` to ensure `value` is either 0 or 1. */ + OUT(c, locp, "tcg_gen_andi_i32(", &tmp, ", ", value, ", 1);\n"); + /* Use `neg` to map 0 -> 0 and 1 -> 0xffff... */ + OUT(c, locp, "tcg_gen_neg_i32(", &tmp, ", ", &tmp, ");\n"); + OUT(c, locp, "tcg_gen_deposit_i32(", dst, ", ", dst, + ", ", &tmp, ", "); + OUT(c, locp, lo, ", ", &len, ");\n"); + + gen_rvalue_free(c, locp, &tmp); + gen_rvalue_free(c, locp, hi); + gen_rvalue_free(c, locp, lo); + gen_rvalue_free(c, locp, value); +} + +unsigned gen_if_cond(Context *c, YYLTYPE *locp, HexValue *cond) +{ + const char *bit_suffix; + /* Generate an end label, if false branch to that label */ + OUT(c, locp, "TCGLabel *if_label_", &c->inst.if_count, + " = gen_new_label();\n"); + *cond = rvalue_materialize(c, locp, cond); + bit_suffix = (cond->bit_width == 64) ? "i64" : "i32"; + OUT(c, locp, "tcg_gen_brcondi_", bit_suffix, "(TCG_COND_EQ, ", cond, + ", 0, if_label_", &c->inst.if_count, ");\n"); + gen_rvalue_free(c, locp, cond); + return c->inst.if_count++; +} + +unsigned gen_if_else(Context *c, YYLTYPE *locp, unsigned index) +{ + unsigned if_index = c->inst.if_count++; + /* Generate label to jump if else is not verified */ + OUT(c, locp, "TCGLabel *if_label_", &if_index, + " = gen_new_label();\n"); + /* Jump out of the else statement */ + OUT(c, locp, "tcg_gen_br(if_label_", &if_index, ");\n"); + /* Fix the else label */ + OUT(c, locp, "gen_set_label(if_label_", &index, ");\n"); + return if_index; +} + +HexValue gen_rvalue_pred(Context *c, YYLTYPE *locp, HexValue *pred) +{ + /* Predicted instructions need to zero out result args */ + gen_inst_init_args(c, locp); + + if (is_direct_predicate(pred)) { + bool is_dotnew = pred->is_dotnew; + char predicate_id[2] = { pred->pred.id, '\0' }; + char *pred_str = (char *) &predicate_id; + *pred = gen_tmp_value(c, locp, "0", 32, UNSIGNED); + if (is_dotnew) { + OUT(c, locp, "tcg_gen_mov_i32(", pred, + ", hex_new_pred_value["); + OUT(c, locp, pred_str, "]);\n"); + } else { + OUT(c, locp, "gen_read_preg(", pred, ", ", pred_str, ");\n"); + } + } + + return *pred; +} + +HexValue gen_rvalue_var(Context *c, YYLTYPE *locp, HexValue *var) +{ + find_variable(c, locp, var, var); + return *var; +} + +HexValue gen_rvalue_mpy(Context *c, YYLTYPE *locp, HexMpy *mpy, + HexValue *op1, HexValue *op2) +{ + HexValue res; + memset(&res, 0, sizeof(HexValue)); + + assert_signedness(c, locp, mpy->first_signedness); + assert_signedness(c, locp, mpy->second_signedness); + + *op1 = gen_cast_op(c, locp, op1, mpy->first_bit_width * 2, + mpy->first_signedness); + /* Handle fMPTY3216.. */ + if (mpy->first_bit_width == 32) { + *op2 = gen_cast_op(c, locp, op2, 64, mpy->second_signedness); + } else { + *op2 = gen_cast_op(c, locp, op2, mpy->second_bit_width * 2, + mpy->second_signedness); + } + res = gen_bin_op(c, locp, MUL_OP, op1, op2); + /* Handle special cases required by the language */ + if (mpy->first_bit_width == 16 && mpy->second_bit_width == 16) { + HexValue src_width = gen_imm_value(c, locp, 32, 32, UNSIGNED); + HexSignedness signedness = bin_op_signedness(c, locp, + mpy->first_signedness, + mpy->second_signedness); + res = gen_extend_op(c, locp, &src_width, 64, &res, + signedness); + } + return res; +} + +static inline HexValue gen_rvalue_simple_unary(Context *c, YYLTYPE *locp, + HexValue *value, + const char *c_code, + const char *tcg_code) +{ + unsigned bit_width = (value->bit_width == 64) ? 64 : 32; + HexValue res; + if (value->type == IMMEDIATE) { + res = gen_imm_qemu_tmp(c, locp, bit_width, value->signedness); + gen_c_int_type(c, locp, value->bit_width, value->signedness); + OUT(c, locp, " ", &res, " = ", c_code, "(", value, ");\n"); + } else { + res = gen_tmp(c, locp, bit_width, value->signedness); + OUT(c, locp, tcg_code, "_i", &bit_width, "(", &res, ", ", value, + ");\n"); + gen_rvalue_free(c, locp, value); + } + return res; +} + + +HexValue gen_rvalue_not(Context *c, YYLTYPE *locp, HexValue *value) +{ + return gen_rvalue_simple_unary(c, locp, value, "~", "tcg_gen_not"); +} + +HexValue gen_rvalue_notl(Context *c, YYLTYPE *locp, HexValue *value) +{ + unsigned bit_width = (value->bit_width == 64) ? 64 : 32; + HexValue res; + if (value->type == IMMEDIATE) { + res = gen_imm_qemu_tmp(c, locp, bit_width, value->signedness); + gen_c_int_type(c, locp, value->bit_width, value->signedness); + OUT(c, locp, " ", &res, " = !(", value, ");\n"); + } else { + HexValue zero = gen_constant(c, locp, "0", bit_width, UNSIGNED); + HexValue one = gen_constant(c, locp, "0xff", bit_width, UNSIGNED); + res = gen_tmp(c, locp, bit_width, value->signedness); + OUT(c, locp, "tcg_gen_movcond_i", &bit_width); + OUT(c, locp, "(TCG_COND_EQ, ", &res, ", ", value, ", ", &zero); + OUT(c, locp, ", ", &one, ", ", &zero, ");\n"); + gen_rvalue_free(c, locp, value); + } + return res; +} + +HexValue gen_rvalue_sat(Context *c, YYLTYPE *locp, HexSat *sat, + HexValue *width, HexValue *value) +{ + const char *unsigned_str; + const char *bit_suffix = (value->bit_width == 64) ? "i64" : "i32"; + HexValue res; + HexValue ovfl; + /* + * Note: all saturates are assumed to implicitly set overflow. + * This assumption holds for the instructions currently parsed + * by idef-parser. + */ + yyassert(c, locp, width->imm.value < value->bit_width, + "To compute overflow, source width must be greater than" + " saturation width!"); + yyassert(c, locp, !is_inside_ternary(c), + "Saturating from within a ternary is not allowed!"); + assert_signedness(c, locp, sat->signedness); + + unsigned_str = (sat->signedness == UNSIGNED) ? "u" : ""; + res = gen_tmp_local(c, locp, value->bit_width, sat->signedness); + ovfl = gen_tmp_local(c, locp, 32, sat->signedness); + OUT(c, locp, "gen_sat", unsigned_str, "_", bit_suffix, "_ovfl("); + OUT(c, locp, &ovfl, ", ", &res, ", ", value, ", ", &width->imm.value, + ");\n"); + OUT(c, locp, "gen_set_usr_field_if(USR_OVF,", &ovfl, ");\n"); + gen_rvalue_free(c, locp, value); + + return res; +} + +HexValue gen_rvalue_fscr(Context *c, YYLTYPE *locp, HexValue *value) +{ + HexValue key = gen_tmp(c, locp, 64, UNSIGNED); + HexValue res = gen_tmp(c, locp, 64, UNSIGNED); + HexValue frame_key = gen_tmp(c, locp, 32, UNSIGNED); + *value = gen_rvalue_extend(c, locp, value); + OUT(c, locp, "gen_read_reg(", &frame_key, ", HEX_REG_FRAMEKEY);\n"); + OUT(c, locp, "tcg_gen_concat_i32_i64(", + &key, ", ", &frame_key, ", ", &frame_key, ");\n"); + OUT(c, locp, "tcg_gen_xor_i64(", &res, ", ", value, ", ", &key, ");\n"); + gen_rvalue_free(c, locp, &key); + gen_rvalue_free(c, locp, &frame_key); + gen_rvalue_free(c, locp, value); + return res; +} + +HexValue gen_rvalue_abs(Context *c, YYLTYPE *locp, HexValue *value) +{ + return gen_rvalue_simple_unary(c, locp, value, "abs", "tcg_gen_abs"); +} + +HexValue gen_rvalue_neg(Context *c, YYLTYPE *locp, HexValue *value) +{ + return gen_rvalue_simple_unary(c, locp, value, "-", "tcg_gen_neg"); +} + +HexValue gen_rvalue_brev(Context *c, YYLTYPE *locp, HexValue *value) +{ + HexValue res; + yyassert(c, locp, value->bit_width <= 32, + "fbrev not implemented for 64-bit integers!"); + res = gen_tmp(c, locp, value->bit_width, value->signedness); + *value = rvalue_materialize(c, locp, value); + OUT(c, locp, "gen_helper_fbrev(", &res, ", ", value, ");\n"); + gen_rvalue_free(c, locp, value); + return res; +} + +HexValue gen_rvalue_ternary(Context *c, YYLTYPE *locp, HexValue *cond, + HexValue *true_branch, HexValue *false_branch) +{ + bool is_64bit = (true_branch->bit_width == 64) || + (false_branch->bit_width == 64); + unsigned bit_width = (is_64bit) ? 64 : 32; + HexValue zero = gen_constant(c, locp, "0", bit_width, UNSIGNED); + HexValue res = gen_tmp(c, locp, bit_width, UNSIGNED); + Ternary *ternary = NULL; + + if (is_64bit) { + *cond = gen_rvalue_extend(c, locp, cond); + *true_branch = gen_rvalue_extend(c, locp, true_branch); + *false_branch = gen_rvalue_extend(c, locp, false_branch); + } else { + *cond = gen_rvalue_truncate(c, locp, cond); + } + *cond = rvalue_materialize(c, locp, cond); + *true_branch = rvalue_materialize(c, locp, true_branch); + *false_branch = rvalue_materialize(c, locp, false_branch); + + OUT(c, locp, "tcg_gen_movcond_i", &bit_width); + OUT(c, locp, "(TCG_COND_NE, ", &res, ", ", cond, ", ", &zero); + OUT(c, locp, ", ", true_branch, ", ", false_branch, ");\n"); + + assert(c->ternary->len > 0); + ternary = &g_array_index(c->ternary, Ternary, c->ternary->len - 1); + gen_rvalue_free_manual(c, locp, &ternary->cond); + g_array_remove_index(c->ternary, c->ternary->len - 1); + + gen_rvalue_free(c, locp, cond); + gen_rvalue_free(c, locp, true_branch); + gen_rvalue_free(c, locp, false_branch); + return res; +} + +const char *cond_to_str(TCGCond cond) +{ + switch (cond) { + case TCG_COND_NEVER: + return "TCG_COND_NEVER"; + case TCG_COND_ALWAYS: + return "TCG_COND_ALWAYS"; + case TCG_COND_EQ: + return "TCG_COND_EQ"; + case TCG_COND_NE: + return "TCG_COND_NE"; + case TCG_COND_LT: + return "TCG_COND_LT"; + case TCG_COND_GE: + return "TCG_COND_GE"; + case TCG_COND_LE: + return "TCG_COND_LE"; + case TCG_COND_GT: + return "TCG_COND_GT"; + case TCG_COND_LTU: + return "TCG_COND_LTU"; + case TCG_COND_GEU: + return "TCG_COND_GEU"; + case TCG_COND_LEU: + return "TCG_COND_LEU"; + case TCG_COND_GTU: + return "TCG_COND_GTU"; + default: + abort(); + } +} + +void emit_arg(Context *c, YYLTYPE *locp, HexValue *arg) +{ + switch (arg->type) { + case REGISTER_ARG: + if (arg->reg.type == DOTNEW) { + EMIT_SIG(c, ", TCGv N%cN", arg->reg.id); + } else { + bool is64 = (arg->bit_width == 64); + const char *type = is64 ? "TCGv_i64" : "TCGv_i32"; + char reg_id[5]; + reg_compose(c, locp, &(arg->reg), reg_id); + EMIT_SIG(c, ", %s %s", type, reg_id); + /* MuV register requires also MuN to provide its index */ + if (arg->reg.type == MODIFIER) { + EMIT_SIG(c, ", int MuN"); + } + } + break; + case PREDICATE: + { + char suffix = arg->is_dotnew ? 'N' : 'V'; + EMIT_SIG(c, ", TCGv P%c%c", arg->pred.id, suffix); + } + break; + default: + { + fprintf(stderr, "emit_arg got unsupported argument!"); + abort(); + } + } +} + +void emit_footer(Context *c) +{ + EMIT(c, "}\n"); + EMIT(c, "\n"); +} + +void track_string(Context *c, GString *s) +{ + g_array_append_val(c->inst.strings, s); +} + +void free_variables(Context *c, YYLTYPE *locp) +{ + for (unsigned i = 0; i < c->inst.allocated->len; ++i) { + Var *var = &g_array_index(c->inst.allocated, Var, i); + const char *suffix = var->bit_width == 64 ? "i64" : "i32"; + OUT(c, locp, "tcg_temp_free_", suffix, "(", var->name->str, ");\n"); + } +} + +void free_instruction(Context *c) +{ + assert(!is_inside_ternary(c)); + /* Free the strings */ + g_string_truncate(c->signature_str, 0); + g_string_truncate(c->out_str, 0); + g_string_truncate(c->header_str, 0); + /* Free strings allocated by the instruction */ + for (unsigned i = 0; i < c->inst.strings->len; i++) { + g_string_free(g_array_index(c->inst.strings, GString*, i), TRUE); + } + g_array_free(c->inst.strings, TRUE); + /* Free INAME token value */ + g_string_free(c->inst.name, TRUE); + /* Free variables and registers */ + g_array_free(c->inst.allocated, TRUE); + /* Initialize instruction-specific portion of the context */ + memset(&(c->inst), 0, sizeof(Inst)); +} + +void assert_signedness(Context *c, + YYLTYPE *locp, + HexSignedness signedness) +{ + yyassert(c, locp, + signedness != UNKNOWN_SIGNEDNESS, + "Unspecified signedness"); +} diff --git a/target/hexagon/idef-parser/parser-helpers.h b/target/hexagon/idef-parser/parser-helpers.h new file mode 100644 index 0000000000..2766296417 --- /dev/null +++ b/target/hexagon/idef-parser/parser-helpers.h @@ -0,0 +1,376 @@ +/* + * Copyright(c) 2019-2022 rev.ng Labs Srl. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + */ + +#ifndef PARSER_HELPERS_H +#define PARSER_HELPERS_H + +#include <assert.h> +#include <inttypes.h> +#include <stdarg.h> +#include <stdbool.h> +#include <stdint.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> + +#include "tcg/tcg-cond.h" + +#include "idef-parser.tab.h" +#include "idef-parser.yy.h" +#include "idef-parser.h" + +/* Decomment this to disable yyasserts */ +/* #define NDEBUG */ + +#define ERR_LINE_CONTEXT 40 + +#define START_COMMENT "/" "*" +#define END_COMMENT "*" "/" + +void yyerror(YYLTYPE *locp, + yyscan_t scanner __attribute__((unused)), + Context *c, + const char *s); + +#ifndef NDEBUG +#define yyassert(context, locp, condition, msg) \ + if (!(condition)) { \ + yyerror(locp, (context)->scanner, (context), (msg)); \ + } +#endif + +bool is_direct_predicate(HexValue *value); + +bool is_inside_ternary(Context *c); + +/** + * Print functions + */ + +void str_print(Context *c, YYLTYPE *locp, const char *string); + +void uint8_print(Context *c, YYLTYPE *locp, uint8_t *num); + +void uint64_print(Context *c, YYLTYPE *locp, uint64_t *num); + +void int_print(Context *c, YYLTYPE *locp, int *num); + +void uint_print(Context *c, YYLTYPE *locp, unsigned *num); + +void tmp_print(Context *c, YYLTYPE *locp, HexTmp *tmp); + +void pred_print(Context *c, YYLTYPE *locp, HexPred *pred, bool is_dotnew); + +void reg_compose(Context *c, YYLTYPE *locp, HexReg *reg, char reg_id[5]); + +void reg_print(Context *c, YYLTYPE *locp, HexReg *reg); + +void imm_print(Context *c, YYLTYPE *locp, HexImm *imm); + +void var_print(Context *c, YYLTYPE *locp, HexVar *var); + +void rvalue_print(Context *c, YYLTYPE *locp, void *pointer); + +void out_assert(Context *c, YYLTYPE *locp, void *dummy); + +/** + * Copies output code buffer into stdout + */ +void commit(Context *c); + +#define OUT_IMPL(c, locp, x) \ + _Generic(*(x), \ + char: str_print, \ + uint8_t: uint8_print, \ + uint64_t: uint64_print, \ + int: int_print, \ + unsigned: uint_print, \ + HexValue: rvalue_print, \ + default: out_assert \ + )(c, locp, x); + +/* FOREACH macro */ +#define FE_1(c, locp, WHAT, X) WHAT(c, locp, X) +#define FE_2(c, locp, WHAT, X, ...) \ + WHAT(c, locp, X)FE_1(c, locp, WHAT, __VA_ARGS__) +#define FE_3(c, locp, WHAT, X, ...) \ + WHAT(c, locp, X)FE_2(c, locp, WHAT, __VA_ARGS__) +#define FE_4(c, locp, WHAT, X, ...) \ + WHAT(c, locp, X)FE_3(c, locp, WHAT, __VA_ARGS__) +#define FE_5(c, locp, WHAT, X, ...) \ + WHAT(c, locp, X)FE_4(c, locp, WHAT, __VA_ARGS__) +#define FE_6(c, locp, WHAT, X, ...) \ + WHAT(c, locp, X)FE_5(c, locp, WHAT, __VA_ARGS__) +#define FE_7(c, locp, WHAT, X, ...) \ + WHAT(c, locp, X)FE_6(c, locp, WHAT, __VA_ARGS__) +#define FE_8(c, locp, WHAT, X, ...) \ + WHAT(c, locp, X)FE_7(c, locp, WHAT, __VA_ARGS__) +#define FE_9(c, locp, WHAT, X, ...) \ + WHAT(c, locp, X)FE_8(c, locp, WHAT, __VA_ARGS__) +/* repeat as needed */ + +#define GET_MACRO(_1, _2, _3, _4, _5, _6, _7, _8, _9, NAME, ...) NAME + +#define FOR_EACH(c, locp, action, ...) \ + do { \ + GET_MACRO(__VA_ARGS__, \ + FE_9, \ + FE_8, \ + FE_7, \ + FE_6, \ + FE_5, \ + FE_4, \ + FE_3, \ + FE_2, \ + FE_1)(c, locp, action, \ + __VA_ARGS__) \ + } while (0) + +#define OUT(c, locp, ...) FOR_EACH((c), (locp), OUT_IMPL, __VA_ARGS__) + +const char *cmp_swap(Context *c, YYLTYPE *locp, const char *type); + +/** + * Temporary values creation + */ + +HexValue gen_tmp(Context *c, + YYLTYPE *locp, + unsigned bit_width, + HexSignedness signedness); + +HexValue gen_tmp_value(Context *c, + YYLTYPE *locp, + const char *value, + unsigned bit_width, + HexSignedness signedness); + +HexValue gen_imm_value(Context *c __attribute__((unused)), + YYLTYPE *locp, + int value, + unsigned bit_width, + HexSignedness signedness); + +HexValue gen_imm_qemu_tmp(Context *c, YYLTYPE *locp, unsigned bit_width, + HexSignedness signedness); + +void gen_rvalue_free(Context *c, YYLTYPE *locp, HexValue *rvalue); + +HexValue rvalue_materialize(Context *c, YYLTYPE *locp, HexValue *rvalue); + +HexValue gen_rvalue_extend(Context *c, YYLTYPE *locp, HexValue *rvalue); + +HexValue gen_rvalue_truncate(Context *c, YYLTYPE *locp, HexValue *rvalue); + +void gen_varid_allocate(Context *c, + YYLTYPE *locp, + HexValue *varid, + unsigned bit_width, + HexSignedness signedness); + +/** + * Code generation functions + */ + +HexValue gen_bin_cmp(Context *c, + YYLTYPE *locp, + TCGCond type, + HexValue *op1, + HexValue *op2); + +HexValue gen_bin_op(Context *c, + YYLTYPE *locp, + OpType type, + HexValue *op1, + HexValue *op2); + +HexValue gen_cast_op(Context *c, + YYLTYPE *locp, + HexValue *src, + unsigned target_width, + HexSignedness signedness); + +/** + * gen_extend_op extends a region of src_width_ptr bits stored in a + * value_ptr to the size of dst_width. Note: src_width_ptr is a + * HexValue * to handle the special case where it is unknown at + * translation time. + */ +HexValue gen_extend_op(Context *c, + YYLTYPE *locp, + HexValue *src_width, + unsigned dst_width, + HexValue *value, + HexSignedness signedness); + +void gen_rdeposit_op(Context *c, + YYLTYPE *locp, + HexValue *dst, + HexValue *value, + HexValue *begin, + HexValue *width); + +void gen_deposit_op(Context *c, + YYLTYPE *locp, + HexValue *dst, + HexValue *value, + HexValue *index, + HexCast *cast); + +HexValue gen_rextract_op(Context *c, + YYLTYPE *locp, + HexValue *src, + unsigned begin, + unsigned width); + +HexValue gen_extract_op(Context *c, + YYLTYPE *locp, + HexValue *src, + HexValue *index, + HexExtract *extract); + +HexValue gen_read_reg(Context *c, YYLTYPE *locp, HexValue *reg); + +void gen_write_reg(Context *c, YYLTYPE *locp, HexValue *reg, HexValue *value); + +void gen_assign(Context *c, + YYLTYPE *locp, + HexValue *dst, + HexValue *value); + +HexValue gen_convround(Context *c, + YYLTYPE *locp, + HexValue *src); + +HexValue gen_round(Context *c, + YYLTYPE *locp, + HexValue *src, + HexValue *position); + +HexValue gen_convround_n(Context *c, + YYLTYPE *locp, + HexValue *src, + HexValue *pos); + +/** + * Circular addressing mode with auto-increment + */ +void gen_circ_op(Context *c, + YYLTYPE *locp, + HexValue *addr, + HexValue *increment, + HexValue *modifier); + +HexValue gen_locnt_op(Context *c, YYLTYPE *locp, HexValue *src); + +HexValue gen_ctpop_op(Context *c, YYLTYPE *locp, HexValue *src); + +HexValue gen_rotl(Context *c, YYLTYPE *locp, HexValue *src, HexValue *n); + +HexValue gen_deinterleave(Context *c, YYLTYPE *locp, HexValue *mixed); + +HexValue gen_interleave(Context *c, + YYLTYPE *locp, + HexValue *odd, + HexValue *even); + +HexValue gen_carry_from_add(Context *c, + YYLTYPE *locp, + HexValue *op1, + HexValue *op2, + HexValue *op3); + +void gen_addsat64(Context *c, + YYLTYPE *locp, + HexValue *dst, + HexValue *op1, + HexValue *op2); + +void gen_inst(Context *c, GString *iname); + +void gen_inst_init_args(Context *c, YYLTYPE *locp); + +void gen_inst_code(Context *c, YYLTYPE *locp); + +void gen_pred_assign(Context *c, YYLTYPE *locp, HexValue *left_pred, + HexValue *right_pred); + +void gen_cancel(Context *c, YYLTYPE *locp); + +void gen_load_cancel(Context *c, YYLTYPE *locp); + +void gen_load(Context *c, YYLTYPE *locp, HexValue *size, + HexSignedness signedness, HexValue *ea, HexValue *dst); + +void gen_store(Context *c, YYLTYPE *locp, HexValue *size, HexValue *ea, + HexValue *src); + +void gen_sethalf(Context *c, YYLTYPE *locp, HexCast *sh, HexValue *n, + HexValue *dst, HexValue *value); + +void gen_setbits(Context *c, YYLTYPE *locp, HexValue *hi, HexValue *lo, + HexValue *dst, HexValue *value); + +unsigned gen_if_cond(Context *c, YYLTYPE *locp, HexValue *cond); + +unsigned gen_if_else(Context *c, YYLTYPE *locp, unsigned index); + +HexValue gen_rvalue_pred(Context *c, YYLTYPE *locp, HexValue *pred); + +HexValue gen_rvalue_var(Context *c, YYLTYPE *locp, HexValue *var); + +HexValue gen_rvalue_mpy(Context *c, YYLTYPE *locp, HexMpy *mpy, HexValue *op1, + HexValue *op2); + +HexValue gen_rvalue_not(Context *c, YYLTYPE *locp, HexValue *value); + +HexValue gen_rvalue_notl(Context *c, YYLTYPE *locp, HexValue *value); + +HexValue gen_rvalue_sat(Context *c, YYLTYPE *locp, HexSat *sat, HexValue *n, + HexValue *value); + +HexValue gen_rvalue_fscr(Context *c, YYLTYPE *locp, HexValue *value); + +HexValue gen_rvalue_abs(Context *c, YYLTYPE *locp, HexValue *value); + +HexValue gen_rvalue_neg(Context *c, YYLTYPE *locp, HexValue *value); + +HexValue gen_rvalue_brev(Context *c, YYLTYPE *locp, HexValue *value); + +HexValue gen_rvalue_ternary(Context *c, YYLTYPE *locp, HexValue *cond, + HexValue *true_branch, HexValue *false_branch); + +const char *cond_to_str(TCGCond cond); + +void emit_header(Context *c); + +void emit_arg(Context *c, YYLTYPE *locp, HexValue *arg); + +void emit_footer(Context *c); + +void track_string(Context *c, GString *s); + +void free_variables(Context *c, YYLTYPE *locp); + +void free_instruction(Context *c); + +void assert_signedness(Context *c, + YYLTYPE *locp, + HexSignedness signedness); + +#endif /* PARSER_HELPERS_h */ diff --git a/target/hexagon/idef-parser/prepare b/target/hexagon/idef-parser/prepare new file mode 100755 index 0000000000..72d6fcbd21 --- /dev/null +++ b/target/hexagon/idef-parser/prepare @@ -0,0 +1,24 @@ +#!/bin/bash + +# +# Copyright(c) 2019-2021 rev.ng Labs Srl. All Rights Reserved. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, see <http://www.gnu.org/licenses/>. +# + +set -e +set -o pipefail + +# Run the preprocessor and drop comments +cpp "$@" |