diff options
| author | Fabrice Desclaux <fabrice.desclaux@cea.fr> | 2020-02-20 09:01:17 +0100 |
|---|---|---|
| committer | Fabrice Desclaux <fabrice.desclaux@cea.fr> | 2020-02-20 09:34:17 +0100 |
| commit | 9016dd157ce2adc6d8107acc518daf8254633c15 (patch) | |
| tree | 67f8077f29509da718e6d11f35931aea967b0544 /doc/expression/expression.ipynb | |
| parent | e6ad6b76f9f40b00059cb4fb48a2d3f9e91aec48 (diff) | |
| download | miasm-9016dd157ce2adc6d8107acc518daf8254633c15.tar.gz miasm-9016dd157ce2adc6d8107acc518daf8254633c15.zip | |
Add some language rules
Diffstat (limited to 'doc/expression/expression.ipynb')
| -rw-r--r-- | doc/expression/expression.ipynb | 247 |
1 files changed, 46 insertions, 201 deletions
diff --git a/doc/expression/expression.ipynb b/doc/expression/expression.ipynb index 83baecee..5049a6a0 100644 --- a/doc/expression/expression.ipynb +++ b/doc/expression/expression.ipynb @@ -17,7 +17,14 @@ "* A minimal language;\n", "* The side effects are explicit, for example *A + B* will not implicitly update flags.\n", "\n", - "Miasm's IR implementation is located in `miasm.expression.expression`, with `Expr*` objects. Each object has a size, in bits." + "Miasm's IR implementation is located in `miasm.expression.expression`, with `Expr*` objects. Rules of the language will be enumerated along this documentation." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Rule: Each expression has a size (in bit). Some expressions need this size at creation time, others compute their size from their arguments.**" ] }, { @@ -130,7 +137,9 @@ { "cell_type": "code", "execution_count": 4, - "metadata": {}, + "metadata": { + "scrolled": true + }, "outputs": [ { "name": "stdout", @@ -159,6 +168,13 @@ "cell_type": "markdown", "metadata": {}, "source": [ + "**Rule: In _most_ cases, the arguments of an ExprOp must have the same size. The ExprOp size is then the size of its arguments. Some exceptions exist, for example the operator \"```==```\" has a size of 1 bit. Size exceptions are described here: https://github.com/cea-sec/miasm/blob/5ee76eed19b62983909d6a16966acc15b2c8726f/miasm/expression/expression.py#L1045**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ "Helpers ease the creation of common operations." ] }, @@ -275,6 +291,13 @@ "cell_type": "markdown", "metadata": {}, "source": [ + "**Rule: The ```ExprAssign``` is not really a word of the language: it can be considered as a statement. Thus, it cannot be an argument of another expression.**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ "The word `ExprCond` represents a ternary relation, equivalend to the Python `src1 if cond else src2`" ] }, @@ -308,6 +331,13 @@ "cell_type": "markdown", "metadata": {}, "source": [ + "**Rule: The sizes of ```src1``` and ```src2``` must be equal, but can differ from the size of ```cond```.**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ "The following words manipulate the sizes:\n", "* `ExprSlice`: extracts a bits slice of an expression;\n", "* `ExprCompose`: concatenates two expressions." @@ -355,6 +385,13 @@ ] }, { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Rule: The size of an ```ExprSlice``` is equal to ```stop - start```.**" + ] + }, + { "cell_type": "code", "execution_count": 11, "metadata": {}, @@ -635,11 +672,11 @@ "output_type": "stream", "text": [ "(a + 0x10) & 0xFFFFFFFF\n", + "(a + 0x10) & 0xFFFFFFFF\n", + "0x10\n", "0xFFFFFFFF\n", "a\n", "a + 0x10\n", - "(a + 0x10) & 0xFFFFFFFF\n", - "0x10\n", "a + 0x10 -> a\n", "a + 0x10 -> 0x10\n", "(a + 0x10) & 0xFFFFFFFF -> a + 0x10\n", @@ -796,10 +833,10 @@ "name": "stdout", "output_type": "stream", "text": [ - "{<class 'miasm.expression.expression.ExprCond'>: [<function expr_simp_equal at 0x7f9af40ce290>],\n", - " <class 'miasm.expression.expression.ExprOp'>: [<function expr_simp_inverse at 0x7f9af40ce200>],\n", - " <class 'miasm.expression.expression.ExprSlice'>: [<function expr_simp_inf_signed at 0x7f9af40ce0e0>,\n", - " <function expr_simp_inf_unsigned_inversed at 0x7f9af40ce170>]}\n", + "{<class 'miasm.expression.expression.ExprCond'>: [<function expr_simp_equal at 0x7f4dd825daf0>],\n", + " <class 'miasm.expression.expression.ExprOp'>: [<function expr_simp_inverse at 0x7f4dd825da60>],\n", + " <class 'miasm.expression.expression.ExprSlice'>: [<function expr_simp_inf_signed at 0x7f4dd825d940>,\n", + " <function expr_simp_inf_unsigned_inversed at 0x7f4dd825d9d0>]}\n", "(((x ^ y) & (x ^ (x + -y))) ^ (x + -y))[31:32]\n", "x <s y\n" ] @@ -820,198 +857,6 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Exercice 1 : Add a new transformation rule" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The current goal is to create a new simplification rule. The rule is informally the following:\n", - "\n", - "*A left shift of *n* bits followed by a right shift of *size - n* bits is equivalent to a rotation to the left of *n* bits.*\n", - "\n", - "For example, `a << 14 | a >> 18` becomes `a <<< 14` (if `a` is on 32 bits and `<<<` is the left rotation operator)." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In Miasm, a transformation rule is actually a function whose two parameters are:\n", - "* An `ExpressionSimplifier` instance used for applying simplifications ;\n", - "* An `Expr` to simplify\n", - "\n", - "The function **must always return an Expr**. If nothing has changed, the function will return its second argument.\n", - "\n", - "A transformation rule must generate a *new* expression. Indeed, the Miasm expressions are immutables, so if a modification occurs, a new instance must be created to include the modifications." - ] - }, - { - "cell_type": "code", - "execution_count": 26, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "a\n", - "a\n", - "a << 0x4\n", - "a << 0x4\n", - "(a << 0x4) | (a >> 0x1C)\n", - "(a << 0x4) | (a >> 0x1C)\n" - ] - }, - { - "ename": "AssertionError", - "evalue": "", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mAssertionError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m<ipython-input-26-f5572590f9c2>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 35\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 36\u001b[0m \u001b[0;31m# Tests launch\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 37\u001b[0;31m \u001b[0mcheck\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtests\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msimp_engine\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 38\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;32m<ipython-input-26-f5572590f9c2>\u001b[0m in \u001b[0;36mcheck\u001b[0;34m(tests, custom_expr_simp)\u001b[0m\n\u001b[1;32m 19\u001b[0m \u001b[0mgot\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcustom_expr_simp\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minp\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 20\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mgot\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 21\u001b[0;31m \u001b[0;32massert\u001b[0m \u001b[0mout\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcanonize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0mgot\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcanonize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 22\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 23\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mcollections\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mCounter\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mAssertionError\u001b[0m: " - ] - } - ], - "source": [ - "a = ExprId(\"a\", 32)\n", - "b = ExprId(\"b\", 32)\n", - "cst1 = ExprInt(16, 32)\n", - "\n", - "# Tests vectors\n", - "tests = [\n", - " # (entrée, sortie attendue)\n", - " (a, a),\n", - " (a << ExprInt(4, 32), (a << ExprInt(4, 32))),\n", - " ((a << ExprInt(4, 32) | a >> ExprInt(28, 32)), ExprOp(\"<<<\", a, ExprInt(4, 32))),\n", - " ((a << ExprInt(4, 32) | a >> ExprInt(27, 32)), (a << ExprInt(4, 32) | a >> ExprInt(27, 32))),\n", - " ((a >> ExprInt(28, 32) | a << ExprInt(4, 32)), ExprOp(\"<<<\", a, ExprInt(4, 32))),\n", - "]\n", - "\n", - "# Vérification\n", - "def check(tests, custom_expr_simp):\n", - " for inp, out in tests:\n", - " print(inp)\n", - " got = custom_expr_simp(inp)\n", - " print(got)\n", - " assert out.canonize() == got.canonize()\n", - "\n", - "from collections import Counter\n", - "\n", - "def mysimplification(simp_engine, expr):\n", - " # TODO\n", - " return expr\n", - "\n", - "from miasm.expression.simplifications import ExpressionSimplifier\n", - "simp_engine = ExpressionSimplifier()\n", - "\n", - "simp_engine.enable_passes({\n", - " ExprOp: [mysimplification]\n", - "})\n", - "\n", - "# Tests launch\n", - "check(tests, simp_engine)\n", - "\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Add a more complex test case\n", - "\n", - "tests += [\n", - " ((a << ExprInt(4, 32)) | a >> (ExprInt(32, 32) - ExprInt(4, 32)), ExprOp(\"<<<\", a, ExprInt(4, 32))),\n", - "]\n", - "\n", - "# TODO: Activate constant propagation passes\n", - "\n", - "# Tests launch\n", - "check(tests, simp_engine)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Expression matching" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Miasm embeds a tiny regular expression engine, called `match_expr`, in order to ease the writing of transformation rules.\n", - "\n", - "Its parameters are:\n", - "* The expression to analyze;\n", - "* An expression describing the pattern to match;\n", - "* A list of *jokers*, i.e. the expressions that can be replaced inside the *pattern*\n", - "\n", - "For example, if we want to match the expression `X + (X * Y) + EAX`, where `X` and `Y` are placeholders, the `match_expr`instanciation will look like:\n", - "\n", - "`match_expr(expr_to_match, X + (X * Y) + EAX, [X, Y])`.\n", - "\n", - "The results is a dictionnary associating each joker with the matching subexpressions found. If the operator is commutative, the order can change." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Exercice 2 : expression matching" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Implement the following simplification rule using the match_expr: `((x & y) + (x | y))` -> `(x + y)`" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from miasm.expression.expression import match_expr\n", - "x = ExprId(\"x\", 32)\n", - "y = ExprId(\"y\", 32)\n", - "# tests vectors\n", - "tests = [\n", - " (((x & y) + (x | y)), (x + y)),\n", - " (((x & y) + (x & y)), ((x & y) + (x & y))),\n", - " (((cst1 >> a) | (a ^ cst1)) + (((a ^ cst1) & (cst1 >> a))), (cst1 >> a) + (a ^ cst1)),\n", - "]\n", - "\n", - "# jokers\n", - "X = ExprId(\"X\", 32)\n", - "Y = ExprId(\"Y\", 32)\n", - "\n", - "def my(e_s, expr):\n", - " # TODO: transformation rule\n", - " res = match_expr(...)\n", - " if res:\n", - " return res[X] + res[Y]\n", - " return expr\n", - "\n", - "simp_engine = ExpressionSimplifier()\n", - "simp_engine.enable_passes({ExprOp:[my]})\n", - "\n", - "check(tests, simp_engine)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ "## Going further" ] }, @@ -1044,7 +889,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.4" + "version": "3.8.1" } }, "nbformat": 4, |