about summary refs log tree commit diff stats
path: root/doc/expression/expression.ipynb
diff options
context:
space:
mode:
authorFabrice Desclaux <fabrice.desclaux@cea.fr>2020-02-20 09:01:17 +0100
committerFabrice Desclaux <fabrice.desclaux@cea.fr>2020-02-20 09:34:17 +0100
commit9016dd157ce2adc6d8107acc518daf8254633c15 (patch)
tree67f8077f29509da718e6d11f35931aea967b0544 /doc/expression/expression.ipynb
parente6ad6b76f9f40b00059cb4fb48a2d3f9e91aec48 (diff)
downloadfocaccia-miasm-9016dd157ce2adc6d8107acc518daf8254633c15.tar.gz
focaccia-miasm-9016dd157ce2adc6d8107acc518daf8254633c15.zip
Add some language rules
Diffstat (limited to 'doc/expression/expression.ipynb')
-rw-r--r--doc/expression/expression.ipynb247
1 files changed, 46 insertions, 201 deletions
diff --git a/doc/expression/expression.ipynb b/doc/expression/expression.ipynb
index 83baecee..5049a6a0 100644
--- a/doc/expression/expression.ipynb
+++ b/doc/expression/expression.ipynb
@@ -17,7 +17,14 @@
     "* A minimal language;\n",
     "* The side effects are explicit, for example *A + B* will not implicitly update flags.\n",
     "\n",
-    "Miasm's IR implementation is located in `miasm.expression.expression`, with `Expr*` objects. Each object has a size, in bits."
+    "Miasm's IR implementation is located in `miasm.expression.expression`, with `Expr*` objects. Rules of the language will be enumerated along this documentation."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Rule: Each expression has a size (in bit). Some expressions need this size at creation time, others compute their size from their arguments.**"
    ]
   },
   {
@@ -130,7 +137,9 @@
   {
    "cell_type": "code",
    "execution_count": 4,
-   "metadata": {},
+   "metadata": {
+    "scrolled": true
+   },
    "outputs": [
     {
      "name": "stdout",
@@ -159,6 +168,13 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
+    "**Rule: In _most_ cases, the arguments of an ExprOp must have the same size. The ExprOp size is then the size of its arguments. Some exceptions exist, for example the operator \"```==```\" has a size of 1 bit. Size exceptions are described here: https://github.com/cea-sec/miasm/blob/5ee76eed19b62983909d6a16966acc15b2c8726f/miasm/expression/expression.py#L1045**"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
     "Helpers ease the creation of common operations."
    ]
   },
@@ -275,6 +291,13 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
+    "**Rule: The ```ExprAssign``` is not really a word of the language: it can be considered as a statement. Thus, it cannot be an argument of another expression.**"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
     "The word `ExprCond` represents a ternary relation, equivalend to the Python `src1 if cond else src2`"
    ]
   },
@@ -308,6 +331,13 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
+    "**Rule: The sizes of ```src1``` and ```src2``` must be equal, but can differ from the size of ```cond```.**"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
     "The following words manipulate the sizes:\n",
     "* `ExprSlice`: extracts a bits slice of an expression;\n",
     "* `ExprCompose`: concatenates two expressions."
@@ -355,6 +385,13 @@
    ]
   },
   {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Rule: The size of an ```ExprSlice``` is equal to ```stop - start```.**"
+   ]
+  },
+  {
    "cell_type": "code",
    "execution_count": 11,
    "metadata": {},
@@ -635,11 +672,11 @@
      "output_type": "stream",
      "text": [
       "(a + 0x10) & 0xFFFFFFFF\n",
+      "(a + 0x10) & 0xFFFFFFFF\n",
+      "0x10\n",
       "0xFFFFFFFF\n",
       "a\n",
       "a + 0x10\n",
-      "(a + 0x10) & 0xFFFFFFFF\n",
-      "0x10\n",
       "a + 0x10 -> a\n",
       "a + 0x10 -> 0x10\n",
       "(a + 0x10) & 0xFFFFFFFF -> a + 0x10\n",
@@ -796,10 +833,10 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "{<class 'miasm.expression.expression.ExprCond'>: [<function expr_simp_equal at 0x7f9af40ce290>],\n",
-      " <class 'miasm.expression.expression.ExprOp'>: [<function expr_simp_inverse at 0x7f9af40ce200>],\n",
-      " <class 'miasm.expression.expression.ExprSlice'>: [<function expr_simp_inf_signed at 0x7f9af40ce0e0>,\n",
-      "                                                   <function expr_simp_inf_unsigned_inversed at 0x7f9af40ce170>]}\n",
+      "{<class 'miasm.expression.expression.ExprCond'>: [<function expr_simp_equal at 0x7f4dd825daf0>],\n",
+      " <class 'miasm.expression.expression.ExprOp'>: [<function expr_simp_inverse at 0x7f4dd825da60>],\n",
+      " <class 'miasm.expression.expression.ExprSlice'>: [<function expr_simp_inf_signed at 0x7f4dd825d940>,\n",
+      "                                                   <function expr_simp_inf_unsigned_inversed at 0x7f4dd825d9d0>]}\n",
       "(((x ^ y) & (x ^ (x + -y))) ^ (x + -y))[31:32]\n",
       "x <s y\n"
      ]
@@ -820,198 +857,6 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### Exercice 1 : Add a new transformation rule"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "The current goal is to create a new simplification rule. The rule is informally the following:\n",
-    "\n",
-    "*A left shift of *n* bits followed by a right shift of *size - n* bits is equivalent to a rotation to the left of *n* bits.*\n",
-    "\n",
-    "For example, `a << 14 | a >> 18` becomes `a <<< 14` (if `a` is on 32 bits and `<<<` is the left rotation operator)."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "In Miasm, a transformation rule is actually a function whose two parameters are:\n",
-    "* An `ExpressionSimplifier` instance used for applying simplifications ;\n",
-    "* An `Expr` to simplify\n",
-    "\n",
-    "The function **must always return an Expr**. If nothing has changed, the function will return its second argument.\n",
-    "\n",
-    "A transformation rule must generate a *new* expression. Indeed, the Miasm expressions are immutables, so if a modification occurs, a new instance must be created to include the modifications."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 26,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "a\n",
-      "a\n",
-      "a << 0x4\n",
-      "a << 0x4\n",
-      "(a << 0x4) | (a >> 0x1C)\n",
-      "(a << 0x4) | (a >> 0x1C)\n"
-     ]
-    },
-    {
-     "ename": "AssertionError",
-     "evalue": "",
-     "output_type": "error",
-     "traceback": [
-      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
-      "\u001b[0;31mAssertionError\u001b[0m                            Traceback (most recent call last)",
-      "\u001b[0;32m<ipython-input-26-f5572590f9c2>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m     35\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     36\u001b[0m \u001b[0;31m# Tests launch\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 37\u001b[0;31m \u001b[0mcheck\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtests\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msimp_engine\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m     38\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
-      "\u001b[0;32m<ipython-input-26-f5572590f9c2>\u001b[0m in \u001b[0;36mcheck\u001b[0;34m(tests, custom_expr_simp)\u001b[0m\n\u001b[1;32m     19\u001b[0m         \u001b[0mgot\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcustom_expr_simp\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minp\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     20\u001b[0m         \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mgot\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 21\u001b[0;31m         \u001b[0;32massert\u001b[0m \u001b[0mout\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcanonize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0mgot\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcanonize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m     22\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     23\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mcollections\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mCounter\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
-      "\u001b[0;31mAssertionError\u001b[0m: "
-     ]
-    }
-   ],
-   "source": [
-    "a = ExprId(\"a\", 32)\n",
-    "b = ExprId(\"b\", 32)\n",
-    "cst1 = ExprInt(16, 32)\n",
-    "\n",
-    "# Tests vectors\n",
-    "tests = [\n",
-    "    # (entrée, sortie attendue)\n",
-    "    (a, a),\n",
-    "    (a << ExprInt(4, 32), (a << ExprInt(4, 32))),\n",
-    "    ((a << ExprInt(4, 32) | a >> ExprInt(28, 32)), ExprOp(\"<<<\", a, ExprInt(4, 32))),\n",
-    "    ((a << ExprInt(4, 32) | a >> ExprInt(27, 32)), (a << ExprInt(4, 32) | a >> ExprInt(27, 32))),\n",
-    "    ((a >> ExprInt(28, 32) | a << ExprInt(4, 32)), ExprOp(\"<<<\", a, ExprInt(4, 32))),\n",
-    "]\n",
-    "\n",
-    "# Vérification\n",
-    "def check(tests, custom_expr_simp):\n",
-    "    for inp, out in tests:\n",
-    "        print(inp)\n",
-    "        got = custom_expr_simp(inp)\n",
-    "        print(got)\n",
-    "        assert out.canonize() == got.canonize()\n",
-    "\n",
-    "from collections import Counter\n",
-    "\n",
-    "def mysimplification(simp_engine, expr):\n",
-    "    # TODO\n",
-    "    return expr\n",
-    "\n",
-    "from miasm.expression.simplifications import ExpressionSimplifier\n",
-    "simp_engine = ExpressionSimplifier()\n",
-    "\n",
-    "simp_engine.enable_passes({\n",
-    "    ExprOp: [mysimplification]\n",
-    "})\n",
-    "\n",
-    "# Tests launch\n",
-    "check(tests, simp_engine)\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Add a more complex test case\n",
-    "\n",
-    "tests += [\n",
-    "    ((a << ExprInt(4, 32)) | a >> (ExprInt(32, 32) - ExprInt(4, 32)), ExprOp(\"<<<\", a, ExprInt(4, 32))),\n",
-    "]\n",
-    "\n",
-    "# TODO: Activate constant propagation passes\n",
-    "\n",
-    "# Tests launch\n",
-    "check(tests, simp_engine)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Expression matching"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Miasm embeds a tiny regular expression engine, called `match_expr`, in order to ease the writing of transformation rules.\n",
-    "\n",
-    "Its parameters are:\n",
-    "* The expression to analyze;\n",
-    "* An expression describing the pattern to match;\n",
-    "* A list of *jokers*, i.e. the expressions that can be replaced inside the *pattern*\n",
-    "\n",
-    "For example, if we want to match the expression `X + (X * Y) + EAX`, where `X` and `Y` are placeholders, the `match_expr`instanciation will look like:\n",
-    "\n",
-    "`match_expr(expr_to_match, X + (X * Y) + EAX, [X, Y])`.\n",
-    "\n",
-    "The results is a dictionnary associating each joker with the matching subexpressions found. If the operator is commutative, the order can change."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Exercice 2 : expression matching"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Implement the following simplification rule using the match_expr: `((x & y) + (x | y))` -> `(x + y)`"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from miasm.expression.expression import match_expr\n",
-    "x = ExprId(\"x\", 32)\n",
-    "y = ExprId(\"y\", 32)\n",
-    "# tests vectors\n",
-    "tests = [\n",
-    "    (((x & y) + (x | y)), (x + y)),\n",
-    "    (((x & y) + (x & y)), ((x & y) + (x & y))),\n",
-    "    (((cst1 >> a) | (a ^ cst1)) + (((a ^ cst1) & (cst1 >> a))), (cst1 >> a) + (a ^ cst1)),\n",
-    "]\n",
-    "\n",
-    "# jokers\n",
-    "X = ExprId(\"X\", 32)\n",
-    "Y = ExprId(\"Y\", 32)\n",
-    "\n",
-    "def my(e_s, expr):\n",
-    "    # TODO: transformation rule\n",
-    "    res = match_expr(...)\n",
-    "    if res:\n",
-    "        return res[X] + res[Y]\n",
-    "    return expr\n",
-    "\n",
-    "simp_engine = ExpressionSimplifier()\n",
-    "simp_engine.enable_passes({ExprOp:[my]})\n",
-    "\n",
-    "check(tests, simp_engine)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
     "## Going further"
    ]
   },
@@ -1044,7 +889,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.7.4"
+   "version": "3.8.1"
   }
  },
  "nbformat": 4,