about summary refs log tree commit diff stats
path: root/doc/expression/expression.ipynb
diff options
context:
space:
mode:
Diffstat (limited to 'doc/expression/expression.ipynb')
-rw-r--r--doc/expression/expression.ipynb1118
1 files changed, 1118 insertions, 0 deletions
diff --git a/doc/expression/expression.ipynb b/doc/expression/expression.ipynb
new file mode 100644
index 00000000..0babf990
--- /dev/null
+++ b/doc/expression/expression.ipynb
@@ -0,0 +1,1118 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Représentation intermédiaire"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Miasm utilise une [représentation intermédiaire](https://en.wikipedia.org/wiki/Intermediate_representation) (*intermediate representation*, *IR*) pour abstraire les effets de bords d'un programme (comme LLVM par exemple). Les avantages étant :\n",
+    "* une représentation unique, quelque soit l'architecture de départ\n",
+    "* un *vocabulaire* minimaliste\n",
+    "* tous les effets de bords sont explicites (un *A + B* ne va pas mettre à jour des flags)\n",
+    "\n",
+    "L'IR de Miasm est implémentée dans `miasm.expression.expression`, sous forme d'`Expr*`. Une taille, en bits, leur est associée."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Vocabulaire"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Les mots les plus simples du vocabulaire sont :\n",
+    "* `ExprId` : représente un identifiant. Par exemple, le registre `EAX` sera représenté par un `ExprId` de 32 bits.\n",
+    "* `ExprInt` : représente un entier non signé."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "a\n",
+      "ExprId('a', 32)\n",
+      "a\n",
+      "32\n"
+     ]
+    }
+   ],
+   "source": [
+    "from miasm.expression.expression import *\n",
+    "\n",
+    "a = ExprId(\"a\", 32)\n",
+    "print a\n",
+    "print repr(a)\n",
+    "\n",
+    "# Accès à l'identifiant\n",
+    "print a.name\n",
+    "print a.size"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "0x10\n",
+      "0xFFFFFFFF\n",
+      "16\n"
+     ]
+    }
+   ],
+   "source": [
+    "cst1 = ExprInt(16, 32)\n",
+    "print cst1\n",
+    "cst2 = ExprInt(-1, 32)\n",
+    "print cst2\n",
+    "\n",
+    "# Accès à la valeur associée\n",
+    "print int(cst1)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Ensuite, le mot `ExprMem` permet de représenter un accès mémoire, d'une taille définie en bit."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "@16[0x11223344]\n",
+      "0x11223344\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Accès mémoire de 16 bits, à l'addresse 0x11223344 sur 32 bits\n",
+    "addr = ExprInt(0x11223344, 32)\n",
+    "mem1 = ExprMem(addr, 16)\n",
+    "print mem1\n",
+    "\n",
+    "# Accès à l'addresse\n",
+    "print mem1.ptr"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Le mot `ExprOp` permet de définir des opérations n-aires entre expressions. L'opération est une chaîne de caractère, on peut donc en définir des nouvelles au besoin. Certaines opérations (`+`, `*`, `|`, `parity`, ...) sont déjà utilisées par Miasm. Une opération est toujours faite entre éléments de même taille, et a la taille de ses arguments."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "a + 0x10\n",
+      "(ExprId('a', 32), ExprInt(0x10, 32))\n",
+      "MyCustomOp(a)\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Définition d'une opération\n",
+    "op1 = ExprOp(\"+\", a, cst1)\n",
+    "print op1\n",
+    "\n",
+    "# Accès aux arguments\n",
+    "print op1.args\n",
+    "\n",
+    "# Définition d'une opération custom\n",
+    "op2 = ExprOp(\"MyCustomOp\", a)\n",
+    "print op2"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Des helpers sont présents pour faciliter la création de certaines opérations courantes"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "a + 0x10\n",
+      "a * 0x10\n",
+      "-a\n",
+      "a | 0x10\n",
+      "a & 0x10\n"
+     ]
+    }
+   ],
+   "source": [
+    "print a + cst1\n",
+    "print a * cst1\n",
+    "print - a\n",
+    "print a | cst1\n",
+    "print a & cst1"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Attention, même si les Expressions permettent de \"tout\" représenter, Miasm fait quelques hypothèses sur la représentation de certaines opérations :\n",
+    "* les opération associative (`+`, `^`, `|`, ...) sont des opérations n-aire\n",
+    "* le `-` est toujours unaire"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "a + -0x10\n"
+     ]
+    }
+   ],
+   "source": [
+    "print a - cst1"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "* `parity` est toujours de taille 1, c'est une des rares exceptions"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "32\n",
+      "1\n"
+     ]
+    }
+   ],
+   "source": [
+    "p = ExprOp(\"parity\", a)\n",
+    "print a.size\n",
+    "print p.size"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "L'opération `=` est gérée à part, par un mot dédié `ExprAssign`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "a = 0x10\n",
+      "0x10\n",
+      "a\n"
+     ]
+    }
+   ],
+   "source": [
+    "assign = ExprAssign(a, cst1)\n",
+    "print assign\n",
+    "\n",
+    "# Source, destination\n",
+    "print assign.src\n",
+    "print assign.dst"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Le mot `ExprCond` permet de représenter une condition ternaire, équivalent au Python `src1 if cond else src2`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 20,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "a?(0x10,0xFFFFFFFF)\n",
+      "a\n",
+      "0x10\n",
+      "0xFFFFFFFF\n"
+     ]
+    }
+   ],
+   "source": [
+    "cond = ExprCond(a, cst1, cst2)\n",
+    "print cond\n",
+    "\n",
+    "# Accès aux éléments\n",
+    "print cond.cond\n",
+    "print cond.src1\n",
+    "print cond.src2"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "La manipulation des tailles est faite grâce aux mots :\n",
+    "* `ExprSlice`: extraction d'un tranche de bits d'une expression\n",
+    "* `ExprCompose`: composition d'expression (comme un sandwich)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "a[6:8]\n",
+      "2\n",
+      "a\n",
+      "6\n",
+      "8\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "True"
+      ]
+     },
+     "execution_count": 21,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "sl = ExprSlice(a, 6, 8)\n",
+    "print sl\n",
+    "print sl.size\n",
+    "\n",
+    "# Accès aux éléments\n",
+    "print sl.arg\n",
+    "print sl.start\n",
+    "print sl.stop\n",
+    "\n",
+    "# Forme plus simple\n",
+    "sl == a[6:8]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{a 0 32, 0x10 32 64}\n",
+      "64\n",
+      "(ExprId('a', 32), ExprInt(0x10, 32))\n",
+      "[(0, ExprId('a', 32)), (32, ExprInt(0x10, 32))]\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Représente la concaténation de a (bit 0 à 31) avec cst1 (bit 32 à 63)\n",
+    "comp = ExprCompose(a, cst1)\n",
+    "print comp\n",
+    "print comp.size\n",
+    "\n",
+    "# Accès aux éléments\n",
+    "print comp.args\n",
+    "# Accès au bit de départ, et à l'argument associé\n",
+    "print list(comp.iter_args())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Enfin, le mot `ExprLoc` permet de représenter un endroit (*location*) de la mémoire, du binaire, etc.\n",
+    "Par exemple, il permet de désigner la destination d'un saut ou d'un appel de fonction.\n",
+    "\n",
+    "Un endroit est désigné par un élément unique (de type `LocKey`), qui peut être vu comme une clé permettant d'accèder aux autres infos liées à ce lieux : son offset, un nom (\"main\"), etc.\n",
+    "`ExprLoc` peut alors être vu comme un conteneur pour un `LocKey`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 23,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "loc_key_1\n"
+     ]
+    }
+   ],
+   "source": [
+    "loc = ExprLoc(LocKey(1), 32)\n",
+    "print loc"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "En résumé, les différents mots sont :\n",
+    "\n",
+    "| Mot | Ecriture |\n",
+    "|-----|----------|\n",
+    "|ExprAssign|A=B|\n",
+    "|ExprInt|0x18|\n",
+    "|ExprId|EAX|\n",
+    "|ExprLoc|label_1|\n",
+    "|ExprCond|A ? B : C|\n",
+    "|ExprMem|@16[ESI]|\n",
+    "|ExprOp|A + B|\n",
+    "|ExprSlice|AH = EAX[8 :16]|\n",
+    "|ExprCompose|AX = AH.AL|"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Helpers communs"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 24,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "ExprInt(0xFFFFFFFF, 32)"
+      ]
+     },
+     "execution_count": 24,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Masque de la bonne taille\n",
+    "a.mask"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 25,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "32"
+      ]
+     },
+     "execution_count": 25,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Taille de l'expression\n",
+    "a.size"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 26,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "a 0x10\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Version affichable\n",
+    "print a, cst1"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 27,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "ExprId('a', 32) ExprOp('+', ExprId('a', 32), ExprInt(0x10, 32))\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Représentation (pour pouvoir re-copier dans le code)\n",
+    "print repr(a), repr(a + cst1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 28,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "zeroExt_64(0x10)\n",
+      "signExt_64(0x10)\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Extension de taille (non signé, signé)\n",
+    "print cst1.zeroExtend(64)\n",
+    "print cst1.signExtend(64)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 29,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "a[31:32]\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Most significant bit\n",
+    "print a.msb()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 30,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "a + a + 0x10\n",
+      "0xFFFFFFFF + 0xFFFFFFFF + 0x10\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Remplacement\n",
+    "expr1 = a + a + cst1\n",
+    "print expr1\n",
+    "expr2 = expr1.replace_expr({a: cst2})\n",
+    "print expr2"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 31,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "True\n",
+      "False\n",
+      "True\n",
+      "True\n",
+      "True\n",
+      "False\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Test de type\n",
+    "print a.is_id()\n",
+    "print a.is_int()\n",
+    "print cst1.is_int()\n",
+    "print op1.is_op()\n",
+    "print op1.is_op(\"+\")\n",
+    "print op1.is_op(\"&\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Expression sous forme de graphe"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Les expressions Miasm ont une structure récursive, et peuvent être représentée et manipulées sous la forme de graphe.\n",
+    "L'objet obtenu est un `DiGraph`, implémenté dans `miasm.core.graph` et offrant les méthodes standards de manipulation de graphes (accès au noeuds, arrêtes, prédécesseurs, succésseurs, dominance, post-dominance, représentation en DOT, ...)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 32,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "(a + 0x10) & 0xFFFFFFFF\n",
+      "0x10\n",
+      "(a + 0x10) & 0xFFFFFFFF\n",
+      "a + 0x10\n",
+      "a\n",
+      "0xFFFFFFFF\n",
+      "a + 0x10 -> a\n",
+      "a + 0x10 -> 0x10\n",
+      "(a + 0x10) & 0xFFFFFFFF -> a + 0x10\n",
+      "(a + 0x10) & 0xFFFFFFFF -> 0xFFFFFFFF\n"
+     ]
+    }
+   ],
+   "source": [
+    "expr3 = a + cst1 & cst2\n",
+    "print expr3\n",
+    "graph = expr3.graph()\n",
+    "print graph"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 33,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "image/svg+xml": [
+       "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
+       "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
+       " \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
+       "<!-- Generated by graphviz version 2.40.1 (20161225.0304)\n",
+       " -->\n",
+       "<!-- Title: asm_graph Pages: 1 -->\n",
+       "<svg width=\"249pt\" height=\"191pt\"\n",
+       " viewBox=\"0.00 0.00 248.50 191.00\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
+       "<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 187)\">\n",
+       "<title>asm_graph</title>\n",
+       "<polygon fill=\"#ffffff\" stroke=\"transparent\" points=\"-4,4 -4,-187 244.5,-187 244.5,4 -4,4\"/>\n",
+       "<!-- 2528525675489518808 -->\n",
+       "<g id=\"node1\" class=\"node\">\n",
+       "<title>2528525675489518808</title>\n",
+       "<path fill=\"none\" stroke=\"#000000\" d=\"M12,-.5C12,-.5 48,-.5 48,-.5 54,-.5 60,-6.5 60,-12.5 60,-12.5 60,-24.5 60,-24.5 60,-30.5 54,-36.5 48,-36.5 48,-36.5 12,-36.5 12,-36.5 6,-36.5 0,-30.5 0,-24.5 0,-24.5 0,-12.5 0,-12.5 0,-6.5 6,-.5 12,-.5\"/>\n",
+       "<text text-anchor=\"start\" x=\"13\" y=\"-14.3\" font-family=\"Courier New\" font-size=\"14.00\" fill=\"#000000\">0x10</text>\n",
+       "</g>\n",
+       "<!-- 17816670189519931537 -->\n",
+       "<g id=\"node2\" class=\"node\">\n",
+       "<title>17816670189519931537</title>\n",
+       "<path fill=\"none\" stroke=\"#000000\" d=\"M30,-146.5C30,-146.5 222,-146.5 222,-146.5 228,-146.5 234,-152.5 234,-158.5 234,-158.5 234,-170.5 234,-170.5 234,-176.5 228,-182.5 222,-182.5 222,-182.5 30,-182.5 30,-182.5 24,-182.5 18,-176.5 18,-170.5 18,-170.5 18,-158.5 18,-158.5 18,-152.5 24,-146.5 30,-146.5\"/>\n",
+       "<text text-anchor=\"start\" x=\"31\" y=\"-160.3\" font-family=\"Courier New\" font-size=\"14.00\" fill=\"#000000\">(a + 0x10) &amp; 0xFFFFFFFF</text>\n",
+       "</g>\n",
+       "<!-- 13033141602480874210 -->\n",
+       "<g id=\"node3\" class=\"node\">\n",
+       "<title>13033141602480874210</title>\n",
+       "<path fill=\"none\" stroke=\"#000000\" d=\"M32.5,-73.5C32.5,-73.5 101.5,-73.5 101.5,-73.5 107.5,-73.5 113.5,-79.5 113.5,-85.5 113.5,-85.5 113.5,-97.5 113.5,-97.5 113.5,-103.5 107.5,-109.5 101.5,-109.5 101.5,-109.5 32.5,-109.5 32.5,-109.5 26.5,-109.5 20.5,-103.5 20.5,-97.5 20.5,-97.5 20.5,-85.5 20.5,-85.5 20.5,-79.5 26.5,-73.5 32.5,-73.5\"/>\n",
+       "<text text-anchor=\"start\" x=\"34\" y=\"-87.3\" font-family=\"Courier New\" font-size=\"14.00\" fill=\"#000000\">a + 0x10</text>\n",
+       "</g>\n",
+       "<!-- 17816670189519931537&#45;&gt;13033141602480874210 -->\n",
+       "<g id=\"edge3\" class=\"edge\">\n",
+       "<title>17816670189519931537&#45;&gt;13033141602480874210</title>\n",
+       "<path fill=\"none\" stroke=\"#000000\" d=\"M111.4157,-146.4551C104.3952,-137.7686 95.8452,-127.1898 88.1285,-117.642\"/>\n",
+       "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"90.629,-115.1677 81.621,-109.5904 85.1848,-119.5678 90.629,-115.1677\"/>\n",
+       "</g>\n",
+       "<!-- 7773804816832915623 -->\n",
+       "<g id=\"node5\" class=\"node\">\n",
+       "<title>7773804816832915623</title>\n",
+       "<path fill=\"none\" stroke=\"#000000\" d=\"M143.5,-73.5C143.5,-73.5 228.5,-73.5 228.5,-73.5 234.5,-73.5 240.5,-79.5 240.5,-85.5 240.5,-85.5 240.5,-97.5 240.5,-97.5 240.5,-103.5 234.5,-109.5 228.5,-109.5 228.5,-109.5 143.5,-109.5 143.5,-109.5 137.5,-109.5 131.5,-103.5 131.5,-97.5 131.5,-97.5 131.5,-85.5 131.5,-85.5 131.5,-79.5 137.5,-73.5 143.5,-73.5\"/>\n",
+       "<text text-anchor=\"start\" x=\"145\" y=\"-87.3\" font-family=\"Courier New\" font-size=\"14.00\" fill=\"#000000\">0xFFFFFFFF</text>\n",
+       "</g>\n",
+       "<!-- 17816670189519931537&#45;&gt;7773804816832915623 -->\n",
+       "<g id=\"edge4\" class=\"edge\">\n",
+       "<title>17816670189519931537&#45;&gt;7773804816832915623</title>\n",
+       "<path fill=\"none\" stroke=\"#000000\" d=\"M140.8315,-146.4551C148.0431,-137.6809 156.8417,-126.9759 164.751,-117.353\"/>\n",
+       "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"167.4854,-119.5382 171.1312,-109.5904 162.0776,-115.0934 167.4854,-119.5382\"/>\n",
+       "</g>\n",
+       "<!-- 13033141602480874210&#45;&gt;2528525675489518808 -->\n",
+       "<g id=\"edge2\" class=\"edge\">\n",
+       "<title>13033141602480874210&#45;&gt;2528525675489518808</title>\n",
+       "<path fill=\"none\" stroke=\"#000000\" d=\"M57.8539,-73.4551C53.5846,-65.0319 48.4135,-54.8292 43.6914,-45.5128\"/>\n",
+       "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"46.812,-43.9277 39.1691,-36.5904 40.5682,-47.0924 46.812,-43.9277\"/>\n",
+       "</g>\n",
+       "<!-- 11913676708980208381 -->\n",
+       "<g id=\"node4\" class=\"node\">\n",
+       "<title>11913676708980208381</title>\n",
+       "<path fill=\"none\" stroke=\"#000000\" d=\"M90,-.5C90,-.5 120,-.5 120,-.5 126,-.5 132,-6.5 132,-12.5 132,-12.5 132,-24.5 132,-24.5 132,-30.5 126,-36.5 120,-36.5 120,-36.5 90,-36.5 90,-36.5 84,-36.5 78,-30.5 78,-24.5 78,-24.5 78,-12.5 78,-12.5 78,-6.5 84,-.5 90,-.5\"/>\n",
+       "<text text-anchor=\"start\" x=\"101\" y=\"-14.3\" font-family=\"Courier New\" font-size=\"14.00\" fill=\"#000000\">a</text>\n",
+       "</g>\n",
+       "<!-- 13033141602480874210&#45;&gt;11913676708980208381 -->\n",
+       "<g id=\"edge1\" class=\"edge\">\n",
+       "<title>13033141602480874210&#45;&gt;11913676708980208381</title>\n",
+       "<path fill=\"none\" stroke=\"#000000\" d=\"M76.3933,-73.4551C80.7779,-65.0319 86.0889,-54.8292 90.9386,-45.5128\"/>\n",
+       "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"94.0703,-47.0766 95.5831,-36.5904 87.8611,-43.8445 94.0703,-47.0766\"/>\n",
+       "</g>\n",
+       "</g>\n",
+       "</svg>\n"
+      ],
+      "text/plain": [
+       "<graphviz.files.Source at 0x7f3244ccecd0>"
+      ]
+     },
+     "execution_count": 33,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "dot = graph.dot()\n",
+    "from graphviz import Source\n",
+    "src = Source(dot)\n",
+    "src"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Simplification d'expression"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "La simplification d'expression dans Miasm permet d'appliquer des règles de transformation à une expression, tant qu'il reste des règles appliquables.\n",
+    "Ce mécanisme est fait via un `ExpressionSimplifier`, implémenté dans `miasm.expression.simplifications`.\n",
+    "\n",
+    "Quelques règles de transformations basiques sont déjà présentes, et activées dans l'instance `expr_simp` du même module."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 34,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "0x10 + 0xFFFFFFFF\n",
+      "0xF\n",
+      "0x10[4:5]\n",
+      "0x1\n",
+      "a + a + -a\n",
+      "a\n",
+      "a + 0x10\n",
+      "0x10 + 0x10\n",
+      "0x20\n",
+      "a * 0x4\n"
+     ]
+    }
+   ],
+   "source": [
+    "from miasm.expression.simplifications import expr_simp\n",
+    "\n",
+    "# 0x10 + (-1) = 0xF\n",
+    "op3 = cst1 + cst2\n",
+    "print op3\n",
+    "cst3 = expr_simp(op3)\n",
+    "print cst3\n",
+    "\n",
+    "# 5ème bit de 0x10 = 1\n",
+    "sl2 = cst1[4:5]\n",
+    "print sl2\n",
+    "cst4 = expr_simp(sl2)\n",
+    "print cst4\n",
+    "\n",
+    "# a + a - a = a\n",
+    "op4 = a + a - a\n",
+    "print op4\n",
+    "print expr_simp(op4)\n",
+    "assert expr_simp(op4) == a\n",
+    "\n",
+    "# Utilisation pour évaluer une expression (ici, a + 0x10 évalué avec a = 0x10)\n",
+    "print op1\n",
+    "print op1.replace_expr({a: cst1})\n",
+    "print expr_simp(op1.replace_expr({a: cst1}))\n",
+    "print expr_simp(a + a +a + a)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Des règles de transformations peuvent être ajoutées, via `enable_passes`. Elles sont données sous la forme de fonction associées à un type d'expression.\n",
+    "\n",
+    "Ci-dessous, on cherche à ajouter des passes permettant de transformer les expressions booléennes des conditions en opération du type `<`.\n",
+    "L'expression arithmético-booléenne correspondante est:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 35,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "((x + -y) ^ ((x ^ y) & ((x + -y) ^ x)))[31:32]\n",
+      "True\n",
+      "False\n",
+      "True\n"
+     ]
+    }
+   ],
+   "source": [
+    "x = ExprId(\"x\", 32)\n",
+    "y = ExprId(\"y\", 32)\n",
+    "\n",
+    "inf_signed = ((x - y) ^ ((x ^ y) & ((x - y) ^ x)))[31:32]\n",
+    "print inf_signed\n",
+    "\n",
+    "def is_inf(x_val, y_val):\n",
+    "    new_val = expr_simp(inf_signed.replace_expr({\n",
+    "        x: x_val,\n",
+    "        y: y_val,\n",
+    "    }))\n",
+    "    assert new_val.is_int()\n",
+    "    return int(new_val) == 1\n",
+    "\n",
+    "# 0 < 10\n",
+    "print is_inf(ExprInt(0, 32), ExprInt(10, 32))\n",
+    "# 10 !< 10\n",
+    "print is_inf(ExprInt(10, 32), ExprInt(10, 32))\n",
+    "# -1 < 0\n",
+    "print is_inf(ExprInt(0xFFFFFFFF, 32), ExprInt(0, 32))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "On ajoute les règles de transformations correspondantes (déjà implémentée dans le framework, mais non activées par défaut) :"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 36,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{<class 'miasm.expression.expression.ExprCond'>: [<function expr_simp_equal at 0x7f3244ca7cf8>],\n",
+      " <class 'miasm.expression.expression.ExprSlice'>: [<function expr_simp_inf_signed at 0x7f3244ca7b90>,\n",
+      "                                                   <function expr_simp_inf_unsigned_inversed at 0x7f3244ca7c08>],\n",
+      " <class 'miasm.expression.expression.ExprOp'>: [<function expr_simp_inverse at 0x7f3244ca7c80>]}\n",
+      "(((x ^ y) & (x ^ (x + -y))) ^ (x + -y))[31:32]\n",
+      "x <s y\n"
+     ]
+    }
+   ],
+   "source": [
+    "from pprint import pprint as pp\n",
+    "from miasm.expression.simplifications import ExpressionSimplifier\n",
+    "pp(ExpressionSimplifier.PASS_COND)\n",
+    "\n",
+    "print expr_simp(inf_signed)\n",
+    "expr_simp_cond = ExpressionSimplifier()\n",
+    "expr_simp_cond.enable_passes(ExpressionSimplifier.PASS_COND)\n",
+    "print expr_simp_cond(inf_signed)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Exercice 1 : ajout d'une règle de transformation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Le but est maintenant d'ajouter notre propre règle de simplification, qui est la suivante (informellement) :\n",
+    "\n",
+    "*Un décalage à gauche de *n* et composé avec un décalage à droite de *taille - n* est une rotation à gauche de *n.\n",
+    "\n",
+    "Autrement dit, `a << 14 | a >> 18` devient `a <<< 14` (si `a` est sur 32 bits, où `<<<` est l'opération de rotation à gauche)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Une règle de transformation se présente sous la forme d'une fonction, qui prend en paramètre :\n",
+    "* l'instance actuelle de l'`ExpressionSimplifier` utilisé (pour lancer des simplifications récursivement si besoin)\n",
+    "* l'expression à simplifier\n",
+    "\n",
+    "La fonction doit **toujours retourner une expression**. Si elle ne fait aucun changement, elle retournera donc directement son deuxième argument.\n",
+    "\n",
+    "Une règle de transformation doit retourner une *nouvelle* expression. En effet, les expressions de Miasm sont immutables, il faut donc recréer une nouvelle expression pour pouvoir appliquer une modification."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "a = ExprId(\"a\", 32)\n",
+    "b = ExprId(\"b\", 32)\n",
+    "cst1 = ExprInt(16, 32)\n",
+    "\n",
+    "# Vecteurs de tests\n",
+    "tests = [\n",
+    "    # (entrée, sortie attendue)\n",
+    "    (a, a),\n",
+    "    (a << ExprInt(4, 32), (a << ExprInt(4, 32))),\n",
+    "    ((a << ExprInt(4, 32) | a >> ExprInt(28, 32)), ExprOp(\"<<<\", a, ExprInt(4, 32))),\n",
+    "    ((a << ExprInt(4, 32) | a >> ExprInt(27, 32)), (a << ExprInt(4, 32) | a >> ExprInt(27, 32))),\n",
+    "    ((a >> ExprInt(28, 32) | a << ExprInt(4, 32)), ExprOp(\"<<<\", a, ExprInt(4, 32))),\n",
+    "]\n",
+    "\n",
+    "# Vérification\n",
+    "def check(tests, custom_expr_simp):\n",
+    "    for inp, out in tests:\n",
+    "        print inp\n",
+    "        got = custom_expr_simp(inp)\n",
+    "        print got\n",
+    "        assert out.canonize() == got.canonize()\n",
+    "\n",
+    "from collections import Counter\n",
+    "\n",
+    "def masimplification(simp_engine, expr):\n",
+    "    # TODO\n",
+    "    return expr\n",
+    "\n",
+    "from miasm.expression.simplifications import ExpressionSimplifier\n",
+    "simp_engine = ExpressionSimplifier()\n",
+    "\n",
+    "simp_engine.enable_passes({\n",
+    "    ExprOp: [masimplification]\n",
+    "})\n",
+    "\n",
+    "# Lancement des tests\n",
+    "check(tests, simp_engine)\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Ajout d'un test un peu plus évolué\n",
+    "\n",
+    "tests += [\n",
+    "    ((a << ExprInt(4, 32)) | a >> (ExprInt(32, 32) - ExprInt(4, 32)), ExprOp(\"<<<\", a, ExprInt(4, 32))),\n",
+    "]\n",
+    "\n",
+    "# TODO: Activation des passes de propagation de constante\n",
+    "\n",
+    "# Lancement des tests\n",
+    "check(tests, simp_engine)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Matching d'expression"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Pour faciliter l'écriture de règle de transformation, Miasm embarque un mini moteur \"d'expression régulière\" sur les expressions, appelé `match_expr`.\n",
+    "\n",
+    "Ses arguments sont :\n",
+    "* L'expression que l'on veut analyser\n",
+    "* Une expression décrivant ce que l'on veut matcher (le *pattern*)\n",
+    "* La liste des *jokers*, c'est à dire les expressions qui peuvent être remplacées dans le *pattern*\n",
+    "\n",
+    "Par exemple, si l'on veut matcher l'expression `X + (X * Y) + EAX`, où `X` et `Y` sont des *placeholders*, on va utiliser :\n",
+    "\n",
+    "`match_expr(expr_to_match, X + (X * Y) + EAX, [X, Y])`.\n",
+    "\n",
+    "Le résultat est un dictionnaire associant chaque joker avec la sous-expression correspondante.\n",
+    "Il est capable de faire varier l'ordre des expressions lorsque l'opérateur est commutatif."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Exercice 2 : Matching d'expression"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Implémentez la règle de simplification suivante, à l'aide d'un MatchExpr :\n",
+    "`((x & y) + (x | y))` -> `(x + y)`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 24,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "(x & y) + (x | y)\n",
+      "{ExprId('X', 32): ExprId('x', 32), ExprId('Y', 32): ExprId('y', 32)}\n",
+      "x + y\n",
+      "(x & y) + (x & y)\n",
+      "(x & y) + (x & y)\n",
+      "((0x10 >> a) | (a ^ 0x10)) + ((a ^ 0x10) & (0x10 >> a))\n",
+      "{ExprId('X', 32): ExprOp('>>', ExprInt(0x10, 32), ExprId('a', 32)), ExprId('Y', 32): ExprOp('^', ExprId('a', 32), ExprInt(0x10, 32))}\n",
+      "(0x10 >> a) + (a ^ 0x10)\n"
+     ]
+    }
+   ],
+   "source": [
+    "from miasm.expression.expression import match_expr\n",
+    "x = ExprId(\"x\", 32)\n",
+    "y = ExprId(\"y\", 32)\n",
+    "# Vecteur de tests\n",
+    "tests = [\n",
+    "    (((x & y) + (x | y)), (x + y)),\n",
+    "    (((x & y) + (x & y)), ((x & y) + (x & y))),\n",
+    "    (((cst1 >> a) | (a ^ cst1)) + (((a ^ cst1) & (cst1 >> a))), (cst1 >> a) + (a ^ cst1)),\n",
+    "]\n",
+    "\n",
+    "# jokers\n",
+    "X = ExprId(\"X\", 32)\n",
+    "Y = ExprId(\"Y\", 32)\n",
+    "\n",
+    "def my(e_s, expr):\n",
+    "    # TODO: règle de transformation\n",
+    "    res = match_expr(...)\n",
+    "    if res:\n",
+    "        return res[X] + res[Y]\n",
+    "    return expr\n",
+    "\n",
+    "simp_engine = ExpressionSimplifier()\n",
+    "simp_engine.enable_passes({ExprOp:[my]})\n",
+    "\n",
+    "check(tests, simp_engine)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Pour aller plus loin..."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Quelques fonctionnalités relative aux expressions mais non détaillées ici (se référer aux exemples) :\n",
+    "* `SymbolicExecutionEngine` : émulation symbolique\n",
+    "* `Translators` : traduction des expressions vers du C, Python, \"Miasm like\", z3\n",
+    "* `expr_range` : Analyse du range de valeurs possibles d'une expression\n",
+    "* `AssignBlock`, `IRBlock`, `DiGraphDefUse`, `dead_simp`, ... : accumulation d'expression pour la description d'effet de bord d'un programme, et traitements associés\n",
+    "* `miasm.arch.*.sem.py`, `SemBuilder` : description des sémantique des architectures, c'est à dire des effets de bords associés à un mnémonique"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}