diff options
44 files changed, 906 insertions, 160 deletions
diff --git a/.gitignore b/.gitignore index 6fc6b959..34b94382 100644 --- a/.gitignore +++ b/.gitignore @@ -1,11 +1,18 @@ # Build directory /build/* +dist/ +sdists/ # Emacs files *~ # Compiled python files +__pycache__/ *.py[cod] # Generated files *.egg* **.dot **.so -VERSION \ No newline at end of file +VERSION +# Virtual environments +venv*/ +.env/ +.venv*/ \ No newline at end of file diff --git a/README.md b/README.md index 50b8de59..63c1d4b9 100644 --- a/README.md +++ b/README.md @@ -516,7 +516,7 @@ instance to emulate library functions effects. Documentation ============= -TODO +Some documentation ressources are available in the [doc](doc) folder. An auto-generated documentation is available: * [Doxygen](http://miasm.re/miasm_doxygen) diff --git a/doc/README.md b/doc/README.md index e9c1c9d2..e125491c 100644 --- a/doc/README.md +++ b/doc/README.md @@ -36,6 +36,7 @@ class LocationDB(builtins.object) - examples for the main features (see `/example`) - interactive tutorials (IPython Notebooks) on the following topics: + - Emulation API: [notebook](jitter/jitter.ipynb) - Miasm's IR bricks known as `Expr`: [notebook](expression/expression.ipynb) - Lifting from assembly to IR: [notebook](ir/lift.ipynb) - `LocationDB` usage, the database for locations: [notebook](locationdb/locationdb.ipynb) diff --git a/doc/ir/lift.ipynb b/doc/ir/lift.ipynb index aaa20a0b..56ce6fbd 100644 --- a/doc/ir/lift.ipynb +++ b/doc/ir/lift.ipynb @@ -1729,7 +1729,7 @@ "\n", "```\n", "\n", - "This is the generic code used in `x86_64` to model function calls. But you can finely model functions. For example, suppose you are analysing code on `x86_32` with `stdcall` convention. Suppose you know the callee clean its stack arguments. Supppose as well you know for each function how many arguments it has. You can then customize the model to match the callee and compute the correct stack modification, as well as getting the arguments from stack:\n", + "This is the generic code used in `x86_64` to model function calls. But you can finely model functions. For example, suppose you are analysing code on `x86_32` with `stdcall` convention. Suppose you know the callee clean its stack arguments. Suppose as well you know for each function how many arguments it has. You can then customize the model to match the callee and compute the correct stack modification, as well as getting the arguments from stack:\n", "\n", "\n", "\n" diff --git a/doc/jitter/jitter.ipynb b/doc/jitter/jitter.ipynb new file mode 100644 index 00000000..adab4c5b --- /dev/null +++ b/doc/jitter/jitter.ipynb @@ -0,0 +1,543 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Miasm Emulation Engine\n", + "\n", + "This short document provides an introduction to the APIs of the Miasm emulation engine. This emulation engine is commonly referred to as **jitter** throughout the project's code (as it is based on [JiT](https://en.wikipedia.org/wiki/Just-in-time_compilation) methods)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The first step is to obtain an instance of the jitter. To do this, we will go through an intermediate object, `Machine`. It will allow us to instantiate a set of elements linked to a given architecture supported by Miasm (x86, arm, mips, ...)." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['arml',\n", + " 'armb',\n", + " 'armtl',\n", + " 'armtb',\n", + " 'sh4',\n", + " 'x86_16',\n", + " 'x86_32',\n", + " 'x86_64',\n", + " 'msp430',\n", + " 'mips32b',\n", + " 'mips32l',\n", + " 'aarch64l',\n", + " 'aarch64b',\n", + " 'ppc32b',\n", + " 'mepl',\n", + " 'mepb']" + ] + }, + "execution_count": 1, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from miasm.analysis.machine import Machine\n", + "Machine.available_machine()" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "x86_32\n" + ] + } + ], + "source": [ + "machine = Machine(\"x86_32\")\n", + "print(machine.name)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `Machine` class in Miasm provides various interfaces:\n", + "\n", + "* `jitter`: an emulation engine\n", + "* `dis_engine`: a disassembly engine\n", + "* `lifter`: a lifting engine, to lift assembly code to the corresponding Miasm internal representation (IR)\n", + "* `lifter_model_call`: as `lifter`, but function calls are abstracted\n", + "* `mn`: low level object to interact with an architecture (assembly, disassembly of only a few bytes, etc.)\n", + "* `gdbserver`: a GDB remote debugging server (to link with a `jitter` instance)\n", + "\n", + "The IR related objects are already discussed in [this notebook](../ir/lift.ipynb)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To obtain an emulator instance, we first need to:\n", + "\n", + "1. instanciate a *symbol table* (`LocationDB` in Miasm, already explained in [this notebook](../locationdb/locationdb.ipynb)) ;\n", + "1. (optionnaly) choose a backed for the emulation (more specifically, to JiT the code):\n", + "\n", + " * \"python\": a backend entirely in Python, slow\n", + " * \"gcc\": a backend based on the compiler GCC (Assembly $\\rightarrow$ IR Miasm $\\rightarrow$ C $\\rightarrow$ GCC), fast\n", + " * \"llvm\": a backend based on the LLVM JiT engine (Assembly $\\rightarrow$ IR Miasm $\\rightarrow$ IR LLVM $\\rightarrow$ LLVM JiT), fast\n", + "\n", + "Note: using a JiT backend as `llvm` or `gcc` greatly improve the performance over the `python` one. As a result, without any argument, instanciating an emulator will automatically try to look for and choose the best suited backend. " + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "<miasm.arch.x86.jit.jitter_x86_32 object at 0x7f195fe07d60>\n" + ] + } + ], + "source": [ + "from miasm.core.locationdb import LocationDB\n", + "\n", + "loc_db = LocationDB()\n", + "jitter = machine.jitter(loc_db)\n", + "print(jitter)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For now, our emulator is an empty box. It has:\n", + "\n", + "* registers, reachable from the `.cpu` attribute. These are initialized to 0.\n", + "* a virtual memory, reachable from the `.vm` attribute. It starts empty." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### CPU state\n", + "\n", + "Let's manipulate a few registers:" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0\n", + "1\n" + ] + } + ], + "source": [ + "# Read a register\n", + "print(jitter.cpu.EAX)\n", + "\n", + "# Write a register\n", + "jitter.cpu.EAX = 1\n", + "print(jitter.cpu.EAX)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "A few *helpers* are also available. For instance, one can get every registers using:" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{'RAX': 1, 'RBX': 0, 'RCX': 0, 'RDX': 0, 'RSI': 0, 'RDI': 0, 'RSP': 0, 'RBP': 0, 'R8': 0, 'R9': 0, 'R10': 0, 'R11': 0, 'R12': 0, 'R13': 0, 'R14': 0, 'R15': 0, 'RIP': 0, 'zf': 0, 'nf': 0, 'pf': 0, 'of': 0, 'cf': 0, 'af': 0, 'df': 0, 'ES': 0, 'CS': 0, 'SS': 0, 'DS': 0, 'FS': 0, 'GS': 0, 'MM0': 0, 'MM1': 0, 'MM2': 0, 'MM3': 0, 'MM4': 0, 'MM5': 0, 'MM6': 0, 'MM7': 0, 'XMM0': 0, 'XMM1': 0, 'XMM2': 0, 'XMM3': 0, 'XMM4': 0, 'XMM5': 0, 'XMM6': 0, 'XMM7': 0, 'XMM8': 0, 'XMM9': 0, 'XMM10': 0, 'XMM11': 0, 'XMM12': 0, 'XMM13': 0, 'XMM14': 0, 'XMM15': 0, 'tsc': 1234605616436508552}\n" + ] + } + ], + "source": [ + "# GPReg : General Purpose registers\n", + "regs = jitter.cpu.get_gpreg()\n", + "print(regs)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note: in Miasm, `x86_32` implementation is merged with the `x86_64` one. As a result, the register names are the x64 ones. Still, the computation is made accordingly to the 32bits implementation, and registers can be accessed either using the x64 name or the 32bits name (ie `.EAX` and `.RAX`)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Virtual memory\n", + "\n", + "Initially, the virtual memory is empty. The *repr* output offers a summary of the availables pages: " + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Addr Size Access Comment" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "jitter.vm" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Memory pages can then be added.\n", + "The **memory address is arbitrary** (no alignment requirement). **The size is also arbitrary** (byte precision).\n", + "\n", + "Optionnaly, a comment associated to the page can be provided." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Addr Size Access Comment\n", + "0x1000 0x1000 RWX test page\n", + "0x112233 0x666 RWX no alignment, byte precision" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from miasm.jitter.csts import PAGE_READ, PAGE_WRITE, PAGE_EXEC\n", + "jitter.vm.add_memory_page(0x1000, PAGE_READ | PAGE_WRITE | PAGE_EXEC, b\"\\x00\" * 0x1000, \"test page\")\n", + "jitter.vm.add_memory_page(0x112233, PAGE_READ | PAGE_WRITE | PAGE_EXEC, b\"\\x00\" * 0x666, \"no alignment, byte precision\")\n", + "jitter.vm" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The memory can now be accessed, read and write:" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "b'\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00'" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "jitter.vm.get_mem(0x1000, 0x10)" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "b'toto\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00'" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "jitter.vm.set_mem(0x1000, b\"toto\")\n", + "jitter.vm.get_mem(0x1000, 0x10)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note: having a byte-precision memory has some advantages. For instance:\n", + "* when mapping a partially known structure, instead of defaulting unknown fields to 0, we can instead only map in memory the known field. As a result, we obtain a sparse memory layout and if the programm try to read an unknown field, the execution will stop right on the responsible instruction, helping the reversing work\n", + "* allocation can be made byte-wise. Thus, it could be easier to detect overflows and underflows" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Emulation\n", + "\n", + "We will now run the emulator.\n", + "First, actual instructions are needed:" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [], + "source": [ + "jitter.vm.set_mem(0x1000, bytes.fromhex(\"B844332211C3\"))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We will also enable debug logging.\n", + "By default, the logger will logs executed instruction and register values." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [], + "source": [ + "jitter.set_trace_log()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": true + }, + "source": [ + "The emulation is initialized with `jitter.init_run(address)` and resumed with `jitter.continue_run()`.\n", + "For convenience, `jitter.run(address)` is usually used, wrapping these two API." + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": { + "scrolled": false + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "[WARNING ]: [Errno cannot get mem ad] 0x1337beef\n", + "WARNING: address 0x1337BEEF is not mapped in virtual memory:\n", + "[WARNING ]: cannot disasm at 1337BEEF\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "00001000 MOV EAX, 0x11223344\n", + "EAX 11223344 EBX 00000000 ECX 00000000 EDX 00000000 ESI 00000000 EDI 00000000 ESP 88880FFC EBP 00000000 EIP 00001005 zf 0 nf 0 of 0 cf 0\n", + "00001005 RET \n", + "EAX 11223344 EBX 00000000 ECX 00000000 EDX 00000000 ESI 00000000 EDI 00000000 ESP 88881000 EBP 00000000 EIP 1337BEEF zf 0 nf 0 of 0 cf 0\n" + ] + } + ], + "source": [ + "jitter.vm.add_memory_page(0x88880000, PAGE_READ | PAGE_WRITE | PAGE_EXEC, b\"\\x00\" * 0x1000, \"stack\")\n", + "jitter.vm.set_mem(0x88880000 + 0x1000 - 4, b\"\\xef\\xbe\\x37\\x13\")\n", + "jitter.cpu.ESP = 0x88880000 + 0x1000 - 4\n", + "\n", + "jitter.run(0x1000)\n", + "\n", + "# The execution ends with an error, which is expected.\n", + "# Indeed, we RET on 0x1337beef, which is not mapped in memory,\n", + "# hence the \"WARNING: address 0x1337BEEF is not mapped in virtual memory\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Breakpoints\n", + "\n", + "One can register breakpoints to be raised just before a given address will be executed.\n", + "\n", + "As we control the CPU, the breakpoint implementation is invisible from the emulated environnement and we can use an arbitrary number of them." + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": {}, + "outputs": [], + "source": [ + "# Breakpoints are actually callbacks which receive the emulation instance in argument\n", + "def hello_world(jitter):\n", + " print(\"Hello, world!\")\n", + " print(\"EAX value is %d\" % jitter.cpu.EAX)\n", + " # Stop execution right here\n", + " return False\n", + "\n", + "jitter.add_breakpoint(0x1005, hello_world)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If a breakpoint callback returns `True`, the emulation continues.\n", + "Otherwise, the emulation will stop and the value returned by the `jitter.continue_run()` call is the callback return value.\n", + "\n", + "Note: if one need to add more arguments to the callback, there are many Pythonic way to do it.\n", + "For instance, one can use `lambda` to capture an argument or object instance (ie. `jitter.add_breakpoint(address, obj.callback)` with a `def callback(self, jitter)` inside the `obj` definition).\n", + "\n", + "The breakpoint mechanism is used to implement several features. For instance:\n", + "* it can be used to hook a function, by breakpointing on the function address, emulating a few side effects then returning to the return address. Indeed, modifying the programm counter (usually using `jitter.pc = XXX`) in a breakpoint will resume the execution on this new address\n", + "* in Miasm examples, a breakpoint on a fake address is often used to properly stop the emulation. The callback is usually named `code_sentinelle` and allow to get back control after the emulation of a code snippet" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "collapsed": true + }, + "source": [ + "### Exercices\n", + "\n", + "For the interested reader, this section introduce a few exercice to practice the API.\n", + "\n", + "#### Exercice 1\n", + "\n", + "Starting from a new instance:\n", + "1. Add a memory page which will act as a stack page\n", + "1. Update the stack pointer accordingly\n", + "1. Manually push a fake return address\n", + "1. Map a `RET` instruction (0xC3) in another page\n", + "1. Run the emulation and ensure it finish on your fake address" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Exercice 1 - Spoiler\n", + "\n", + "A few helpers are actually availables to reduce the burden of these manual actions:" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0x1000\n" + ] + } + ], + "source": [ + "# Init a stack and set stack pointer accordingly (architecture agnostic way of writing)\n", + "jitter.init_stack()\n", + "\n", + "# Push 0x1000 on the stack\n", + "jitter.push_uint32_t(0x1000)\n", + "\n", + "# Pop 0x1000 from the stack\n", + "print(hex(jitter.pop_uint32_t()))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Exercice 2\n", + "\n", + "Same a Exercice 1, but instead of ending on an error, we will this time add a breakpoint on the fake return address to properly stop the execution." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "#### Exercice 2 - Spoiler\n", + "\n", + "The example `jitter/x86_32.py` is exactly doing this." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/doc/locationdb/locationdb.ipynb b/doc/locationdb/locationdb.ipynb index 33a18930..09e47ca6 100644 --- a/doc/locationdb/locationdb.ipynb +++ b/doc/locationdb/locationdb.ipynb @@ -5,13 +5,13 @@ "metadata": {}, "source": [ "# LocationDB object \n", - "The `LocationDB` is the Miasm object responsible of the symbols' management. A `Location` is an object representing a code or data (or anywhere else) position. As the name explicits it, the `LocationDB` is a database of locations. Here are some rules:\n", + "The `LocationDB` is the Miasm object responsible of the symbols' management. A `Location` is an object representing a code or data (or anywhere else) position. As the name says, the `LocationDB` is a database of locations. Here are some rules:\n", "- each location has exactly *one* associated `LocKey`\n", "- a `LocKey` is linked to a unique `LocationDB` (and must not be used in another `LocationDB`)\n", "- a `LocKey` is very similar to primary key object in a database.\n", "- a `LocKey` can have an optional *offset*.\n", "- a `LocKey` can have multiple symbol names\n", - "- two `Lockey`s cannot share an identic offset\n", + "- two `Lockey`s cannot share an identical offset\n", "- two `LocKey`s cannot share a symbol name\n", "\n", "Below are manipulations of the `LocationDB`" diff --git a/example/disasm/dis_binary.py b/example/disasm/dis_binary.py index af140f28..cf927adb 100644 --- a/example/disasm/dis_binary.py +++ b/example/disasm/dis_binary.py @@ -8,7 +8,7 @@ fdesc = open(sys.argv[1], 'rb') loc_db = LocationDB() # The Container will provide a *bin_stream*, bytes source for the disasm engine -# It will prodive a view from a PE or an ELF. +# It will provide a view from a PE or an ELF. cont = Container.from_stream(fdesc, loc_db) # The Machine, instantiated with the detected architecture, will provide tools diff --git a/example/jitter/memory_breakpoint.py b/example/jitter/memory_breakpoint.py index 900b1621..291682a4 100644 --- a/example/jitter/memory_breakpoint.py +++ b/example/jitter/memory_breakpoint.py @@ -34,15 +34,15 @@ sb.jitter.vm.add_memory_breakpoint(address, size, access_type) def memory_breakpoint_handler(jitter): memory_read = jitter.vm.get_memory_read() if len(memory_read) > 0: - print("Read at instruction 0x%s:" % jitter.pc) + print("Read at instruction 0x%x:" % jitter.pc) for start_address, end_address in memory_read: - print("- from %s to %s" % (hex(start_address), hex(end_address))) + print("- from 0x%x to 0x%x" % (start_address, end_address)) memory_write = jitter.vm.get_memory_write() if len(memory_write) > 0: - print("Write at instruction 0x%s:" % jitter.pc) + print("Write at instruction 0x%x:" % jitter.pc) for start_address, end_address in memory_write: - print("- from %s to %s" % (hex(start_address), hex(end_address))) + print("- from 0x%x to 0x%x" % (start_address, end_address)) # Cleanup jitter.vm.set_exception(jitter.vm.get_exception() ^ EXCEPT_BREAKPOINT_MEMORY) diff --git a/example/jitter/unpack_generic.py b/example/jitter/unpack_generic.py index 3329d2a9..e389a4b5 100644 --- a/example/jitter/unpack_generic.py +++ b/example/jitter/unpack_generic.py @@ -35,7 +35,7 @@ def stop(jitter): if options.oep: # Set callbacks sb.jitter.add_breakpoint(int(options.oep, 0), stop) - + # Run until an error is encountered - IT IS UNLIKELY THE ORIGINAL ENTRY POINT try: sb.run() diff --git a/example/symbol_exec/symbol_exec.py b/example/symbol_exec/symbol_exec.py index 1d3f4576..f11f095a 100644 --- a/example/symbol_exec/symbol_exec.py +++ b/example/symbol_exec/symbol_exec.py @@ -54,5 +54,5 @@ cur_addr = symb.run_at(ircfg, addr, step=options.steps) # Modified elements print('Modified registers:') symb.dump(mems=False) -print('Modified memory (should be empty):') +print('Modified memory:') symb.dump(ids=False) diff --git a/miasm/__init__.py b/miasm/__init__.py index 417a6268..309a1ae7 100644 --- a/miasm/__init__.py +++ b/miasm/__init__.py @@ -40,13 +40,13 @@ def _version_from_git_describe(): if process.returncode == 0: tag = out.decode().strip() - match = re.match('^v?(.+?)-(\\d+)-g[a-f0-9]+$', tag) + match = re.match(r'^v?(.+?)-(\d+)-g[a-f0-9]+$', tag) if match: # remove the 'v' prefix and add a '.devN' suffix return '%s.dev%s' % (match.group(1), match.group(2)) else: # just remove the 'v' prefix - return re.sub('^v', '', tag) + return re.sub(r'^v', '', tag) else: raise subprocess.CalledProcessError(process.returncode, err) @@ -71,7 +71,7 @@ def _version(): # See 'man gitattributes' for more details. git_archive_id = '$Format:%h %d$' sha1 = git_archive_id.strip().split()[0] - match = re.search('tag:(\\S+)', git_archive_id) + match = re.search(r'tag:(\S+)', git_archive_id) if match: return "git-archive.dev" + match.group(1) elif sha1: diff --git a/miasm/analysis/data_flow.py b/miasm/analysis/data_flow.py index 06453264..23d0b3dd 100644 --- a/miasm/analysis/data_flow.py +++ b/miasm/analysis/data_flow.py @@ -1910,7 +1910,7 @@ class State(object): def may_interfer(self, dsts, src): """ - Return True is @src may interfer with expressions in @dsts + Return True if @src may interfere with expressions in @dsts @dsts: Set of Expressions @src: expression to test """ @@ -2085,7 +2085,7 @@ class State(object): to_del = set() for node in list(classes.nodes()): if self.may_interfer(dsts, node): - # Interfer with known equivalence class + # Interfere with known equivalence class self.equivalence_classes.del_element(node) if node.is_id() or node.is_mem(): self.undefined.add(node) @@ -2137,7 +2137,7 @@ class State(object): undefined = set(node for node in self.undefined if node.is_id() or node.is_mem()) undefined.update(set(node for node in other.undefined if node.is_id() or node.is_mem())) # Should we compute interference between srcs and undefined ? - # Nop => should already interfer in other state + # Nop => should already interfere in other state components1 = classes1.get_classes() components2 = classes2.get_classes() @@ -2173,7 +2173,7 @@ class State(object): continue if common: # Intersection contains multiple nodes - # Here, common nodes don't interfer with any undefined + # Here, common nodes don't interfere with any undefined nodes_ok.update(common) out.append(common) diff = component1.difference(common) diff --git a/miasm/analysis/debugging.py b/miasm/analysis/debugging.py index f114d901..d5f59d49 100644 --- a/miasm/analysis/debugging.py +++ b/miasm/analysis/debugging.py @@ -377,7 +377,7 @@ class DebugCmd(cmd.Cmd, object): args = arg.split(" ") if args[-1].lower() not in ["on", "off"]: - self.print_warning("/!\ %s not in 'on' / 'off'" % args[-1]) + self.print_warning("[!] %s not in 'on' / 'off'" % args[-1]) return mode = args[-1].lower() == "on" d = {} diff --git a/miasm/analysis/dse.py b/miasm/analysis/dse.py index 5e6c4e8d..11674734 100644 --- a/miasm/analysis/dse.py +++ b/miasm/analysis/dse.py @@ -234,7 +234,7 @@ class DSEEngine(object): def handle(self, cur_addr): r"""Handle destination @cur_addr: Expr of the next address in concrete execution - /!\ cur_addr may be a loc_key + [!] cur_addr may be a loc_key In this method, self.symb is in the "just before branching" state """ @@ -475,7 +475,7 @@ class DSEEngine(object): @cpu: (optional) if set, update registers' value @mem: (optional) if set, update memory value - /!\ all current states will be loss. + [!] all current states will be loss. This function is usually called when states are no more synchronized (at the beginning, returning from an unstubbed syscall, ...) """ diff --git a/miasm/arch/aarch64/sem.py b/miasm/arch/aarch64/sem.py index 8cbab90b..eaa01228 100644 --- a/miasm/arch/aarch64/sem.py +++ b/miasm/arch/aarch64/sem.py @@ -179,7 +179,7 @@ system_regs = { (3, 0, 2, 3, 0): APGAKeyLo_EL1, (3, 0, 2, 3, 1): APGAKeyHi_EL1, - + (3, 0, 4, 1, 0): SP_EL0, (3, 0, 4, 6, 0): ICC_PMR_EL1, # Alias ICV_PMR_EL1 @@ -285,7 +285,7 @@ system_regs = { (3, 0, 0, 0, 1): CTR_EL0, (3, 3, 0, 0, 7): DCZID_EL0, - + (3, 3, 4, 4, 0): FPCR, (3, 3, 4, 4, 1): FPSR, @@ -1578,13 +1578,13 @@ def msr(ir, instr, arg1, arg2, arg3, arg4, arg5, arg6): e.append(ExprAssign(zf, arg6[30:31])) e.append(ExprAssign(cf, arg6[29:30])) e.append(ExprAssign(of, arg6[28:29])) - + elif arg1.is_int(3) and arg2.is_int(3) and arg3.is_id("c4") and arg4.is_id("c2") and arg5.is_int(7): e.append(ExprAssign(tco, arg6[25:26])) elif arg1.is_int(3) and arg2.is_int(3) and arg3.is_id("c4") and arg4.is_id("c2") and arg5.is_int(0): e.append(ExprAssign(dit, arg6[24:25])) - + elif arg1.is_int(3) and arg2.is_int(0) and arg3.is_id("c4") and arg4.is_id("c2") and arg5.is_int(4): e.append(ExprAssign(uao, arg6[23:24])) @@ -1599,7 +1599,7 @@ def msr(ir, instr, arg1, arg2, arg3, arg4, arg5, arg6): e.append(ExprAssign(af, arg6[8:9])) e.append(ExprAssign(iff, arg6[7:8])) e.append(ExprAssign(ff, arg6[6:7])) - + elif arg1.is_int(3) and arg2.is_int(0) and arg3.is_id("c4") and arg4.is_id("c2") and arg5.is_int(2): e.append(ExprAssign(cur_el, arg6[2:4])) diff --git a/miasm/arch/arm/arch.py b/miasm/arch/arm/arch.py index 5ccf5eca..91c22bd5 100644 --- a/miasm/arch/arm/arch.py +++ b/miasm/arch/arm/arch.py @@ -1148,8 +1148,12 @@ class arm_op2(arm_arg): shift_op = ExprInt(amount, 32) a = regs_expr[rm] if shift_op == ExprInt(0, 32): + #rrx if shift_type == 3: self.expr = ExprOp(allshifts[4], a) + #asr, lsr + elif shift_type == 1 or shift_type == 2: + self.expr = ExprOp(allshifts[shift_type], a, ExprInt(32, 32)) else: self.expr = a else: diff --git a/miasm/arch/arm/sem.py b/miasm/arch/arm/sem.py index e507a045..a138ef91 100644 --- a/miasm/arch/arm/sem.py +++ b/miasm/arch/arm/sem.py @@ -2,6 +2,7 @@ from builtins import range from future.utils import viewitems, viewvalues from miasm.expression.expression import * +from miasm.expression.simplifications import expr_simp from miasm.ir.ir import Lifter, IRBlock, AssignBlock from miasm.arch.arm.arch import mn_arm, mn_armt from miasm.arch.arm.regs import * @@ -253,6 +254,15 @@ def update_flag_zn(a): return e +# Instructions which use shifter's carry flag: ANDS, BICS, EORS, MOVS/RRX, MVNS, ORNS (TODO), ORRS, TEQ, TST +def compute_rrx_carry(operation): + """ + Returns a tuple (result, carry) corresponding to the RRX computation + @operation: The ExprOp operation + """ + new_cf = operation.args[0][:1] + res = ExprCompose(operation.args[0][1:], cf) + return res, new_cf # XXX TODO: set cf if ROT imm in argument @@ -272,12 +282,12 @@ def update_flag_add_of(op1, op2): def update_flag_sub_cf(op1, op2): - "Compote CF in @op1 - @op2" + "Compute CF in @op1 - @op2" return [ExprAssign(cf, ExprOp("FLAG_SUB_CF", op1, op2) ^ ExprInt(1, 1))] def update_flag_sub_of(op1, op2): - "Compote OF in @op1 - @op2" + "Compute OF in @op1 - @op2" return [ExprAssign(of, ExprOp("FLAG_SUB_OF", op1, op2))] @@ -381,6 +391,9 @@ def update_flag_arith_subwc_co(arg1, arg2, arg3): e += update_flag_subwc_of(arg1, arg2, arg3) return e +# Utility function for flag computation when it depends on the mode +def isThumb(lifter): + return isinstance(lifter, (Lifter_Armtl, Lifter_Armtb)) def get_dst(a): @@ -395,6 +408,8 @@ def adc(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) arg1, arg2 = b, c r = b + c + cf.zeroExtend(32) if instr.name == 'ADCS' and a != PC: @@ -411,6 +426,8 @@ def add(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) arg1, arg2 = b, c r = b + c if instr.name == 'ADDS' and a != PC: @@ -424,11 +441,18 @@ def add(ir, instr, a, b, c=None): def l_and(ir, instr, a, b, c=None): - e = [] + setflags = (instr.name == 'ANDS') and a != PC if c is None: b, c = a, b + if c.is_op(): + e, extra_ir = _shift_rotate_tpl(ir, instr, a, c, onlyCarry=setflags) + # get back the result + c = e.pop(0).src + else: + e = [] + extra_ir = [] r = b & c - if instr.name == 'ANDS' and a != PC: + if setflags: e += [ExprAssign(zf, ExprOp('FLAG_EQ_AND', b, c))] e += update_flag_nf(r) @@ -436,13 +460,14 @@ def l_and(ir, instr, a, b, c=None): dst = get_dst(a) if dst is not None: e.append(ExprAssign(ir.IRDst, r)) - return e, [] - + return e, extra_ir def sub(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) r = b - c e.append(ExprAssign(a, r)) dst = get_dst(a) @@ -455,6 +480,8 @@ def subs(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) arg1, arg2 = b, c r = b - c e += update_flag_arith_sub_zn(arg1, arg2) @@ -470,6 +497,8 @@ def eor(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) r = b ^ c e.append(ExprAssign(a, r)) dst = get_dst(a) @@ -479,9 +508,16 @@ def eor(ir, instr, a, b, c=None): def eors(ir, instr, a, b, c=None): - e = [] + setflags = a != PC if c is None: b, c = a, b + if c.is_op(): + e, extra_ir = _shift_rotate_tpl(ir, instr, a, c, onlyCarry=setflags) + # get back the result + c = e.pop(0).src + else: + e = [] + extra_ir = [] arg1, arg2 = b, c r = arg1 ^ arg2 @@ -492,13 +528,15 @@ def eors(ir, instr, a, b, c=None): dst = get_dst(a) if dst is not None: e.append(ExprAssign(ir.IRDst, r)) - return e, [] + return e, extra_ir def rsb(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) arg1, arg2 = c, b r = arg1 - arg2 e.append(ExprAssign(a, r)) @@ -512,6 +550,8 @@ def rsbs(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) arg1, arg2 = c, b r = arg1 - arg2 e += update_flag_arith_sub_zn(arg1, arg2) @@ -527,6 +567,8 @@ def sbc(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) arg1, arg2 = b, c r = arg1 - (arg2 + (~cf).zeroExtend(32)) e.append(ExprAssign(a, r)) @@ -540,6 +582,8 @@ def sbcs(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) arg1, arg2 = b, c r = arg1 - (arg2 + (~cf).zeroExtend(32)) @@ -557,6 +601,8 @@ def rsc(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) arg1, arg2 = c, b r = arg1 - (arg2 + (~cf).zeroExtend(32)) e.append(ExprAssign(a, r)) @@ -570,6 +616,8 @@ def rscs(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) arg1, arg2 = c, b r = arg1 - (arg2 + (~cf).zeroExtend(32)) @@ -585,18 +633,32 @@ def rscs(ir, instr, a, b, c=None): def tst(ir, instr, a, b): - e = [] + setflags = a != PC + if b.is_op(): + e, extra_ir = _shift_rotate_tpl(ir, instr, a, b, onlyCarry=setflags) + # get back the result + b = e.pop(0).src + else: + e = [] + extra_ir = [] arg1, arg2 = a, b r = arg1 & arg2 e += [ExprAssign(zf, ExprOp('FLAG_EQ_AND', arg1, arg2))] e += update_flag_nf(r) - return e, [] + return e, extra_ir def teq(ir, instr, a, b, c=None): - e = [] + setflags = a != PC + if b.is_op(): + e, extra_ir = _shift_rotate_tpl(ir, instr, a, b, onlyCarry=setflags) + # get back the result + b = e.pop(0).src + else: + e = [] + extra_ir = [] if c is None: b, c = a, b arg1, arg2 = b, c @@ -605,7 +667,7 @@ def teq(ir, instr, a, b, c=None): e += [ExprAssign(zf, ExprOp('FLAG_EQ_CMP', arg1, arg2))] e += update_flag_nf(r) - return e, [] + return e, extra_ir def l_cmp(ir, instr, a, b, c=None): @@ -622,6 +684,8 @@ def cmn(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) arg1, arg2 = b, c e += update_flag_arith_add_zn(arg1, arg2) e += update_flag_arith_add_co(arg1, arg2) @@ -632,6 +696,8 @@ def orr(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) r = b | c e.append(ExprAssign(a, r)) dst = get_dst(a) @@ -644,6 +710,8 @@ def orn(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) r = ~(b | c) e.append(ExprAssign(a, r)) dst = get_dst(a) @@ -653,9 +721,16 @@ def orn(ir, instr, a, b, c=None): def orrs(ir, instr, a, b, c=None): - e = [] + setflags = a != PC if c is None: b, c = a, b + if c.is_op(): + e, extra_ir = _shift_rotate_tpl(ir, instr, a, c, onlyCarry=setflags) + # get back the result + c = e.pop(0).src + else: + e = [] + extra_ir = [] arg1, arg2 = b, c r = arg1 | arg2 @@ -666,10 +741,13 @@ def orrs(ir, instr, a, b, c=None): dst = get_dst(a) if dst is not None: e.append(ExprAssign(ir.IRDst, r)) - return e, [] + return e, extra_ir def mov(ir, instr, a, b): + if b.is_op(): + return _shift_rotate_tpl(ir, instr, a, b, setflags=False) + # TODO handle cf e = [ExprAssign(a, b)] dst = get_dst(a) if dst is not None: @@ -686,10 +764,78 @@ def movt(ir, instr, a, b): return e, [] +def _shift_rotate_tpl(ir, instr, dst, shift_operation, setflags=False, is_not=False, onlyCarry=False): + """ + Template to generate a shift/rotate + A temporary basic block is generated to handle 0-shift + @dst: destination + @shift_operation: the shift/rotate operation (ExprOp) + @setflags: (optional) if set, flags are updated (ZNC) + @onlyCarry: (optional) if set, Z and N flags won't be updated except if setflags is set. + @is_not: (optional) if set, behaves as MVN/MVNS + """ + op = shift_operation.op + # Compute carry (+ result for rrx) + if op == 'rrx': + res, new_cf = compute_rrx_carry(shift_operation) + shifter = ExprInt(1, 8) + elif op in ['<<', '>>', 'a>>']: + shifter = shift_operation.args[1] + if setflags or onlyCarry: + new_cf = ExprOp(op, shift_operation.args[0], shifter - ExprInt(1, size=shifter.size)) + left = op[-1] == '<' + new_cf = new_cf.msb() if left else new_cf[:1] + res = shift_operation + elif op == '>>>': + shifter = shift_operation.args[1] + if setflags or onlyCarry: + new_cf = shift_operation.msb() + res = shift_operation + else: + raise NotImplementedError(f"Unknown shift / rotate operation : {op}") + + # NOT the result and use it for ZN flags computations + if is_not: + res ^= ExprInt(-1, res.size) + # Build basic blocks + e_do = [] + e = [ExprAssign(dst, res)] + if setflags: + e += update_flag_zn(res) + if setflags or onlyCarry: + e_do += [ExprAssign(cf, expr_simp(new_cf))] + # Don't generate conditional shifter on constant + if shifter.is_int(): + if shifter.is_int(0): + # assignement + flags if setflags except cf + return (e, []) + else: + # assignement + flags if setflags + return (e + e_do, []) + + loc_do, loc_do_expr = ir.gen_loc_key_and_expr(ir.IRDst.size) + loc_skip = ir.get_next_loc_key(instr) + loc_skip_expr = ExprLoc(loc_skip, ir.IRDst.size) + isPC = get_dst(dst) + if isPC is not None: + # Not really a Loc in this case + loc_skip_expr = res + e_do.append(ExprAssign(ir.IRDst, loc_skip_expr)) + e.append(ExprAssign( + ir.IRDst, ExprCond(shifter, loc_do_expr, loc_skip_expr))) + return (e, [IRBlock(ir.loc_db, loc_do, [AssignBlock(e_do, instr)])]) + + + def movs(ir, instr, a, b): e = [] + # handle shift / rotate + if b.is_op(): + return _shift_rotate_tpl(ir, instr, a, b, setflags=a != PC) + + e.append(ExprAssign(a, b)) - # XXX TODO check + # TODO handle cf e += [ExprAssign(zf, ExprOp('FLAG_EQ', b))] e += update_flag_nf(b) @@ -700,7 +846,10 @@ def movs(ir, instr, a, b): def mvn(ir, instr, a, b): + if b.is_op(): + return _shift_rotate_tpl(ir, instr, a, b, setflags=False, is_not=True) r = b ^ ExprInt(-1, 32) + # TODO handle cf e = [ExprAssign(a, r)] dst = get_dst(a) if dst is not None: @@ -709,10 +858,12 @@ def mvn(ir, instr, a, b): def mvns(ir, instr, a, b): + if b.is_op(): + return _shift_rotate_tpl(ir, instr, a, b, setflags= a != PC, is_not=True) e = [] r = b ^ ExprInt(-1, 32) e.append(ExprAssign(a, r)) - # XXX TODO check + # TODO handle cf e += [ExprAssign(zf, ExprOp('FLAG_EQ', r))] e += update_flag_nf(r) @@ -765,6 +916,8 @@ def bic(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) r = b & (c ^ ExprInt(-1, 32)) e.append(ExprAssign(a, r)) dst = get_dst(a) @@ -774,9 +927,16 @@ def bic(ir, instr, a, b, c=None): def bics(ir, instr, a, b, c=None): - e = [] + setflags = a != PC if c is None: b, c = a, b + if c.is_op(): + e, extra_ir = _shift_rotate_tpl(ir, instr, a, c, onlyCarry=setflags) + # get back the result + c = e.pop(0).src + else: + e = [] + extra_ir = [] tmp1, tmp2 = b, ~c r = tmp1 & tmp2 @@ -787,13 +947,15 @@ def bics(ir, instr, a, b, c=None): dst = get_dst(a) if dst is not None: e.append(ExprAssign(ir.IRDst, r)) - return e, [] + return e, extra_ir def sdiv(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) loc_div = ExprLoc(ir.loc_db.add_location(), ir.IRDst.size) loc_except = ExprId(ir.loc_db.add_location(), ir.IRDst.size) @@ -825,6 +987,8 @@ def udiv(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) @@ -888,6 +1052,8 @@ def mul(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) r = b * c e.append(ExprAssign(a, r)) dst = get_dst(a) @@ -900,6 +1066,8 @@ def muls(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) r = b * c e += update_flag_zn(r) e.append(ExprAssign(a, r)) @@ -1169,100 +1337,65 @@ def und(ir, instr, a, b): e = [] return e, [] -# TODO XXX implement correct CF for shifters def lsr(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b - r = b >> c - e.append(ExprAssign(a, r)) - dst = get_dst(a) - if dst is not None: - e.append(ExprAssign(ir.IRDst, r)) - return e, [] + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) + return _shift_rotate_tpl(ir, instr, a, b >> c, setflags=False) def lsrs(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b - r = b >> c - e.append(ExprAssign(a, r)) - - e += [ExprAssign(zf, ExprOp('FLAG_EQ', r))] - e += update_flag_nf(r) - - dst = get_dst(a) - if dst is not None: - e.append(ExprAssign(ir.IRDst, r)) - return e, [] + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) + return _shift_rotate_tpl(ir, instr, a, b >> c, setflags= a != PC) def asr(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) r = ExprOp("a>>", b, c) - e.append(ExprAssign(a, r)) - dst = get_dst(a) - if dst is not None: - e.append(ExprAssign(ir.IRDst, r)) - return e, [] + return _shift_rotate_tpl(ir, instr, a, r, setflags=False) def asrs(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) r = ExprOp("a>>", b, c) - e.append(ExprAssign(a, r)) - - e += [ExprAssign(zf, ExprOp('FLAG_EQ', r))] - e += update_flag_nf(r) - - dst = get_dst(a) - if dst is not None: - e.append(ExprAssign(ir.IRDst, r)) - return e, [] + return _shift_rotate_tpl(ir, instr, a, r, setflags= a != PC) def lsl(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b - r = b << c - e.append(ExprAssign(a, r)) - dst = get_dst(a) - if dst is not None: - e.append(ExprAssign(ir.IRDst, r)) - return e, [] + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) + return _shift_rotate_tpl(ir, instr, a, b << c, setflags=False) + def lsls(ir, instr, a, b, c=None): e = [] if c is None: b, c = a, b - r = b << c - e.append(ExprAssign(a, r)) - - e += [ExprAssign(zf, ExprOp('FLAG_EQ', r))] - e += update_flag_nf(r) - - dst = get_dst(a) - if dst is not None: - e.append(ExprAssign(ir.IRDst, r)) - return e, [] + if c.is_op('rrx'): + c, _ = compute_rrx_carry(c) + return _shift_rotate_tpl(ir, instr, a, b << c, setflags= a != PC) def rors(ir, instr, a, b): e = [] r = ExprOp(">>>", a, b) - e.append(ExprAssign(a, r)) - - e += [ExprAssign(zf, ExprOp('FLAG_EQ', r))] - e += update_flag_nf(r) + return _shift_rotate_tpl(ir, instr, a, r, setflags= a != PC) - dst = get_dst(a) - if dst is not None: - e.append(ExprAssign(ir.IRDst, r)) - return e, [] def push(ir, instr, a): @@ -1962,9 +2095,7 @@ class Lifter_Arml(Lifter): args = instr.args # ir = get_mnemo_expr(self, self.name.lower(), *args) if len(args) and isinstance(args[-1], ExprOp): - if args[-1].op == 'rrx': - args[-1] = ExprCompose(args[-1].args[0][1:], cf) - elif (args[-1].op in ['<<', '>>', '<<a', 'a>>', '<<<', '>>>'] and + if (args[-1].op in ['<<', '>>', '<<a', 'a>>', '<<<', '>>>'] and isinstance(args[-1].args[-1], ExprId)): args[-1] = ExprOp(args[-1].op, args[-1].args[0], diff --git a/miasm/arch/mep/regs.py b/miasm/arch/mep/regs.py index b7fa2a78..be195b61 100644 --- a/miasm/arch/mep/regs.py +++ b/miasm/arch/mep/regs.py @@ -44,7 +44,7 @@ csr_names = ["PC", "LP", "SAR", "S3", "RPB", "RPE", "RPC", "HI", "LO", csr_exprs, csr_inits, csr_infos = gen_regs(csr_names, globals()) # Define aliases to control/special registers -PC = csr_exprs[0] # Program Conter. On MeP, it is the special register R0 +PC = csr_exprs[0] # Program Counter. On MeP, it is the special register R0 LP = csr_exprs[1] # Link Pointer. On MeP, it is the special register R1 SAR = csr_exprs[2] # Shift Amount Register. On MeP, it is the special register R2 RPB = csr_exprs[4] # Repeat Begin. On MeP, it is the special register R4 diff --git a/miasm/arch/x86/arch.py b/miasm/arch/x86/arch.py index dabd0c82..c5ff9b63 100644 --- a/miasm/arch/x86/arch.py +++ b/miasm/arch/x86/arch.py @@ -428,7 +428,7 @@ def offsize(p): def get_prefix(s): - g = re.search('(\S+)(\s+)', s) + g = re.search(r'(\S+)(\s+)', s) if not g: return None, s prefix, b = g.groups() diff --git a/miasm/arch/x86/sem.py b/miasm/arch/x86/sem.py index ffa2641c..81e45e7e 100644 --- a/miasm/arch/x86/sem.py +++ b/miasm/arch/x86/sem.py @@ -5057,7 +5057,7 @@ def ldmxcsr(ir, instr, dst): def _select4(src, control): - # Implementation inspired from Intel Intrisics Guide + # Implementation inspired from Intel Intrinsics Guide # @control is already resolved (was an immediate) if control == 0: diff --git a/miasm/core/cpu.py b/miasm/core/cpu.py index 7a1cacff..7df9f991 100644 --- a/miasm/core/cpu.py +++ b/miasm/core/cpu.py @@ -393,7 +393,7 @@ variable = pyparsing.Word(pyparsing.alphas + "_$.", pyparsing.alphanums + "_") variable.setParseAction(cb_parse_id) operand = str_int | variable -base_expr = pyparsing.operatorPrecedence(operand, +base_expr = pyparsing.infixNotation(operand, [(notop, 1, pyparsing.opAssoc.RIGHT, cb_op_not), (andop, 2, pyparsing.opAssoc.RIGHT, cb_op_and), (xorop, 2, pyparsing.opAssoc.RIGHT, cb_op_xor), @@ -408,7 +408,7 @@ default_prio = 0x1337 def isbin(s): - return re.match('[0-1]+$', s) + return re.match(r'[0-1]+$', s) def int2bin(i, l): @@ -1301,7 +1301,7 @@ class cls_mn(with_metaclass(metamn, object)): @classmethod def fromstring(cls, text, loc_db, mode = None): global total_scans - name = re.search('(\S+)', text).groups() + name = re.search(r'(\S+)', text).groups() if not name: raise ValueError('cannot find name', text) name = name[0] diff --git a/miasm/core/graph.py b/miasm/core/graph.py index 0dfd7e6a..e680894c 100644 --- a/miasm/core/graph.py +++ b/miasm/core/graph.py @@ -20,7 +20,7 @@ class DiGraph(object): # N -> Nodes N2 with a edge (N2 -> N) self._nodes_pred = {} - self.escape_chars = re.compile('[' + re.escape('{}') + '&|<>' + ']') + self.escape_chars = re.compile(r'[\{\}&|<>]') def __repr__(self): diff --git a/miasm/core/sembuilder.py b/miasm/core/sembuilder.py index 24470656..9843ee6a 100644 --- a/miasm/core/sembuilder.py +++ b/miasm/core/sembuilder.py @@ -22,8 +22,8 @@ class MiasmTransformer(ast.NodeTransformer): """ # Parsers - parse_integer = re.compile("^i([0-9]+)$") - parse_mem = re.compile("^mem([0-9]+)$") + parse_integer = re.compile(r"^i([0-9]+)$") + parse_mem = re.compile(r"^mem([0-9]+)$") # Visitors def visit_Call(self, node): diff --git a/miasm/core/utils.py b/miasm/core/utils.py index 41bf78c1..eb170576 100644 --- a/miasm/core/utils.py +++ b/miasm/core/utils.py @@ -26,7 +26,7 @@ COLOR_OP = "black" COLOR_MNEMO = "blue1" -ESCAPE_CHARS = re.compile('[' + re.escape('{}') + '&|<>' + ']') +ESCAPE_CHARS = re.compile(r'[\{\}&|<>]') def set_html_text_color(text, color): return '<font color="%s">%s</font>' % (color, text) diff --git a/miasm/expression/expression.py b/miasm/expression/expression.py index c507f19f..e5debb34 100644 --- a/miasm/expression/expression.py +++ b/miasm/expression/expression.py @@ -2146,7 +2146,7 @@ def expr_is_sNaN(expr): def expr_is_float_lower(op1, op2): """Return 1 on 1 bit if @op1 < @op2, 0 otherwise. - /!\ Assume @op1 and @op2 are not NaN + [!] Assume @op1 and @op2 are not NaN Comparison is the floating point one, defined in IEEE754 """ sign1, sign2 = op1.msb(), op2.msb() @@ -2160,7 +2160,7 @@ def expr_is_float_lower(op1, op2): def expr_is_float_equal(op1, op2): """Return 1 on 1 bit if @op1 == @op2, 0 otherwise. - /!\ Assume @op1 and @op2 are not NaN + [!] Assume @op1 and @op2 are not NaN Comparison is the floating point one, defined in IEEE754 """ sign1, sign2 = op1.msb(), op2.msb() diff --git a/miasm/expression/expression_helper.py b/miasm/expression/expression_helper.py index 299e52e6..81fc5c90 100644 --- a/miasm/expression/expression_helper.py +++ b/miasm/expression/expression_helper.py @@ -89,7 +89,7 @@ op_propag_cst = ['+', '*', '^', '&', '|', '>>', def is_pure_int(e): """ return True if expr is only composed with integers - /!\ ExprCond returns True is src1 and src2 are integers + [!] ExprCond returns True if src1 and src2 are integers """ def modify_cond(e): if isinstance(e, m2_expr.ExprCond): @@ -344,7 +344,7 @@ class ExprRandom(object): compose_max_layer = 5 # Maximum size of memory address in bits memory_max_address_size = 32 - # Re-use already generated elements to mimic a more realistic behavior + # Reuse already generated elements to mimic a more realistic behavior reuse_element = True generated_elements = {} # (depth, size) -> [Expr] @@ -444,13 +444,13 @@ class ExprRandom(object): """Internal function for generating sub-expression according to options @size: (optional) Operation size @depth: (optional) Expression depth - /!\ @generated_elements is left modified + [!] @generated_elements is left modified """ # Perfect tree handling if not cls.perfect_tree: depth = random.randint(max(0, depth - 2), depth) - # Element re-use + # Element reuse if cls.reuse_element and random.choice([True, False]) and \ (depth, size) in cls.generated_elements: return random.choice(cls.generated_elements[(depth, size)]) diff --git a/miasm/expression/simplifications_common.py b/miasm/expression/simplifications_common.py index 835f8723..9156ee67 100644 --- a/miasm/expression/simplifications_common.py +++ b/miasm/expression/simplifications_common.py @@ -1146,7 +1146,7 @@ def simp_cmp_bijective_op(expr_simp, expr): # a + b + c == a + b if not args_b: return ExprOp(TOK_EQUAL, ExprOp(op, *args_a), ExprInt(0, args_a[0].size)) - + arg_a = ExprOp(op, *args_a) arg_b = ExprOp(op, *args_b) return ExprOp(TOK_EQUAL, arg_a, arg_b) @@ -1275,7 +1275,7 @@ def simp_cond_eq_zero(_, expr): def simp_sign_inf_zeroext(expr_s, expr): """ - /!\ Ensure before: X.zeroExt(X.size) => X + [!] Ensure before: X.zeroExt(X.size) => X X.zeroExt() <s 0 => 0 X.zeroExt() <=s 0 => X == 0 @@ -1782,7 +1782,7 @@ def simp_bcdadd_cf(_, expr): for i in range(0,16,4): nib_1 = (arg1.arg >> i) & (0xF) nib_2 = (arg2.arg >> i) & (0xF) - + j = (carry + nib_1 + nib_2) if (j >= 10): carry = 1 @@ -1807,7 +1807,7 @@ def simp_bcdadd(_, expr): for i in range(0,16,4): nib_1 = (arg1.arg >> i) & (0xF) nib_2 = (arg2.arg >> i) & (0xF) - + j = (carry + nib_1 + nib_2) if (j >= 10): carry = 1 diff --git a/miasm/ir/ir.py b/miasm/ir/ir.py index e9b86899..d26c5d1d 100644 --- a/miasm/ir/ir.py +++ b/miasm/ir/ir.py @@ -48,7 +48,7 @@ def _expr_loc_to_symb(expr, loc_db): return m2_expr.ExprId(name, expr.size) -ESCAPE_CHARS = re.compile('[' + re.escape('{}') + '&|<>' + ']') +ESCAPE_CHARS = re.compile(r'[\{\}&|<>]') class TranslatorHtml(Translator): __LANG__ = "custom_expr_color" diff --git a/miasm/ir/translators/z3_ir.py b/miasm/ir/translators/z3_ir.py index 4b674c4e..c72ff36f 100644 --- a/miasm/ir/translators/z3_ir.py +++ b/miasm/ir/translators/z3_ir.py @@ -1,10 +1,11 @@ from builtins import map from builtins import range -import imp +import importlib.util import logging # Raise an ImportError if z3 is not available WITHOUT actually importing it -imp.find_module("z3") +if importlib.util.find_spec("z3") is None: + raise ImportError("No module named 'z3'") from miasm.ir.translators.translator import Translator diff --git a/miasm/jitter/jitload.py b/miasm/jitter/jitload.py index fb1c1f72..99e4429d 100644 --- a/miasm/jitter/jitload.py +++ b/miasm/jitter/jitload.py @@ -476,7 +476,7 @@ class Jitter(object): def get_exception(self): return self.cpu.get_exception() | self.vm.get_exception() - # commun functions + # common functions def get_c_str(self, addr, max_char=None): """Get C str from vm. @addr: address in memory diff --git a/miasm/jitter/loader/pe.py b/miasm/jitter/loader/pe.py index c988fc59..9af068e4 100644 --- a/miasm/jitter/loader/pe.py +++ b/miasm/jitter/loader/pe.py @@ -23,12 +23,12 @@ log.setLevel(logging.INFO) def get_pe_dependencies(pe_obj): """Collect the shared libraries upon which this PE depends. - + @pe_obj: pe object Returns a set of strings of DLL names. - + Example: - + pe = miasm.analysis.binary.Container.from_string(buf) deps = miasm.jitter.loader.pe.get_pe_dependencies(pe.executable) assert sorted(deps)[0] == 'api-ms-win-core-appcompat-l1-1-0.dll' @@ -63,12 +63,12 @@ def get_import_address_pe(e): """Compute the addresses of imported symbols. @e: pe object Returns a dict mapping from tuple (dll name string, symbol name string) to set of virtual addresses. - + Example: - + pe = miasm.analysis.binary.Container.from_string(buf) imports = miasm.jitter.loader.pe.get_import_address_pe(pe.executable) - assert imports[('api-ms-win-core-rtlsupport-l1-1-0.dll', 'RtlCaptureStackBackTrace')] == {0x6b88a6d0} + assert imports[('api-ms-win-core-rtlsupport-l1-1-0.dll', 'RtlCaptureStackBackTrace')] == {0x6b88a6d0} """ import2addr = defaultdict(set) if e.DirImport.impdesc is None: @@ -732,7 +732,7 @@ class ImpRecStateMachine(object): "entry_module_addr": func_addr, "entry_memory_addr": self.cur_address, } - + def transition(self, data): if self.state == self.STATE_SEARCH: if data in self.func_addrs: @@ -760,7 +760,7 @@ class ImpRecStateMachine(object): self.transition(data) else: raise ValueError() - + def run(self): while True: data, address = yield @@ -804,7 +804,7 @@ class ImpRecStrategy(object): @update_libs: if set (default), update `libs` object with founded addresses @align_hypothesis: if not set (default), do not consider import addresses are written on aligned addresses - + Return the list of candidates """ candidates = [] diff --git a/miasm/jitter/loader/utils.py b/miasm/jitter/loader/utils.py index 73809141..7f913d76 100644 --- a/miasm/jitter/loader/utils.py +++ b/miasm/jitter/loader/utils.py @@ -65,7 +65,7 @@ class libimp(object): # imp_ord_or_name = vm_get_str(imp_ord_or_name, 0x100) # imp_ord_or_name = imp_ord_or_name[:imp_ord_or_name.find('\x00')] - #/!\ can have multiple dst ad + #[!] can have multiple dst ad if not imp_ord_or_name in self.lib_imp2dstad[libad]: self.lib_imp2dstad[libad][imp_ord_or_name] = set() if dst_ad is not None: diff --git a/miasm/loader/minidump.py b/miasm/loader/minidump.py index fbb7bde5..c16473b4 100644 --- a/miasm/loader/minidump.py +++ b/miasm/loader/minidump.py @@ -388,7 +388,7 @@ class Context_AMD64(CStruct): ("MxCsr", "u32"), # Segment & processor - # /!\ activation depends on multiple flags + # [!] activation depends on multiple flags ("SegCs", "u16", is_activated("CONTEXT_CONTROL")), ("SegDs", "u16", is_activated("CONTEXT_SEGMENTS")), ("SegEs", "u16", is_activated("CONTEXT_SEGMENTS")), @@ -406,7 +406,7 @@ class Context_AMD64(CStruct): ("Dr7", "u64", is_activated("CONTEXT_DEBUG_REGISTERS")), # Integer registers - # /!\ activation depends on multiple flags + # [!] activation depends on multiple flags ("Rax", "u64", is_activated("CONTEXT_INTEGER")), ("Rcx", "u64", is_activated("CONTEXT_INTEGER")), ("Rdx", "u64", is_activated("CONTEXT_INTEGER")), diff --git a/miasm/loader/pe.py b/miasm/loader/pe.py index ea7cbc52..1252e70e 100644 --- a/miasm/loader/pe.py +++ b/miasm/loader/pe.py @@ -1110,7 +1110,7 @@ class DirDelay(CStruct): if isfromva(tmp_thunk[j].rva & 0x7FFFFFFF) == func: return isfromva(entry.firstthunk) + j * 4 else: - raise ValueError('unknown func tpye %r' % func) + raise ValueError('unknown func type %r' % func) def get_funcvirt(self, addr): rva = self.get_funcrva(addr) diff --git a/miasm/os_dep/linux/environment.py b/miasm/os_dep/linux/environment.py index 808fc847..3ba4382f 100644 --- a/miasm/os_dep/linux/environment.py +++ b/miasm/os_dep/linux/environment.py @@ -13,7 +13,7 @@ from miasm.core.interval import interval from miasm.jitter.csts import PAGE_READ, PAGE_WRITE -REGEXP_T = type(re.compile('')) +REGEXP_T = type(re.compile(r'')) StatInfo = namedtuple("StatInfo", [ "st_dev", "st_ino", "st_nlink", "st_mode", "st_uid", "st_gid", "st_rdev", @@ -262,7 +262,7 @@ class FileSystem(object): expr.flags, exc_info=True, ) - return re.compile('$X') + return re.compile(r'$X') return expr # Remove '../', etc. diff --git a/miasm/os_dep/win_api_x86_32.py b/miasm/os_dep/win_api_x86_32.py index e9c5fd4a..6e568abb 100644 --- a/miasm/os_dep/win_api_x86_32.py +++ b/miasm/os_dep/win_api_x86_32.py @@ -623,10 +623,10 @@ def kernel32_CreateFile(jitter, funcname, get_str): elif fname.upper() in ['NUL']: ret = winobjs.module_cur_hwnd else: - # sandox path + # sandbox path sb_fname = windows_to_sbpath(fname) if args.access & 0x80000000 or args.access == 1: - # read + # read and maybe write if args.dwcreationdisposition == 2: # create_always if os.access(sb_fname, os.R_OK): @@ -642,7 +642,10 @@ def kernel32_CreateFile(jitter, funcname, get_str): if stat.S_ISDIR(s.st_mode): ret = winobjs.handle_pool.add(sb_fname, 0x1337) else: - h = open(sb_fname, 'r+b') + open_mode = 'rb' + if (args.access & 0x40000000) or args.access == 2: + open_mode = 'r+b' + h = open(sb_fname, open_mode) ret = winobjs.handle_pool.add(sb_fname, h) else: log.warning("FILE %r (%s) DOES NOT EXIST!", fname, sb_fname) @@ -671,8 +674,8 @@ def kernel32_CreateFile(jitter, funcname, get_str): raise NotImplementedError("Untested case") else: raise NotImplementedError("Untested case") - elif args.access & 0x40000000: - # write + elif (args.access & 0x40000000) or args.access == 2: + # write but not read if args.dwcreationdisposition == 3: # open existing if is_original_file: @@ -684,7 +687,7 @@ def kernel32_CreateFile(jitter, funcname, get_str): # open dir ret = winobjs.handle_pool.add(sb_fname, 0x1337) else: - h = open(sb_fname, 'r+b') + h = open(sb_fname, 'wb') ret = winobjs.handle_pool.add(sb_fname, h) else: raise NotImplementedError("Untested case") # to test @@ -2452,7 +2455,7 @@ def user32_GetKeyboardType(jitter): jitter.func_ret_stdcall(ret_ad, ret) - + class startupinfo(object): """ typedef struct _STARTUPINFOA { @@ -2528,7 +2531,7 @@ def kernel32_GetStartupInfo(jitter, funcname, set_str): Retrieves the contents of the STARTUPINFO structure that was specified when the calling process was created. - + https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-getstartupinfow """ diff --git a/miasm/runtime/int_lib.h b/miasm/runtime/int_lib.h index 7f5eb799..873296d4 100644 --- a/miasm/runtime/int_lib.h +++ b/miasm/runtime/int_lib.h @@ -48,7 +48,7 @@ #define XSTR(a) STR(a) #define SYMBOL_NAME(name) XSTR(__USER_LABEL_PREFIX__) #name -#if defined(__ELF__) || defined(__MINGW32__) || defined(__wasm__) +#if defined(__ELF__) || defined(__MINGW32__) || defined(__CYGWIN__) || defined(__wasm__) #define COMPILER_RT_ALIAS(name, aliasname) \ COMPILER_RT_ABI __typeof(name) aliasname __attribute__((__alias__(#name))); #elif defined(__APPLE__) diff --git a/requirements.txt b/requirements.txt index 5db3c2a8..b518400d 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,2 +1,2 @@ -pyparsing~=2.0 +pyparsing>=2.4.1 future diff --git a/setup.py b/setup.py index a33f608f..e1e54434 100644 --- a/setup.py +++ b/setup.py @@ -325,7 +325,7 @@ def build_all(): "VERSION" ] }, - install_requires=["future", "pyparsing~=2.0"], + install_requires=["future", "pyparsing>=2.4.1"], cmdclass={"install_data": smart_install_data}, ext_modules = ext_modules, # Metadata diff --git a/test/analysis/depgraph.py b/test/analysis/depgraph.py index 57a73a5f..9760e717 100644 --- a/test/analysis/depgraph.py +++ b/test/analysis/depgraph.py @@ -108,7 +108,7 @@ class IRATest(LifterModelCall): def bloc2graph(irgraph, label=False, lines=True): """Render dot graph of @blocks""" - escape_chars = re.compile('[' + re.escape('{}') + ']') + escape_chars = re.compile(r'[\{\}]') label_attr = 'colspan="2" align="center" bgcolor="grey"' edge_attr = 'label = "%s" color="%s" style="bold"' td_attr = 'align="left"' @@ -179,7 +179,7 @@ def bloc2graph(irgraph, label=False, lines=True): def dg2graph(graph, label=False, lines=True): """Render dot graph of @blocks""" - escape_chars = re.compile('[' + re.escape('{}') + ']') + escape_chars = re.compile(r'[\{\}]') label_attr = 'colspan="2" align="center" bgcolor="grey"' edge_attr = 'label = "%s" color="%s" style="bold"' td_attr = 'align="left"' diff --git a/test/arch/arm/arch.py b/test/arch/arm/arch.py index 42e80772..a3f3a974 100644 --- a/test/arch/arm/arch.py +++ b/test/arch/arm/arch.py @@ -241,7 +241,12 @@ reg_tests_arm = [ '110f111e'), ('XXXXXXXX MCRCC p15, 0x0, R8, c2, c0, 0x1', '308f023e'), - + ('XXXXXXXX MOV R4, R4 ASR 0x20', + '4440a0e1'), + ('XXXXXXXX MOV R2, R5 LSR 0x20', + '2520a0e1'), + ('XXXXXXXX MOVS R2, R5 LSR 0x20', + '2520b0e1'), ] ts = time.time() diff --git a/test/arch/arm/sem.py b/test/arch/arm/sem.py index a5b6d5eb..343bc063 100755 --- a/test/arch/arm/sem.py +++ b/test/arch/arm/sem.py @@ -81,7 +81,7 @@ class TestARMSemantic(unittest.TestCase): self.assertEqual( compute('MOV R4, R4 LSR 31', {R4: 0xDEADBEEF, }), {R4: 0x00000001, }) self.assertEqual( - compute('MOV R4, R4 LSR 32', {R4: 0xDEADBEEF, }), {R4: 0xDEADBEEF, }) + compute('MOV R4, R4 LSR 32', {R4: 0xDEADBEEF, }), {R4: 0x0, }) self.assertRaises(ValueError, compute, 'MOV R4, R4 LSR 33') self.assertEqual( compute('MOV R4, R4 LSR R5', {R4: 0xDEADBEEF, R5: 0xBADBAD01, }), {R4: 0x6F56DF77, R5: 0xBADBAD01, }) @@ -93,7 +93,7 @@ class TestARMSemantic(unittest.TestCase): self.assertEqual( compute('MOV R4, R4 ASR 31', {R4: 0xDEADBEEF, }), {R4: 0xFFFFFFFF, }) self.assertEqual( - compute('MOV R4, R4 ASR 32', {R4: 0xDEADBEEF, }), {R4: 0xDEADBEEF, }) + compute('MOV R4, R4 ASR 32', {R4: 0xDEADBEEF, }), {R4: 0xFFFFFFFF, }) self.assertRaises(ValueError, compute, 'MOV R4, R4 ASR 33') self.assertEqual( compute('MOV R4, R4 ASR R5', {R4: 0xDEADBEEF, R5: 0xBADBAD01, }), {R4: 0xEF56DF77, R5: 0xBADBAD01, }) @@ -111,6 +111,57 @@ class TestARMSemantic(unittest.TestCase): cf: 0, R4: 0x6F56DF77, }) self.assertEqual(compute('MOV R4, R4 RRX ', {cf: 1, R4: 0xDEADBEEF, }), { cf: 1, R4: 0xEF56DF77, }) + # S + self.assertEqual( + compute('MOVS R4, R4 ', {R4: 0xDEADBEEF, }), {R4: 0xDEADBEEF, nf: 1, zf: 0,}) + self.assertRaises(ValueError, compute, 'MOVS R4, R4 LSL 0') + self.assertEqual( + compute('MOVS R4, R4 LSL 1', {R4: 0xDEADBEEF, }), {R4: 0xBD5B7DDE, nf: 1, zf: 0, cf: 1,}) + self.assertEqual( + compute('MOVS R4, R4 LSL 16', {R4: 0xDEADBEEF, }), {R4: 0xBEEF0000, nf: 1, zf: 0, cf: 1,}) + self.assertEqual( + compute('MOVS R4, R4 LSL 31', {R4: 0xDEADBEEF, }), {R4: 0x80000000, nf: 1, zf: 0, cf: 1,}) + self.assertRaises(ValueError, compute, 'MOVS R4, R4 LSL 32') + self.assertEqual( + compute('MOVS R4, R4 LSL R5', {R4: 0xDEADBEEF, R5: 0xBADBAD01, }), {R4: 0xBD5B7DDE, R5: 0xBADBAD01, nf: 1, zf: 0, cf: 1,}) + self.assertRaises(ValueError, compute, 'MOVS R4, R4 LSR 0') + self.assertEqual( + compute('MOVS R4, R4 LSR 1', {R4: 0xDEADBEEF, }), {R4: 0x6F56DF77, nf: 0, zf: 0, cf: 1,}) + self.assertEqual( + compute('MOVS R4, R4 LSR 16', {R4: 0xDEADBEEF, }), {R4: 0x0000DEAD, nf: 0, zf: 0, cf: 1,}) + self.assertEqual( + compute('MOVS R4, R4 LSR 31', {R4: 0xDEADBEEF, }), {R4: 0x00000001, nf: 0, zf: 0, cf: 1,}) + self.assertEqual( + compute('MOVS R4, R4 LSR 32', {R4: 0xDEADBEEF, }), {R4: 0x0, nf: 0, zf: 1, cf: 1,}) + self.assertRaises(ValueError, compute, 'MOVS R4, R4 LSR 33') + self.assertEqual( + compute('MOVS R4, R4 LSR R5', {R4: 0xDEADBEEF, R5: 0xBADBAD01, }), {R4: 0x6F56DF77, R5: 0xBADBAD01, nf: 0, zf: 0, cf: 1,}) + self.assertRaises(ValueError, compute, 'MOVS R4, R4 ASR 0') + self.assertEqual( + compute('MOVS R4, R4 ASR 1', {R4: 0xDEADBEEF, }), {R4: 0xEF56DF77, nf: 1, zf: 0, cf: 1,}) + self.assertEqual( + compute('MOVS R4, R4 ASR 16', {R4: 0xDEADBEEF, }), {R4: 0xFFFFDEAD, nf: 1, zf: 0, cf: 1,}) + self.assertEqual( + compute('MOVS R4, R4 ASR 31', {R4: 0xDEADBEEF, }), {R4: 0xFFFFFFFF, nf: 1, zf: 0, cf: 1,}) + self.assertEqual( + compute('MOVS R4, R4 ASR 32', {R4: 0xDEADBEEF, }), {R4: 0xFFFFFFFF, nf: 1, zf: 0, cf: 1,}) + self.assertRaises(ValueError, compute, 'MOVS R4, R4 ASR 33') + self.assertEqual( + compute('MOVS R4, R4 ASR R5', {R4: 0xDEADBEEF, R5: 0xBADBAD01, }), {R4: 0xEF56DF77, R5: 0xBADBAD01, nf: 1, zf: 0, cf: 1,}) + self.assertRaises(ValueError, compute, 'MOVS R4, R4 ROR 0') + self.assertEqual( + compute('MOVS R4, R4 ROR 1', {R4: 0xDEADBEEF, }), {R4: 0xEF56DF77, nf: 1, zf: 0, cf: 1,}) + self.assertEqual( + compute('MOVS R4, R4 ROR 16', {R4: 0xDEADBEEF, }), {R4: 0xBEEFDEAD, nf: 1, zf: 0, cf: 1,}) + self.assertEqual( + compute('MOVS R4, R4 ROR 31', {R4: 0xDEADBEEF, }), {R4: 0xBD5B7DDF, nf: 1, zf: 0, cf: 1,}) + self.assertRaises(ValueError, compute, 'MOVS R4, R4 ROR 32') + self.assertEqual( + compute('MOVS R4, R4 ROR R5', {R4: 0xDEADBEEF, R5: 0xBADBAD01, }), {R4: 0xEF56DF77, R5: 0xBADBAD01, nf: 1, zf: 0, cf: 1,}) + self.assertEqual(compute('MOVS R4, R4 RRX ', {cf: 0, R4: 0xDEADBEEF, }), { + cf: 1, R4: 0x6F56DF77, zf: 0, nf: 0}) + self.assertEqual(compute('MOVS R4, R4 RRX ', {cf: 1, R4: 0xDEADBEEF, }), { + cf: 1, R4: 0xEF56DF77, zf: 0, nf: 1}) def test_ADC(self): # §A8.8.1: ADC{S}{<c>}{<q>} {<Rd>,} <Rn>, #<const> diff --git a/test/arch/mep/asm/ut_helpers_asm.py b/test/arch/mep/asm/ut_helpers_asm.py index 9f6dc5c2..2ebd0622 100644 --- a/test/arch/mep/asm/ut_helpers_asm.py +++ b/test/arch/mep/asm/ut_helpers_asm.py @@ -27,7 +27,7 @@ def check_instruction(mn_str, mn_hex, multi=None, offset=0): """Try to disassemble and assemble this instruction""" # Rename objdump registers names - mn_str = re.sub("\$([0-9]+)", lambda m: "R"+m.group(1), mn_str) + mn_str = re.sub(r"\$([0-9]+)", lambda m: "R"+m.group(1), mn_str) mn_str = mn_str.replace("$", "") # Disassemble diff --git a/test/arch/mep/ir/test_loadstore.py b/test/arch/mep/ir/test_loadstore.py index 87343fcb..e7b211bd 100644 --- a/test/arch/mep/ir/test_loadstore.py +++ b/test/arch/mep/ir/test_loadstore.py @@ -83,7 +83,7 @@ class TestLoadStore(object): [(ExprMem(ExprInt(0x1010, 32), 32), ExprInt(0xABC7, 32))]) def test_lb(self): - """Test LB executon""" + """Test LB execution""" # LB Rn,(Rm) exec_instruction("LB R1, (R2)", |