diff options
Diffstat (limited to 'README.md')
| -rw-r--r-- | README.md | 113 |
1 files changed, 73 insertions, 40 deletions
diff --git a/README.md b/README.md index 68033d9..443ec1a 100644 --- a/README.md +++ b/README.md @@ -1,26 +1,15 @@ # Focaccia -This repository contains initial code for comprehensive testing of binary -translators. +This repository contains the source code for Focaccia, a comprehensive validator for CPU emulators +and binary translators. ## Requirements -For Python dependencies, see the `requirements.txt`. We also require at least LLDB version 17 for `fs_base`/`gs_base` -register support. +Python dependencies are handled via pyproject and uv. We provide first-class support for Nix via our +flake, which integrates with our Python uv environment via uv2nix. -I had to compile LLDB myself; these are the steps I had to take (you also need swig version >= 4): - -``` -git clone https://github.com/llvm/llvm-project <llvm-path> -cd <llvm-path> -cmake -S llvm -B build -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang;lldb" -DLLDB_ENABLE_PYTHON=TRUE -DLLDB_ENABLE_SWIG=TRUE -cmake --build build/ --parallel $(nproc) - -# Add the built LLDB python bindings to your PYTHONPATH: -PYTHONPATH="$PYTHONPATH:$(./build/bin/lldb -P)" -``` - -It will take a while to compile. +We do not support any other build system officially but Focaccia has been known to work on various +other systems also, as long as its Python dependencies are provided. ## How To Use @@ -34,21 +23,60 @@ A number of additional tools are included to simplify use when validating QEMU: ```bash capture-transforms -o oracle.trace bug.out qemu-x86_64 -g 12345 bug.out & -validate-qemu --symb-trace oracle.trace localhost 12345 +validate-qemu --symb-trace oracle.trace --remote localhost:12345 ``` -Alternatively if you have access to the focaccia QEMU plugin: +The above workflow works for reproducing most QEMU bugs but cannot handle the following two cases: + +1. Optimization bugs + +2. Bugs in non-deterministic programs + +We provide alternative approaches for dealing with optimization bugs. Focaccia currently does not +handle bugs in non-deterministic programs. + +### QEMU Optimization bugs + +When a bug is suspected to be an optimization bug, you can use the Focaccia QEMU plugin. The QEMU +plugin is exposed, along with the QEMU version corresponding to it, under the qemu-plugin package in +the Nix flake. + +It is used as follows: ```bash -validation_server.py --symb-trace oracle.trace --use-socket=/tmp/focaccia.sock --guest_arch=<arch> +validate-qemu --symb-trace oracle.trace --use-socket=/tmp/focaccia.sock --guest_arch=arch ``` -After you see `Listening for QEMU Plugin connection at /tmp/focaccia.sock...` you can start QEMU like this: + +Once the server prints `Listening for QEMU Plugin connection at /tmp/focaccia.sock...`, QEMU can be +started in debug mode: + ```bash -qemu-<arch> [-one-insn-per-tb] --plugin build/contrib/plugins/libfocaccia.so <bug.out> +qemu-<arch> [-one-insn-per-tb] --plugin result/lib/plugins/libfocaccia.so bug.out ``` +Note: the above workflow assumes that you used `nix build .#qemu-plugin` to build the plugin under +`result`. + Using this workflow, Focaccia can determine whether a mistranslation occured in that particular QEMU run. +Focaccia includes support for tracing non-deterministic programs using the RR debugger, requiring a +similar workflow: + +```bash +rr record -o bug.rr.out +rr replay -s 12345 bug.rr.out +capture-transforms --remote localhost:12345 --deterministic-log bug.rr.out -o oracle.trace bug.out +``` + +Note: the `rr replay` call prints the correct binary name to use when invoking `capture-transforms`, +it also prints program output. As such, it should be invoked separately as a foreground process. + +Note: `rr record` may fail on Zen and Zen+ AMD CPUs. It is generally possible to continue using it +by specifying flag `-F` but keep in mind that replaying may fail unexpectedly sometimes on such +CPUs. + +Note: we currently do not support validating such programs on QEMU. + ### Box64 For validating Box64, we create the oracle and test traces and compare them @@ -72,31 +100,34 @@ The `tools/` directory contains additional utility scripts to work with focaccia The following files belong to a rough framework for the snapshot comparison engine: - - `focaccia/snapshot.py`: Structures used to work with snapshots. The `ProgramState` class is our primary -representation of program snapshots. + - `focaccia/snapshot.py`: Structures used to work with snapshots. The `ProgramState` class is our + primary representation of program snapshots. - `focaccia/compare.py`: The central algorithms that work on snapshots. - - `focaccia/arch/`: Abstractions over different processor architectures. Currently we have x86 and aarch64. + - `focaccia/arch/`: Abstractions over different processor architectures. Currently we have x86 and + aarch64. ### Concolic execution The following files belong to a prototype of a data-dependency generator based on symbolic execution: - - `focaccia/symbolic.py`: Algorithms and data structures to compute and manipulate symbolic program transformations. -This handles the symbolic part of "concolic" execution. + - `focaccia/symbolic.py`: Algorithms and data structures to compute and manipulate symbolic program + transformations. This handles the symbolic part of "concolic" execution. - - `focaccia/lldb_target.py`: Tools for executing a program concretely and tracking its execution using -[LLDB](https://lldb.llvm.org/). This handles the concrete part of "concolic" execution. + - `focaccia/lldb_target.py`: Tools for executing a program concretely and tracking its execution + using [LLDB](https://lldb.llvm.org/). This handles the concrete part + of "concolic" execution. - - `focaccia/miasm_util.py`: Tools to evaluate Miasm's symbolic expressions based on a concrete state. Ties the symbolic -and concrete parts together into "concolic" execution. + - `focaccia/miasm_util.py`: Tools to evaluate Miasm's symbolic expressions based on a concrete + state. Ties the symbolic and concrete parts together into "concolic" + execution. ### Helpers - - `focaccia/parser.py`: Utilities for parsing logs from Arancini and QEMU, as well as serializing/deserializing to/from -our own log format. + - `focaccia/parser.py`: Utilities for parsing logs from Arancini and QEMU, as well as + serializing/deserializing to/from our own log format. - `focaccia/match.py`: Algorithms for trace matching. @@ -104,14 +135,16 @@ our own log format. To add support for an architecture <arch>, do the following: - - Add a file `focaccia/arch/<arch>.py`. This module declares the architecture's description, such as register names and -an architecture class. The convention is to declare state flags (e.g. flags in RFLAGS for x86) as separate registers. + - Add a file `focaccia/arch/<arch>.py`. This module declares the architecture's description, such + as register names and an architecture class. The convention is to declare state flags (e.g. flags + in RFLAGS for x86) as separate registers. - Add the class to the `supported_architectures` dict in `focaccia/arch/__init__.py`. - - Depending on Miasm's support for <arch>, add register name aliases to the `MiasmSymbolResolver.miasm_flag_aliases` -dict in `focaccia/miasm_util.py`. + - Depending on Miasm's support for <arch>, add register name aliases to the + `MiasmSymbolResolver.miasm_flag_aliases` dict in `focaccia/miasm_util.py`. + + - Depending on the existence of a flags register in <arch>, implement conversion from the flags + register's value to values of single logical flags (e.g. implement the operation `RFLAGS['OF']`) + in the respective concrete targets (LLDB, GDB, ...). - - Depending on the existence of a flags register in <arch>, implement conversion from the flags register's value to -values of single logical flags (e.g. implement the operation `RFLAGS['OF']`) in the respective concrete targets (LLDB, -GDB, ...). |