diff options
| author | Peter Maydell <peter.maydell@linaro.org> | 2020-11-10 12:23:05 +0000 |
|---|---|---|
| committer | Peter Maydell <peter.maydell@linaro.org> | 2020-11-10 12:23:05 +0000 |
| commit | 879860ca706fa1ef47ba511c49a6e2b1b49be9b7 (patch) | |
| tree | f6557aa49be6e488920c865482ce4c874337cc38 /docs/devel/fuzzing.txt | |
| parent | 6c8e801f076109a31d864fdbeec57badd159fb06 (diff) | |
| parent | a58cabd0e355fc51f18db359ba260da268df26ef (diff) | |
| download | focaccia-qemu-879860ca706fa1ef47ba511c49a6e2b1b49be9b7.tar.gz focaccia-qemu-879860ca706fa1ef47ba511c49a6e2b1b49be9b7.zip | |
Merge remote-tracking branch 'remotes/huth-gitlab/tags/pull-request-2020-11-10' into staging
* Some small qtest fixes * Oss-fuzz updates * Publish the docs built during gitlab CI to the user's gitlab.io page * Update the OpenBSD VM test to v6.8 * Fix the device-crash-test script to run with the meson build system * Some small s390x fixes # gpg: Signature made Tue 10 Nov 2020 11:05:06 GMT # gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5 # gpg: issuer "thuth@redhat.com" # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full] # gpg: aka "Thomas Huth <thuth@redhat.com>" [full] # gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full] # gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown] # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * remotes/huth-gitlab/tags/pull-request-2020-11-10: s390x: Avoid variable size warning in ipl.h s390x: fix clang 11 warnings in cpu_models.c qtest: Update references to parse_escape() in comments fuzz: add virtio-blk fuzz target docs: add "page source" link to sphinx documentation gitlab: force enable docs build in Fedora, Ubuntu, Debian gitlab: publish the docs built during CI configure: surface deprecated targets in the help output fuzz: Make fork_fuzz.ld compatible with LLVM's LLD scripts/oss-fuzz: give all fuzzers -target names docs/fuzz: update fuzzing documentation post-meson docs/fuzz: rST-ify the fuzzing documentation MAINTAINERS: Add gitlab-pipeline-status script to GitLab CI section gitlab-ci: Drop generic cache rule tests/qtest/tpm: Remove redundant check in the tpm_test_swtpm_test() qtest: Fix bad printf format specifiers device-crash-test: Check if path is actually an executable file tests/vm: update openbsd to release 6.8 meson: always include contrib/libvhost-user Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Diffstat (limited to 'docs/devel/fuzzing.txt')
| -rw-r--r-- | docs/devel/fuzzing.txt | 214 |
1 files changed, 0 insertions, 214 deletions
diff --git a/docs/devel/fuzzing.txt b/docs/devel/fuzzing.txt deleted file mode 100644 index 03585c1a9b..0000000000 --- a/docs/devel/fuzzing.txt +++ /dev/null @@ -1,214 +0,0 @@ -= Fuzzing = - -== Introduction == - -This document describes the virtual-device fuzzing infrastructure in QEMU and -how to use it to implement additional fuzzers. - -== Basics == - -Fuzzing operates by passing inputs to an entry point/target function. The -fuzzer tracks the code coverage triggered by the input. Based on these -findings, the fuzzer mutates the input and repeats the fuzzing. - -To fuzz QEMU, we rely on libfuzzer. Unlike other fuzzers such as AFL, libfuzzer -is an _in-process_ fuzzer. For the developer, this means that it is their -responsibility to ensure that state is reset between fuzzing-runs. - -== Building the fuzzers == - -NOTE: If possible, build a 32-bit binary. When forking, the 32-bit fuzzer is -much faster, since the page-map has a smaller size. This is due to the fact that -AddressSanitizer mmaps ~20TB of memory, as part of its detection. This results -in a large page-map, and a much slower fork(). - -To build the fuzzers, install a recent version of clang: -Configure with (substitute the clang binaries with the version you installed). -Here, enable-sanitizers, is optional but it allows us to reliably detect bugs -such as out-of-bounds accesses, use-after-frees, double-frees etc. - - CC=clang-8 CXX=clang++-8 /path/to/configure --enable-fuzzing \ - --enable-sanitizers - -Fuzz targets are built similarly to system/softmmu: - - make i386-softmmu/fuzz - -This builds ./i386-softmmu/qemu-fuzz-i386 - -The first option to this command is: --fuzz-target=FUZZ_NAME -To list all of the available fuzzers run qemu-fuzz-i386 with no arguments. - -For example: - ./i386-softmmu/qemu-fuzz-i386 --fuzz-target=virtio-scsi-fuzz - -Internally, libfuzzer parses all arguments that do not begin with "--". -Information about these is available by passing -help=1 - -Now the only thing left to do is wait for the fuzzer to trigger potential -crashes. - -== Useful libFuzzer flags == - -As mentioned above, libFuzzer accepts some arguments. Passing -help=1 will list -the available arguments. In particular, these arguments might be helpful: - -$CORPUS_DIR/ : Specify a directory as the last argument to libFuzzer. libFuzzer -stores each "interesting" input in this corpus directory. The next time you run -libFuzzer, it will read all of the inputs from the corpus, and continue fuzzing -from there. You can also specify multiple directories. libFuzzer loads existing -inputs from all specified directories, but will only write new ones to the -first one specified. - --max_len=4096 : specify the maximum byte-length of the inputs libFuzzer will -generate. - --close_fd_mask={1,2,3} : close, stderr, or both. Useful for targets that -trigger many debug/error messages, or create output on the serial console. - --jobs=4 -workers=4 : These arguments configure libFuzzer to run 4 fuzzers in -parallel (4 fuzzing jobs in 4 worker processes). Alternatively, with only --jobs=N, libFuzzer automatically spawns a number of workers less than or equal -to half the available CPU cores. Replace 4 with a number appropriate for your -machine. Make sure to specify a $CORPUS_DIR, which will allow the parallel -fuzzers to share information about the interesting inputs they find. - --use_value_profile=1 : For each comparison operation, libFuzzer computes -(caller_pc&4095) | (popcnt(Arg1 ^ Arg2) << 12) and places this in the coverage -table. Useful for targets with "magic" constants. If Arg1 came from the fuzzer's -input and Arg2 is a magic constant, then each time the Hamming distance -between Arg1 and Arg2 decreases, libFuzzer adds the input to the corpus. - --shrink=1 : Tries to make elements of the corpus "smaller". Might lead to -better coverage performance, depending on the target. - -Note that libFuzzer's exact behavior will depend on the version of -clang and libFuzzer used to build the device fuzzers. - -== Generating Coverage Reports == -Code coverage is a crucial metric for evaluating a fuzzer's performance. -libFuzzer's output provides a "cov: " column that provides a total number of -unique blocks/edges covered. To examine coverage on a line-by-line basis we -can use Clang coverage: - - 1. Configure libFuzzer to store a corpus of all interesting inputs (see - CORPUS_DIR above) - 2. ./configure the QEMU build with: - --enable-fuzzing \ - --extra-cflags="-fprofile-instr-generate -fcoverage-mapping" - 3. Re-run the fuzzer. Specify $CORPUS_DIR/* as an argument, telling libfuzzer - to execute all of the inputs in $CORPUS_DIR and exit. Once the process - exits, you should find a file, "default.profraw" in the working directory. - 4. Execute these commands to generate a detailed HTML coverage-report: - llvm-profdata merge -output=default.profdata default.profraw - llvm-cov show ./path/to/qemu-fuzz-i386 -instr-profile=default.profdata \ - --format html -output-dir=/path/to/output/report - -== Adding a new fuzzer == -Coverage over virtual devices can be improved by adding additional fuzzers. -Fuzzers are kept in tests/qtest/fuzz/ and should be added to -tests/qtest/fuzz/Makefile.include - -Fuzzers can rely on both qtest and libqos to communicate with virtual devices. - -1. Create a new source file. For example ``tests/qtest/fuzz/foo-device-fuzz.c``. - -2. Write the fuzzing code using the libqtest/libqos API. See existing fuzzers -for reference. - -3. Register the fuzzer in ``tests/fuzz/Makefile.include`` by appending the -corresponding object to fuzz-obj-y - -Fuzzers can be more-or-less thought of as special qtest programs which can -modify the qtest commands and/or qtest command arguments based on inputs -provided by libfuzzer. Libfuzzer passes a byte array and length. Commonly the -fuzzer loops over the byte-array interpreting it as a list of qtest commands, -addresses, or values. - -== The Generic Fuzzer == -Writing a fuzz target can be a lot of effort (especially if a device driver has -not be built-out within libqos). Many devices can be fuzzed to some degree, -without any device-specific code, using the generic-fuzz target. - -The generic-fuzz target is capable of fuzzing devices over their PIO, MMIO, -and DMA input-spaces. To apply the generic-fuzz to a device, we need to define -two env-variables, at minimum: - -QEMU_FUZZ_ARGS= is the set of QEMU arguments used to configure a machine, with -the device attached. For example, if we want to fuzz the virtio-net device -attached to a pc-i440fx machine, we can specify: -QEMU_FUZZ_ARGS="-M pc -nodefaults -netdev user,id=user0 \ - -device virtio-net,netdev=user0" - -QEMU_FUZZ_OBJECTS= is a set of space-delimited strings used to identify the -MemoryRegions that will be fuzzed. These strings are compared against -MemoryRegion names and MemoryRegion owner names, to decide whether each -MemoryRegion should be fuzzed. These strings support globbing. For the -virtio-net example, we could use QEMU_FUZZ_OBJECTS= - * 'virtio-net' - * 'virtio*' - * 'virtio* pcspk' (Fuzz the virtio devices and the PC speaker...) - * '*' (Fuzz the whole machine) - -The "info mtree" and "info qom-tree" monitor commands can be especially useful -for identifying the MemoryRegion and Object names used for matching. - -As a generic rule-of-thumb, the more MemoryRegions/Devices we match, the greater -the input-space, and the smaller the probability of finding crashing inputs for -individual devices. As such, it is usually a good idea to limit the fuzzer to -only a few MemoryRegions. - -To ensure that these env variables have been configured correctly, we can use: - -./qemu-fuzz-i386 --fuzz-target=generic-fuzz -runs=0 - -The output should contain a complete list of matched MemoryRegions. - -= Implementation Details = - -== The Fuzzer's Lifecycle == - -The fuzzer has two entrypoints that libfuzzer calls. libfuzzer provides it's -own main(), which performs some setup, and calls the entrypoints: - -LLVMFuzzerInitialize: called prior to fuzzing. Used to initialize all of the -necessary state - -LLVMFuzzerTestOneInput: called for each fuzzing run. Processes the input and -resets the state at the end of each run. - -In more detail: - -LLVMFuzzerInitialize parses the arguments to the fuzzer (must start with two -dashes, so they are ignored by libfuzzer main()). Currently, the arguments -select the fuzz target. Then, the qtest client is initialized. If the target -requires qos, qgraph is set up and the QOM/LIBQOS modules are initialized. -Then the QGraph is walked and the QEMU cmd_line is determined and saved. - -After this, the vl.c:qemu__main is called to set up the guest. There are -target-specific hooks that can be called before and after qemu_main, for -additional setup(e.g. PCI setup, or VM snapshotting). - -LLVMFuzzerTestOneInput: Uses qtest/qos functions to act based on the fuzz -input. It is also responsible for manually calling the main loop/main_loop_wait -to ensure that bottom halves are executed and any cleanup required before the -next input. - -Since the same process is reused for many fuzzing runs, QEMU state needs to -be reset at the end of each run. There are currently two implemented -options for resetting state: -1. Reboot the guest between runs. - Pros: Straightforward and fast for simple fuzz targets. - Cons: Depending on the device, does not reset all device state. If the - device requires some initialization prior to being ready for fuzzing - (common for QOS-based targets), this initialization needs to be done after - each reboot. - Example target: i440fx-qtest-reboot-fuzz -2. Run each test case in a separate forked process and copy the coverage - information back to the parent. This is fairly similar to AFL's "deferred" - fork-server mode [3] - Pros: Relatively fast. Devices only need to be initialized once. No need - to do slow reboots or vmloads. - Cons: Not officially supported by libfuzzer. Does not work well for devices - that rely on dedicated threads. - Example target: virtio-net-fork-fuzz |