Halide has a set of fuzz-testing harnesses that can be used to find those
tricky to find, edge cases and bugs that would otherwise not be caught
by a regular unit-testing suite. At the moment these fuzz-tests are housed
in the test/fuzz
directory. The fuzz testing suite use the common,
libfuzzer interface for fuzz-tests.
Fuzz testing requires specific instrumentation across the entire build; to do this we make use of a fuzzing-specific-toolchain/preset. e.g.
cmake -B build --preset linux-x64-fuzzer -DLLVM_ROOT=/path/to/llvminstall
cmake --build ./build -j$(nproc)
Note that the LLVM install that you use must be built with
-D LLVM_ENABLE_RUNTIMES="compiler-rt"
set if you want to build the fuzzer
tests (failing to do so will fail at configure time); not all prebuilt LLVM
installs include this, so you may need to build LLVM from source to run the
fuzz tests locally.
Fuzz-testing harnesses are a little different to a more traditional unit-test and don't have a definitive end of test. In other words, a fuzz test will run:
- for an infinite amount of time (the default),
- for a user specified maximum amount of time,
- until the fuzzer finds a bug and crashes,
- you manually kill the process e.g. (ctrl-C).
Once you have built the fuzz testing suite using the commands listed above you can list the fuzz testing harnesses using the command:
ls ./build/test/fuzz/fuzz_*
To run a fuzzer simply run the fuzz-testing harness with no arguments. e.g.
./build/test/fuzz/fuzz_simplify
By default this will run the fuzz test on a single core and discard whatever. temporary corpus is created.
To reuse a given corpus (recommended) create a new directory to store the corpus generated by your fuzz testing harness and pass that directory into your fuzzer e.g.
mkdir fuzz_simplify_corpus -p
./build/test/fuzz/fuzz_simplify fuzz_simplify_corpus
This will save the state of the fuzzer between runs, this way any progress that your fuzzer makes improving code-coverage will remain persistent on your disk.
Up until this point the fuzzer has only been running on a single core. To speed things up a little, let's run the fuzzer in parallel across all available cores on our machine.
./build/test/fuzz/fuzz_simplify fuzz_simplify_corpus -fork=$(nproc)
An important part of fuzz testing is reproducing the crashing input. To handle this, a libfuzzer-based fuzz harness will create a crash file whenever the fuzzer exits unexpectedly. This will look something like:
crash-<some_random_hash>
To reproduce a crash we simply rerun our fuzz harness with our crash file as the first argument.
./build/test/fuzz/fuzz_simplify crash-<some_random_hash>
So long as your fuzz harness and library are deterministic this should reproduce the original crash.
A bare-bones fuzzer will look something like the following:
#include <stdint.h>
#include <stddef.h>
#include <my_library.h>
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
// Randomly throw data at our function and hope it doesn't crash.
foo(data, size);
return 0;
}
This assumes that our function foo takes in a buffer and the size of said
buffer. But in many cases we would like to make use of more structured data.
e.g. a string or a vector of integers etc. Thankfully libfuzzer provides
a handy helper to convert a raw buffer into common structured data types,
the FuzzedDataProvider class.
For examples on how to use this class see test/fuzz/simplify.cpp
.