-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
testing: Add basic fuzz test harnesses #1807
Conversation
Thanks for opening this pull request! It might take a while before we look at it, so don't worry if there seems to be no feedback. We'll get to it. |
See #1802 for more details :) |
@@ -269,6 +270,10 @@ if (QL_BUILD_TEST_SUITE OR QL_BUILD_BENCHMARK) | |||
add_subdirectory(test-suite) | |||
endif() | |||
|
|||
if (QL_BUILD_FUZZ_TEST_SUITE AND CMAKE_CXX_COMPILER_ID MATCHES "Clang") | |||
add_subdirectory(fuzz-test-suite) | |||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If QL_BUILD_FUZZ_TEST_SUITE
is set for an unsupported compiler then CMake configuration should fail. Also, is this feature supported with clang-cl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. I admit my cmake foo is somewhat lacking. I'll see what I can work out but if you have suggestions on how you'd like to do that then I'd appreciate your guidance 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with clang-cl so I'd have to look into that a bit more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like the following should be good enough:
if (QL_BUILD_FUZZ_TEST_SUITE)
if (CMAKE_CXX_COMPILER_ID MATCHES "Clang")
add_subdirectory(fuzz-test-suite)
else()
message(FATAL_ERROR "QL_BUILD_FUZZ_TEST_SUITE is currently only supported for Clang")
endif()
endif()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I didn't see this and I ended up using cmake_dependent_option instead. Let me know if that works otherwise, I'll put this in instead as it looks like it achieves roughly the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am ok with this change, I just wonder what the strategy is. Both approaches either using latest CMake version or supporting older system CMake version has pros and cons, so there is no straightforward answer. It is up to the maintainers what to choose here :) Lot's of interesting discussion about this exist in for example reddit :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, honestly the change in version was only to allow for cmake_dependent_option. While the ergonomics are a little more clunky we could revert back to the earlier version, this would just become a regular option
and an if statement which is a fairly minor change. I don't feel strongly about going either way.
Worth noting however that I did make/test this change on Ubuntu 20.04, although I had forgotten that I was using the pip version of cmake, which backports a newer version to older systems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll open a PR soon, reverting to an earlier version of cmake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rolled back here #1831
Just curious, did you encounter any fuzzer test failures when you ran this locally? |
Not yet, I only ran it for around 10min though. I'd say it's reasonably robust if it can handle that. But if we go down the route of integration with google/oss-fuzz it'll run for about 4hrs every day on a distributed cluster, pulling in new code changes as they come. So even if it doesn't find bugs now, it might quickly catch new bugs if they happen. |
104f31f
to
0b41640
Compare
I think I've addressed most of you feedback, thanks for being so responsive :) |
endif() | ||
|
||
# Define the fuzz target | ||
add_executable(DateParserFuzzer dateparserfuzzer.cpp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If more fuzzer test cases will be added in the future, then perhaps the target can be renamed to something more generic like ql_fuzz_test_suite
like the existing test suite target.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool I'll take a look, I admit I haven't extended cmake in this way before, but I'll take a swing at it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking into this a little I don't think that this makes sense. The boost harnesses that you are referencing here can be bundled because each unit-test has a finite runtime. So each test can be either serialized or run in parallel from a single executable. In contrast, a fuzz-test will run;
- Indefinitely.
- Or until a crashing input is found.
- Or you manually exit.
- Or if you manually set a timeout.
This poses a problem when you are bundling testing harnesses, as you can't neccesarily say that a fuzz harness is going to complete/exit or not. So if you bundle 3 harnesses, you might just get stuck on harness 1, resulting in 2 and 3 not getting fuzzed at all. So you typically want a single executable per functionality/functionalilty group.
0b41640
to
94cd7d5
Compare
Sorry I think I used cmake dependent option wrong on the last version. I've updated it, and tested it locally using both gcc/clang so I'm hoping the latest changes should fix the CI problems. |
constexpr int kMaxString = 100; | ||
try { | ||
(void)QuantLib::DateParser::parseFormatted(fdp.ConsumeRandomLengthString(kMaxString), "%Y-%m-%d"); | ||
(void)QuantLib::DateParser::parseISO(fdp.ConsumeRandomLengthString(kMaxString)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a random string of length up to 100? If so, it's unlikely that it will ever be a date in ISO format. Is the purpose of this test only to throw random data at the library and check that it doesn't crash or overwrite memory it doesn't own?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's mostly correct. Although I would add that the fuzzer will likely "learn" what the iso date format is. So it'll start off passing in seemingly random data, and then it'll measure the code coverage. The inputs that acheive the highest code-coverage will then be mutated to create new inputs. So it's likely that after a period of time the fuzzer will "learn" (through genetic programming), what the iso date format is, as that will produce the highest code coverage. We can help this process along and make it a bit quicker by providing a seed-corpus or a dictionary. But I haven't included either in the PR as I didn't want to complicate things to early on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks. Right now we're not training it, though, are we? I'm guessing we should add something in the catch clause that tells the fuzzer the result of the attempt, right?
In your experience, is this something that mostly works for simple classes like this one, or does it scale to cases with a few dozens of inputs? Real-world cases for this library would look at least like https://github.com/lballabio/QuantLib/blob/master/Examples/MulticurveBootstrapping/MulticurveBootstrapping.cpp which needs a whole lot of quotes and swap data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks. Right now we're not training it, though, are we? I'm guessing we should add something in the catch clause that tells the fuzzer the result of the attempt, right?
Well no, you wouldn't train the fuzzer in the same way that you would train a classification machine learning model. Instead you can think of fuzzing as more of an optimisation algorithm, where the goal state of the optimizer is to provide a set of inputs that result in 100% of code being executed (rarely happens in practice). So in effect the fuzzer will train itself. The "result" of our fuzz harness is therefore is the resulting code-coverage not the resulting value of the software under test. But it is a good point that we could add some extra assertions to the catch clause, e.g. we could assert the type of exception, this is useful if some exceptions are acceptable but others are not.
It is worth noting however that while the fuzzer "learns" just by running it, it isn't stateful by default, so as soon as the fuzzer exits it'll loose the "learning progress" that it's made. We can save the state between runs by providing a corpus directory. e.g.
mkdir date_parser_corpus -p
# Share corpus/learning-state between runs.
./fuzz-test-suite/DateParserFuzzer date_parser_corpus
If we go down the route of integrating with google/oss-fuzz, all of this corpus management is handled automatically.
In your experience, is this something that mostly works for simple classes like this one, or does it scale to cases with a few dozens of inputs? Real-world cases for this library would look at least like https://github.com/lballabio/QuantLib/blob/master/Examples/MulticurveBootstrapping/MulticurveBootstrapping.cpp which needs a whole lot of quotes and swap data.
Yes in practice fuzzing is very scalable to more complex algorithms/applications. I think the largest scale fuzzer I've seen is google/syzkaller which fuzz's the entire Linux kernel. There are some things that are difficult to fuzz e.g. software with a lot of; non-determinism or global state (where it's hard to reset). But I think fuzzing the example you've linked would be relatively straightforward. While I'm not familiar with multicurve bootstrapping in general it looks like a good candidate for fuzzing with some property based testing. So as a toy example let's say your algorithm is to take an average of a vector, you might add a property assertion something like assert(average(&my_vec) >= min(&my_vec))
, where my_vec
is constructed from fuzzed data. In this case you're testing an invariant property that should be true for all non-empty inputs to average.
94cd7d5
to
333c79c
Compare
I thought I should add a few things in general. There are a couple of prerequisites to building this
The fuzz harness will run indefinitely or;
You can get more detailed help for the fuzzer by running;
It's also best to configure cmake to run with |
Added a basic integration as a draft PR with google/oss-fuzz google/oss-fuzz#11123. I'll need to make a few modifications to the PR if/when this PR is accepted but it is largely working with a few minor builds failures which will be solved once it's all pulled together. |
FuzzedDataProvider fdp(data, size); | ||
constexpr int kMaxString = 100; | ||
try { | ||
(void)QuantLib::DateParser::parseFormatted(fdp.ConsumeRandomLengthString(kMaxString), "%Y-%m-%d"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it turns out there is a bug in parseFormatted, which results in an integer overflow, specifically if the string input is empty i.e. ""
you'll end up with an integer overflow when parsing. I compiled the fuzzer with -fsanitize=undefined -fno-sanitize-recover=all
, and it quickly found the bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this simplest way to reproduce this would be something like;
#include <ql/time/asx.hpp>
#include <ql/time/date.hpp>
#include <ql/time/ecb.hpp>
#include <ql/time/imm.hpp>
#include <ql/time/timeunit.hpp>
#include <ql/utilities/dataparsers.hpp>
#include <string>
int main() {
(void)QuantLib::DateParser::parseFormatted("", "%Y-%m-%d");
return 0;
}
Then compile with the flags as above -fsanitize=undefined -fno-sanitize-recover=all
. Not a security risk, but it might give you some funky results.
@lballabio friendly ping, is this something you are interested in. Or shall I close this PR? |
Hi—sorry, yes, let's move forward. Do you need to make more changes in this PR? |
Yeah I think I'm happy with where things are at. There are a few conversations that are still open, but I think I've addressed everything. This PR is mostly intended to lay the groundwork for more involved fuzzing of the actual financial algorithms in quantlib itself. |
Ok. Warning: as of now, the fuzzing test suite won't be in the distributed release (we'd have to add some info to the autotools files for that) but it shouldn't be a problem right now, am I correct? |
I'm not sure I understand your question. The fuzz tests will be built by default, if the compiler is clang. I can easily change this. I'm not familiar with autotools so I'm not really sure how that relates to the distributed releases. As a general note, I don't think it would make sense to have these fuzz-tests included in a distributed release. Does that answer your question? I'm happy to make some further tweaks if necessary, but I'll need some guidance on what needs to be done. |
The idea is that the .tar.gz for download at the top of a given release page are not simply a dump of the git repository at a given revision. They include some generated files which are not in the repo, and don't include some files which are in the repo. Since we didn't add your new files to any of the |
Yeah makes sense to me I'll flick that switch then |
Done! the fuzz tests won't be built by default only if explicitly requested on the command line. |
@@ -42,6 +44,7 @@ set(QL_INSTALL_CMAKEDIR "lib/cmake/${PACKAGE_NAME}" CACHE STRING | |||
option(QL_BUILD_BENCHMARK "Build benchmark" ON) | |||
option(QL_BUILD_EXAMPLES "Build examples" ON) | |||
option(QL_BUILD_TEST_SUITE "Build test suite" ON) | |||
cmake_dependent_option(QL_BUILD_FUZZ_TEST_SUITE "Build fuzz test suite" OFF "'${CMAKE_CXX_COMPILER_ID}' MATCHES 'Clang'" OFF) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think at this point this can just be
option(QL_BUILD_FUZZ_TEST_SUITE "Build fuzz test suite" OFF)
Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be, although this will still likely fail to build if you are not using clang, as gcc doesn't support libfuzzer out of the box. The dependent_option will only make that option visible if the compiler is clang.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. I misinterpreted the semantics of cmake_dependent_option
.
Congratulations on your first merged pull request! |
No description provided.