There is unrest in the testing community. Several thousand fuzz test values have declared their intentions to attack the fuzz target software. But how do we get the values to it?
You may think that this is straightforward…
Think again: what if the fuzz target is a software component deeply embedded in a cyber-physical system? Or the file parsing module of a GUI program?
Are you convinced enough to keep reading?
Then follow me down a short road filled with pointy rocks, dead ends, roadblocks, and potholes. And lots of hammers.
Well, that depends… – It’s all about the circumstances
Up to this point, we did not talk about what our target actually is – besides it being some amorphous blob of software.
But for this post, we have to take a closer look at that question, as it dictates the circumstance we have to keep in mind when setting up our fuzzer.
I will do so by defining three abstract properties of the target and how they influence the process of setting up fuzzers for it.
- Distance: This isn’t the distance we sit before the monitor, but rather how many non-target software layers are imposed between tester and target. In a Unit-Test like situation, this distance is very short. Or even non-existent, if we can directly access the API of the target software (component). If however, our target is a deeply embedded, cyber-physical system we can only address via some kind of network, the distance is very long indeed.
Why is this important? Because any intermittent layer may transform or even block our test values in some way or other. Which is bad for our test efficiency and effectiveness.
- Interface: This isn’t (necessarily) the API you are searching for. I am talking about the type of interface available to feed test data to the target. Do we have a file-based target? Or is it network-based?
Please let it be a simple software API. It will make our lives so much easier and testing a lot faster…
- Control: Do we have the target source code (and the means to make changes)? That is: are we doing a white-box test or not (see my previous post for my definitions of black- / grey and white-box testing). We may need to make small changes to the target to facilitate testing. Which is impossible in a black-box scenario, of course.
A nail for the hammer: The Fuzz target
Great, so we know a little more about our target software.
But, strangely enough, it isn’t build to take input from a fuzzer. Huh, weird.
So we have a mighty hammer (our fuzzer) and a nice, fresh piece of wood (the target software). Something is missing here…
Our nail is called the fuzz target. The glue between target software and fuzzer.
We have to craft this nail, so that it takes a test input from the fuzzer and feed the test target via its available interface
In a “normal” software test of some library API, this can be as simple as writing a small program to load up the lib and call the interface with parameters passed in via the command line.
If the interface to our target is file-based, we need a bit of code to store the output of the fuzzer in the respective file format and location. And another bit of code to tell the target the path of the latest test file to open and process.
And finally, for a network based interface we need to write code which takes the fuzz values and sends them via the respective network to the target software.
Crafting this nail well is important.
But as it was for the nail-smiths of old, it may also be a lengthy and painfully frustrating task. Especially if you don’t know the target well.
Available (unit) tests can be a great help in getting started.
Better yet, have your fuzz target generated by clever tooling. This only works if you have high control – that is to say the complete target source code.
Of roadblocks and potholes
There are several code constructs which are troublesome for fuzzers to overcome – that is to reach into certain code areas governed by such construct. I talked about this fact briefly in my previous post and why some fuzzing techniques are better prepared to overcome such challenges than others.
But with a high control over the fuzz target, we can try to disable such roadblocks upfront.
- Magic value checks: Conditional statements in the form if (x==42) or if(y.Equals(“Universal Truth”) are troublesome. Many advanced fuzzers have the option to make such magic values known to the fuzzer by simply filling out a dictionary. The fuzzer will then use entries from the dictionary as part of its test generation strategy. Some fuzzers can even learn such values automatically, e.g. by extracting constants from the source code.
- Cryptography: Stuff like hashes, encryption or a checksum can make it all but impossible to get past for a fuzzer – after all, these things are designed to be hard to predict. Disabling (or stubbing) such checks will allow the fuzzed values to reach deeper into the target logic. Another way is to just compute the respective values as part of the fuzz target and add them into the test values at the respective position.
Again! Hit it again! The test harness
Remember the introduction post? Fuzzing is all about automated testing.
But to achieve automated software testing, we need a test harness.
That is a piece (or collection) of software (and maybe hardware!) to embed our target in.
The job of the harness is to provide the “natural” environment for the target to run in. That is to say, all its dependencies, IO and state information.
Going with the picture of hammer, nail, and wood, the harness would be the workbench preventing the wood from jumping off while we are hard at work swinging our mighty hammer.
For a library test, we want to ensure that all dependencies are in place and (re-)initialized to their “correct” state before calling the test functions.
For a GUI program we may need some kind of script-able input tool to automatically trigger the functions we want to test.
And to test embedded systems, it is usually necessary to simulate at least some part of the surrounding system our target expects to interact with. Otherwise the target will be very upset with its current life and don’t do anything. At least not its “normal” work we want to test.
Setting up a test harness can be a lot of work – even more, if you’re unfamiliar with the target software.
If we use one of the many available fuzzers out there (and don’t write our own), setting up fuzz target and harness will actually be the biggest task in the entire fuzzing process. And its all upfront work. Potentially frustrating days of upfront work.
A clean software interface and short target distance (as defined above) will make our lives considerably easier. As do existing (unit) tests and CI pipelines we can use as reference.
Need for speed – Testing at scale
Most fuzzers rely on hitting the target with thousand upon thousands of test cases. This means that test speed is a critical factor. We don’t want to wait for the fuzzer to work for months on end after all.
In principal there are two ways to scale up our fuzzing efficiency: optimization and parallelization.
- Optimization: cut down test time as much as possible. This can be achieved in multiple ways.
- Reduce input size: the larger the test input the longer the target will (usually) take to process it. By cutting the input short where possible, we can reduce the test time.
- Reduce Distance: if we have good control, we may be able to cut out the target module and thus cut short the total distance. By eliminating intermediate, non-target software layers we can reduce execution time – potentially drastically.
- Stub heavy lifters: with good control, we may determine and disable computationally heavy non-target procedures for the time of testing and replace them with stubs. E.g. logging, file- or network access and cryptographic operations. Obviously this requires expert knowledge about the target to make sure the introduced changes do not change the actual target behavior.
- Parallelization: More hammers + more nails + more wood = more hammer time!
In times of virtualization, dockers, cloud computing, CI pipelines and multi-core systems, parallelization is the icing on the fuzzing cake. The big players throw entire data centers worth of computation power against their high-value targets.
This obviously does only work really well for pure software testing. Cyber-physical test targets have the big drawback of requiring physical components and connections, which cannot be simply parallelized.
Bottom line: Keep hammering!
And with that, we have all we need for a good hammer time, right?
A hammer – the fuzzer.
A Nail – the fuzz target.
Nice piece of Wood – the software to be tested.
Great, so … swing away!
Slam – Slam – Slam.
As the first joy of swinging away slowly starts to wear off, you may ask yourself how successful you were.
That is, you would look down at the piece of wood in front of you and wonder how deep the nail has sunk in already.
You observe your work.
Observing, or Monitoring as we will call it, is the next topic ahead of us!