Protocol fuzz testing – Order 66 the stack

Jochen Kreissl

Every single bit exchanged over any kind of network needs to follow some kind of well-defined structure: a communication protocol. Otherwise, replies will be like: “Dude, stop talking Klingon”. And there are no handy golden robots able to universally translate anything in real world (yet).
Such protocols can become quite complex and many incorporate things like state machines, timeouts, variable message flows, and optional content. This makes them not exactly straightforward targets for fuzz testing. But at the same time, it makes them extremely tempting test targets! After all: Complexity breeds errors.
They are virtually everywhere: cellular communication, anything Internet-of-Things, modern cars, charging stations, commerce, medical, even in airplanes and power plants- the list just goes on. And if we take a look at CVE lists, it becomes (painfully) obvious that it’s hard to correctly implement and test all the protocols that are commonly used in modern communication stacks.
In this article, I will take a look at the challenges and potentials of applying fuzzing against the implementation of communication protocols.

Communication protocols 101

Communication protocols dictate how participants exchange data over whichever connection they are using. Be it via cable or some radio-based technology like WiFi, cellular or Bluetooth.
Just like a natural language defines words and grammar to allow people to understand each other.
And just like common decency (aka Etiquette) demanding people to say “Hello” at the start of a conversation. Instead of just blasting away their undying love for the God-Emperor of mankind.

Communication protocols: A matter of common decency in machine-machine communication.

In fact, before your browser finished loading this article, my blog post had to squeeze through a veritable maze of communication protocols: HTTPS, TLS, TCP, IP (v4 or v6), Ethernet and several more auxiliary ones to boot.
Whew, what a bunch of acronyms right there, right?
And each of them stands for a protocol, defined by some multi-hundred page standard. Which regularly references several other multi-hundred page standards. And have several different versions. Sometimes with major (breaking) changes to the protocol and messages (looking at you TLS 1.3).
And Erratas.
Never heard of an Errata before? Lucky you.
They are basically a hot-fix for an already released document. Better not miss the fact that the standard you are about to implement has a bunch of Erratas. Explaining some misleading section or changing some definitions or other stuff all over the place…

Anyway.
Fact is: there are many protocols. Building on top of each other and relying on the correct working of the next lower layer to provide their function.
And to get anything done we need all of them.

Why (fuzz) test communication protocols?

Tl:dr: Because you really should fuzz all the things. (What answer did you expect from me, seriously...)

You need a bit more?
Well, consider you have some critical communication going on.
Say, you want to charge your new-now-empty battery-electric land speeder at a public charging station.
You attach your charging cable. At once, your vehicle and the station start talking with each other.
(Over the power cable by the way. No additional “data” cable required. Really interesting stuff but also really off topic.)
How much power should be transferred?
And at which voltage?
How long you are likely going to stay and charge?
Payment details (if required) and more.

Now, what if your car’s communication stack had a (fatal) vulnerability in one of its lower-level communication protocols.
And that charging station was actually placed there by some imaginative Jawas.
Which promptly exploits said vulnerability to hijack your car and disable it. Until their Dunecrawler appears over the horizon to hook up your speeder (and whatever droids you may have hidden inside) hell-bent on selling it to the highest bidder. Which may or may not be you.
We don’t want that, right?

Right, so better make sure that all those protocols with the fancy abbreviations are correctly implemented, tested and robust against attacks. Remember: we need all of them to be (relative) bug-free for the system to be secure.
It is the old crux with security:

For something to be secure, every component must be secure.
For something to be insecure, only one thing needs to be insecure.

General-purpose fuzz testing of protocols – start the engine!

“Okay, okay,” I hear you groan, “then just start up one of the Fuzzers out there and let ’em do their thing then”.
You can do that.
In fact, you probably should do that. Can’t hurt and they are great tools that may eventually find their way around.
We have to keep in mind however, that those Fuzzers were designed to test (file format) parsers. Which means they were never intended to target a stateful system.

In fact, we can do (even) better.
The thing is, protocols by their very nature are (more or less) well defined and (hopefully) standardized.
Thus we know exactly what the exchanged messages look like and how communication should play out.
That is different from the more general situation in (arbitrary) code testing Fuzzers usually face.

So, we can fuzz smarter rather than harder!
If you remember my earlier post about different fuzzing techniques, we could use grammar-based Fuzzers in cases where we know the input grammar of a program.
Well, we do know the input for an implementation of a communication stack – since it is all standardized!
(Crafting an actual grammar out of it may be quite challenging, however.)
As noted in the earlier post, a Fuzzer with access to the input format of the target program will be much faster than a general-purpose Fuzzer (like libfuzz or AFL).

Stateful protocol testing – Where is your permit A38?

Some challenges still remain however.
Remember that I said above that communication protocols are mostly stateful?
That is, they have some kind of internal state machine which dictates the types of messages that are acceptable in said state. Other messages will simply be discarded, even if they are syntactically correct.

As an example why this is so important, let us take a quick look at TLS. It is the protocol of choice to establish a secure, reliable connection between endpoints. HTTPS and many other higher-level protocols rely upon TLS.
It describes a series of messages and also defines a strict exchange sequence.

The TLS (1.2) protocol describes a handshake sequence in which messages have to be exchanged in a fixed order. Out-of-order messages will not only be ignored but the recipient is required to terminate the handshake afterwards.

First, the client must send a ClientHello upon which the server will reply with a ServerHello message, and so on.
If your implementation has a bug in the handling logic of one of the later messages, we will have a hard time triggering it with “normal” fuzzing techniques.
That is because the Fuzzer would have to create a series of correct messages (all prior to the one with a bug in the parsing logic) and then create a faulty message which triggers the bug.
Which is pretty unlikely thus taking a long time to reach.

Protocol specific (fuzz) testing – time to brain up

Alright, so what does this mean for our (fuzz) tester?
It is highly advantageous for our tester to understand the protocol it wants to test.
And not only understand but actually “speak” it correctly – until it specifically does something wrong.
Our tester has to be clever and cunning!

Put another way: The Fuzzer must behave like an infiltrator attempting to sneak into an enemy stronghold.
He must look like the guards. Speak like them and behave like them too.
Until he reaches his goal and can do whatever nasty business he intended to.
Like blowing up their fancy new death star or something.

To accomplish this, the Fuzzer has to implement the target protocol by itself.

Making a protocol-aware Fuzzer. It needs to be aware of message structures, state machine(s), and the general protocol logic.

Implement it in a specific way, which allows the tester to knowingly stretch or outright break all the rules at any time.
We want it to:

Change the message sending order
Omit and replay complete messages
Change message content
Send (too) large or (too) small messages
Perform protocol conform communication – up until a certain point – and only then start behaving erroneously.
Explicitly break the requirements of the protocol. Many specifications have parts reading like this: “Message X MUST NOT be sent before message Y“. Obviously we want the Fuzzer to send Y before X at some point.
Just for snaps and giggles (and to see whether the test target spontaneously combusts).

This is a lot of (initial) work. And we need to do it for every protocol we want to test. Uff.
Hence my recommendation of just start blasting away with a general-purpose Fuzzer for starters.
But when you put in this extra work… you will get a truly powerful, highly specialized protocol testing engine.
Of course, we can design the Fuzzer in such a way as to be easily extensible for new protocols. Thus greatly reducing the effort required to adapt it to another protocol. After all, the requirements for the (fuzz) testing engine remain the same, even if the target protocols vary considerably.

Distance matters – sometimes

You may rightfully ask: “Hey, is this really necessary if I can do API testing?“
The typical answer is decisive: “It depends“.
General-purpose Fuzzers can find many bugs when doing proper API testing of the protocol implementation.
Parsing errors in particular.
Errors in the stateful protocol logic are more difficult to detect, however.

But then again, there are several prerequisites for doing white-box fuzz testing in the first place.
Namely: full access to source code and compilation tooling.
Plus the skill (and time) to write a proper fuzz target that exercises your target program code.

But what if you do not have the source code? Because you bought it from a third party.
Or it is written in Ook ! by long-gone-Jack and no one wants to touch it with a pointy stick let alone a text editor.
Maybe you simply lack the time (or experienced personnel) to set up a white-box fuzzing environment.
Picking up a dedicated protocol-specific Fuzzer can be a highly effective solution.

Another use case would be during a system-level test. There, you feed input via the system’s nominal boundary interfaces (either physical or public API) rather than against internal functions.
Imagine you are starship manufacturer Incom. You need to make sure that all the auxiliary power couplings, targeting computers, and navigation droid input jacks of the new X-Wing are working properly. And they do not come with spyware from the Empire on board. Having a set of dedicated protocol testing tools at hand to quickly verify all those parts bought from different suppliers can cut off a lot of time from your testing phase. This equates to more X-Wings for the rebels at the end of the day!

Monitoring in protocol testing

A nice consequence of doing specialized protocol testing is the implications it brings for monitoring. As discussed in my previous article, monitoring in the general case is highly target-dependent and not (fully) solvable by a generalized tool.
But when we test an implementation of a specific protocol, we have a pretty good idea of how the target should react to erroneous messages.
Because standards (should!) define the error behavior in various situations. Maybe it is as simple as ignoring any malformed or unexpected messages. But maybe it calls for other actions, like sending a (specific) alert message. Or to shut down the connection by sending a goodbye-type message.
As a tester, we can monitor such behavior and detect misbehavior.

Advanced monitoring during protocol testing. As standards define expected error behavior, we can define a protocol-aware monitor. It can detect non-standard conform behavior and report it as an error.

Imagine the Fuzzer created a message bearing an invalid “Version” field of the protocol. If the target just accepts the message and continues (because it does not properly check the field) a protocol-aware Fuzzer can notice this and mark it as an error.
Using a general-purpose Fuzzer, detecting such an error would be much harder. It is unlikely to cause a crash, so it will not be trivially detectable. We as testers would have to write a specific monitoring condition to detect it. And who wants to do that?

Hunt down faulty stacks – execute Order 66

That’s all for now.
Time to go and check the stacks you are deploying. In particular, if the said stack is not one of the tried and tested stacks. Like those used in, say the Linux or Windows kernel.
And be not mistaken, even in those stacks, bugs are (still) occasionally found.

After all, wouldn’t it be just too embarrassing to have an attacker circumvent your critical system with advanced security features because of a bug in your twelve year old, never-patched IP stack…?
Good hunting.
And remember to shoot any questions you may have (regarding protocol testing or fuzzing) below.

Protocol fuzz testing – Order 66 the stack

Jochen Kreissl

Communication protocols 101

Why (fuzz) test communication protocols?

General-purpose fuzz testing of protocols – start the engine!

Stateful protocol testing – Where is your permit A38?

Protocol specific (fuzz) testing – time to brain up

Distance matters – sometimes

Monitoring in protocol testing

Hunt down faulty stacks – execute Order 66

Further readings

Jochen Kreissl

Share:

Legal Notice

Leave a Comment Cancel reply

Related Posts

Monitoring the fuzz target – Revenge of the Bug

Fuzz target – Attack of the Fuzzers

Fuzzing techniques – The Generator Menace

Automated security testing with Fuzzing – The Prequel

Follow us on LinkedIn