To bit or not to bit?

Source code is present in pretty much everything we interact with today: phone apps, vehicle GPS, aircraft multimedia or flight controls, technological gizmos, services, video games. It’s everywhere.

And deep down, at machine level, it is indeed represented with the “0” and “1” binary form, as shown in every techno-thriller movie.

But we as developers do not spit out binary code (although I remember typing bytes long ago, to “draw” video game sprites 😊)

No, source code is written using a language, with a vocabulary to learn, and a grammar to follow. And just like the human languages we learn at school, we can end up being (more or less) fluent with a programming language.

A Rosetta Stone for developers

There are many programming languages, from the hundreds tracked by TIOBE, to the thousands listed by The Language List.

You can’t obviously learn them all, but many can be regrouped and categorized, and still represent a great variety of possible ways to make a computer do your bidding.

So, do we have to actually be fluent in each language to be able to grasp any piece of code’s meaning? Could there be some kind of Rosetta Stone, a representation to help understand code without knowledge of the language used, and even compare code from different languages?

Enter Al-Khwarizmi

Al-Khwarizmi not only pioneered algebra, but also our beloved algorithms. And this sequence of instructions is what most, if not all programming languages intend to produce.

If we look at an algorithm, regardless of the supporting language, it can be broken down into abstract elements such as statements, conditions, control flow changes, exception management.

Granted, this representation does not convey the whole code content, but it can produce a visualization of its inner logic, and assist advanced analysis functions.

Go with the flow

But enough exposition, here are some examples.

This is just one statement.
Pretty boring right?
This one is more interesting
And there’s more color!

Now that you get the idea, let’s finish with real applications, showing you source code’s inner beauty.

Pyramids, cities out of a sci-fi movie

And last, a glimpse at your two favorite Operating Systems:

Flow graph of a file Manager function from Windows (3.0)

Floq graph of the 'AppCommandProc' function. It is a 1300 lines long "switch..case" for all edition commands
The ‘AppCommandProc’ function is a 1300 lines long “switch..case” for all edition commands

A GPU Driver function from the Linux Kernel (5.6-rc6)

Flow graph of the 'calculate_bandwidth'. The function is a 2000 lines long extensive graphical algorithm
Flow graph of the ‘calculate_bandwidth’ function. It is a 2000 lines long extensive graphical algorithm

Is a picture worth a thousand bytes?

These visual representations of algorithms are compelling, even mesmerizing. Some of them could be displayed in a ‘binary gallery’ (hey, that’s an idea!)

But appearances aside, there is value associated to these control flow graphs:

  • Numerical data, as metrics related to the algorithm structure (depth, vocabulary frequency, nesting level, etc)
  • The structure of nodes itself, which can be viewed as patterns to analyze and categorize

We will come back to these concepts soon, with interesting use cases and applications!

Further readings

Share:

Legal Notice

This text is the intellectual property of the author and is copyrighted by coderskitchen.com. You are welcome to reuse the thoughts from this blog post. However, the author must always be mentioned with a link to this post!

5 thoughts on “Anatomy of a source code”

    • Thank you for your comment.

      To generate a graph, the algorithm is first analyzed and broken down into elementary components.
      This ‘algorithmic signature’ can then be translated into visual tokens.

      Such tokens represent sequential statements, conditions (If, Switch, While …), control flow change (Break, Continue, Throw …), exceptions (Try, Finally, Catch …)

      Reply
      • Thank you – I was curious whether you know of any tools for generating such flow graphs please?

        (I hope that nobody had to build the Linux graph by hand!)

        Reply
    • No, none of the presented graphs were drawn by hand 🙂
      The graphs were produced by the Squore tool, which among other things embeds a code analyzer and the display features.

      Reply

Leave a Comment

Related Posts

Int code overflow - coderskitchen
John Paliotta

Why is everything an int?

In C/C++ applications, most scalar variables are defined as ‘int’.  Do these applications deal with lots of large numbers that need 32-bit integers?  Not likely,

Flavien Huynh

Measuring and monitoring technical debt

The technical debt concept has been around for some time now. We can find it in international conferences, in scientific articles, books, and even here

Improving code quality
Flavien Huynh

Software quality: To the rescue!

In this post we’ll show what a healthy relationship with code quality looks like. After our introductory post, software quality (‘SQ’ for friends) might seem