We provide a 3D animation environment for the development and testing of advanced driver assistance systems (ADAS) and autonomous vehicles. Developing and maintaining this environment is a complex task. Unlike games, where smaller bugs or something being displayed slightly differently is a minor issue, these simulations need to be deterministic and free of any errors. Although our code is tested by means of unit tests, these cannot cover everything that might go wrong when using a game engine. Furthermore, unit tests are not the best method to discover any visual errors. In our first approach to visual testing we had testers who manually executed test cases and compared their results to a reference image.

Unity-based 3d environment for development and testing of ADAS systems and autonomous vehicles

We used image-based testing to test many different aspects of our simulation:

  • The visual appearance of objects (vehicles, humans, buildings, vegetation) to represent the virtual world.
  • Lights of objects and the lighting of virtual scenes.
  • All sorts of specialized shaders used for rendering. 
  • Runtime geometry generation of objects. For instance, generating road networks on the fly from OpenDRIVE files. 
  • Updates to the latest version of the rendering engine (Unity in our case). These updates are often required to benefit from the latest improvements and features but may break or change our visualization.

Manual execution of these test cases is rather time-consuming and monotonous. Therefore, we decided to automate the execution. This allows our testers to focus on more complex test cases and eliminates human errors caused by vague test descriptions or testers missing details. Our visual testing framework detects even small differences in hue and saturation that may be imperceptible to humans which is important when developing something that will be used to test automatic safety features in cars.

Architecture of the visual testing framework

Four parts make up the visual testing framework:

  • A remote API in our application that consists of a large variety of automatic test steps.
  • A test description format to specify the test steps for a given test case.
  • A tool to compare one or more previously defined reference images to the current graphical outcome of the executed test case.
  • A report that contains the results of all executed tests.
Architecture of the visual testing framework
Architecture of the visual testing framework

Test description and execution

We provide a test file for each test case in which we list the test steps that must be performed for this test. The simplest possible test consists of two commands. The first command loads a visualization project file which contains a scene in which we set up the features that are to be tested within the current test. The second command creates a screenshot of the currently rendered scene and compares it to a previously generated reference image, failing the test if a difference is detected.

More complex scenarios can be tested by chaining several commands together. Supported commands include, for example, a camera change to obtain a different perspective of an object or skipping to a certain time in a pre-recorded simulation. Multiple compare commands can be inserted between the other commands to verify correctness at every step.

The part of the visual testing framework that is responsible for test execution and image comparison is written in Python. Our visualization application is based on the Unity engine and is therefore written in C#. The Thrift framework is used by the visual testing framework to send the commands specified in the test file framework to the visualization application. Thrift was chosen due to its ability to communicate between applications written in different programming languages and its backwards compatibility between different versions of it.

Image comparison

For every compare command in a test we generate a reference image that displays the expected result. During a test run every one of these compare commands produces a comparison image. The comparison image is matched to the reference image and the test is marked as a failure if the difference between the two is sufficiently large. For every failed test we generate a result image that displays the difference between the two images. This image aids in debugging as it highlights even small or subtle differences.

Visual testing : Image comparison

We use two different comparison methods from the OpenCV library to compare images. The faster Mean Squared Error (MSE) method is used to detect if there is any change at all, but it is not ideal to detect feature changes. The slower Structural Similarity Index (SSI) is used if MSE detected any change. It helps to get a better estimate on how large the differences between the two images are. If SSI produces a difference value that exceeds a certain threshold the corresponding test is marked as a failure. For more details on the applied methods, see the third-partyhow-to about the comparison of two images.

To further increase performance, we also tested an approach using the md5 hash of the images to avoid comparing identical images. Unfortunately rendering on different machines resulted in subtle changes that change the md5 hash while not causing any visible differences. 

Visual testing reports

A Junit result file is generated to allow an easy integration into a continuous integration tool (CI) – Jenkins in our case. This allows the CI to execute the tests automatically and mark a test execution with failure or success. Additionally, an HTML report is generated which is easier to read for humans. It contains the reference, comparison, and the difference images of all failed tests. This way, a developer can quickly gain an overview of all failed tests and the differences that caused them to fail.


We came up with a visual testing solution that we use in our everyday developing process. Currently, our application is covered with more than 1.000 visual tests. By means of the visual testing framework several bugs were detected during our development process. We can cover most of our use cases using the current features. However, there are still some limitations in our testing framework. The following features cannot be tested so far:

  • Effects that are dependent on passing time such as walking animations of pedestrians.
  • Rendering effects that are dependent on some kind of randomness. For example, particle effects that are used to render rain/snow.

What are your experiences on automated testing concepts besides unit testing? Are you using a testing framework similar to the one presented in this post?

Further readings


Share on linkedin
Share on twitter
Share on whatsapp

Leave a Comment

Related Posts

Martin Heininger

Testing AI-based systems

AI-based systems are becoming popular. However, the broad acceptance of such systems is still quite low in our society. Trust in and acceptance of AI-based

System testing in virtual environments
Marcus Eggenberger

System testing in virtual environments

When it comes to system level testing, virtual environments offer several strong benefits for you. In this blog post, I want to highlight some of

Improving test efficiency
John Paliotta

Improving test efficiency

Effective software teams are always looking for ways to make their members more efficient. They realize that improving the development process is an good way

Debugging without Security risks
Niroshan Rajadurai

Debugging without security risks

Security researcher Michael Myng found the keylogging code in software drivers that were preinstalled on HP laptops to make the keyboard work. He discovered the

Quantifying the cost of fixing bugs
Lynda Gaines

Cost of fixing vs. preventing bugs

When you think about improving software quality, your initial thoughts might be the cost of investing in new tools and engineering labor to implement them.

Hey there!

Subscribe and get a monthly digest of our newest quality pieces to your inbox.