Visual testing of a Unity-based 3D world

Visual Testing Coderskitchen

We provide a 3D animation environment for the development and testing of advanced driver assistance systems (ADAS) and autonomous vehicles. Developing and maintaining this environment is a complex task. Unlike games, where smaller bugs or something being displayed slightly differently is a minor issue, these simulations need to be deterministic and free of any errors. Although our code is tested by means of unit tests, these cannot cover everything that might go wrong when using a game engine. Furthermore, unit tests are not the best method to discover any visual errors. In our first approach to visual testing we had testers who manually executed test cases and compared their results to a reference image.

Unity-based 3d environment for development and testing of ADAS systems and autonomous vehicles

We used image-based testing to test many different aspects of our simulation:

  • The visual appearance of objects (vehicles, humans, buildings, vegetation) to represent the virtual world.
  • Lights of objects and the lighting of virtual scenes.
  • All sorts of specialized shaders used for rendering. 
  • Runtime geometry generation of objects. For instance, generating road networks on the fly from OpenDRIVE files. 
  • Updates to the latest version of the rendering engine (Unity in our case). These updates are often required to benefit from the latest improvements and features but may break or change our visualization.

Manual execution of these test cases is rather time-consuming and monotonous. Therefore, we decided to automate the execution. This allows our testers to focus on more complex test cases and eliminates human errors caused by vague test descriptions or testers missing details. Our visual testing framework detects even small differences in hue and saturation that may be imperceptible to humans which is important when developing something that will be used to test automatic safety features in cars.

Architecture of the visual testing framework

Four parts make up the visual testing framework:

  • A remote API in our application that consists of a large variety of automatic test steps.
  • A test description format to specify the test steps for a given test case.
  • A tool to compare one or more previously defined reference images to the current graphical outcome of the executed test case.
  • A report that contains the results of all executed tests.
Architecture of the visual testing framework
Architecture of the visual testing framework

Test description and execution

We provide a test file for each test case in which we list the test steps that must be performed for this test. The simplest possible test consists of two commands. The first command loads a visualization project file which contains a scene in which we set up the features that are to be tested within the current test. The second command creates a screenshot of the currently rendered scene and compares it to a previously generated reference image, failing the test if a difference is detected.

More complex scenarios can be tested by chaining several commands together. Supported commands include, for example, a camera change to obtain a different perspective of an object or skipping to a certain time in a pre-recorded simulation. Multiple compare commands can be inserted between the other commands to verify correctness at every step.

The part of the visual testing framework that is responsible for test execution and image comparison is written in Python. Our visualization application is based on the Unity engine and is therefore written in C#. The Thrift framework is used by the visual testing framework to send the commands specified in the test file framework to the visualization application. Thrift was chosen due to its ability to communicate between applications written in different programming languages and its backwards compatibility between different versions of it.

Image comparison

For every compare command in a test we generate a reference image that displays the expected result. During a test run every one of these compare commands produces a comparison image. The comparison image is matched to the reference image and the test is marked as a failure if the difference between the two is sufficiently large. For every failed test we generate a result image that displays the difference between the two images. This image aids in debugging as it highlights even small or subtle differences.

Visual testing : Image comparison

We use two different comparison methods from the OpenCV library to compare images. The faster Mean Squared Error (MSE) method is used to detect if there is any change at all, but it is not ideal to detect feature changes. The slower Structural Similarity Index (SSI) is used if MSE detected any change. It helps to get a better estimate on how large the differences between the two images are. If SSI produces a difference value that exceeds a certain threshold the corresponding test is marked as a failure. For more details on the applied methods, see the third-partyhow-to about the comparison of two images.

To further increase performance, we also tested an approach using the md5 hash of the images to avoid comparing identical images. Unfortunately rendering on different machines resulted in subtle changes that change the md5 hash while not causing any visible differences. 

Visual testing reports

A Junit result file is generated to allow an easy integration into a continuous integration tool (CI) – Jenkins in our case. This allows the CI to execute the tests automatically and mark a test execution with failure or success. Additionally, an HTML report is generated which is easier to read for humans. It contains the reference, comparison, and the difference images of all failed tests. This way, a developer can quickly gain an overview of all failed tests and the differences that caused them to fail.


We came up with a visual testing solution that we use in our everyday developing process. Currently, our application is covered with more than 1.000 visual tests. By means of the visual testing framework several bugs were detected during our development process. We can cover most of our use cases using the current features. However, there are still some limitations in our testing framework. The following features cannot be tested so far:

  • Effects that are dependent on passing time such as walking animations of pedestrians.
  • Rendering effects that are dependent on some kind of randomness. For example, particle effects that are used to render rain/snow.

What are your experiences on automated testing concepts besides unit testing? Are you using a testing framework similar to the one presented in this post?

Further readings


Legal Notice

This text is the intellectual property of the author and is copyrighted by You are welcome to reuse the thoughts from this blog post. However, the author must always be mentioned with a link to this post!

Leave a Comment

Related Posts

Daniel Lehner

Software Testing in the World of IoT

This interview answers the question “How can I reduce the testing effort for the software of my IoT system?” The expert from Vector explains, how

Daniel Lehner

Reducing efforts in system testing

Which are the best methods to reduce efforts in system testing? In this second part of the Coders Kitchen interview, system testing experts Hans Quecke

Traditional system testing vs. model-based system testing - Coders Kitchen
Daniel Lehner

Traditional vs. model-based system testing

Interview with system testing experts Simone Gronau and Hans Quecke about typical challenges in defining system tests, and how model-based methods might offer an alternative.

Hey there!

Subscribe and get a monthly digest of our newest quality pieces to your inbox.