Most asked questions on data collection for ADAS validation

By Giusy Troncone

About 1.25 million people globally die each year due to road traffic crashes – that’s over 3400 deaths a day! That is a huge number! Road traffic injuries are even the leading cause of death globally among people aged 15–29 years.

Road traffic deaths

These numbers speak for themselves: this is a public health and development crisis expected to worsen unless someone takes action.  Without action, annual road traffic deaths will become the seventh leading cause of death by 2030.

Road traffic crashes cause grief, suffering, and economic losses to victims, their families, communities, and nations as a whole, costing countries on average 3 to 5% of their gross national product. Indirect costs, such as loss of productivity, damage to vehicles and property, reduced quality of life and other factors, add to the true cost to society.

More than ever, increasing road safety is on top of the priorities list of many countries worldwide. However, road safety is not just the result of one single factor.

When you look at the top 25 of most common causes of car accidents, you see that besides technical causes (design defects, tire blowouts, etc.) and environmental causes (rain, snow, etc.), 16 (sixteen !) causes are human-related. These include speeding, distracted & drunk driving, wrong-way driving, improper turns, etc.

Car accidents - common causes

That is why Advanced Driver Assistance Systems (ADAS in short) such as Adaptive Cruise Control (ACC), Lane Keep Assist (LKA), or highway pilots are so important and becoming more and more standard features in new vehicles.  Some of them are even becoming mandatory following new legislation, such as the General Safety Regulation (GSR). Engineers develop these advanced safety systems to automate and enhance aspects of the driving experience to increase overall safety and safe driving habits. Research shows that ADAS technology reduces the number of fatalities on the road by also reducing the chance for human error.

However, building reliable ADAS systems is not always straightforward.

A critical challenge ADAS development engineers need to tackle is accessing the necessary data to validate the complete system. The sheer number of traffic scenarios, road layouts, and environmental conditions these systems need to tackle require extensive ADAS testing and data collection campaigns. However, not all data is useful, most of it being redundant, so finding those corner cases becomes a real challenge.

In one of our previous blog posts, we explained why ADAS sensor data collection is crucial to developing self-driving cars.

In this blog post, we will go a little bit deeper on some of the most frequently asked questions on data collection for ADAS validation.

What are the typical data collection use cases for an ADAS validation system?

The first critical step for developing trustworthy ADAS systems is to collect data from real-life traffic situations and train these systems to react to them most efficiently and safely possible. 

A critical component of this process is, of course, the data recorder.  This device interfaces with the different sensors mounted on or inside the vehicle, and it collects all their data streams​.

Depending on the current stage of the development process, there are different use cases for collecting in-vehicle data​.

Use cases for in-vehicle data collection

A first use case for an ADAS validation system is ‘standalone’ recording. In this case, experts connect the different sensors directly to the recorder, they don’t directly evaluate the control functions. Instead, the goal of this campaign is to collect data in real-life traffic conditions​.  This data will train new perception algorithms (which will define how the vehicle needs to react to a particular traffic situation). Later, by leveraging the data within the validation processes, it is possible to check whether these algorithms are working correctly.

Here, a key feature of the recording hardware is its flexibility when connecting and communicating with a wide range of ADAS sensors.  There are many different cameras, lidars and radars on the market and, today, a wide variety of interfaces and protocols exist. That’s why a great flexibility is required when it comes to connecting to these sensors.  Please check below for some further information.

A second use case is the so-called “tapping” case.  Here, the experts insert the recorder between the sensors and the vehicle’s ADAS Electronic Control Unit (ECU)​.  Hence, it can monitor what enters and what leaves the ECU.  Such a setup enables assessing the in-field operation of a new(er) version of the ECU software​. Hereby, one can evaluate the ability of the software stack to react to a specific situation that unfolds around the vehicle. So, in this specific tapping mode, transparency of the recorder is key as it should not affect the normal operation of the sensors and ECUs​.

Towards the end of the development, third parties (e.g., EuroNCAP) assess the vehicle to accredit it before it can go on public roads.  In this case, the recorder needs to be entirely independent of the internal vehicle systems to exclude any influence it might have.  If a specific test fails, the ADAS engineer would, of course, like to understand why it failed.  That is why there is still a need to have a data recorder on board to record additional sensors such as high-accuracy localization systems. These further optimize the on-board ADAS systems and to objectively assess their safety and comfort.

Finally, production fleets can also generate insightful datasets. They can help to better understand how the field uses the system and how customers interact with it. However, a full-blown ADAS recorder is typically not present in every production vehicle for cost reasons​.  There are car manufacturers on the market who use the data from the onboard ADAS systems themselves. The company servers receive the data over the air, and they can potentially help to optimize their ADAS systems. However, this requires extensive data logistics and planning. It can also be very costly, as it relies on over-the-air data streaming via 4G/5G.

Why is accurate data synchronization for sensor fusion setups so important and how can this be achieved?

When you start using more than one sensor inside your vehicle, synchronization of their data becomes critical.  For so-called sensor fusion systems, the ECU makes decisions based on the data originating from complementary sensor technologies to ensure maximum reliability.  In that case, you need to make sure that you’re capturing data at the exact right moment.  Even small delays between different sensor readings may directly affect control decisions and cause a potentially unsafe traffic situation.

That is why accurate timestamping of all data is very critical when designing complex sensor fusion systems.  It is key to capture potential time fluctuations when gathering data during ADAS system development and validation.

High levels of confidence in the acquired timestamps require system-wide time synchronization. The IEEE 1588 standard is often used in the industry.  It transmits timing information from a reference clock to all units to align their internal clocks.

The recorder hardware itself or a GNSS receiver can provide a reference time (Figure 1 below).

The Simcenter SCAPTOR XTSS software provides time synchronization across a complete recording setup
Figure 1. Time synchronization across a complete recording setup

The latter can obtain an absolute time base.  Engineers can subsequently synchronize other recording devices. Similarly, they can also synchronize vehicle ECUs or sensors supporting the IEEE standard.

How can you scale your ADAS validation system for different test campaigns and levels of automation?

As indicated above, a modern vehicle designed for assisted or autonomous driving typically presents widely varying sensor types.  These include cameras, radars, lidars, ultrasonics, IMUs, GPS, CAN-bus systems, etc. All sensors do not necessarily have the same interfaces: they typically offer a broad range of physical connections, and/or they use different communication protocols.   

To measure all these sensors, you need an ADAS validation system that is flexible in terms of sensor connectivity. Well-known interfaces include Ethernet, CAN-bus, Automotive Ethernet, and CAN-FD. Some camera sensors used during early development can also leverage USB or other Ethernet-based protocols such as GigE Vision. Production-grade cameras will tend to use specific protocols such as GMSL2 and FPD-Link III.

The more sensors you want to measure, the more data streamed to the data recorder.  Therefore, besides increased sensor connectivity, you also need to consider the overall bandwidth, right? As you progress, from a level 2 up to a level 3, 4, or 5 of automated driving, the number of cameras is going up, together with the bandwidth of the data they produce. You are talking about terabytes of generated data per vehicle.  This adds a level of scalability to the overall challenge that you have.  

Data scalability

For example, a Full HD camera recording images in the RAW 12 format and at 40 frames per second approximately results in 120 megabytes of data per second. Newer generations of radar outputting raw radar data cube information can also generate around 220 megabytes per second. When you have 6 of these cameras and radars in your setup, the complete vehicle will produce around 2 gigabytes of data per second.  For a full day of testing – 8 hours – that’s equivalent to 58 terabytes. That’s an enormous amount of information per vehicle. This kind of sensor setup is more likely in a level 4 or level 5 vehicle.  Today’s level 2 vehicles are typically a little easier to capture, but you do need a system that can take that kind of bandwidth and offering extensive scalability.  

In conclusion, your recording system needs to be scalable, both in terms of Recording hardware as the used storage.

How can you get the acquired data as quickly and securely as possible to the ADAS development and validation engineers?

Once collected, the data needs to be shared with the development and validation engineers (Figure 2).

Four key steps to setup a smart and efficient ADAS testing workflow
Figure 2: Get recorded data efficiently and quickly from test vehicles to the hands of development and validation engineers

Data ingestion is a critical process that gets the data from the test vehicle into the hands of the engineers that will further use it to generate and train new perception algorithms and validate their performance through data replay. But efficient ingestion requires high-bandwidth data transfers, minimal human action, and maximizes the availability of test vehicles.

When the storage cartridge is full, the test engineer can easily swap it with an empty one and immediately continue his measurement.  The full storage cartridge can then be plugged into into the ingestion station, transferring all data with one button-click or fully autonomously. The data is copied in complete confidence to any desired location, like internal servers or cloud services. 

On top of that, an ingestion station can support several commonly used high-speed interfaces such as 10GbE, USB3.1, and eSATA. It can also transfer the data in a parallel or sequential way to multiple target devices.  This makes it easy for the test engineer to transfer the recorded data to an on-premise data center or a remote cloud provider. In this case, no external PC or peripherals are required. Instead, a simple purpose-built user interface is used, which reduces the risk of costly user manipulation errors.

  • Blog post: Relying on ADAS framework for autonomous driving
  • Webinar: Tackle the challenges of high-speed ADAS sensor and autonomous vehicle data collection for testing and validation
This article first appeared on the Siemens Digital Industries Software blog at