Simulation-based automatic object annotation

When it comes to autonomous vehicles, there is no margin for error. To absolutely guarantee safety and user confidence the AV needs to detect everything from physical objects to micro-discrepancies in visibility levels. Even variables on the car itself such as tire and brake wear need to be anticipated. The purest way of getting an AV to correctly react to these elements is to place it on the road. But even after millions of kilometers, real-world trials can only do so much: virtual testing is required to complete the picture.

Simulation does have its own limitations but these are gradually being addressed. One problem being solved right now concerns data annotation. Typically, vehicle developers would collect data sets from driving scenarios in the real world before applying these to a simulation program. To apply the data onto objects that are appearing in the virtual environment, such as pedestrians and cars, they would need a team of people to manually annotate everything one frame of footage at a time. However, new simulation approaches are set to relieve this task and give developers more time to refine their perception products, from cars to cameras to chips. British simulation specialist rFpro produces digital-twin models in which users can conduct tests. With its new method of simulation-based automatic object annotation, rFpro claims it can slash the costs and error risk associated with the standard manual technique and open AV technology development to more participants.

“By setting up the simulation at the start with classifications and segmentations, you can then run thousands of different iterations and still get perfectly annotated data, because you’ve already set it up in the simulation,” says rFpro managing director Matt Daley. “Every single time you collect some real-world data, or you move a camera and collect it, you’ve got to go back and manually annotate it again. The key part here is that it is the simulation software that’s allowing you to automatically annotate the data. There’s a really fundamental difference between taking real-world data, which is a picture where you have to tell what everything is, versus simulated data where you’ve already decided on the picture you’re going to draw before you’ve drawn it.”

Data farming is designed to serve the ‘perception teams’ of autonomous vehicle developers, who must hone their algorithms using masses of collected data. rFpro claims that data farming annotates 10,000 times more quickly than harvesting manually in real time, which consumes around 30 minutes per frame.

With data farming tech, the frame movement is constant, meaning the annotation is too. And because this task can be left for a computer to run, the error rate is slashed from 10% to zero. Daley describes the finished product as a “long-term investment”, the culmination of two decades of research: “The first half of rFpro’s life was completely focused on driver-in-the-loop simulation, and then bringing in hardware-in-the-loop. Everything was focused on real-time simulation.

“The key switch in the past few years has been that we don’t need to constrain ourselves to real time. Our technology can be run whenever, and that’s what data farming is shouting about. It can be used in a sequenced software-in-the-loop simulation, as well as a real-time-constrained driver-in-the-loop.”

This freedom to simulate at will has proved attractive to Denso and Ambarella, two companies that helped cultivate and are now using data farming. Denso develops lidar and radar system products at its European research and development hubs, while Silicon Valley firm Ambarella – following its 2015 acquisition of VisLab – makes vision processors for different levels of autonomy. But it’s not just the Tier 1s taking note. rFpro is used by academic departments such as WMG at the University of Warwick, and there’s also curiosity from industry regulators that have a pressing need to ensure AV safety before greenlighting public roll-outs.

These organizations all have different simulation needs: for instance, suppliers want to simulate across several hardware units, while faculties might only need to engage a single computer. To address this, rFpro has developed a dedicated server for its software, which handles communications to external hardware and enables the user to expand across multiple GPUs and CPUs. The creation of a synchronized, scalable simulation program has been crucial to establishing
a widespread market for data farming.

rFpro is one of several companies fielding more accessible simulation practices. Israel’s Cognata builds environments and synthetic data sets for AV developers to use in their validations through a cloud platform. The company has commercialized a life rendering method that saves on manually annotating real-world data by involving deep neural networks.

“We use deep neural networks to learn how sensor technologies react in the real world,” explains Cognata CEO Danny Atsmon. “We are then using the networks to transfer the synthetically generated data from our engine into something that looks like the actual sensor. It’s called transfer learning. You can utilize this data throughout the simulation, so it saves lots of manual annotation and lots of time, plus you get higher quality data annotation because it’s a process that is fully automated.”

British AV developer Five has created its own simulation software to leverage the data collected during real-world tests. Its self-driving cars are seen routinely around South London as they gain knowledge of the crowded city streets.

“It’s impractical to test your system solely on the road,” says Five’s director of assurance, Iain Whiteside. “We build large test suites of scenarios, which are short sequences of vehicle behaviors. This suite of thousands of scenarios is a stress test to our AV stack. We also advocate running ‘less directed’ virtual worlds in simulation, so rather than forcing behaviors, you generate agents that have some degree of autonomy, and you look for any bubble in that behavior that turns out to be dangerous for the vehicle.”

Despite its data gathering being rooted in real-world testing, Five can apply its results to drive simulations
in a time-saving manner. It’s also working on solving saliency and fidelity problems to address further challenges within simulation.

“We’ve got lots of technology that we use to create automatic, extremely accurate annotations that can be used by quite a lot of our processes,” explains Whiteside. “This includes something called scenario extraction, which enables us to extract new scenarios from the real world, which can then be run in simulation. This allows you to test how a new version of the stack did.”

The gradual process of overcoming the constraints of simulation is likely to be a breakthrough for the AV movement. Time will tell, but reducing hardware costs and run time through automatic annotation should make the sector more accessible. This could, in turn, accelerate AV technology development across the board and enable the industry to break into widespread Level 4 autonomy and beyond.

“The interest level we’ve seen has been about the inclusion of people who didn’t think they could afford to do this before,” notes Daley. “They now realize they can have simulation capability at a scale that suits them. The industry has been constrained by the fact that it has heavily relied on manually annotated real-world data. You’ve only got to look at the players in the market who have made any sort of impression: it’s the ones with the unlimited cash piles. But they’re not in a position yet to sell their products on the road. So even with huge resources, it’s not enough to just complete the full engineering task of developing and validating an autonomous vehicle. Every person in the market has now acknowledged that simulation forms the backbone for how we can actually complete that big engineering challenge.”

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

In this Issue – September 2025

In this Issue – September 2025

In this Issue – April 2025

In this Issue – January 2025

EXPO INTERVIEW: Anurag Paul and Inderjot Saggu, staff research engineers at Plus

INTERVIEW: Mehdi Ferhan, Volvo Group’s head of powertrain, engines, axles and hydrogen

INTERVIEW: Analog Devices’ Paul Fernando discusses the OpenGMSL automotive connectivity technology

Subaru selects HPE to accelerate AI development for next-gen EyeSight driver assist system

PlusAI announces KPIs for commercial readiness

Uber and WeRide expand robotaxi network in Abu Dhabi

Simulation-based automatic object annotation

Related Posts