Scroll Top

Companies are turning to synthetic data to fill in gaps in their AI models

WHY THIS MATTERS IN BRIEF

When you don’t have real data to work with why not create some? That’s synthetic data …

 

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential Universityconnect, watch a keynote, read our codexes, or browse my blog.

Deep learning has pushed the capabilities of Artificial Intelligence (AI) to new levels, but there are still some kinks to straighten out, such as AI bias, as well as how to best train AI models to handle safety-critical applications such as self-driving cars.

If an AI recommendation engine gets its predictions wrong and puts a strange advert in your browser window, you might raise an eyebrow. But no long-term damage would have been done, but things are very different when algorithms get behind the wheel and encounter something they’ve never seen before.

 

See also
Researchers use mind-reading AI to put people's thoughts on TV

 

Unsurprisingly rare events, or edge cases as they’re known, represent an especially tricky problem for self-driving car developers so now many of them are using Synthetic Data – which is “fake” data generated by AI’s that’s based on lifelike simulations of real-world events – to help them train their AI’s better and faster.

Examples include solutions such as NVIDIA’s Omniverse Replicator, which I’ve spoken about before, that lets developers augment real-world environments with digitally rendered scenarios such as a layer of thick snow covering the road or obscuring street signs. Another illustration of its capabilities is to digitally simulate a child running into the road chasing after a ball.

Developers could, of course, use crash test dummies and various props to achieve the same thing, but the time and expense of training an AI this way generally outweighs any benefits. Plus, if things went wrong, you’d risk damaging the vehicle and its sensors, whereas in a digitally simulated environment everything can be simply refreshed and rerun if things don’t go according to plan.

 

See also
Seoul to become the world's first Metaverse city

 

Elsewhere firms such as Synthesis AI have shown how synthetic data can be used to test the effectiveness of vehicle driver safety monitoring systems. These tools work by tracking the driver’s face to identify signs of drowsiness or distraction and then the output can be linked to Advanced Driver Assistance Systems (ADAS), for example to prime pre-crash mechanisms if the safety monitoring alerts fail to trigger a response from the driver.

Naturally, developers wouldn’t ask a test driver to fall asleep at the wheel on purpose – as a vehicle speeds along the road – so that they could put a potential facial detection algorithm and the mitigations that go with it to the test instead.

The result of Synthesis AI’s newest algorithm is a service called FaceAPI. The tool allows users to create millions of unique 3D driver models with different facial expressions “FaceAPI is already able to produce a wide variety of emotions, including, of course, closed eyes and drowsiness,” write the creators. Now, expanding on the capabilities of the synthetic data-generating software, the model can also represent a driver looking down at their phone or turning to talk to a passenger rather than focus on the road ahead.

 

See also
DeepMind’s AI now has human like 3D vision

 

Undoubtedly the availability of realistic synthetic data for AI training can give firms a helping hand in entering markets where competitors may hold large datasets that would otherwise provide a high barrier to entry. Making it straightforward for start-ups to generate useful AI training sets based on synthetic data gives newer companies the capacity to quickly build momentum without needing to invest large amounts of capital.

Synthetic data also goes beyond just recreating scenarios that would be problematic in the real world, it can let developers dig into all manner of real world areas where training AI’s would ordinarily be expensive, time-consuming, or both.

Related Posts

Leave a comment

FREE! DOWNLOAD THE 2024 EMERGING TECHNOLOGY AND TRENDS CODEXES!DOWNLOAD

Awesome! You're now subscribed.

Pin It on Pinterest

Share This