Thought Leadership

SynthAI: synthetic data for all

By Spencer Acain

The world of AI is one that is constantly growing and evolving. New practices, technologies, and ideas are being developed, tested, and adopted at an incredible pace. In this sea of emergent technology, synthetic data is one that shows great promise across all fields of AI training. In the broadest sense, synthetic data is any data generated by a computer. Something as simple as an excel file filled with names and birthdates qualifies, however, the truly exciting usage of synthetic data is for mimicking more complex and potentially difficult to obtain real-world data – especially visual data. But creating synthetic image data is no simple feat, so many companies are turning to service companies to provide the necessary skills and technical ability. But now with SynthAI synthetic data is just a click away.

Image data is one of the most important and difficult datasets to work with in AI training, but in many fields, especially manufacturing, it is of vital importance. In a factory, a robotics system might be required to perform assembly, QC, movement, or any number of other tasks on an object. In each of these cases, nothing is possible until the system using AI can actually recognize the required object or feature but that is only achieved through long training and vast quantities of visual data. Gathering this visual data is a major challenge with implementing AI in the factory, since it requires setting up an expensive dedicated testing environment, halting work on production equipment, or both. This is where synthetic data can take over, generating photo realistic images in a highly accurate virtual environment to replace expensive and difficult to obtain real images.

Training a machine learning algorithm can be broken down into two broad steps: first an algorithm is selected or trained to broadly recognize the objects it needs to and perform required operations. Then the algorithm is trained to perform its tasks in the specific environment where it will be deployed. Synthetic data can likewise be broken down into two broad types. The first type is Domain Randomized data, which seeks to provide as much variety as possible in the sample data to ensure an algorithm can function in a wide range of scenarios. The second type is Close to Real data, which seeks to mimic as closely as possible real-world conditions an AI algorithm will be operating under. This requires a detailed virtual representation of the entire production system or even factory (such as a comprehensive digital twin) in order to be created.

Synthetic data is an excellent tool that not only allows for key visual data to be gathered without tying up expensive equipment, but also lets designers test scenarios that would be impossible to do in the real world. While synthetic data is a powerful tool that exemplifies the move to a more customizable and adaptable manufacturing landscape, for many companies it also represents a substantial investment of time and money. To create synthetic visual data accurate enough to take the place of real images requires expertise in 3D modeling, visual effects, rendering, and other computer graphics disciplines not normally associated with manufacturing.

Enter, SynthAI. A new product from Siemens, SynthAI can help reduce or even eliminate many of the costs associated with synthetic data and AI implementation in the factory. SynthAI is a web-based application which accepts a simple CAD file to generate a complete set of accurately annotated synthetic images and even a ready-to-use trained ML algorithm as an easy download. With SynthAI, a purpose-trained AI system becomes a service to be purchased rather than a major project to be undertaken. Of course, a service like SynthAI cannot take the place of Close to Real synthetic data or images capture from the real production environment. But what it does offer is a simple and accessible solution for building a large data set for basic training needs. Studies show synthetic data can deliver  up to a 90% reduction in real-world data requirements, something too valuable for forward thinking companies to ignore.

AI is taking an increasingly important role in the factory and to meet the challenges inherent in AI implementation, manufacturers are turning to software and services providers who can make the process as efficient and cost effective as possible. SynthAI addresses these challenges by placing the power and adaptability of synthetic data in the palm of manufacturing’s proverbial hand, allowing for AI to be deployed not just faster, but better and smarter than ever before. To find out more, click here.


Siemens Digital Industries Software is driving transformation to enable a digital enterprise where engineering, manufacturing and electronics design meet tomorrow. Xcelerator, the comprehensive and integrated portfolio of software and services from Siemens Digital Industries Software, helps companies of all sizes create and leverage a comprehensive digital twin that provides organizations with new insights, opportunities and levels of automation to drive innovation.

For more information on Siemens Digital Industries Software products and services, visit siemens.com/software or follow us on LinkedIn, Twitter, Facebook and Instagram.

Siemens Digital Industries Software – Where today meets tomorrow.

Leave a Reply

This article first appeared on the Siemens Digital Industries Software blog at https://blogs.sw.siemens.com/thought-leadership/2022/05/05/synthai-synthetic-data-for-all/