what: This entry introduces synthetic-data-generation, and how you can use it to train performance and robust vision models via Blender. We’ll provide an overview of the Blender setup and, for demonstration purposes, a concrete scene classification scenario from the fashion domain.
Why: There is little or no need for human annotation to take advantage of Blender’s procedural capabilities and to adopt a data-centric approach to obtain better machine-learning models.
Who: will trust us Blender >3.1 and Python 3.7. The generated images can be used for any downstream task, regardless of possible dependent frameworks (eg Tensorflow, PyTorch).
Synthetic data generation (SDG) includes a variety of methods that aim to generate data programmatically to support downstream tasks. In statistics and machine learning (ML), the result is to synthesize samples with a uniform distribution of target domains, which is used for model training or testing purposes. it’s part of data-centric ML approachWhere we actively work on data rather than models, algorithms or architectures to achieve better performance.
SDGs are adopted for a number of reasons, the primary of which are:
- Reducing the need for human labeling and curation
- Always facilitating and/or accessing data requirements for high-capacity models
- Dealing with issues such as generality, robustness, portability, bias
- Removing Actual Data Usage Restrictions (Privacy & Regulations)
We are interested in Computer Vision (CV) Synthesis of realistic scene samples (Most common media such as pictures and videos). There are two major approaches to synthesizing data for this domain: generative-model and computer graphics (CG) pipelines. Hybrid approaches exist that combine multiple methods into different measures depending on the target setup.
think for example Creating images of non-existent cats to train a CatVSDog classifier or feed images from a game and Simulated environment to bootstrap the training of self-driving systemsor rFinish Unlimited Variety of CG Human Faces for Landmark Localization,
In this article we will focus on the CG approach, which relies on traditional software and tools for manipulating 3D materials (modelling), applying materials (texturing), and synthesizing 2D images (rendering). Let the blender happen.
Blender is a free and open source 3D CG software toolset. It has seen remarkable improvements over the years, notably with the improvements in version 2.8: redesigned user interface and workspace, real-time Eevee renderer, optimized cycles (path-tracer engine), with 2D animation grease-pencilImproved shader nodes and more recently extra-powerful geometry-nodes system, All this (and more) plus the Python API makes it an apparently popular choice for researchers and hobbyists interested in programmatic and procedural control of 3D environments, without the need to rely on less user-friendly 3D tools or libraries. makes.
The power of programmatic setup for SDGs (and beyond) can be demonstrated with a few lines of code. The following Python snippet allows the location of the camera to be randomized in a controlled manner. if you have a
Track To Obstacles on your camera, you are also guaranteed that it will point to the same spot from different angles.
The following snippet instead shows how to randomize World (ie the visual environment) in terms of background color and light intensity. It only requires a basic node setup, as shown here.
At this point, any object placed in the scene can already be presented from different angles, under different lighting conditions and backgrounds.
The following code is required to render the current view in the file.
One can expand on these basic blocks and automate/randomize any aspect relevant to their needs. Rendered images can be used in any downstream work. The real-time Eeevee renderer guarantees that you can generate images in terms of seconds or less, and therefore easily scale up for data-hungry regimes. Cycles is also an option should you need high realism, but then you are looking at longer rendering times and strong reliance on a good GPU.
On top of that, using Blender you have access to all 3D scene and object information, and you can use passes To split the target content into individual rendered images (e.g. depth, normal-map, ambient-occlusion, object or segmentation map via content index).
For this entry, we will however focus on a single visual classification setting, and explore procedural materials using a simple but concrete classification example from the fashion domain: textile pattern classification.
Let’s introduce the use case: We have a visual classification task where we need to classify the pattern of a fashion item. This is a multi-class problem, where we rely on a set of already defined squares (e.g. plain, striped, dotted, floral, checkered, animal-print). We all have an intuitive understanding of these classes, they apply to any fashion item, such as a dress, a pair of sneakers, or a bag. We can manually collect and curate data for this task, but what about just artificially generating it through Blender?
We first need to get some 3D objects for our target data distribution. We can easily find a plethora of 3d fashion stuff in a premium package or even a free model. We may also approach modeling procedurally, but this is a topic that will be covered in a separate entry in the future.
Once we have our 3D objects, we can start working on the material.
We show the node tree for the three sample materials here: field, striped And flowers, The former two can be achieved purely in Blender. For flowers We instead rely on external images that need to be downloaded separately.
In addition to such materials we only need the following code. We have a naive function to generate random colors and a wrapping function for each target class.
When run, such functions randomize the respective Blender materials (the names used in the code must match the ones in Blender, this is for both composite node-tree names and specific nodes names).
If we combine the above setup with the randomization provided in the previous section, we can start rendering.
With the generated images you can try to train a vision model already in place and see how it performs on your actual, target domain data. You don’t need to collect, scrape and curate actual images, you don’t need human annotation and validation and you have a fully programmatic setup to expand and further exploit.
While this was a toy use case, the more complex or niche, or the less your concepts/classes are presented, the more the SDG approach will save you the pain of data collection and curation.
A synthetic dataset is good when it is versatile and able to generalize to a real dataset. Synthetic data can be used in a number of ways. This can complement the real data already available during training, where different ratios of synthetic VS reals are often tested to understand the best compromise for model performance. Sometimes it can be just a means Test/verify robustness of vision model,
one of the main limitations so called domain gap, which means the intrinsic difference between real and synthetic images. For our Blender example, using more varied and detailed 3D objects, richer textures, or the Cycles renderer will produce images that look more realistic, but there will still be a substantial difference, even more so if compared to real fashion images. Human models are included. specific lines of research such as domain-optimization See how to deal with such issues.
SDGs also require specialist domain knowledge (the same one that would be needed for human annotation), and an overall setup of quality-control, to avoid even wider and more damaging domain differences. There’s always the risk of injecting implicit biases into the generation process, or missing potentially extraneous (yet relevant) regions of your domain altogether. Let’s take our toy use case again: we need to have a clear understanding of our guidelines for what makes up a pattern. Flowers. Does it need realistic flower illustration? Does it need to be colorful enough? How many levels of stylization are acceptable? Without clear answers to such questions we will define the process of synthetic generation based on our own biases and assumptions, and then are likely to fail to capture the requirements of our actual data.
In this entry, we gave a high-level introduction to Synthetic-Data-Generation (SDG) and how it can be easily achieved through Blender.
The Pattern Classification setting allowed us to demonstrate how Blender Materials and Python scripting could be combined to generate synthetic data via a traditional computer-graphics pipeline.
It’s a simple yet powerful setup, which can already be a starting point for a multitude of computer-vision use-cases.
Future entries in this series are planned to expand on the introduction presented here, and explore various and more complex scenarios for SDGs in the visual domain, while showing how Blender can be seen as a powerful tool to tackle them. how can be used.
We welcome feedback and suggestions for running such entries in the future. Overall we plan to cover aspects like depth, segmentation and other scene information that you can get for free if you depend on Blender. We want to go beyond pure material node-trees to tackle procedural modeling and see for example how geometry nodes can be used to manipulate meshes for SDGs.
From there we will move on to hybrid approaches to SDGs, such as object/body generation and reconstruction, generative models for texture synthesis, and neural rendering, considering the significant recent advances made in machine learning for computer graphics.
#Synthetic #Data #Generation #Computer #Vision #Blender #Alex #Martinelli #August