Open-Source Pipeline for Synthetic Data Generation in Industrial Applications
Author
Sanchis Reig, Adrian
Term
4. semester
Education
Publication year
2025
Submitted on
2025-06-04
Abstract
Visionsbaserede deep learning-modeller til industrielle opgaver kræver store, velannoterede datasæt, som er dyre at indsamle og svære at skalere. Denne afhandling udvikler en open-source pipeline, der automatiserer generering og kuratering af syntetiske træningsdata til fremstillingsbrug i samarbejde med Mercedes-Benz Group AG. Med Blenders Cycles-renderer og Python-API samler pipelinen programmatisk fotorealistiske scener, materialer, belysning og kamerakonfigurationer og eksporterer automatiske annotationer til efterfølgende modeltræning. For at mindske sim-til-real-kløften anvendes domænerandomisering og guidet domænerandomisering, suppleret af diffusionsbaseret augmentation med Stable Diffusion XL (herunder ControlNets og IP-Adapter-inpainting). Et filtreringstrin sammenligner høj- og lavniveau-billedtræk for at udvælge nyttige eksempler. Implementeringen understøtter deterministisk seeding, keyframe- og intervalbaseret rendering, parallel og cloud-kørsel samt validering af tilfældige positurer. Pipelinen konfigureres og testes på automotive-, robotik- og T-LESS-datasæt og evalueres via sammenligning med en tidligere renderpipeline, studier af opløsning og datasætstørrelse, effekter af augmentation og filtrering, rendertid-benchmarks, ablations og zero/one/few-shot-scenarier. Selvom konkrete kvantitative resultater ikke fremgår af dette uddrag, positionerer afhandlingen pipelinen som et praktisk, skalerbart værktøj til træning af industrielle objektdetektionsmodeller.
Vision-based deep learning models for industrial tasks depend on large, well-annotated datasets that are costly to acquire and hard to scale. This thesis develops an open-source pipeline that automates the generation and curation of synthetic training data for manufacturing use cases, in collaboration with Mercedes-Benz Group AG. Built around Blender’s Cycles renderer and Python API, the pipeline programmatically assembles photorealistic scenes, materials, lighting, and camera configurations, and exports automatic annotations for downstream model training. To reduce the sim-to-real gap, it applies Domain Randomization and Guided Domain Randomization, and complements rendering with diffusion-based augmentation using Stable Diffusion XL (including ControlNets and IP-Adapter inpainting). A filtering stage compares high- and low-level image features to select useful samples. The implementation supports deterministic seeding, keyframe and interval-based rendering, parallel and cloud execution, and random pose validation. The pipeline is configured and tested on automotive, robotics, and T-LESS datasets, and evaluated through comparisons with a previous render pipeline, studies of resolution and dataset size, augmentation and filtering effects, render-time benchmarks, ablations, and zero/one/few-shot scenarios. While specific quantitative outcomes are not reported in this excerpt, the thesis positions the pipeline as a practical, scalable tool for training industrial object detection models.
[This abstract was generated with the help of AI]
Documents
