IEEE Access (Jan 2025)

PCGOD: Enhancing Object Detection With Synthetic Data for Scarce and Sensitive Computer Vision Tasks

  • Walid Remmas,
  • Martin Lints,
  • Jaak Joonas Uudmae

DOI
https://doi.org/10.1109/ACCESS.2025.3572719
Journal volume & issue
Vol. 13
pp. 91325 – 91333

Abstract

Read online

Object detection models rely on large-scale, high-quality annotated datasets, which are often expensive, scarce, or restricted due to privacy concerns. Synthetic data generation has emerged as an alternative, yet existing approaches have limitations: generative models lack structured annotations and precise spatial control, while game-engine-based datasets suffer from inaccuracies due to 3D bounding box projections, limited scene diversity, and poor handling of articulated objects. We propose PCGOD, an Unreal Engine-based framework that combines photorealistic rendering with comprehensive domain randomization to bridge the synthetic-to-real (sim2real) domain gap. PCGOD employs a marker-based extremity projection method that places markers at key points on object geometries and projects only visible markers to create tight-fitting bounding boxes. For articulated objects, our approach dynamically tracks skeletal pose changes, ensuring annotations adapt to varied configurations. The framework addresses sim2real transfer through six-dimensional randomization: background environments, model textures and poses, landscape textures, weather conditions, camera perspectives, and procedural scene composition. Evaluations using YOLOv11 and Salience-DETR in an object detection task demonstrate that our marker-based approach achieves up to 41.61% improvement in annotation accuracy over conventional methods. Models trained with just 10% real data supplemented by our synthetic data achieve over 80% of the performance of models trained on 100% real data. Moreover, mixed datasets containing 25% synthetic and 75% real data outperform pure real-data training by up to 5.1%. These results confirm that our approach significantly enhances synthetic data utility for object detection, offering an effective solution for domains with limited training data availability.

Keywords