Sensors (Mar 2022)

Improving Semantic Segmentation of Urban Scenes for Self-Driving Cars with Synthetic Images

  • Maksims Ivanovs,
  • Kaspars Ozols,
  • Artis Dobrajs,
  • Roberts Kadikis

DOI
https://doi.org/10.3390/s22062252
Journal volume & issue
Vol. 22, no. 6
p. 2252

Abstract

Read online

Semantic segmentation of an incoming visual stream from cameras is an essential part of the perception system of self-driving cars. State-of-the-art results in semantic segmentation have been achieved with deep neural networks (DNNs), yet training them requires large datasets, which are difficult and costly to acquire and time-consuming to label. A viable alternative to training DNNs solely on real-world datasets is to augment them with synthetic images, which can be easily modified and generated in large numbers. In the present study, we aim at improving the accuracy of semantic segmentation of urban scenes by augmenting the Cityscapes real-world dataset with synthetic images generated with the open-source driving simulator CARLA (Car Learning to Act). Augmentation with synthetic images with a low degree of photorealism from the MICC-SRI (Media Integration and Communication Center–Semantic Road Inpainting) dataset does not result in the improvement of the accuracy of semantic segmentation, yet both MobileNetV2 and Xception DNNs used in the present study demonstrate a better accuracy after training on the custom-made CCM (Cityscapes-CARLA Mixed) dataset, which contains both real-world Cityscapes images and high-resolution synthetic images generated with CARLA, than after training only on the real-world Cityscapes images. However, the accuracy of semantic segmentation does not improve proportionally to the amount of the synthetic data used for augmentation, which indicates that augmentation with a larger amount of synthetic data is not always better.

Keywords