Street Object Detection from Synthesized and Processed Semantic Image: A Deep Learning Based Study

Parthaw Goswami; A. B. M. Aowlad Hossain

doi:10.1007/s44230-023-00043-1

Human-Centric Intelligent Systems (Sep 2023)

Street Object Detection from Synthesized and Processed Semantic Image: A Deep Learning Based Study

Parthaw Goswami,
A. B. M. Aowlad Hossain

Affiliations

Parthaw Goswami: Department of Electronics and Communication Engineering, Khulna University of Engineering and Technology
A. B. M. Aowlad Hossain: Department of Electronics and Communication Engineering, Khulna University of Engineering and Technology

DOI: https://doi.org/10.1007/s44230-023-00043-1
Journal volume & issue: Vol. 3, no. 4
pp. 487 – 507

Abstract

Read online

Abstract Semantic image synthesis plays an important role in the development of Advanced Driver Assistance System (ADAS). Street objects detection might be erroneous during raining or when images from vehicle’s camera are blurred, which can cause serious accidents. Therefore, automatic and accurate street object detection is a demanding research scope. In this paper, a deep learning based framework is proposed and investigated for street object detection from synthesized and processed semantic image. Firstly, a Conditional Generative Adversarial Network (CGAN) has been used to create the realistic image. The brightness of the CGAN generated image has been increased using neural style transfer method. Furthermore, Enhanced Super-Resolution Generative Adversarial Networks (ESRGAN) based image enhancement concept has been used to improve the resolution of style-transferred images. These processed images exhibit better clarity and high fidelity which is impactful in the performance improvement of object detector. Finally, the synthesized and processed images were used as input in a Region-based Convolutional Neural Network (Faster R-CNN) and a MobileNet Single Shot Detector (MobileNetSSDv2) model separately for object detection. The widely used Cityscape dataset is used to investigate the performance of the proposed framework. The results analysis shows that the used synthesized and processed input improves the performance of the detectors than the unprocessed counterpart. A comparison of the proposed detection framework with related state of the art techniques is also found satisfactory with a mean average precision (mAP) around 32.6%, whereas most of the cases, mAPs are reported in the range of 20–28% for this particular dataset.

Published in Human-Centric Intelligent Systems

ISSN: 2667-1336 (Online)
Publisher: Springer Nature
Country of publisher: Netherlands
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.springer.com/journal/44230

About the journal

Abstract

Keywords