ESSN: Enhanced Semantic Segmentation Network by Residual Concatenation of Feature Maps

Dong Seop Kim; Muhammad Arsalan; Muhammad Owais; Kang Ryoung Park

doi:10.1109/ACCESS.2020.2969442

IEEE Access (Jan 2020)

ESSN: Enhanced Semantic Segmentation Network by Residual Concatenation of Feature Maps

Dong Seop Kim,
Muhammad Arsalan,
Muhammad Owais,
Kang Ryoung Park

Affiliations

Dong Seop Kim: ORCiD; Division of Electronics and Electrical Engineering, Dongguk University, Seoul, South Korea
Muhammad Arsalan: ORCiD; Division of Electronics and Electrical Engineering, Dongguk University, Seoul, South Korea
Muhammad Owais: ORCiD; Division of Electronics and Electrical Engineering, Dongguk University, Seoul, South Korea
Kang Ryoung Park: ORCiD; Division of Electronics and Electrical Engineering, Dongguk University, Seoul, South Korea

DOI: https://doi.org/10.1109/ACCESS.2020.2969442
Journal volume & issue: Vol. 8
pp. 21363 – 21379

Abstract

Read online

Semantic segmentation performs pixel-level classification of multiple classes in the input image. Previous studies on semantic segmentation have used various methods such as multi-scale image, encoder-decoder, attention, spatial pyramid pooling, conditional random field, and generative models. However, the contexts of various sizes and types in diverse environments make their performance limited in robustly detecting and classifying objects. To address this problem, we propose an enhanced semantic segmentation network (ESSN) robust to various objects, contexts, and environments. The ESSN can extract multi-scale information well by concatenating the residual feature maps with various receptive fields extracted from sequential convolution blocks, and it can improve the performance of semantic segmentation without additional modules such as loss or attention during the training process. We performed the experiments with two open databases, the Stanford background dataset (SBD) and Cambridge-driving labeled video database (CamVid). Experimental results demonstrated the pixel acc. of 92.74%, class acc. of 79.66%, and mIoU of 71.67% with CamVid, and pixel acc. of 87.46%, class acc. of 81.51%, and mIoU of 71.56% with SBD, which are higher than those of the existing state-of-the-art methods. In addition, the average processing time were 31.12 ms and 92.46 ms on the desktop computer and Jetson TX2 embedded system, respectively, which confirmed that ESSN is applicable to both the desktop computer and Jetson TX2 embedded system which is widely used in autonomous vehicles.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords