IEEE Access (Jan 2020)

ESSN: Enhanced Semantic Segmentation Network by Residual Concatenation of Feature Maps

  • Dong Seop Kim,
  • Muhammad Arsalan,
  • Muhammad Owais,
  • Kang Ryoung Park

DOI
https://doi.org/10.1109/ACCESS.2020.2969442
Journal volume & issue
Vol. 8
pp. 21363 – 21379

Abstract

Read online

Semantic segmentation performs pixel-level classification of multiple classes in the input image. Previous studies on semantic segmentation have used various methods such as multi-scale image, encoder-decoder, attention, spatial pyramid pooling, conditional random field, and generative models. However, the contexts of various sizes and types in diverse environments make their performance limited in robustly detecting and classifying objects. To address this problem, we propose an enhanced semantic segmentation network (ESSN) robust to various objects, contexts, and environments. The ESSN can extract multi-scale information well by concatenating the residual feature maps with various receptive fields extracted from sequential convolution blocks, and it can improve the performance of semantic segmentation without additional modules such as loss or attention during the training process. We performed the experiments with two open databases, the Stanford background dataset (SBD) and Cambridge-driving labeled video database (CamVid). Experimental results demonstrated the pixel acc. of 92.74%, class acc. of 79.66%, and mIoU of 71.67% with CamVid, and pixel acc. of 87.46%, class acc. of 81.51%, and mIoU of 71.56% with SBD, which are higher than those of the existing state-of-the-art methods. In addition, the average processing time were 31.12 ms and 92.46 ms on the desktop computer and Jetson TX2 embedded system, respectively, which confirmed that ESSN is applicable to both the desktop computer and Jetson TX2 embedded system which is widely used in autonomous vehicles.

Keywords