Selecting Optimal Combination of Data Channels for Semantic Segmentation in City Information Modelling (CIM)

Yuanzhi Cai; Hong Huang; Kaiyang Wang; Cheng Zhang; Lei Fan; Fangyu Guo

doi:10.3390/rs13071367

Remote Sensing (Apr 2021)

Selecting Optimal Combination of Data Channels for Semantic Segmentation in City Information Modelling (CIM)

Yuanzhi Cai,
Hong Huang,
Kaiyang Wang,
Cheng Zhang,
Lei Fan,
Fangyu Guo

Affiliations

Yuanzhi Cai: Department of Civil Engineering, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
Hong Huang: Department of Civil Engineering, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
Kaiyang Wang: Department of Civil Engineering, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
Cheng Zhang: Department of Civil Engineering, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
Lei Fan: Department of Civil Engineering, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
Fangyu Guo: Department of Civil Engineering, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China

DOI: https://doi.org/10.3390/rs13071367
Journal volume & issue: Vol. 13, no. 7
p. 1367

Abstract

Read online

Over the last decade, a 3D reconstruction technique has been developed to present the latest as-is information for various objects and build the city information models. Meanwhile, deep learning based approaches are employed to add semantic information to the models. Studies have proved that the accuracy of the model could be improved by combining multiple data channels (e.g., XYZ, Intensity, D, and RGB). Nevertheless, the redundant data channels in large-scale datasets may cause high computation cost and time during data processing. Few researchers have addressed the question of which combination of channels is optimal in terms of overall accuracy (OA) and mean intersection over union (mIoU). Therefore, a framework is proposed to explore an efficient data fusion approach for semantic segmentation by selecting an optimal combination of data channels. In the framework, a total of 13 channel combinations are investigated to pre-process data and the encoder-to-decoder structure is utilized for network permutations. A case study is carried out to investigate the efficiency of the proposed approach by adopting a city-level benchmark dataset and applying nine networks. It is found that the combination of IRGB channels provide the best OA performance, while IRGBD channels provide the best mIoU performance.

Published in Remote Sensing

ISSN: 2072-4292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/remotesensing/

About the journal

Abstract

Keywords