Assessing the effects of convolutional neural network architectural factors on model performance for remote sensing image classification: An in-depth investigation

Feihao Chen; Jin Yeu Tsou

doi:10.1016/j.jag.2022.102865

International Journal of Applied Earth Observations and Geoinformation (Aug 2022)

Assessing the effects of convolutional neural network architectural factors on model performance for remote sensing image classification: An in-depth investigation

Feihao Chen,
Jin Yeu Tsou

Affiliations

Feihao Chen: School of Architecture, The Chinese University of Hong Kong, Hong Kong Special Administrative Region
Jin Yeu Tsou: School of Architecture, The Chinese University of Hong Kong, Hong Kong Special Administrative Region; Department of Architecture and Civil Engineering, City University of Hong Kong, Hong Kong Special Administrative Region; Corresponding author at: Department of Architecture and Civil Engineering, City University of Hong Kong, Hong Kong Special Administrative Region.

DOI: https://doi.org/10.1016/j.jag.2022.102865
Journal volume & issue: Vol. 112
p. 102865

Abstract

Read online

Although the application of deep learning in remote sensing (RS) has achieved fruitful results, systematic research on exploring the model performance and guiding the design of new convolutional neural network (CNN) architectures is still lacking. This subject is of great concern to researchers or practitioners in this field because existing CNN structures may not be adequate to deal with complex RS scenarios. In this study, an empirical formula of CNN model performance is delivered based on a literature review. Extensive experiments are conducted on six public RS data sets to investigate the influences of three architectural factors, namely, network depth, width, and cardinality. Two types of CNN architectures, i.e., VGG and ResNet, are adopted as baselines. We monitor and visualize the data distributions and gradients of the utilized CNNs to prevent the gradient vanishing or exploding problem. Grad-CAM is adopted to open the black box of CNNs and to illustrate the effects of adjusting architectural factors. Our experiments indicate that (1) increasing the network depth is beneficial to the semantic feature learning capacity of a CNN model, but excessive depth also leads to a decline of overall accuracy; (2) a partly widening strategy is effective because it can improve the model performance while maintaining the network complexity; and (3) network cardinality has huge potential in achieving a balance between model efficiency and accuracy. Suggestions for improving the CNN model performance and developing new structures are summarized in this paper.

Published in International Journal of Applied Earth Observations and Geoinformation

ISSN: 1569-8432 (Print); 1872-826X (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Geography. Anthropology. Recreation: Physical geography; Geography. Anthropology. Recreation: Environmental sciences
Website: https://www.journals.elsevier.com/international-journal-of-applied-earth-observation-and-geoinformation

About the journal

Abstract

Keywords