IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2024)

MLFMNet: A Multilevel Feature Mining Network for Semantic Segmentation on Aerial Images

  • Xinyu Wei,
  • Lei Rao,
  • Guangyu Fan,
  • Niansheng Chen

DOI
https://doi.org/10.1109/JSTARS.2024.3452250
Journal volume & issue
Vol. 17
pp. 16165 – 16179

Abstract

Read online

Semantic segmentation of aerial images is crucial in various practical applications, encompassing traffic management, search tasks, urban planning, and more. However, due to the unique shooting angles of aerial images, there are significant challenges in accurately segmenting objects, including large variations in object scales, deformations, and unclear features of small targets. To address this, we propose a multilevel feature mining network based on an encoder–decoder architecture called MLFMNet, aimed at excavating and integrating multilevel feature information in aerial images to enhance segmentation accuracy and robustness. MLFMNet leverages skip connections to obtain hierarchical feature representations from the encoder. Subsequently, through learnable fusion module and feature reconstruction module in the proposed decoder, it progressively fuses and reconstructs these features, thereby achieving accurate semantic segmentation. To tackle issues of significant size variations and deformations in objects, we design an irregular pyramid receptive field module embedded at the bottom of the encoder to capture receptive fields from multiple feature vectors, thus further mining abstract features. Moreover, to address the challenge of low segmentation and detection accuracy for small targets, a fine-grained feature mining module is embedded in the bottom of the decoder to capture spatial detail features. Particularly, MLFMNet-B achieves an mIoU of 70.8%, ranking fourth in the official leaderboard of the UAVid test set.

Keywords