MAPM:PolSAR Image Classification with Masked Autoencoder Based on Position Prediction and Memory Tokens

Jianlong Wang; Yingying Li; Dou Quan; Beibei Hou; Zhensong Wang; Haifeng Sima; Junding Sun

doi:10.3390/rs16224280

Remote Sensing (Nov 2024)

MAPM:PolSAR Image Classification with Masked Autoencoder Based on Position Prediction and Memory Tokens

Jianlong Wang,
Yingying Li,
Dou Quan,
Beibei Hou,
Zhensong Wang,
Haifeng Sima,
Junding Sun

Affiliations

Jianlong Wang: School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo 454003, China
Yingying Li: School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo 454003, China
Dou Quan: Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, School of Artificial Intelligence, Xidian University, Xi’an 710071, China
Beibei Hou: School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo 454003, China
Zhensong Wang: School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo 454003, China
Haifeng Sima: School of Software, Henan Polytechnic University, Jiaozuo 454003, China
Junding Sun: School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo 454003, China

DOI: https://doi.org/10.3390/rs16224280
Journal volume & issue: Vol. 16, no. 22
p. 4280

Abstract

Read online

Deep learning methods have shown significant advantages in polarimetric synthetic aperture radar (PolSAR) image classification. However, their performances rely on a large number of labeled data. To alleviate this problem, this paper proposes a PolSAR image classification method with a Masked Autoencoder based on Position prediction and Memory tokens (MAPM). First, MAPM designs a Masked Autoencoder (MAE) based on the transformer for pre-training, which can boost feature learning and improve classification results based on the number of labeled samples. Secondly, since the transformer is relatively insensitive to the order of the input tokens, a position prediction strategy is introduced in the encoder part of the MAE. It can effectively capture subtle differences and discriminate complex, blurry boundaries in PolSAR images. In the fine-tuning stage, the addition of learnable memory tokens can improve classification performance. In addition, L1 loss is used for MAE optimization to enhance the robustness of the model to outliers in PolSAR data. Experimental results show the effectiveness and advantages of the proposed MAPM in PolSAR image classification. Specifically, MAPM achieves performance gains of about 1% in classification accuracy compared with existing methods.

Published in Remote Sensing

ISSN: 2072-4292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/remotesensing/

About the journal

Abstract

Keywords