SRL‐ProtoNet: Self‐supervised representation learning for few‐shot remote sensing scene classification

Bing Liu; Hongwei Zhao; Jiao Li; Yansheng Gao; Jianrong Zhang

doi:10.1049/cvi2.12304

IET Computer Vision (Oct 2024)

SRL‐ProtoNet: Self‐supervised representation learning for few‐shot remote sensing scene classification

Bing Liu,
Hongwei Zhao,
Jiao Li,
Yansheng Gao,
Jianrong Zhang

Affiliations

Bing Liu: Department of College of Computer Science and Technology Jilin University Changchun China
Hongwei Zhao: Department of College of Computer Science and Technology Jilin University Changchun China
Jiao Li: Department of Jilin University Library Jilin University Changchun China
Yansheng Gao: Department of the College of Computer Science and Engineering Changchun University of Technology Changchun China
Jianrong Zhang: Department of College of Computer Science and Technology Jilin University Changchun China

DOI: https://doi.org/10.1049/cvi2.12304
Journal volume & issue: Vol. 18, no. 7
pp. 1034 – 1042

Abstract

Read online

Abstract Using a deep learning method to classify a large amount of labelled remote sensing scene data produces good performance. However, it is challenging for deep learning based methods to generalise to classification tasks with limited data. Few‐shot learning allows neural networks to classify unseen categories when confronted with a handful of labelled data. Currently, episodic tasks based on meta‐learning can effectively complete few‐shot classification, and training an encoder that can conduct representation learning has become an important component of few‐shot learning. An end‐to‐end few‐shot remote sensing scene classification model based on ProtoNet and self‐supervised learning is proposed. The authors design the Pre‐prototype for a more discrete feature space and better integration with self‐supervised learning, and also propose the ProtoMixer for higher quality prototypes with a global receptive field. The authors’ method outperforms the existing state‐of‐the‐art self‐supervised based methods on three widely used benchmark datasets: UC‐Merced, NWPU‐RESISC45, and AID. Compare with previous state‐of‐the‐art performance. For the one‐shot setting, this method improves by 1.21%, 2.36%, and 0.84% in AID, UC‐Merced, and NWPU‐RESISC45, respectively. For the five‐shot setting, this method surpasses by 0.85%, 2.79%, and 0.74% in the AID, UC‐Merced, and NWPU‐RESISC45, respectively.

Published in IET Computer Vision

ISSN: 1751-9632 (Print); 1751-9640 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519640

About the journal

Abstract

Keywords