UAV Image Multi-Labeling with Data-Efficient Transformers

Laila Bashmal; Yakoub Bazi; Mohamad Mahmoud Al Rahhal; Haikel Alhichri; Naif Al Ajlan

doi:10.3390/app11093974

Applied Sciences (Apr 2021)

UAV Image Multi-Labeling with Data-Efficient Transformers

Laila Bashmal,
Yakoub Bazi,
Mohamad Mahmoud Al Rahhal,
Haikel Alhichri,
Naif Al Ajlan

Affiliations

Laila Bashmal: Computer Engineering Department, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia
Yakoub Bazi: Computer Engineering Department, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia
Mohamad Mahmoud Al Rahhal: Applied Computer Science Department, College of Applied Computer Science, King Saud University, Riyadh 11543, Saudi Arabia
Haikel Alhichri: Computer Engineering Department, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia
Naif Al Ajlan: Computer Engineering Department, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia

DOI: https://doi.org/10.3390/app11093974
Journal volume & issue: Vol. 11, no. 9
p. 3974

Abstract

Read online

In this paper, we present an approach for the multi-label classification of remote sensing images based on data-efficient transformers. During the training phase, we generated a second view for each image from the training set using data augmentation. Then, both the image and its augmented version were reshaped into a sequence of flattened patches and then fed to the transformer encoder. The latter extracts a compact feature representation from each image with the help of a self-attention mechanism, which can handle the global dependencies between different regions of the high-resolution aerial image. On the top of the encoder, we mounted two classifiers, a token and a distiller classifier. During training, we minimized a global loss consisting of two terms, each corresponding to one of the two classifiers. In the test phase, we considered the average of the two classifiers as the final class labels. Experiments on two datasets acquired over the cities of Trento and Civezzano with a ground resolution of two-centimeter demonstrated the effectiveness of the proposed model.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords