Applied Sciences (Dec 2020)

Deep Unsupervised Embedding for Remote Sensing Image Retrieval Using Textual Cues

  • Mohamad M. Al Rahhal,
  • Yakoub Bazi,
  • Taghreed Abdullah,
  • Mohamed L. Mekhalfi,
  • Mansour Zuair

DOI
https://doi.org/10.3390/app10248931
Journal volume & issue
Vol. 10, no. 24
p. 8931

Abstract

Read online

Compared to image-image retrieval, text-image retrieval has been less investigated in the remote sensing community, possibly because of the complexity of appropriately tying textual data to respective visual representations. Moreover, a single image may be described via multiple sentences according to the perception of the human labeler and the structure/body of the language they use, which magnifies the complexity even further. In this paper, we propose an unsupervised method for text-image retrieval in remote sensing imagery. In the method, image representation is obtained via visual Big Transfer (BiT) Models, while textual descriptions are encoded via a bidirectional Long Short-Term Memory (Bi-LSTM) network. The training of the proposed retrieval architecture is optimized using an unsupervised embedding loss, which aims to make the features of an image closest to its corresponding textual description and different from other image features and vise-versa. To demonstrate the performance of the proposed architecture, experiments are performed on two datasets, obtaining plausible text/image retrieval outcomes.

Keywords