Any-Shot Learning From Multimodal Observations (ALMO)

Mehmet Aktukmak; Yasin Yilmaz; Alfred O. Hero

doi:10.1109/ACCESS.2023.3282932

IEEE Access (Jan 2023)

Any-Shot Learning From Multimodal Observations (ALMO)

Mehmet Aktukmak,
Yasin Yilmaz,
Alfred O. Hero

Affiliations

Mehmet Aktukmak: ORCiD; Electrical and Computer Engineering Department, University of Michigan, Ann Arbor, MI, USA
Yasin Yilmaz: ORCiD; Electrical Engineering Department, University of South Florida, Tampa, FL, USA
Alfred O. Hero: ORCiD; Electrical and Computer Engineering Department, University of Michigan, Ann Arbor, MI, USA

DOI: https://doi.org/10.1109/ACCESS.2023.3282932
Journal volume & issue: Vol. 11
pp. 61513 – 61524

Abstract

Read online

In this paper, we propose a framework (ALMO) for any-shot learning from multi-modal observations. Using training data containing both objects (inputs) and class attributes (side information) from multiple modalities, ALMO embeds the high-dimensional data into a common stochastic latent space using modality-specific encoders. Subsequently, a non-parametric classifier is trained to predict the class labels of the objects. We perform probabilistic data fusion to combine the modalities in the stochastic latent space and learn class conditional distributions for improved generalization and scalability. We formulate ALMO for both few-shot and zero-shot classification tasks, demonstrating significant improvement in recognition performance on the Omniglot and CUB-200 datasets as compared to state-of-the-art baselines.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords