Content-Attribute Disentanglement for Generalized Zero-Shot Learning

Yoojin An; Sangyeon Kim; Yuxuan Liang; Roger Zimmermann; Dongho Kim; Jihie Kim

doi:10.1109/ACCESS.2022.3178800

IEEE Access (Jan 2022)

Content-Attribute Disentanglement for Generalized Zero-Shot Learning

Yoojin An,
Sangyeon Kim,
Yuxuan Liang,
Roger Zimmermann,
Dongho Kim,
Jihie Kim

Affiliations

Yoojin An: ORCiD; Department of Artificial Intelligence, Dongguk University, Seoul, South Korea
Sangyeon Kim: ORCiD; NAVER WEBTOON AI, Seongnam, South Korea
Yuxuan Liang: School of Computing, National University of Singapore, Singapore
Roger Zimmermann: ORCiD; School of Computing, National University of Singapore, Singapore
Dongho Kim: Dongguk Institute of Convergence Education, Dongguk University, Seoul, South Korea
Jihie Kim: ORCiD; Department of Artificial Intelligence, Dongguk University, Seoul, South Korea

DOI: https://doi.org/10.1109/ACCESS.2022.3178800
Journal volume & issue: Vol. 10
pp. 58320 – 58331

Abstract

Read online

Humans can recognize or infer unseen classes of objects using descriptions explaining the characteristics (semantic information) of the classes. However, conventional deep learning models trained in a supervised manner cannot classify classes that were unseen during training. Hence, many studies have been conducted into generalized zero-shot learning (GZSL), which aims to produce system which can recognize both seen and unseen classes, by transferring learned knowledge from seen to unseen classes. Since seen and unseen classes share a common semantic space, extracting appropriate semantic information from images is essential for GZSL. In addition to semantic-related information (attributes), images also contain semantic-unrelated information (contents), which can degrade the classification performance of the model. Therefore, we propose a content-attribute disentanglement architecture which separates the content and attribute information of images. The proposed method is comprised of three major components: 1) a feature generation module for synthesizing unseen visual features; 2) a content-attribute disentanglement module for discriminating content and attribute codes from images; and 3) an attribute comparator module for measuring the compatibility between the attribute codes and the class prototypes which act as the ground truth. With extensive experiments, we show that our method achieves state-of-the-art and competitive results on four benchmark datasets in GZSL. Our method also outperforms the existing zero-shot learning methods in all of the datasets. Moreover, our method has the best accuracy as well in a zero-shot retrieval task. Our code is available at https://github.com/anyoojin1996/CA-GZSL.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords