Language Bias-Driven Self-Knowledge Distillation with Generalization Uncertainty for Reducing Language Bias in Visual Question Answering

Desen Yuan; Lei Wang; Qingbo Wu; Fanman Meng; King Ngi Ngan; Linfeng Xu

doi:10.3390/app12157588

Applied Sciences (Jul 2022)

Language Bias-Driven Self-Knowledge Distillation with Generalization Uncertainty for Reducing Language Bias in Visual Question Answering

Desen Yuan,
Lei Wang,
Qingbo Wu,
Fanman Meng,
King Ngi Ngan,
Linfeng Xu

Affiliations

Desen Yuan: School of Information and Communication Engineering, University of Electronic Science and Technology of China, Xiyuan West Road 2006, Chengdu 611731, China
Lei Wang: School of Information and Communication Engineering, University of Electronic Science and Technology of China, Xiyuan West Road 2006, Chengdu 611731, China
Qingbo Wu: School of Information and Communication Engineering, University of Electronic Science and Technology of China, Xiyuan West Road 2006, Chengdu 611731, China
Fanman Meng: School of Information and Communication Engineering, University of Electronic Science and Technology of China, Xiyuan West Road 2006, Chengdu 611731, China
King Ngi Ngan: School of Information and Communication Engineering, University of Electronic Science and Technology of China, Xiyuan West Road 2006, Chengdu 611731, China
Linfeng Xu: School of Information and Communication Engineering, University of Electronic Science and Technology of China, Xiyuan West Road 2006, Chengdu 611731, China

DOI: https://doi.org/10.3390/app12157588
Journal volume & issue: Vol. 12, no. 15
p. 7588

Abstract

Read online

To answer questions, visual question answering systems (VQA) rely on language bias but ignore the information of the images, which has negative information on its generalization. The mainstream debiased methods focus on removing language prior to inferring. However, the image samples are distributed unevenly in the dataset, so the feature sets acquired by the model often cannot cover the features (views) of the tail samples. Therefore, language bias occurs. This paper proposes a language bias-driven self-knowledge distillation framework to implicitly learn the feature sets of multi-views so as to reduce language bias. Moreover, to measure the performance of student models, the authors of this paper use a generalization uncertainty index to help student models learn unbiased visual knowledge and force them to focus more on the questions that cannot be answered based on language bias alone. In addition, the authors of this paper analyze the theory of the proposed method and verify the positive correlation between generalization uncertainty and expected test error. The authors of this paper validate the method’s effectiveness on the VQA-CP v2, VQA-CP v1 and VQA v2 datasets through extensive ablation experiments.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords