Improving BERT With Self-Supervised Attention

Yiren Chen; Xiaoyu Kou; Jiangang Bai; Yunhai Tong

doi:10.1109/ACCESS.2021.3122273

IEEE Access (Jan 2021)

Improving BERT With Self-Supervised Attention

Yiren Chen,
Xiaoyu Kou,
Jiangang Bai,
Yunhai Tong

Affiliations

Yiren Chen: ORCiD; Key Laboratory of Machine Perception (MOE), School of Electronics Engineering and Computer Science, Peking University, Beijing, China
Xiaoyu Kou: ORCiD; Key Laboratory of Machine Perception (MOE), School of Electronics Engineering and Computer Science, Peking University, Beijing, China
Jiangang Bai: ORCiD; Key Laboratory of Machine Perception (MOE), School of Electronics Engineering and Computer Science, Peking University, Beijing, China
Yunhai Tong: ORCiD; Key Laboratory of Machine Perception (MOE), School of Electronics Engineering and Computer Science, Peking University, Beijing, China

DOI: https://doi.org/10.1109/ACCESS.2021.3122273
Journal volume & issue: Vol. 9
pp. 144129 – 144139

Abstract

Read online

One of the most popular paradigms of applying large pre-trained NLP models such as BERT is to fine-tune it on a smaller dataset. However, one challenge remains as the fine-tuned model often overfits on smaller datasets. A symptom of this phenomenon is that irrelevant or misleading words in the sentence, which are easy to understand for human beings, can substantially degrade the performance of these fine-tuned BERT models. In this paper, we propose a novel technique, called Self-Supervised Attention (SSA) to help facilitate this generalization challenge. Specifically, SSA automatically generates weak, token-level attention labels iteratively by probing the fine-tuned model from the previous iteration. We investigate two different ways of integrating SSA into BERT and propose a hybrid approach to combine their benefits. Empirically, through a variety of public datasets, we illustrate significant performance improvement using our SSA-enhanced BERT model.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords