RoBERTa-CoA: RoBERTa-Based Effective Finetuning Method Using Co-Attention

Jeong-Hoon Kim; Sung-Wook Park; Jun-Yeong Kim; Jun Park; Se-Hoon Jung; Chun-Bo Sim

doi:10.1109/ACCESS.2023.3328352

IEEE Access (Jan 2023)

RoBERTa-CoA: RoBERTa-Based Effective Finetuning Method Using Co-Attention

Jeong-Hoon Kim,
Sung-Wook Park,
Jun-Yeong Kim,
Jun Park,
Se-Hoon Jung,
Chun-Bo Sim

Affiliations

Jeong-Hoon Kim: ORCiD; Interdisciplinary Program in IT-Bio Convergence System, Sunchon National University, Suncheon-si, Jeollanam-do, South Korea
Sung-Wook Park: ORCiD; Interdisciplinary Program in IT-Bio Convergence System, Sunchon National University, Suncheon-si, Jeollanam-do, South Korea
Jun-Yeong Kim: ORCiD; Interdisciplinary Program in IT-Bio Convergence System, Sunchon National University, Suncheon-si, Jeollanam-do, South Korea
Jun Park: ORCiD; Interdisciplinary Program in IT-Bio Convergence System, Sunchon National University, Suncheon-si, Jeollanam-do, South Korea
Se-Hoon Jung: ORCiD; Department of Computer Engineering, Sunchon National University, Suncheon-si, Jeoolanam-do, South Korea
Chun-Bo Sim: ORCiD; Interdisciplinary Program in IT-Bio Convergence System, Sunchon National University, Suncheon-si, Jeollanam-do, South Korea

DOI: https://doi.org/10.1109/ACCESS.2023.3328352
Journal volume & issue: Vol. 11
pp. 120292 – 120303

Abstract

Read online

In the field of natural language processing, artificial intelligence (AI) technology has been utilized to solve various problems, such as text classification, similarity measurement, chatbots, machine translation, and machine reading comprehension. Significant advancements have been made in complex and rule-intensive natural language processing through deep learning, in which machines directly learn patterns. Machine reading comprehension, a natural language processing task, involves machines understanding questions and paragraphs to find answers within paragraphs. In 2019, bidirectional encoder representations from transformers (BERT) and a robust optimized BERT pretraining approach (RoBERTa) were introduced. They were then optimized for pretraining and fine-tuning, resulting in significant advancements. RoBERTa outperformed BERT in terms of training speed and performance by increasing the pretraining data and batch sizes, employing dynamic masking, and eliminating the next sentence prediction task. In RoBERTa, machine reading comprehension involves simultaneously inputting questions and paragraphs. However, this simultaneous input method suffers from the attention separate representation (ASP) problem, in which the attention distribution between the question and the paragraph spreads widely across keywords. This study proposed two methods to address the ASP problem. The existing input format, question-paragraph, was changed to three independent inputs: question, paragraph, and concatenated RoBERTa outputs. The concatenated matrix was then transformed into two matrices, and a machine reading comprehension algorithm using co-attention was proposed. An ablation study was conducted to evaluate and analyze the model’s performance, comprehension, and design efficiency. According to the experimental results, the proposed method improved the EM by 0.9% and F1 by 1.0% compared to the existing methods. Consequently, the learning performance was enhanced through attention concentration and co-attention, and it could demonstrate much better performance compared to the existing models.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords