Improving Machine Reading Comprehension with Multi-Task Learning and Self-Training

Jianquan Ouyang; Mengen Fu

doi:10.3390/math10030310

Mathematics (Jan 2022)

Improving Machine Reading Comprehension with Multi-Task Learning and Self-Training

Jianquan Ouyang,
Mengen Fu

Affiliations

Jianquan Ouyang: College of Computer Cyberspace Security, Xiangtan University, Xiangtan 411105, China
Mengen Fu: College of Computer Cyberspace Security, Xiangtan University, Xiangtan 411105, China

DOI: https://doi.org/10.3390/math10030310
Journal volume & issue: Vol. 10, no. 3
p. 310

Abstract

Read online

Machine Reading Comprehension (MRC) is an AI challenge that requires machines to determine the correct answer to a question based on a given passage, in which extractive MRC requires extracting an answer span to a question from a given passage, such as the task of span extraction. In contrast, non-extractive MRC infers answers from the content of reference passages, including Yes/No question answering to unanswerable questions. Due to the specificity of the two types of MRC tasks, researchers usually work on one type of task separately, but real-life application situations often require models that can handle many different types of tasks in parallel. Therefore, to meet the comprehensive requirements in such application situations, we construct a multi-task fusion training reading comprehension model based on the BERT pre-training model. The model uses the BERT pre-training model to obtain contextual representations, which is then shared by three downstream sub-modules for span extraction, Yes/No question answering, and unanswerable questions, next we fuse the outputs of the three sub-modules into a new span extraction output and use the fused cross-entropy loss function for global training. In the training phase, since our model requires a large amount of labeled training data, which is often expensive to obtain or unavailable in many tasks, we additionally use self-training to generate pseudo-labeled training data to train our model to improve its accuracy and generalization performance. We evaluated the SQuAD2.0 and CAIL2019 datasets. The experiments show that our model can efficiently handle different tasks. We achieved 83.2EM and 86.7F1 scores on the SQuAD2.0 dataset and 73.0EM and 85.3F1 scores on the CAIL2019 dataset.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords