Narrative Question Answering with Cutting-Edge Open-Domain QA Techniques: A Comprehensive Study

Xiangyang Mou; Chenghao Yang; Mo Yu; Bingsheng Yao; Xiaoxiao Guo; Saloni Potdar; Hui Su

doi:10.1162/tacl_a_00411

Transactions of the Association for Computational Linguistics (Jan 2021)

Narrative Question Answering with Cutting-Edge Open-Domain QA Techniques: A Comprehensive Study

Xiangyang Mou,
Chenghao Yang,
Mo Yu,
Bingsheng Yao,
Xiaoxiao Guo,
Saloni Potdar,
Hui Su

Affiliations

Xiangyang Mou: Rensselaer Polytechnic Institute & IBM, United States. [email protected]
Chenghao Yang: Rensselaer Polytechnic Institute & IBM, United States
Mo Yu: Rensselaer Polytechnic Institute & IBM, United States. [email protected]
Bingsheng Yao: Rensselaer Polytechnic Institute & IBM, United States
Xiaoxiao Guo: Rensselaer Polytechnic Institute & IBM, United States
Saloni Potdar: Rensselaer Polytechnic Institute & IBM, United States
Hui Su: Rensselaer Polytechnic Institute & IBM, United States

DOI: https://doi.org/10.1162/tacl_a_00411
Journal volume & issue: Vol. 9
pp. 1032 – 1046

Abstract

Read online

AbstractRecent advancements in open-domain question answering (ODQA), that is, finding answers from large open-domain corpus like Wikipedia, have led to human-level performance on many datasets. However, progress in QA over book stories (Book QA) lags despite its similar task formulation to ODQA. This work provides a comprehensive and quantitative analysis about the difficulty of Book QA: (1) We benchmark the research on the NarrativeQA dataset with extensive experiments with cutting-edge ODQA techniques. This quantifies the challenges Book QA poses, as well as advances the published state-of-the-art with a ∼7% absolute improvement on ROUGE-L. (2) We further analyze the detailed challenges in Book QA through human studies.1 Our findings indicate that the event-centric questions dominate this task, which exemplifies the inability of existing QA models to handle event-oriented scenarios.

Published in Transactions of the Association for Computational Linguistics

ISSN: 2307-387X (Online)
Publisher: The MIT Press
Country of publisher: United States
LCC subjects: Language and Literature: Philology. Linguistics: Computational linguistics. Natural language processing
Website: https://direct.mit.edu/tacl

About the journal