QA4PRF: A Question Answering Based Framework for Pseudo Relevance Feedback

Handong Ma; Jiawei Hou; Chenxu Zhu; Weinan Zhang; Ruiming Tang; Jincai Lai; Jieming Zhu; Xiuqiang He; Yong Yu

doi:10.1109/ACCESS.2021.3118600

IEEE Access (Jan 2021)

QA4PRF: A Question Answering Based Framework for Pseudo Relevance Feedback

Handong Ma,
Jiawei Hou,
Chenxu Zhu,
Weinan Zhang,
Ruiming Tang,
Jincai Lai,
Jieming Zhu,
Xiuqiang He,
Yong Yu

Affiliations

Handong Ma: Department of Computer Science, Shanghai Jiao Tong University, Shanghai, China
Jiawei Hou: Department of Computer Science, Shanghai Jiao Tong University, Shanghai, China
Chenxu Zhu: ORCiD; Department of Computer Science, Shanghai Jiao Tong University, Shanghai, China
Weinan Zhang: ORCiD; Department of Computer Science, Shanghai Jiao Tong University, Shanghai, China
Ruiming Tang: Recommendation and Search Project Team, Huawei Noah’s Ark Laboratory, Shenzhen, China
Jincai Lai: Recommendation and Search Project Team, Huawei Noah’s Ark Laboratory, Shenzhen, China
Jieming Zhu: Recommendation and Search Project Team, Huawei Noah’s Ark Laboratory, Shenzhen, China
Xiuqiang He: Recommendation and Search Project Team, Huawei Noah’s Ark Laboratory, Shenzhen, China
Yong Yu: ORCiD; Department of Computer Science, Shanghai Jiao Tong University, Shanghai, China

DOI: https://doi.org/10.1109/ACCESS.2021.3118600
Journal volume & issue: Vol. 9
pp. 139303 – 139314

Abstract

Read online

Pseudo relevance feedback (PRF) automatically performs query expansion based on top-retrieved documents to better represent the user’s information need so as to improve the search results. Previous PRF methods mainly select expansion terms with high occurrence frequency in top-retrieved documents or with high semantic similarity with the original query. However, existing PRF methods hardly try to understand the content of documents, which is very important in performing effective query expansion to reveal the user’s information need. In this paper, we propose a QA-based framework for PRF called QA4PRF to utilize contextual information in documents. In such a framework, we formulate PRF as a QA task, where the query and each top-retrieved document play the roles of question and context in the corresponding QA system, while the objective is to find some proper terms to expand the original query by utilizing contextual information, which are similar answers in QA task. Besides, an attention-based pointer network is built on understanding the content of top-retrieved documents and selecting the terms to represent the original query better. We also show that incorporating the traditional supervised learning methods, such as LambdaRank, to integrate PRF information will further improve the performance of QA4PRF. Extensive experiments on three real-world datasets demonstrate that QA4PRF significantly outperforms the state-of-the-art methods.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords