Schema and content aware classification for predicting the sources containing an answer over corpus and knowledge graphs

Somayeh Asadifar; Mohsen Kahani; Saeedeh Shekarpour

doi:10.7717/peerj-cs.846

PeerJ Computer Science (Mar 2022)

Schema and content aware classification for predicting the sources containing an answer over corpus and knowledge graphs

Somayeh Asadifar,
Mohsen Kahani,
Saeedeh Shekarpour

Affiliations

Somayeh Asadifar: Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad, Khorasan Razavi, Iran
Mohsen Kahani: Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad, Khorasan Razavi, Iran
Saeedeh Shekarpour: College of Arts and Sciences: Computer Science, University of Dayton, Dayton, Ohio, United States

DOI: https://doi.org/10.7717/peerj-cs.846
Journal volume & issue: Vol. 8
p. e846

Abstract

Read online Read online

Today, several attempts to manage question answering (QA) have been made in three separate areas: (1) knowledge-based (KB), (2) text-based and (3) hybrid, which takes advantage of both prior areas in extracting the response. On the other hand, in question answering on a large number of sources, source prediction to ensure scalability is very important. In this paper, a method for source prediction is presented in hybrid QA, involving several KB sources and a text source. In a few hybrid methods for source selection, including only one KB source in addition to the textual source, prioritization or heuristics have been used that have not been evaluated so far. Most methods available in source selection services are based on general metadata or triple instances. These methods are not suitable due to the unstructured source in hybrid QA. In this research, we need data details to predict the source. In addition, unlike KB federated methods that are based on triple instances, we use the behind idea of mediated schema to ensure data integration and scalability. Results from evaluations that consider word, triple, and question level information, show that the proposed approach performs well against a few benchmarks. In addition, the comparison of the proposed method with the existing approaches in hybrid and KB source prediction and also QA tasks has shown a significant reduction in response time and increased accuracy.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords