PeerJ Computer Science (Mar 2022)

Schema and content aware classification for predicting the sources containing an answer over corpus and knowledge graphs

  • Somayeh Asadifar,
  • Mohsen Kahani,
  • Saeedeh Shekarpour

DOI
https://doi.org/10.7717/peerj-cs.846
Journal volume & issue
Vol. 8
p. e846

Abstract

Read online Read online

Today, several attempts to manage question answering (QA) have been made in three separate areas: (1) knowledge-based (KB), (2) text-based and (3) hybrid, which takes advantage of both prior areas in extracting the response. On the other hand, in question answering on a large number of sources, source prediction to ensure scalability is very important. In this paper, a method for source prediction is presented in hybrid QA, involving several KB sources and a text source. In a few hybrid methods for source selection, including only one KB source in addition to the textual source, prioritization or heuristics have been used that have not been evaluated so far. Most methods available in source selection services are based on general metadata or triple instances. These methods are not suitable due to the unstructured source in hybrid QA. In this research, we need data details to predict the source. In addition, unlike KB federated methods that are based on triple instances, we use the behind idea of mediated schema to ensure data integration and scalability. Results from evaluations that consider word, triple, and question level information, show that the proposed approach performs well against a few benchmarks. In addition, the comparison of the proposed method with the existing approaches in hybrid and KB source prediction and also QA tasks has shown a significant reduction in response time and increased accuracy.

Keywords