IEEE Access (Jan 2020)
Web Pages Credibility Scores for Improving Accuracy of Answers in Web-Based Question Answering Systems
Abstract
Web-based question answering (QA) systems are effective in corroborating answers from multiple Web sources. However, Web also contains false, fabricated, and biased information that can have adverse effects on the accuracy of answers in Web-based QA systems. Existing, solutions focus primarily on finding relevant Web pages but either do not evaluate Web pages’ credibility or evaluate two to three out of seven credibility categories. This research proposed a credibility assessment algorithm that uses seven categories, including correctness, authority, currency, professionalism, popularity, impartiality, quality, for scoring credibility, where each credibility category consists of multiple factors. The credibility assessment module is added on top of an existing QA system to score answers based on the credibility of Web pages. The system ranks answers based on the Web pages’ credibility from where answers have been taken. The research conducted extensive quantitative tests on 211 factoid questions, taken from TREC QA data from 1999-2001. Our research findings show that credibility categories including correctness, professionalism, impartiality, and quality significantly improved the accuracy of answers. On the other hand, categories such as authority, currency, popularity played a minor role instead. This research hopes to allow researchers and experts in using the Web credibility assessment model to improve the accuracy of information systems. Credibility scores should assist Web users in selecting credible information, while also forcing content creators to focus more on publishing credible content.
Keywords