IEEE Access (Jan 2018)

Latent Dirichlet Truth Discovery: Separating Trustworthy and Untrustworthy Components in Data Sources

  • Liyan Zhang,
  • Guo-Jun Qi,
  • Dong Zhang,
  • Jinhui Tang

DOI
https://doi.org/10.1109/ACCESS.2017.2780182
Journal volume & issue
Vol. 6
pp. 1741 – 1752

Abstract

Read online

The discovery of truth is a critical step toward effective information and knowledge utilization, especially in Web services, social media networks, and sensor networks. Typically, a set of sources with varying reliability claim observations about a set of objects and the goal is to jointly discover the true fact for each object and the trustworthy degree of each source. In this paper, we propose a latent Dirichlet truth (LDT) discovery model to approach this problem. It defines a random field over all the possible configurations of the trustworthy degrees of sources and facts, and the most probable configuration is inferred by a maximum a posteriori criterion over the observed claims. We note that a typical source is usually made of mixed trustworthy and untrustworthy components, since it can make true or false claims on different objects. While most of the existing algorithms do not attempt separate the untrustworthy component from the trustworthy one in each source, the proposed model explicitly identifies untrustworthy component in each source. This makes the LDT model more capable of separating the trustworthy and untrustworthy components, and in turn improves the accuracy of truth discovery. Experiments on real data sets show competitive results compared with existing algorithms.

Keywords