IEEE Access (Jan 2019)

TriTag-NFPF: Knowledge Denoising for Chinese Encyclopedia based on Triple Tag-Constructed Potential Function

  • Ting Wang,
  • Hanzhe Gu,
  • Jie Li,
  • Jingyao Xie

DOI
https://doi.org/10.1109/ACCESS.2019.2933249
Journal volume & issue
Vol. 7
pp. 107413 – 107427

Abstract

Read online

In this paper, a novel method is proposed for Chinese large-scale online encyclopedia knowledge denoising. Firstly, the initial similarity of the triples is acquired by the similarity computing method integrating the Edit-Distance and TongYiCiCiLin similarity algorithm. Secondly, a novel nuclear field-like potential function of the Infobox knowledge triples is constructed in virtue of Chinese encyclopedia entry semantic tag. Finally, large-scale knowledge triple clustering and denoising are performed by means of the improved potential function proposed in this paper for the purpose of minimizing the influence of massive repetition and ambiguity in the Chinese open encyclopedia Knowledge Base (KB). The proposed method has solved the problems of semantic duplication, ambiguity and inappropriate classification of knowledge triples arising from constructing Chinese KBs. The experimental results indicate that the open-domain oriented Chinese encyclopedia KBs constructed by the method proposed in this paper is outperformed than the state-of-the-art methods.

Keywords