IEEE Access (Jan 2019)

A Multi-Level Author Name Disambiguation Algorithm

  • Siyang Zhang,
  • Xinhua E.,
  • Tian Pan

DOI
https://doi.org/10.1109/ACCESS.2019.2931592
Journal volume & issue
Vol. 7
pp. 104250 – 104257

Abstract

Read online

With the rapid development of information technology, the name ambiguity problem has become one of the primary issues in the fields of information retrieval, data mining, and scientific measurement. Name disambiguation is used to promote computer technology and big data information, which maps virtual relational networks to real social networks to solve the problem that the same name points to multiple entities. At present many literature search platforms launched their respective scholar system, name ambiguity problem will inevitably affect the precision of other information calculations, reduce the credibility of the system, and affect the information quality and content quality. Most work deals with this issue by using graph theory and clustering. However, the name disambiguation problem is still not well resolved. In this paper, we propose a multi-level name disambiguation algorithm. This algorithm is mainly based on the unsupervised algorithm, which combines hierarchical agglomerative clustering (HAC) and graph theory for disambiguating. The experimental results show that the proposed solution achieves clearly better performance (+17 ~ 25% in terms of F1-Measure) than several methods, including HAC and Graph.

Keywords