IEEE Access (Jan 2019)
A Multi-Level Author Name Disambiguation Algorithm
Abstract
With the rapid development of information technology, the name ambiguity problem has become one of the primary issues in the fields of information retrieval, data mining, and scientific measurement. Name disambiguation is used to promote computer technology and big data information, which maps virtual relational networks to real social networks to solve the problem that the same name points to multiple entities. At present many literature search platforms launched their respective scholar system, name ambiguity problem will inevitably affect the precision of other information calculations, reduce the credibility of the system, and affect the information quality and content quality. Most work deals with this issue by using graph theory and clustering. However, the name disambiguation problem is still not well resolved. In this paper, we propose a multi-level name disambiguation algorithm. This algorithm is mainly based on the unsupervised algorithm, which combines hierarchical agglomerative clustering (HAC) and graph theory for disambiguating. The experimental results show that the proposed solution achieves clearly better performance (+17 ~ 25% in terms of F1-Measure) than several methods, including HAC and Graph.
Keywords