Diting: An Author Disambiguation Method Based on Network Representation Learning

Liwen Peng; Siqi Shen; Jun Xu; Yongquan Fu; Dongsheng Li; Adele Lu Jia

doi:10.1109/ACCESS.2019.2942477

IEEE Access (Jan 2019)

Diting: An Author Disambiguation Method Based on Network Representation Learning

Liwen Peng,
Siqi Shen,
Jun Xu,
Yongquan Fu,
Dongsheng Li,
Adele Lu Jia

Affiliations

Liwen Peng: ORCiD; School of Computer, National University of Defense Technology, Changsha, China
Siqi Shen: School of Computer, National University of Defense Technology, Changsha, China
Jun Xu: Ant Financial Services Group, Hangzhou, China
Yongquan Fu: School of Computer, National University of Defense Technology, Changsha, China
Dongsheng Li: School of Computer, National University of Defense Technology, Changsha, China
Adele Lu Jia: College of Information and Electrical Engineering, China Agricultural University, Beijing, China

DOI: https://doi.org/10.1109/ACCESS.2019.2942477
Journal volume & issue: Vol. 7
pp. 135539 – 135555

Abstract

Read online

It is important to disambiguate names among persons in many scenarios. In this work, we propose an unsupervised method Diting and a semi-supervised method Diting++ for author disambiguation. In Diting, we learn a low-dimensional vector to represent each paper in networks, which are formed by connecting papers with multiple types of relationship (such as co-author). During representation learning, we focus on maximizing the gap between positive edges and negative edges. Further, we propose a clustering algorithm which associates papers to their real-life authors. To make full use of the authorship information, which is easy to obtain from the authors’ homepages, we design Diting++ to improve the performance for name disambiguation. Diting++ uses the authorship information listed on the authors’ homepages to construct label networks and uses a network representation learning method to learn paper representations based on label networks and other networks. Further, Diting++ uses a semi-supervised clustering method to partition learned paper representations into disjoint groups. Each group belongs to a distinct author. By making use of the label information, the clustering method partitions papers written by the same author in the same group, whereas papers written by different authors locate in different groups. Through extensive experiments, we show that our methods are significantly better than the state-of-the-art author disambiguation methods.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords