Unveiling user identity across social media: a novel unsupervised gradient semantic model for accurate and efficient user alignment

Yongqiang Peng; Xiaoliang Chen; Duoqian Miao; Xiaolin Qin; Xu Gu; Peng Lu

doi:10.1007/s40747-024-01626-6

Complex & Intelligent Systems (Nov 2024)

Unveiling user identity across social media: a novel unsupervised gradient semantic model for accurate and efficient user alignment

Yongqiang Peng,
Xiaoliang Chen,
Duoqian Miao,
Xiaolin Qin,
Xu Gu,
Peng Lu

Affiliations

Yongqiang Peng: School of Computer and Software Engineering, Xihua University
Xiaoliang Chen: School of Computer and Software Engineering, Xihua University
Duoqian Miao: College of Electronic and Information Engineering, Tongji University Shanghai
Xiaolin Qin: Chengdu Institute of Computer Applications, Chinese Academy of Sciences
Xu Gu: School of Computer and Software Engineering, Xihua University
Peng Lu: Department of Computer Science and Operations Research, University of Montreal

DOI: https://doi.org/10.1007/s40747-024-01626-6
Journal volume & issue: Vol. 11, no. 1
pp. 1 – 28

Abstract

Read online

Abstract The field of social network analysis has identified User Alignment (UA) as a crucial area of investigation. The objective of UA is to identify and connect user accounts across diverse social networks, even when there are no explicit interconnections. UA plays a pivotal role in synthesising coherent user profiles and delving into the intricacies of user behaviour across platforms. However, traditional approaches have encountered limitations. Singular embedding techniques have been found to fall short in fully capturing the semantic essence of user profile attributes. Furthermore, classification-based embedding methods lack definitive criteria for categorisation, thereby constraining both the efficacy and applicability of these models. This paper presents a novel unsupervised Gradient Semantic Model for User Alignment (GSMUA) for the purpose of identifying common user identities across social networks. GSMUA categorises user profile information into weak, sub, and strong gradients based on the semantic intensity of attributes. Different gradient semantic levels direct attention to literal features, semantic features, or a combination of both during feature extraction, thereby achieving a full semantic representation of user attributes. In the case of strongly semantic long texts, GSMUA employs Named Entity Recognition (ENR) technology in order to enhance the inefficient handling of such texts. Furthermore, GSMUA compensates for missing user profile attributes by utilising profile information from user neighbours, thereby reducing the negative impact of missing user profile attributes on model performance. Extensive experiments conducted on four pairs of real datasets demonstrate the superiority of our approach. In comparison to the most effective previously developed unsupervised methods, GSMUA demonstrates improvements in hit-precision ranging from 5.32 to 12.17%. When compared to supervised methods, the improvements range from 0.71 to 11.79%.

Published in Complex & Intelligent Systems

ISSN: 2199-4536 (Print); 2198-6053 (Online)
Publisher: Springer
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science; Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: https://www.springer.com/journal/40747

About the journal

Abstract

Keywords