IEEE Access (Jan 2022)

User Demographic Prediction Based on the Fusion of Mobile and Survey Data

  • Xingyu Chen,
  • Ye Guo,
  • Honglei Xu,
  • Hongyan Yan,
  • Lin Lin

DOI
https://doi.org/10.1109/ACCESS.2022.3215732
Journal volume & issue
Vol. 10
pp. 111507 – 111527

Abstract

Read online

The user demographic prediction problem is one of the critical processes in the construction of user profiles, which is of great significance for understanding users’ characteristics and attributes. Most of the prior works on this problem either used only single-source data or employed a hard-matching method to handle multi-source data. These methods will result in a great loss of data and information in many circumstances, which may affect the model’s accuracy as well as the application scenarios. In order to solve these problems, this paper proposes a framework for user demographic prediction based on mobile and survey data, and presents a Deep Structured Fusion Model (DSFM) using neural networks with attention mechanisms to perform data fusion by comparing user similarity between two heterogeneous datasets. We examine the effectiveness of the framework and the fusion model on a real-world mobile dataset with almost one billion users, using a survey dataset containing 29,809 users’ questionnaire results as an additional information source to predict users’ age and gender. Our framework achieves excellent results on these datasets, increasing the prediction accuracy of gender and age by up to 3.23% and 5.21% compared to the best baseline model.

Keywords