IEEE Access (Jan 2021)

A Personality Mining System for German Twitter Posts With Global Vectors Word Embedding

  • Henning Usselmann,
  • Rangina Ahmad,
  • Dominik Siemon

DOI
https://doi.org/10.1109/ACCESS.2021.3130937
Journal volume & issue
Vol. 9
pp. 165576 – 165610

Abstract

Read online

People’s personality influences their behaviors, attitudes, beliefs, and feelings. Therefore, many scientific studies already benefit from easy ways of measuring personality. By analyzing the written text of a person, it is possible to derive Big Five personality traits. One approach to this is to apply the unsupervised learning algorithm Global Vectors Word Embedding (or Representation), abbreviated GloVe, to English Twitter posts. The overall objective of our research is to show that this algorithm can also be applied to German Twitter posts. Therefore, we built a framework for training and applying machine learning models for personality predictions. We tested if a working prediction model for English Twitter users can be adapted for German users. This could reduce efforts for collecting training data. We evaluated our models based on a personality survey with a sample of German users. The method of adapting an existing model does not perform as well as expected but helps prepare the framework for higher volumes of data. In the end, the final model is based on the evaluation data, which results in an acceptable performance. Via a web application (https://www.miping.de) anyone can easily retrieve personality scores for any public German Twitter user. Altogether, it is shown that GloVe is suitable to predict personality based on German language. The published framework and source code allow for independent improvements to and easy application of the trained model. Now, scientific studies and other applications, e.g., chatbots, could easily incorporate personality data.

Keywords