Symmetry (Nov 2020)

Identification and Prediction of Human Behavior through Mining of Unstructured Textual Data

  • Mohammad Reza Davahli,
  • Waldemar Karwowski,
  • Edgar Gutierrez,
  • Krzysztof Fiok,
  • Grzegorz Wróbel,
  • Redha Taiar,
  • Tareq Ahram

DOI
https://doi.org/10.3390/sym12111902
Journal volume & issue
Vol. 12, no. 1902
p. 1902

Abstract

Read online

The identification of human behavior can provide useful information across multiple job spectra. Recent advances in applying data-based approaches to social sciences have increased the feasibility of modeling human behavior. In particular, studying human behavior by analyzing unstructured textual data has recently received considerable attention because of the abundance of textual data. The main objective of the present study was to discuss the primary methods for identifying and predicting human behavior through the mining of unstructured textual data. Of the 823 articles analyzed, 87 met the predefined inclusion criteria and were included in the literature review. Our results show that the included articles could be symmetrically classified into two groups. The first group of articles attempted to identify the leading indicators of human behavior in unstructured textual data. In this group, the data-based approaches had three main components: (1) collecting self-reported survey data, (2) collecting data from social media and extracting data features, and (3) applying correlation analysis to evaluate the relationship between two sets of data. In contrast, the second group focused on the accuracy of data-based approaches for predicting human behavior. In this group, the data-based approaches could be categorized into (1) approaches based on labeled unstructured textual data and (2) approaches based on unlabeled unstructured textual data. The review provides a comprehensive insight into unstructured textual data mining to identify and predict human behavior and personality traits.

Keywords