Scientific Data (Jun 2024)

A Data Set of Synthetic Utterances for Computational Personality Analysis

  • Yair Neuman,
  • Yochai Cohen

DOI
https://doi.org/10.1038/s41597-024-03488-6
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 12

Abstract

Read online

Abstract The computational analysis of human personality has mainly focused on the Big Five personality theory, and the psychodynamic approach is almost nonexistent despite its rich theoretical grounding and relevance to various tasks. Here, we provide a data set of 4972 synthetic utterances corresponding with five personality dimensions described by the psychodynamic approach: depressive, obsessive, paranoid, narcissistic, and anti-social psychopathic. The utterances have been generated through AI with a deep theoretical orientation that motivated the design of prompts for GPT-4. The dataset has been validated through 14 tests, and it may be relevant for the computational study of human personality and the design of authentic persona in digital domains, from gaming to the artistic generation of movie characters.