Computers in Human Behavior: Artificial Humans (May 2025)

Perception and social evaluation of cloned and recorded voices: Effects of familiarity and self-relevance

  • Victor Rosi,
  • Emma Soopramanien,
  • Carolyn McGettigan

DOI
https://doi.org/10.1016/j.chbah.2025.100143
Journal volume & issue
Vol. 4
p. 100143

Abstract

Read online

Modern speech technologies enable the artificial replication, or cloning, of the human voice. In the present study, we investigated whether listeners' perception and social evaluation of state-of-the-art voice clones depend on whether the clone being heard is a replica of the self, a friend, or a total stranger. We recorded and cloned the voices of familiar pairs of adult participants. Forty-seven of these experimental participants (and 47 unfamiliar controls) rated the Trustworthiness, Attractiveness, Competence, and Dominance of cloned and recorded samples of their own voice and their friend's voice. We observed that while familiar listeners found clones to sound less (or similarly) trustworthy, attractive, and competent than recordings, unfamiliar listeners showed an opposing profile in which clones tended to be rated higher than recordings. Within this, familiar listeners tended to prefer their friend's voice to their own, although perceived similarity of both self- and friend-voice clones to the original speaker identity predicted higher ratings on all trait scales. Overall, we find that familiar listeners' impressions are sensitive to the perceived accuracy and authenticity of cloning for voices they know well, while unfamiliar listeners tend to prefer the synthetic versions of those same voice identities. The latter observation may relate to the tendency of generative voice synthesis models to homogenise speaking accents and styles, such that they more closely approximate (preferred) norms.

Keywords