Using artificial intelligence to explore sound symbolic expressions of gender in American English

Alexander Kilpatrick; Aleksandra Ćwiek

doi:10.7717/peerj-cs.1811

PeerJ Computer Science (Jan 2024)

Using artificial intelligence to explore sound symbolic expressions of gender in American English

Alexander Kilpatrick,
Aleksandra Ćwiek

Affiliations

Alexander Kilpatrick: International Communication, Nagoya University of Commerce and Business, Nagoya, Aichi, Japan
Aleksandra Ćwiek: Leibniz-Zentrum Allgemeine Sprachwissenschaft, Berlin, Germany

DOI: https://doi.org/10.7717/peerj-cs.1811
Journal volume & issue: Vol. 10
p. e1811

Abstract

Read online Read online

This study investigates the extent to which gender can be inferred from the phonemes that make up given names and words in American English. Two extreme gradient boosted algorithms were constructed to classify words according to gender, one using a list of the most common given names (N∼1,000) in North America and the other using the Glasgow Norms (N∼5,500), a corpus consisting of nouns, verbs, adjectives, and adverbs which have each been assigned a psycholinguistic score of how they are associated with male or female behaviour. Both models report significant findings, but the model constructed using given names achieves a greater accuracy despite being trained on a smaller dataset suggesting that gender is expressed more robustly in given names than in other word classes. Feature importance was examined to determine which features were contributing to the decision-making process. Feature importance scores revealed a general pattern across both models, but also show that not all word classes express gender the same way. Finally, the models were reconstructed and tested on the opposite dataset to determine whether they were useful in classifying opposite samples. The results showed that the models were not as accurate when classifying opposite samples, suggesting that they are more suited to classifying words of the same class.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords