Linguistic Typology at the Crossroads (Dec 2022)

Predicting grammatical gender in Nakh languages: Three methods compared

  • Jesse Wichers Schreur,
  • Marc Allassonnière-Tang,
  • Kate Bellamy,
  • Neige Rochant

DOI
https://doi.org/10.6092/issn.2785-0943/14545
Journal volume & issue
Vol. 2, no. 2
pp. 93 – 126

Abstract

Read online

The Nakh languages Chechen and Tsova-Tush each have a five-valued gender system: masculine, feminine, and three “neuter” genders named for their singular agreement forms: B, D and J. Gender assignment in languages is generally analysed as being dependent on both forms and semantics (e.g. Corbett, 1991), with semantics typically prevailing over form (e.g. Bellamy & Wichers Schreur, 2021, Allassonnière-Tang et al., 2021). Most previous studies have considered only binary or tripartite gender systems possessing masculine, feminine, and neuter values. The five-valued system of Nakh thus represents an innovative and insightful case study for analysing gender assignment. In this paper we build on the existing qualitative linguistic analyses of gender assignment in Tsova-Tush (Wichers Schreur, 2021) and apply three machine-learning methods to investigate the weight of form and semantics in predicting grammatical gender in Chechen and Tsova-Tush. The results show that while both form and semantics are helpful for predicting grammatical gender in Nakh, semantics is dominant, which supports findings from existing literature (Allassonnière-Tang, Brown & Fedden, 2021). However, the results also show that the coded semantic information could be further fine-grained to improve the accuracy of the predictions (see also Plaster et al., 2013). In addition, we discuss the implications of the output for our understanding of language-internal and family-internal processes of language change, including how loanwords are integrated from Russian, a three-gender language.

Keywords