Data in Brief (Oct 2021)

Nonsemantic word graphs of texts spanning ∼ 4500 years, including pre-literate Amerindian oral narratives

  • Natália Bezerra Mota,
  • Sylvia Pinheiro,
  • Antonio Guerreiro,
  • Mauro Copelli,
  • Sidarta Ribeiro

Journal volume & issue
Vol. 38
p. 107296

Abstract

Read online

Non-semantic word graphs obtained from oral reports are useful to describe cognitive decline in psychiatric conditions such as Schizophrenia, as well as education-related gains in discourse structure during typical development. Here we provide non-semantic word graph attributes of texts spanning approximately 4500 years of history, and pre-literate Amerindian oral narratives. The dataset assessed comprises 707 literary texts representative of 9 different Afro-Eurasian traditions (Syro-Mesopotamian, Egyptian, Hinduist, Persian, Judeo-Christian, Greek-Roman, Medieval, Modern and Contemporary), and Amerindian narratives (N = 39) obtained from a single ethnic group from South America (Kalapalo, N = 18), or from a mixed ethnic group from South, Central and North America (non-Kalapalo, N = 21). The present article provides detailed information about each text or narrative, including measurements of four graph attributes of interest: number of nodes (lexical diversity), repeated edges (short-range recurrence), largest strongly connected component (long-range recurrence), and average shortest path (graph length).

Keywords