مطالعات زبان‌‌ها و گویش‌های غرب ایران (Mar 2024)

Dialectometry of Linguistic Varieties Common in the Distance between the South of Hamadan Province to the North of Khuzestan Province: Using Levenshtein Distance Approach

  • Shiva Piryaee,
  • Aliyeh Kord Zafaranlu Kambuziya,
  • Arsalan Golfam,
  • Sahar Bahrami-Khorshid

DOI
https://doi.org/10.22126/jlw.2022.7776.1639
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 19

Abstract

Read online

Dialectometry is a computational, quantitative and statistical approach, in which linguistic differences, in a selected geographical area, are examined by using specific methods and techniques. In the present study, the linguistic distances between varieties, common in the area from the south of Hamadan province to the north of Khuzestan province, and their regional distributions, have been studied by using a novel dialectometric approach. These language varieties are mostly Laki and Lori. This study is done in a library and field work method. Therefore, to do that, the distances between the equivalents of 100 words in 80 locations are measured, using Levenshtein distance which is included in RuG/L04software. After the analyzing of the linguistic distances, the outputs are presented in the forms of interpretable maps, diagrams and statistical analysis. The main results of this study are: 1. The clustering of language varieties and 2. The manner of their linguistic-geographical distribution through a linguistic continuum. Also, the study of phonetic-lexical differences between the collected linguistic data, confirms the nature of continuity of these linguistic varieties. Thus, at one end of this continuum, Laki and Lori Lorestani varieties and at the other end, Lori Bakhtiari varieties are locate Introduction Language has a continuous nature, making it difficult to establish clear-cut boundaries between different language varieties. Unlike traditional and qualitative approaches used in dialectology, which often relied on subjective methods, modern dialectometric methods should be employed due to the continuous nature of language. Hence, it is unfeasible to ascertain precise linguistic demarcations among language variations. Quantitative methods in dialectology differ from traditional methods. Traditional approaches rely on the assumptions and linguistic features of the native speakers to determine the distribution and spread of language varieties in a particular geographic region. One of the disadvantages of these methods is the absence of consistent overlap between isoglosses and the utilization of personal preferences and opinions in choosing bundles of isoglosses. Quantitative approaches offer several advantages over traditional approaches. The advantage of using quantitative approaches compared to traditional approaches is the digital classification of data, automatic measurement of distances and frequencies, digital mapping of outputs and providing statistical analysis of linguistic data, which leads to the analysis of a large amount of linguistic data without the personal preferences of the individual researcher. Consequently, this study involves the aggregate analysis of a vast amount of linguistic data through quantitative methods, specifically linguistic distance measurement. The application of quantitative approaches in dialectology research, along with various dialectometric techniques and the assessment of linguistic distances in non-Iranian studies, can be traced back to the works of Seguy (1973) and Goebel (1982). Over the past ten years, dialectometry in Iran has become a focal point for numerous researchers specializing in Iranian languages and dialects. These Studies have been carried out on various common language varieties found in regions such as East Azarbaijan, West Azarbaijan, Hamadan, Mazandaran, Gorgan, Yazd, Ilam, Cherdaval and Talesh. The preservation of a society's identity and cultural heritage is closely intertwined with the study of languages and dialects. As a result, researchers and linguists must prioritize conducting methodical studies in this field. Historically, studies on Iranian language varieties have been conducted in isolation, focusing solely on specific linguistic aspects such as phonology, morphology or syntax. Thus, numerous studies have been carried out thus far, employing both traditional and scientific methodologies, to explore different facets of language varieties (Lori and Laki) common in the examined geographical region. These investigations have primarily focused on phonology, morphology and syntax. It is important to note that these studies are solely descriptive and qualitative in nature, lacking any comparative or quantitative research elements. Furthermore, up to now, few studies have addressed the matter of comparing different linguistic variations with a comprehensive approach, aiming to establish a systematic correlation between linguistic varieties. Their efforts have focused on creating a linguistic atlas and developing a meticulous, scientific, and well-organized classification system for these variations. Hence, it is imperative to carry out a linguistic investigation employing a dialectometric methodology to employ contemporary analytical-computational techniques on prevalent language variations in Lorestan province and its adjacent provinces, namely Hamadan and Khuzestan. The rationale behind selecting this specific geographical scope is the extensive usage of both Lori and Laki dialects within these territories. Consequently, it holds significant value to ascertain the linguistic-geographical dispersion of these language varieties in the aforementioned regions, disregarding any geographical limitations.Methodology This study is a synchronic descriptive-analytical investigation. The RuG/L04 dialectometry and cartography software package was employed to conduct this study. Initially, 100 lexical entries were gathered from 80 different locations. The research database was sourced from three national dialectology projects in Lorestan and Khuzestan provinces, as well as from the field research conducted by the researchers. The participants in this study encompass both males and females, ranging in age from 20 to 70 years, with an average educational attainment of a high school diploma. After transcribing 8000 lexical forms, the geographical coordinates for each location were determined by utilizing Google Earth software. Subsequently, in order to calculate the linguistic distance index, Levenshtein distance algorithm was applied to the data as one of the aggregate analysis approaches. The resulting distance, obtained from the 80x80 matrix, represents a quantitative index within the range of natural numbers. In the subsequent phase of the study, diverse subprograms were employed to categorize the acquired language types. These categorizations are then presented in the form of diagrams, tables, and maps.Results Upon analyzing the acquired outcomes, it was discovered that the linguistic variations being investigated form a continuous language continuum, devoid of any distinct boundaries (in contrast to traditional dialectology approaches). This continuum commences from the Laki and Lori varieties of Lorestan in the southern region of Hamadan province, extending all the way to the Bakhtiari varieties in the northern part of Khuzestan province. The attribute of continuity is also evident in the phonetic differences and alternations observed among the three primary language varieties, namely Laki, Lori Bakhtiari, and Lori Lorestani.Conclusion The greater language distance and difference between Laki and Lori varieties (as indicated in equation 2.5) confirms the belonging of each of these varieties to a different language family; Also, due to belonging to a common language family (Southwestern Iranian), there is less linguistic distance (equal to 1.5) and more linguistic similarity between two varieties of Lori. The obtained Pearson correlation coefficient between the linguistic varieties under study is r=0.88, indicating a strong and statistically significant correlation percentage. This validates the findings of the research and highlights the effectiveness of utilizing Levenshtein's distance dialectometric approach in identifying the primary linguistic clusters of the examined varieties and confirming the continuity nature of the language.

Keywords