Developing Lexico-Semantic Relations of Saraiki Nouns: A Corpus-Based Study

Musarat Nazeer; Musarrat Azher; Azhar Pervaiz; Iqra Yasmeen

University of Chitral Journal of Linguistics and Literature (Jan 2024)

Developing Lexico-Semantic Relations of Saraiki Nouns: A Corpus-Based Study

Musarat Nazeer,
Musarrat Azher,
Azhar Pervaiz,
Iqra Yasmeen

Affiliations

Musarat Nazeer: M.Phil. Scholar, Department of English, University of Sargodha, Sargodha, Punjab, Pakistan
Musarrat Azher: Associate Professor, Department of Linguistics and Language Studies, University of Sargodha, Pakistan
Azhar Pervaiz: Assistant Professor, Department of Linguistics and Language Studies, University of Sargodha, Pakistan
Iqra Yasmeen: Mphil Scholar, Department of English, University of Sargodha, Sargodha, Pakistan

Journal volume & issue: Vol. 8, no. I
pp. 162 – 182

Abstract

Read online

Saraiki, being the fourth most widely spoken language in Pakistan and being used in some parts of India and Afghanistan, is of significant geographical, historical, and cultural importance. However, it remains neglected in terms of proper documentation and identification of its unique linguistic features. The current study is centered on identifying the lexico-semantic categories of Saraiki nouns and then developing their hierarchical relationships (Miller et al., 1993). This quantitative research is designed to contribute to the process of developing Saraiki WordNet and is related to Natural Language Processing (NLP). A corpus of 3 million words was developed on the basis of data collected from different genres of the Saraiki language, including newspapers, academic essays, literary texts, and religious books. Both expansion and merge approaches were used to analyze the data. A wordlist of 1500 most occurring nouns was extracted from the corpus using Antconc 3.4.4.0, followed by manual tagging in Microsoft Excel 2010. Resultantly, 39 most occurring nouns from the wordlist were used to develop 173 related synsets, and lexico-semantic relationships among these nouns were identified with the help of 30 hierarchies (Miller et al., 1993). This study is limited to areas like Bahawalpur, Multan, and Muzaffarabad. It would be a milestone for Saraiki language learners, SWN development, Saraiki lexical resources, online SL dictionaries, and a guide for researchers.

Published in University of Chitral Journal of Linguistics and Literature

ISSN: 2617-3611 (Print); 2663-1512 (Online)
Publisher: Department of English, University of Chitral
Country of publisher: Pakistan
LCC subjects: Language and Literature: English literature; Language and Literature: Philology. Linguistics: Language. Linguistic theory. Comparative grammar
Website: https://jll.uoch.edu.pk/index.php/jll/index

About the journal

Abstract

Keywords