RUDN Journal of Language Studies, Semiotics and Semantics (Mar 2024)

Natural Language Processing and Fiction Text: Basis for Corpus Research

  • Alexey I. Gorozhanov,
  • Innara A. Guseynova,
  • Darya V. Stepanova

DOI
https://doi.org/10.22363/2313-2299-2024-15-1-195-210
Journal volume & issue
Vol. 15, no. 1
pp. 195 – 210

Abstract

Read online

The study deals with NLP procedures on the material of the fiction texts in German and in English, which are considered as strong cultural texts. The aim of the study is to develop a model of such a technical device to process, analyze and interpret a fiction text, which would reveal the full potential of popular NLP tools within the corpus approach. The general methods used in the study are analysis and synthesis. Special methods are additionally used to solve certain specific issues: descriptive method, modelling and qualitative and quantitative analysis. The scientific novelty lies in the fact that the authors apply the crucial principles of the classical theories of text interpretation according to the latest methods and tools of the applied linguistics. As a practical result, special software has been developed, which is able to process SQL based linguistic corpora, automatically built with spaCy NLP library and Python programming language. This software can be used for a fiction text interpretation, as well as for compiling learning materials in Home Reading. It is assumed that the development of special software for strong cultural texts stimulates the search for scientific solutions and at the same time allows one to understand the essential differences that exist between natural and artificial intelligence.

Keywords