Цифровая социология (Apr 2023)

Sociology of values: experience of building a taxonomy by using natural language analysis technology

  • M. A. Kashina,
  • S. Tkach

DOI
https://doi.org/10.26425/2658-347X-2023-6-1-48-58
Journal volume & issue
Vol. 6, no. 1
pp. 48 – 58

Abstract

Read online

Modern research in the field of sociology of science is becoming more complicated due to the constantly growing publication activity of authors. To track trends in sectoral sociology, scientists turn to scientometric methods, but they are not enough. Trends in the development of the sociology of values as a branch of sociology are the subject of the study. The purpose of the work is an assessment of the possibilities of using natural language analysis methods (NLP/NLA) for thematic and theoretical clustering of research in the sociology of values. The design of the study was quantitative and qualitative, it was carried out in two stages. At the first stage, 121 abstracts of a scientific articles were analyzed using text mining, after which their total array was divided into clusters. At the second stage, the results of machine clustering were examined by the method of qualitative text analysis, on the basis of which the limitations and capabilities of the NLP/NLA method were identified for solving the problem of clustering scientific texts. It was found that articles with a more conservative core of theoretical categories (gender studies, migration studies, the theory of globalism) are more amenable to clustering, while theories with a loosely structured and fluid theoretical core (theories using environmental terminology, theories of inequality) are much less amenable to explicit clustering. The results obtained allow us to form a new direction of work with large arrays of scientific texts, associated with their clustering using NLP/NLA. Building clusters enables researchers to work with all texts in a given subject area, and not just with the most cited ones. This, in turn, provides the visibility of all scientific ideas, including those that have not gained popularity/notability.

Keywords