Вестник университета (Jun 2024)

Machine learning methods (tokenization) in marketing research

  • E. V. Ganebnykh,
  • N. K. Savelieva,
  • A. A. Sozinova,
  • O. V. Fokina,
  • I. G. Altsybeeva

DOI
https://doi.org/10.26425/1816-4277-2024-4-61-72
Journal volume & issue
Vol. 0, no. 4
pp. 61 – 72

Abstract

Read online

Field research is of particular interest in marketing because it often generates unique statistics. Closed-ended questions during data collection simplify data processing, but at the same time significantly limit the research subject depth. Open-ended questions provide a deeper understanding of respondents’ opinions, but processing responses in the form of natural language (qualitative data) is difficult and time-consuming, as it is usually done manually. Modern machine learning techniques, particularly tokenization, can be used to automate such data processing. The purpose of the study is to test this method application to data processing of the field research “Monitoring of the competition state and development in the commodity markets of the Novosibirsk Region”. The following tasks have been set and solved: primary information has been collected and prepared for processing, and token groups identified and formed. Based on the groups, the respondents’ answers have been further combined into relatively homogeneous clusters including similar answers to open-ended questions. Subsequent quality control of the conducted research has been carried out on the basis of Precision, Recall and F-measure metrics, which showed an acceptable level of data processing quality. Information collection has been realized through sociological surveys (questionnaire distribution) and CAWI surveys and included open-ended questions. The study reveals that even extremely insignificant references were not missed. The obtained data allowed us to conclude that it is necessary to form annotated databases and token libraries for the marketing research purposes.

Keywords