Современные информационные технологии и IT-образование (Mar 2023)

Intelligent Search System for Working with Big Data

  • Irina F. Astachova,
  • Katerina A. Makoviy,
  • Lev S. Nikitin,
  • Yuliya V. Khitskova

DOI
https://doi.org/10.25559/SITITO.019.202301.180-188
Journal volume & issue
Vol. 19, no. 1
pp. 180 – 188

Abstract

Read online

The article describes a system for modeling an information retrieval system on the Internet. The developed application is described, which allows the operation of the information retrieval system according to the following parameters: according to the data collection model, to the solution of the indexing problem, according to the ranking model, to the solution of the storage problem. Solutions in this area are developing most actively, thanks to progress in the field of artificial intelligence, cloud technologies and natural language processing. These factors have made re-search, the development of intelligent information retrieval systems (IRS), which collect information on the Internet and implement a search based on the data found. This search is available in the absence of impressive material resources. The main problems to be solved in the development of IRS: the problem of data collection; indexing problem; index model, its choice and development; ranking problem; storage problem; quality assessment problem. Search intelligence is provided through the use of ranking using the tf-idf methods, vector model and link analysis, which allow you to find relevant documents that do not contain direct occurrences of words from queries and sort them according to the degree of matching the query. The developed application in the Python language is described, test runs of the system were carried out, which showed its performance, and the organization of the intellectual component is explained.

Keywords