Безопасность информационных технологий (Jun 2019)

A comparative analysis of software identifying approaches

  • Kseniya I. Salakhutdinova,
  • Vladislav V. Malkov,
  • Irina E. Krivtsova

DOI
https://doi.org/10.26583/bit.2019.2.04
Journal volume & issue
Vol. 26, no. 2
pp. 58 – 66

Abstract

Read online

The aim of the study is to provide a test of various well-known gradient boosted decision trees libraries, which are used here in relation to the software identification problem with limited set of executable files belonged to different versions of the same program in the training sample. The importance of software audit for business processes is substantiated. The paper considers the control means of installed software on personal computers of automated systems users. The disadvantages of such software solutions are substantiated with crawling examples for algorithms of program identification and the developed approach to the identification of executable files using the machine learning algorithm – gradient boosting of decision trees, based on the libraries XGBoost, LightGBM, CatBoost is presented. An experiment to identify executable files with the help of XGBoost, LightGBM is performed. On the basis of bicubic measure of clustering quality, a comparative analysis of the results between previously proposed program identification approach based on the CatBoost library, and the results presented in other studies, is performed. The results show that the developed approach allows identifying violations of the established security policy in automated systems information processing.

Keywords