Вестник Дагестанского государственного технического университета: Технические науки (Apr 2024)

Anomaly detection research using Isolation Forest in Machine Learning

  • A. S. Kechedzhiev,
  • O. L. Tsvetkova

DOI
https://doi.org/10.21822/2073-6185-2024-51-1-106-112
Journal volume & issue
Vol. 51, no. 1
pp. 106 – 112

Abstract

Read online

Objective. The study is devoted to assessing the applicability of the Isolation Forest method in the task of detecting anomalies in network traffic data characterized by insufficient markup. The main purpose of the work is to evaluate the effectiveness of Isolation Forest with limited data markup and its potential in critical areas such as cybersecurity and financial analytics.Method. The study includes data preprocessing, training the model on the training set, and evaluating the model's performance on the test set using accuracy metrics, error matrix, and classification report. To implement this research, the Python programming language and the scikit-learn library were chosen to implement the Isolation Forest, as well as Pandas for working with data.Result. Evaluating the applicability of the Isolation Forest method on unstructured data revealed its potential for identifying anomalous patterns without the need for extensive labeling. This confirms the effectiveness of Isolation Forest in environments where access to labeled data is limited or absent.Conclusion. The results demonstrate high anomaly detection recall despite relatively low overall accuracy, indicating the importance of contextual interpretation of metrics in the task of detecting rare events in data.

Keywords