IEEE Access (Jan 2023)

Applying One-Class Algorithms for Data Stream-Based Insider Threat Detection

  • Rafael Bruno Peccatiello,
  • Joao Jose Costa Gondim,
  • Luis Paulo Faina Garcia

DOI
https://doi.org/10.1109/ACCESS.2023.3293825
Journal volume & issue
Vol. 11
pp. 70560 – 70573

Abstract

Read online

An insider threat is anyone who has legitimate access to a particular organization’s network and uses that access to harm that organization. Insider threats may act with or without intent, but when they have an intention, they usually also have some specific motivation. This motivation can vary, including but not limited to personal discontent, financial issues, and coercion. It is hard to face insider threats with traditional security solutions because those solutions are limited to the signature detection paradigm. To overcome this restriction, researchers have proposed using Machine Learning which can address Insider Threat issues more comprehensively. Some of them have used batch learning, and others have used stream learning. Batch approaches are simpler to implement, but the problem is how to apply them in the real world. That is because real insider threat scenarios have complex characteristics to address by batch learning. Although more complex, stream approaches are more comprehensive and feasible to implement. Some studies have also used unsupervised and supervised Machine Learning techniques, but obtaining labeled samples makes it hard to implement fully supervised solutions. This study proposes a framework that combines different data science techniques to address insider threat detection. Among them are using semi-supervised and supervised machine learning, data stream analysis, and periodic retraining procedures. The algorithms used in the implementation were Isolation Forest, Elliptic Envelop, and Local Outlier Factor. This study evaluated the results according to the values obtained by the precision, recall, and F1-Score metrics. The best results were obtained by the ISOF algorithm, with 0.78 for the positive class (malign) recall and 0.80 for the negative class (benign) recall.

Keywords