Journal of King Saud University: Computer and Information Sciences (Dec 2021)

Improving outliers detection in data streams using LiCS and voting

  • Fatima-Zahra Benjelloun,
  • Ahmed Oussous,
  • Amine Bennani,
  • Samir Belfkih,
  • Ayoub Ait Lahcen

Journal volume & issue
Vol. 33, no. 10
pp. 1177 – 1185

Abstract

Read online

Detecting outliers in real-time is increasingly important for many real-world applications such as detecting abnormal heart activity, intrusions to systems, spams or abnormal credit card transactions. However, detecting outliers in data streams rises many challenges such as high-dimensionality, dynamic data distribution and unpredictable relationships. Our simulations demonstrate that some advanced solutions still show drawbacks. In this paper, first, we improve the capacity to detect outliers of both micro-clusters based algorithms (MCOD) and distance-based algorithms (Abstract-C and Exact-Storm) known for their performance. This is by adding a layer called LiCS that classifies online the K-nearest-neighbors (Knn) of each node based on their evolutionary status. This layer aggregates the results and uses a count threshold to better classify nodes. Experiments on SpamBase datasets confirmed that our technique enhances the accuracy and the precision of such algorithm and helps to reduce the unclassified nodes.Second, we propose a hybrid solution based on iterative majority voting and our LiCS. Experiments on real data proves that it outperforms discussed algorithms in terms of accuracy, precision and sensitivity in detecting outliers. It also minimizes the issue of unclassified instances and consolidate the different outputs of algorithms.

Keywords