A comprehensive survey of anomaly detection techniques for high dimensional big data

Srikanth Thudumu; Philip Branch; Jiong Jin; Jugdutt (Jack) Singh

doi:10.1186/s40537-020-00320-x

Journal of Big Data (Jul 2020)

A comprehensive survey of anomaly detection techniques for high dimensional big data

Srikanth Thudumu,
Philip Branch,
Jiong Jin,
Jugdutt (Jack) Singh

Affiliations

Srikanth Thudumu: School of Software & Electrical Engineering, Swinburne University of Technology
Philip Branch: School of Software & Electrical Engineering, Swinburne University of Technology
Jiong Jin: School of Software & Electrical Engineering, Swinburne University of Technology
Jugdutt (Jack) Singh: Sarawak State Government

DOI: https://doi.org/10.1186/s40537-020-00320-x
Journal volume & issue: Vol. 7, no. 1
pp. 1 – 30

Abstract

Read online

Abstract Anomaly detection in high dimensional data is becoming a fundamental research problem that has various applications in the real world. However, many existing anomaly detection techniques fail to retain sufficient accuracy due to so-called “big data” characterised by high-volume, and high-velocity data generated by variety of sources. This phenomenon of having both problems together can be referred to the “curse of big dimensionality,” that affect existing techniques in terms of both performance and accuracy. To address this gap and to understand the core problem, it is necessary to identify the unique challenges brought by the anomaly detection with both high dimensionality and big data problems. Hence, this survey aims to document the state of anomaly detection in high dimensional big data by representing the unique challenges using a triangular model of vertices: the problem (big dimensionality), techniques/algorithms (anomaly detection), and tools (big data applications/frameworks). Authors’ work that fall directly into any of the vertices or closely related to them are taken into consideration for review. Furthermore, the limitations of traditional approaches and current strategies of high dimensional data are discussed along with recent techniques and applications on big data required for the optimization of anomaly detection.

Published in Journal of Big Data

ISSN: 2196-1115 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://journalofbigdata.springeropen.com

About the journal

Abstract

Keywords