IEEE Access (Jan 2018)
Stochastic and Information Theory Techniques to Reduce Large Datasets and Detect Cyberattacks in Ambient Intelligence Environments
Abstract
Ambient intelligence refers a new technological paradigm, where everyday environments behave in a smart way and are sensitive to their inhabitants. In order to reach this objective, complex pervasive sensing platforms are deployed, together with artificial intelligence solutions. In these new, complex, and highly interdependent systems, traditional security policies and defense strategies are not effective, as thousands of heterogeneous cyber and physical elements are mixed and connected. New security solutions try to learn about the expected behavior from the system and its components, so if a strange event occurs; adequate preventive, corrective, and/or reactive security actions to detect and stop the potential cyberphysical attack being performed are triggered in an intelligent way. In order to learn about the system and select and apply the adequate security actions, very large datasets containing records of previous behaviors should be analyzed, sometimes in a very fast way. This fact enormously complicates the implementation of these new security solutions, as it is necessary a huge storage capacity, which many domestic systems do not have, and it is needed to work with huge data sets whose processing time prevents making decisions with the required speed. Therefore, in this paper, we investigate and propose a procedure to reduce large datasets, with the objective of enabling new security techniques to detect cyberattacks in a fast and efficient way. The proposed procedure is based on the calculation of small sets of samples, whose statistic configuration is as similar as desired to the original large dataset. Stochastic models and information theory techniques and theorems are composed and combined in order to define a mathematical framework which allows the obtention of these equivalent reduced datasets. We also describe and evaluate a first implementation of the proposed solution, using both, a simulation scenario and a real deployment.
Keywords