Egyptian Informatics Journal (Sep 2020)

Detection outliers on internet of things using big data technology

  • Haitham Ghallab,
  • Hanan Fahmy,
  • Mona Nasr

Journal volume & issue
Vol. 21, no. 3
pp. 131 – 138

Abstract

Read online

Internet of Things (IoT) is a fundamental concept of a new technology that will be promising and significant in various fields. IoT is a vision that allows things or objects equipped with sensors, actuators, and processors to talk and communicate with each other over the internet to achieve a meaningful goal. Unfortunately, one of the major challenges that affect IoT is data quality and uncertainty, as data volume increases noise, inconsistency and redundancy increases within data and causes paramount issues for IoT technologies. And since IoT is considered to be a massive quantity of heterogeneous networked embedded devices that generate big data, then it is very complex to compute and analyze such massive data. So this paper introduces a new model named NRDD-DBSCAN based on DBSCAN algorithm and using resilient distributed datasets (RDDs) to detect outliers that affect the data quality of IoT technologies. NRDD-DBSCAN has been applied on three different datasets of N-dimensions (2-D, 3-D, and 25-D) and the results were promising. Finally, comparisons have been made between NRDD-DBSCAN and previous models such as RDD-DBSCAN model and DBSCAN algorithm, and these comparisons proved that NRDD-DBSCAN solved the low dimensionality issue of RDD-DBSCAN model and also solved the fact that DBSCAN algorithm cannot handle IoT data. So the conclusion is that NRDD-DBSCAN proposed model can detect the outliers that exist in the datasets of N-dimensions by using resilient distributed datasets (RDDs), and NRDD-DBSCAN can enhance the quality of data exists in IoT applications and technologies.

Keywords