IEEE Access (Jan 2019)

Intelligent Data Engineering for Migration to NoSQL Based Secure Environments

  • Shabana Ramzan,
  • Imran Sarwar Bajwa,
  • Bushra Ramzan,
  • Waheed Anwar

DOI
https://doi.org/10.1109/ACCESS.2019.2916912
Journal volume & issue
Vol. 7
pp. 69042 – 69057

Abstract

Read online

In an era of super computing, data is increasing exponentially requiring more proficiency from the available technologies of data storage, data processing, and analysis. Such continuous massive growth of structured and unstructured data is referred to as a “Big data”. The processing and storage of big data through a conventional technique is not possible. Due to improved proficiency of Big Data solution in handling data, such as NoSQL caused the developers in the previous decade to start preferring big data databases, such as Apache Cassandra, Oracle, and NoSQL. NoSQL is a modern database technology that is designed to provide scalability to support voluminous data, leading to the rise of NoSQL as the most viable database solution. These modern databases aim to overcome the limitations of relational databases such as unlimited scalability, high performance, data modeling, data distribution, and continuous availability. These days, the larger enterprises need to shift NoSQL databases due to their more flexible models. It is a great challenge for business organizations and enterprises to transform their existing databases to NoSQL databases considering heterogeneity and complexity in relational data. In addition, with the emergence of big data, data cleansing has become a great challenge. In this paper, we proposed an approach that has two modules: data transformation and data cleansing module. The first phase is the transformation of a relational database to Oracle NoSQL database through model transformation. The second phase provides data cleansing ability to improve data quality and prepare it for big data analytics. The experiments show the proposed approach successfully transforms the relational database to a big data database and improve data quality.

Keywords