Journal of Big Data (Jul 2018)

Privacy preserving data publishing based on sensitivity in context of Big Data using Hive

  • P. Srinivasa Rao,
  • S. Satyanarayana

DOI
https://doi.org/10.1186/s40537-018-0130-y
Journal volume & issue
Vol. 5, no. 1
pp. 1 – 20

Abstract

Read online

Abstract Privacy preserving data publication is the main concern in present days, because the data being published through internet has been increasing day by day. This huge amount of data was named as Big Data by its size. This project deals with the privacy preservation in context of big data using a data warehousing solution called hive. We implemented nearest similarity based clustering (NSB) with Bottom-up generalization to achieve (v,l)-anonymity which deals with the sensitivity vulnerabilities and ensures the individual privacy. We also calculate the sensitivity levels by simple comparison method using the index values, by classifying the different levels of sensitivity. The experiments were carried out on the hive environment to verify the efficiency of algorithms with big data. This framework also supports the execution of existing algorithms without any changes. The model in the article outperforms than existing models.

Keywords