IEEE Access (Jan 2020)
Comprehensive Survey on Big Data Privacy Protection
Abstract
In recent years, the ever-mounting problem of Internet phishing has been threatening the secure propagation of sensitive data over the web, thereby resulting in either outright decline of data distribution or inaccurate data distribution from several data providers. Therefore, user privacy has evolved into a critical issue in various data mining operations. User privacy has turned out to be a foremost criterion for allowing the transfer of confidential information. The intense surge in storing the personal data of customers (i.e., big data) has resulted in a new research area, which is referred to as privacy-preserving data mining (PPDM). A key issue of PPDM is how to manipulate data using a specific approach to enable the development of a good data mining model on modified data, thereby meeting a specified privacy need with minimum loss of information for the intended data analysis task. The current review study aims to utilize the tasks of data mining operations without risking the security of individuals' sensitive information, particularly at the record level. To this end, PPDM techniques are reviewed and classified using various approaches for data modification. Furthermore, a critical comparative analysis is performed for the advantages and drawbacks of PPDM techniques. This review study also elaborates on the existing challenges and unresolved issues in PPDM.
Keywords