IEEE Access (Jan 2020)

Distributed Machine Learning Oriented Data Integrity Verification Scheme in Cloud Computing Environment

  • Xiao-Ping Zhao,
  • Rui Jiang

DOI
https://doi.org/10.1109/ACCESS.2020.2971519
Journal volume & issue
Vol. 8
pp. 26372 – 26384

Abstract

Read online

Distributed Machine Learning (DML) is one of the core technologies for Artificial Intelligence (AI). However, in the existing distributed machine learning framework, the data integrity is not taken into account. If network attackers forge the data, modify the data, or destroy the data, the training model in the distributed machine learning system will be greatly affected, and the training results are led to be wrong. Therefore, it is crucial to guarantee the data integrity in the DML. In this paper, we propose a distributed machine learning oriented data integrity verification scheme (DML-DIV) to ensure the integrity of training data. Firstly, we adopt the idea of Provable Data Possession (PDP) sampling auditing algorithm to achieve data integrity verification so that our DML-DIV scheme can resist forgery attacks and tampering attacks. Secondly, we generate a random number, namely blinding factor, and apply the discrete logarithm problem (DLP) to construct proof and ensure privacy protection in the TPA verification process. Thirdly, we employ identity-based cryptography and two-step key generation technology to generate data owner's public/private key pair so that our DML-DIV scheme can solve the key escrow problem and reduce the cost of managing the certificates. Finally, formal theoretical analysis and experimental results show the security and efficiency of our DML-DIV scheme.

Keywords