Mathematical Biosciences and Engineering (Jun 2021)

Guaranteed distributed machine learning: Privacy-preserving empirical risk minimization

  • Kwabena Owusu-Agyemang,
  • Zhen Qin,
  • Appiah Benjamin,
  • Hu Xiong,
  • Zhiguang Qin

DOI
https://doi.org/10.3934/mbe.2021243
Journal volume & issue
Vol. 18, no. 4
pp. 4772 – 4796

Abstract

Read online

Distributed learning over data from sensor-based networks has been adopted to collaboratively train models on these sensitive data without privacy leakages. We present a distributed learning framework that involves the integration of secure multi-party computation and differential privacy. In our differential privacy method, we explore the potential of output perturbation and gradient perturbation and also progress with the cutting-edge methods of both techniques in the distributed learning domain. In our proposed multi-scheme output perturbation algorithm (MS-OP), data owners combine their local classifiers within a secure multi-party computation and later inject an appreciable amount of statistical noise into the model before they are revealed. In our Adaptive Iterative gradient perturbation (MS-GP) method, data providers collaboratively train a global model. During each iteration, the data owners aggregate their locally trained models within the secure multi-party domain. Since the conversion of differentially private algorithms are often naive, we improve on the method by a meticulous calibration of the privacy budget for each iteration. As the parameters of the model approach the optimal values, gradients are decreased and therefore require accurate measurement. We, therefore, add a fundamental line-search capability to enable our MS-GP algorithm to decide exactly when a more accurate measurement of the gradient is indispensable. Validation of our models on three (3) real-world datasets shows that our algorithm possesses a sustainable competitive advantage over the existing cutting-edge privacy-preserving requirements in the distributed setting.

Keywords