Guaranteed distributed machine learning: Privacy-preserving empirical risk minimization

Kwabena Owusu-Agyemang; Zhen Qin; Appiah Benjamin; Hu Xiong; Zhiguang Qin

doi:10.3934/mbe.2021243

Mathematical Biosciences and Engineering (Jun 2021)

Guaranteed distributed machine learning: Privacy-preserving empirical risk minimization

Kwabena Owusu-Agyemang,
Zhen Qin,
Appiah Benjamin,
Hu Xiong,
Zhiguang Qin

Affiliations

Kwabena Owusu-Agyemang: University of Electronic Science and Technology of China, School of Information and Software Engineering, China
Zhen Qin: University of Electronic Science and Technology of China, School of Information and Software Engineering, China
Appiah Benjamin: University of Electronic Science and Technology of China, School of Information and Software Engineering, China
Hu Xiong: University of Electronic Science and Technology of China, School of Information and Software Engineering, China
Zhiguang Qin: University of Electronic Science and Technology of China, School of Information and Software Engineering, China

DOI: https://doi.org/10.3934/mbe.2021243
Journal volume & issue: Vol. 18, no. 4
pp. 4772 – 4796

Abstract

Read online

Distributed learning over data from sensor-based networks has been adopted to collaboratively train models on these sensitive data without privacy leakages. We present a distributed learning framework that involves the integration of secure multi-party computation and differential privacy. In our differential privacy method, we explore the potential of output perturbation and gradient perturbation and also progress with the cutting-edge methods of both techniques in the distributed learning domain. In our proposed multi-scheme output perturbation algorithm (MS-OP), data owners combine their local classifiers within a secure multi-party computation and later inject an appreciable amount of statistical noise into the model before they are revealed. In our Adaptive Iterative gradient perturbation (MS-GP) method, data providers collaboratively train a global model. During each iteration, the data owners aggregate their locally trained models within the secure multi-party domain. Since the conversion of differentially private algorithms are often naive, we improve on the method by a meticulous calibration of the privacy budget for each iteration. As the parameters of the model approach the optimal values, gradients are decreased and therefore require accurate measurement. We, therefore, add a fundamental line-search capability to enable our MS-GP algorithm to decide exactly when a more accurate measurement of the gradient is indispensable. Validation of our models on three (3) real-world datasets shows that our algorithm possesses a sustainable competitive advantage over the existing cutting-edge privacy-preserving requirements in the distributed setting.

Published in Mathematical Biosciences and Engineering

ISSN: 1551-0018 (Online)
Publisher: AIMS Press
Country of publisher: United States
LCC subjects: Technology: Chemical technology: Biotechnology; Science: Mathematics
Website: https://www.aimspress.com/journal/MBE

About the journal

Abstract

Keywords