Mathematical Biosciences and Engineering (Apr 2021)
Insuring against the perils in distributed learning: privacy-preserving empirical risk minimization
Abstract
Multiple organizations would benefit from collaborative learning models trained over aggregated datasets from various human activity recognition applications without privacy leakages. Two of the prevailing privacy-preserving protocols, secure multi-party computation and differential privacy, however, are still confronted with serious privacy leakages: lack of provision for privacy guarantee about individual data and insufficient protection against inference attacks on the resultant models. To mitigate the aforementioned shortfalls, we propose privacy-preserving architecture to explore the potential of secure multi-party computation and differential privacy. We utilize the inherent prospects of output perturbation and gradient perturbation in our differential privacy method, and progress with an innovation for both techniques in the distributed learning domain. Data owners collaboratively aggregate the locally trained models inside a secure multi-party computation domain in the output perturbation algorithm, and later inject appreciable statistical noise before exposing the classifier. We inject noise during every iterative update to collaboratively train a global model in our gradient perturbation algorithm. The utility guarantee of our gradient perturbation method is determined by an expected curvature relative to the minimum curvature. With the application of expected curvature, we theoretically justify the advantage of gradient perturbation in our proposed algorithm, therefore closing existing gap between practice and theory. Validation of our algorithm on real-world human recognition activity datasets establishes that our protocol incurs minimal computational overhead, provides substantial utility gains for typical security and privacy guarantees.
Keywords