In Autumn 2020, DOAJ will be relaunching with a new website with updated functionality, improved search, and a simplified application form. More information is available on our blog. Our API is also changing.

Hide this message

Privacy-Preserving Distributed Linear Regression on High-Dimensional Data

Proceedings on Privacy Enhancing Technologies. 2017;2017(4):345-364 DOI 10.1515/popets-2017-0053

 

Journal Homepage

Journal Title: Proceedings on Privacy Enhancing Technologies

ISSN: 2299-0984 (Online)

Publisher: Sciendo

LCC Subject Category: Philosophy. Psychology. Religion: Ethics | Science: Mathematics: Instruments and machines: Electronic computers. Computer science

Country of publisher: Poland

Language of fulltext: English

Full-text formats available: PDF

 

AUTHORS


Gascón Adrià (The Alan Turing Institute and University of Warwick)

Schoppmann Phillipp (Humboldt University of Berlin)

Balle Borja (Amazon)

Raykova Mariana (Yale University)

Doerner Jack (Northeastern University)

Zahur Samee (Google)

Evans David (University of Virginia)

EDITORIAL INFORMATION

Double blind peer review

Editorial Board

Instructions for authors

Time From Submission to Publication: 16 weeks

 

Abstract | Full Text

We propose privacy-preserving protocols for computing linear regression models, in the setting where the training dataset is vertically distributed among several parties. Our main contribution is a hybrid multi-party computation protocol that combines Yao’s garbled circuits with tailored protocols for computing inner products. Like many machine learning tasks, building a linear regression model involves solving a system of linear equations. We conduct a comprehensive evaluation and comparison of different techniques for securely performing this task, including a new Conjugate Gradient Descent (CGD) algorithm. This algorithm is suitable for secure computation because it uses an efficient fixed-point representation of real numbers while maintaining accuracy and convergence rates comparable to what can be obtained with a classical solution using floating point numbers. Our technique improves on Nikolaenko et al.’s method for privacy-preserving ridge regression (S&P 2013), and can be used as a building block in other analyses. We implement a complete system and demonstrate that our approach is highly scalable, solving data analysis problems with one million records and one hundred features in less than one hour of total running time.