IET Information Security (Jan 2024)

A Novel Differentially Private Online Learning Algorithm for Group Lasso in Big Data

  • Jinxia Li,
  • Liwei Lu

DOI
https://doi.org/10.1049/2024/5553292
Journal volume & issue
Vol. 2024

Abstract

Read online

This study addresses the challenge of extracting valuable information and selecting key variables from large datasets, essential across statistics, computational science, and data science. In the age of big data, where safeguarding personal privacy is paramount, this study presents an online learning algorithm that leverages differential privacy to handle large-scale data effectively. The focus is on enhancing the online group lasso approach within the differential privacy realm. The study begins by comparing online and offline learning approaches and classifying common online learning techniques. It proceeds to elucidate the concept of differential privacy and its importance. By enhancing the group-follow-the-proximally-regularized-leader (GFTPRL) algorithm, we have created a new method for the online group lasso model that integrates differential privacy for binary classification in logistic regression. The research offers a solid validation of the algorithm’s effectiveness based on differential privacy and online learning principles. The algorithm’s performance was thoroughly evaluated through simulations with both synthetic and actual data. The comparison is made between the proposed privacy-preserving algorithm and traditional non-privacy-preserving counterparts, with a focus on regret bounds, a measure of performance. The findings underscore the practical benefits of the differential privacy-preserving algorithm in tackling large-scale data analysis while upholding privacy standards. This research marks a significant step forward in the fusion of big data analytics and the safeguarding of individual privacy.