Electronic Research Archive (Apr 2024)
A differentially private distributed collaborative XGBoost method
Abstract
With the rapid progress of artificial intelligence (AI) technology in medical scenarios, it becomes a trend for medical services to adopt various AI algorithms for auxiliary diagnosis and health care of patients. However, medical data is often sensitive and possibly owned by multiple participants without the willingness of data sharing. To solve this problem under the vertical partition scenario of medical data, a differentially private distributed collaborative XGBoost method named DP-DCXGBoost was proposed and applied for disease classification in the paper. Initially, a reputation-based participant selection algorithm was designed, which evaluated the contribution of participants to the global model and used it for reputation calculation to select proper participants. Then, in the collaborative training phase, the proposed method utilized the local vertical dataset of each participant to calculate feature buckets and splitting gains in order to collaboratively construct a differentially private global XGBoost classification model. Finally, the experimental analysis for two real disease datasets showed that the proposed method had good classification accuracy on the basis of preserving participants' data privacy.
Keywords