BMC Medical Research Methodology (Aug 2024)

Distributed non-disclosive validation of predictive models by a modified ROC-GLM

  • Daniel Schalk,
  • Raphael Rehms,
  • Verena S. Hoffmann,
  • Bernd Bischl,
  • Ulrich Mansmann

DOI
https://doi.org/10.1186/s12874-024-02312-4
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 17

Abstract

Read online

Abstract Background Distributed statistical analyses provide a promising approach for privacy protection when analyzing data distributed over several databases. Instead of directly operating on data, the analyst receives anonymous summary statistics, which are combined into an aggregated result. Further, in discrimination model (prognosis, diagnosis, etc.) development, it is key to evaluate a trained model w.r.t. to its prognostic or predictive performance on new independent data. For binary classification, quantifying discrimination uses the receiver operating characteristics (ROC) and its area under the curve (AUC) as aggregation measure. We are interested to calculate both as well as basic indicators of calibration-in-the-large for a binary classification task using a distributed and privacy-preserving approach. Methods We employ DataSHIELD as the technology to carry out distributed analyses, and we use a newly developed algorithm to validate the prediction score by conducting distributed and privacy-preserving ROC analysis. Calibration curves are constructed from mean values over sites. The determination of ROC and its AUC is based on a generalized linear model (GLM) approximation of the true ROC curve, the ROC-GLM, as well as on ideas of differential privacy (DP). DP adds noise (quantified by the $$\ell _2$$ ℓ 2 sensitivity $$\Delta _2(\hat{f})$$ Δ 2 ( f ^ ) ) to the data and enables a global handling of placement numbers. The impact of DP parameters was studied by simulations. Results In our simulation scenario, the true and distributed AUC measures differ by $$\Delta \text {AUC} < 0.01$$ Δ AUC < 0.01 depending heavily on the choice of the differential privacy parameters. It is recommended to check the accuracy of the distributed AUC estimator in specific simulation scenarios along with a reasonable choice of DP parameters. Here, the accuracy of the distributed AUC estimator may be impaired by too much artificial noise added from DP. Conclusions The applicability of our algorithms depends on the $$\ell _2$$ ℓ 2 sensitivity $$\Delta _2(\hat{f})$$ Δ 2 ( f ^ ) of the underlying statistical/predictive model. The simulations carried out have shown that the approximation error is acceptable for the majority of simulated cases. For models with high $$\Delta _2(\hat{f})$$ Δ 2 ( f ^ ) , the privacy parameters must be set accordingly higher to ensure sufficient privacy protection, which affects the approximation error. This work shows that complex measures, as the AUC, are applicable for validation in distributed setups while preserving an individual’s privacy.

Keywords