Evaluation of Risk Prediction with Hierarchical Data: Dependency Adjusted Confidence Intervals for the AUC

Camden Bay; Robert J Glynn; Johanna M Seddon; Mei-Ling Ting Lee; Bernard Rosner

doi:10.3390/stats6020034

Stats (Apr 2023)

Evaluation of Risk Prediction with Hierarchical Data: Dependency Adjusted Confidence Intervals for the AUC

Camden Bay,
Robert J Glynn,
Johanna M Seddon,
Mei-Ling Ting Lee,
Bernard Rosner

Affiliations

Camden Bay: Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
Robert J Glynn: Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
Johanna M Seddon: University of Massachusetts Chan Medical School, Worcester, MA 01655, USA
Mei-Ling Ting Lee: Department of Epidemiology and Biostatistics, University of Maryland School of Public Health, College Park, MD 20742, USA
Bernard Rosner: Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA

DOI: https://doi.org/10.3390/stats6020034
Journal volume & issue: Vol. 6, no. 2
pp. 526 – 538

Abstract

Read online

The area under the true ROC curve (AUC) is routinely used to determine how strongly a given model discriminates between the levels of a binary outcome. Standard inference with the AUC requires that outcomes be independent of each other. To overcome this limitation, a method was developed for the estimation of the variance of the AUC in the setting of two-level hierarchical data using probit-transformed prediction scores generated from generalized estimating equation models, thereby allowing for the application of inferential methods. This manuscript presents an extension of this approach so that inference for the AUC may be performed in a three-level hierarchical data setting (e.g., eyes nested within persons and persons nested within families). A method that accounts for the effect of tied prediction scores on inference is also described. The performance of 95% confidence intervals around the AUC was assessed through the simulation of three-level clustered data in multiple settings, including ones with tied data and variable cluster sizes. Across all settings, the actual 95% confidence interval coverage varied from 0.943 to 0.958, and the ratio of the theoretical variance to the empirical variance of the AUC varied from 0.920 to 1.013. The results are better than those from existing methods. Two examples of applying the proposed methodology are presented.

Published in Stats

ISSN: 2571-905X (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Social Sciences: Statistics
Website: https://www.mdpi.com/journal/stats

About the journal

Abstract

Keywords