Stats (Apr 2023)
Evaluation of Risk Prediction with Hierarchical Data: Dependency Adjusted Confidence Intervals for the AUC
Abstract
The area under the true ROC curve (AUC) is routinely used to determine how strongly a given model discriminates between the levels of a binary outcome. Standard inference with the AUC requires that outcomes be independent of each other. To overcome this limitation, a method was developed for the estimation of the variance of the AUC in the setting of two-level hierarchical data using probit-transformed prediction scores generated from generalized estimating equation models, thereby allowing for the application of inferential methods. This manuscript presents an extension of this approach so that inference for the AUC may be performed in a three-level hierarchical data setting (e.g., eyes nested within persons and persons nested within families). A method that accounts for the effect of tied prediction scores on inference is also described. The performance of 95% confidence intervals around the AUC was assessed through the simulation of three-level clustered data in multiple settings, including ones with tied data and variable cluster sizes. Across all settings, the actual 95% confidence interval coverage varied from 0.943 to 0.958, and the ratio of the theoretical variance to the empirical variance of the AUC varied from 0.920 to 1.013. The results are better than those from existing methods. Two examples of applying the proposed methodology are presented.
Keywords