Lupus Science and Medicine (May 2024)
Development and validation of a risk scoring system to identify patients with lupus nephritis in electronic health record data
Abstract
Objective Accurate identification of lupus nephritis (LN) cases is essential for patient management, research and public health initiatives. However, LN diagnosis codes in electronic health records (EHRs) are underused, hindering efficient identification. We investigated the current performance of International Classification of Diseases (ICD) codes, 9th and 10th editions (ICD9/10), for identifying prevalent LN, and developed scoring systems to increase identification of LN that are adaptable to settings with and without LN ICD codes.Methods Training and test sets derived from EHR data from a large health system. An external set comprised data from the EHR of a second large health system. Adults with ICD9/10 codes for SLE were included. LN cases were ascertained through manual chart reviews conducted by rheumatologists. Two definitions of LN were used: strict (definite LN) and inclusive (definite, potential or diagnostic uncertainty). Gradient boosting models including structured EHR fields were used for predictor selection. Two logistic regression-based scoring systems were developed (‘LN-Code’ included LN ICD codes and ‘LN-No Code’ did not), calibrated and validated using standard performance metrics.Results A total of 4152 patients from University of California San Francisco Medical Center and 370 patients from Zuckerberg San Francisco General Hospital and Trauma Center met the eligibility criteria. Mean age was 50 years, 87% were female. LN diagnosis codes demonstrated low sensitivity (43–73%) but high specificity (92–97%). LN-Code achieved an area under the curve (AUC) of 0.93 and a sensitivity of 0.88 for identifying LN using the inclusive definition. LN-No Code reached an AUC of 0.91 and a sensitivity of 0.95 (0.97 for the strict definition). Both scoring systems had good external validity, calibration and performance across racial and ethnic groups.Conclusions This study quantified the underutilisation of LN diagnosis codes in EHRs and introduced two adaptable scoring systems to enhance LN identification. Further validation in diverse healthcare settings is essential to ensure their broader applicability.