COPD (Dec 2024)
Machine-Learning Model Identifies Patients With Alpha-1 Antitrypsin Deficiency Using Claims Records
Abstract
Identifying patients with rare diseases like alpha-1 antitrypsin deficiency (AATD) is challenging. Machine-learning models may be trained to identify patients with rare diseases using large-scale, real-world databases, whereas electronic medical records have low numbers of confirmed cases and have limited use in training such models. We applied a machine-learning model to a large US claims database to identify undiagnosed symptomatic patients with AATD. Using deidentified data from the Komodo US claims database (April 26, 2016–January 31, 2023), a model was trained to identify symptomatic patients with high probability of AATD. Eighty claims records for high-probability candidates identified by the model were independently reviewed and validated by 2 clinical experts. The experts independently indicated that of the 80 high-probability candidate patients, 65 (81%) and 62 (78%) patients, respectively, should be tested for AATD. Feedback from this validation step informed model optimization. The optimized model was applied to claims data to identify symptomatic patients with probable AATD. Eleven and 14 “features” of the claims data were informative in distinguishing patients with AATD from patients with COPD without AATD and from unspecified chronic liver diseases. Moreover, patients with diagnosed AATD and COPD without AATD had unique cadences of similar medical events in their diagnostic journeys. Our work shows that a machine-learning model trained on a large US claims database can accurately identify symptomatic patients with AATD and provides useful insights into the diagnostic journey of patients with AATD.
Keywords