EClinicalMedicine (Sep 2024)
Development and validation of a mortality risk prediction model for chronic obstructive pulmonary disease: a cross-sectional study using probabilistic graphical modellingResearch in context
Abstract
Summary: Background: Chronic Obstructive Pulmonary Disease (COPD) is a leading cause of mortality. Predicting mortality risk in patients with COPD can be important for disease management strategies. Although all-cause mortality predictors have been developed previously, limited research exists on factors directly affecting COPD-specific mortality. Methods: In a retrospective study, we used probabilistic graphs to analyse clinical cross-sectional data (COPDGene cohort), including demographics, spirometry, quantitative chest imaging, and symptom features, as well as gene expression data. COPDGene recruited current and former smokers, aged 45–80 years with >10 pack-years smoking history, from across the USA (Phase 1, 11/2007-4/2011) and invited them for a follow-up visit (Phase 2, 7/2013-7/2017). ECLIPSE cohort recruited current and former smokers (COPD patients and controls from USA and Europe), aged 45–80 with smoking history >10 pack-years (12/2005-11/2007). We applied graphical models on multi-modal data COPDGene Phase 1 participants to identify factors directly affecting all-cause and COPD-specific mortality (primary outcomes); and on Phase 2 follow-up cohort to identify additional molecular and social factors affecting mortality. We used penalized Cox regression with features selected by the causal graph to build VAPORED, a mortality risk prediction model. VAPORED was compared to existing scores (BODE: BMI, airflow obstruction, dyspnoea, exercise capacity; ADO: age, dyspnoea, airflow obstruction) on the ability to rank individuals by mortality risk, using four evaluation metrics (concordance, concordance probability estimate (CPE), cumulative/dynamic (C/D) area under the receiver operating characteristic curve (AUC), and integrated C/D AUC). The results were validated in ECLIPSE. Findings: Graphical models, applied on the COPDGene Phase 1 samples (n = 8610), identified 11 and 7 variables directly linked to all-cause and COPD-specific mortality, respectively. Although many appear in both models, non-lung comorbidities appear only in the all-cause model, while forced vital capacity (FVC %predicted) appears in COPD-specific mortality model only. Additionally, the graph model of Phase 2 data (n = 3182) identified internet access, CD4 T cells and platelets to be linked to lower mortality risk. Furthermore, using the 7 variables linked to COPD-specific mortality (forced expiratory volume in 1 s/forced vital capacity (FEV1/FVC) ration, FVC %predicted, age, history of pneumonia, oxygen saturation, 6-min walk distance, dyspnoea) we developed VAPORED mortality risk score, which we validated on the ECLIPSE cohort (3-yr all-cause mortality data, n = 2312). VAPORED performed significantly better than ADO, BODE, and updated BODE indices in predicting all-cause mortality in ECLIPSE in terms of concordance (VAPORED [0.719] vs ADO [0.693; FDR p-value 0.014], BODE [0.695; FDR p-value 0.020], and updated BODE [0.694; FDR p-value 0.021]); CPE (VAPORED [0.714] vs ADO [0.673; FDR p-value <0.0001], BODE [0.662; FDR p-value <0.0001], and updated BODE [0.646; FDR p-value <0.0001]); 3-year C/D AUC (VAPORED [0.728] vs ADO [0.702; FDR p-value 0.017], BODE [0.704; FDR p-value 0.021], and updated BODE [0.703; FDR p-value 0.024]); integrated C/D AUC (VAPORED [0.723] vs ADO [0.698; FDR p-value 0.047], BODE [0.695; FDR p-value 0.024], and updated BODE [0.690; FDR p-value 0.021]). Finally, we developed a web tool to help clinicians calculate VAPORED mortality risk and compare it to ADO and BODE predictions. Interpretation: Our work is an important step towards improving our identification of high-risk patients and generating hypotheses of potential biological mechanisms and social factors driving mortality in patients with COPD at the population level. The main limitation of our study is the fact that the analysed datasets consist of older people with extensive smoking history and limited racial diversity. Thus, the results are relevant to high-risk individuals or those diagnosed with COPD and the VAPORED score is validated for them. Funding: This research was supported by NIH [NHLBI, NLM]. The COPDGene study is supported by the COPD Foundation, through grants from AstraZeneca, Bayer Pharmaceuticals, Boehringer Ingelheim, Genentech, GlaxoSmithKline, Novartis, Pfizer and Sunovion.