ERJ Open Research (Mar 2023)
Prediction of persistent chronic cough in patients with chronic cough using machine learning
Abstract
Introduction The aim of this study was to develop and validate prediction models for risk of persistent chronic cough (PCC) in patients with chronic cough (CC). This was a retrospective cohort study. Methods Two retrospective cohorts of patients 18–85 years of age were identified for years 2011–2016: a specialist cohort which included CC patients diagnosed by specialists, and an event cohort which comprised CC patients identified by at least three cough events. A cough event could be a cough diagnosis, dispensing of cough medication or any indication of cough in clinical notes. Model training and validation were conducted using two machine-learning approaches and 400+ features. Sensitivity analyses were also conducted. PCC was defined as a CC diagnosis or any two (specialist cohort) or three (event cohort) cough events in year 2 and again in year 3 after the index date. Results 8581 and 52 010 patients met the eligibility criteria for the specialist and event cohorts (mean age 60.0 and 55.5 years), respectively. 38.2% and 12.4% of patients in the specialist and event cohorts, respectively, developed PCC. The utilisation-based models were mainly based on baseline healthcare utilisations associated with CC or respiratory diseases, while the diagnosis-based models incorporated traditional parameters including age, asthma, pulmonary fibrosis, obstructive pulmonary disease, gastro-oesophageal reflux, hypertension and bronchiectasis. All final models were parsimonious (five to seven predictors) and moderately accurate (area under the curve: 0.74–0.76 for utilisation-based models and 0.71 for diagnosis-based models). Conclusions The application of our risk prediction models may be used to identify high-risk PCC patients at any stage of the clinical testing/evaluation to facilitate decision making.