Algorithms (Oct 2024)
A Machine Learning Approach to Identifying Risk Factors for Long COVID-19
Abstract
Long-term sequelae of coronavirus disease 2019 (COVID-19) infection are common and can have debilitating consequences. There is a need to understand risk factors for Long COVID-19 to give impetus to the development of targeted yet holistic clinical and public health interventions to reduce its associated healthcare and economic burden. Given the large number and variety of predictors implicated spanning health-related and sociodemographic factors, machine learning becomes a valuable tool. As such, this study aims to employ machine learning to produce an algorithm to predict Long COVID-19 risk, and thereby identify key predisposing factors. Longitudinal cohort data were sourced from the UK’s “Understanding Society: COVID-19 Study” (n = 601 participants with past symptomatic COVID-19 infection confirmed by serology testing). The random forest classification algorithm demonstrated good overall performance with 97.4% sensitivity and modest specificity (65.4%). Significant risk factors included early timing of acute COVID-19 infection in the pandemic, greater number of hours worked per week, older age and financial insecurity. Loneliness and having uncommon health conditions were associated with lower risk. Sensitivity analysis suggested that COVID-19 vaccination is also associated with lower risk, and asthma with an increased risk. The results are discussed with emphasis on evaluating the value of machine learning; potential clinical utility; and some benefits and limitations of machine learning for health science researchers given its availability in commonly used statistical software.
Keywords