Clinical subphenotypes in COVID-19: derivation, validation, prediction, temporal patterns, and interaction with social determinants of health

Chang Su; Yongkang Zhang; James H. Flory; Mark G. Weiner; Rainu Kaushal; Edward J. Schenck; Fei Wang

doi:10.1038/s41746-021-00481-w

npj Digital Medicine (Jul 2021)

Clinical subphenotypes in COVID-19: derivation, validation, prediction, temporal patterns, and interaction with social determinants of health

Chang Su,
Yongkang Zhang,
James H. Flory,
Mark G. Weiner,
Rainu Kaushal,
Edward J. Schenck,
Fei Wang

Affiliations

Chang Su: Department of Population Health Sciences, Weill Cornell Medicine
Yongkang Zhang: Department of Population Health Sciences, Weill Cornell Medicine
James H. Flory: Memorial Sloan-Kettering Cancer Center
Mark G. Weiner: Department of Population Health Sciences, Weill Cornell Medicine
Rainu Kaushal: Department of Population Health Sciences, Weill Cornell Medicine
Edward J. Schenck: New York-Presbyterian Hospital, Weill Cornell Medicine
Fei Wang: Department of Population Health Sciences, Weill Cornell Medicine

DOI: https://doi.org/10.1038/s41746-021-00481-w
Journal volume & issue: Vol. 4, no. 1
pp. 1 – 13

Abstract

Read online

Abstract The coronavirus disease 2019 (COVID-19) is heterogeneous and our understanding of the biological mechanisms of host response to the viral infection remains limited. Identification of meaningful clinical subphenotypes may benefit pathophysiological study, clinical practice, and clinical trials. Here, our aim was to derive and validate COVID-19 subphenotypes using machine learning and routinely collected clinical data, assess temporal patterns of these subphenotypes during the pandemic course, and examine their interaction with social determinants of health (SDoH). We retrospectively analyzed 14418 COVID-19 patients in five major medical centers in New York City (NYC), between March 1 and June 12, 2020. Using clustering analysis, 4 biologically distinct subphenotypes were derived in the development cohort (N = 8199). Importantly, the identified subphenotypes were highly predictive of clinical outcomes (especially 60-day mortality). Sensitivity analyses in the development cohort, and rederivation and prediction in the internal (N = 3519) and external (N = 3519) validation cohorts confirmed the reproducibility and usability of the subphenotypes. Further analyses showed varying subphenotype prevalence across the peak of the outbreak in NYC. We also found that SDoH specifically influenced mortality outcome in Subphenotype IV, which is associated with older age, worse clinical manifestation, and high comorbidity burden. Our findings may lead to a better understanding of how COVID-19 causes disease in different populations and potentially benefit clinical trial development. The temporal patterns and SDoH implications of the subphenotypes may add insights to health policy to reduce social disparity in the pandemic.

Published in npj Digital Medicine

ISSN: 2398-6352 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://www.nature.com/npjdigitalmed/

About the journal