Machine learning approaches for influenza A virus risk assessment identifies predictive correlates using ferret model in vivo data

Troy J. Kieran; Xiangjie Sun; Taronna R. Maines; Jessica A. Belser

doi:10.1038/s42003-024-06629-0

Communications Biology (Aug 2024)

Machine learning approaches for influenza A virus risk assessment identifies predictive correlates using ferret model in vivo data

Troy J. Kieran,
Xiangjie Sun,
Taronna R. Maines,
Jessica A. Belser

Affiliations

Troy J. Kieran: Influenza Division, Centers for Disease Control and Prevention
Xiangjie Sun: Influenza Division, Centers for Disease Control and Prevention
Taronna R. Maines: Influenza Division, Centers for Disease Control and Prevention
Jessica A. Belser: Influenza Division, Centers for Disease Control and Prevention

DOI: https://doi.org/10.1038/s42003-024-06629-0
Journal volume & issue: Vol. 7, no. 1
pp. 1 – 14

Abstract

Read online

Abstract In vivo assessments of influenza A virus (IAV) pathogenicity and transmissibility in ferrets represent a crucial component of many pandemic risk assessment rubrics, but few systematic efforts to identify which data from in vivo experimentation are most useful for predicting pathogenesis and transmission outcomes have been conducted. To this aim, we aggregated viral and molecular data from 125 contemporary IAV (H1, H2, H3, H5, H7, and H9 subtypes) evaluated in ferrets under a consistent protocol. Three overarching predictive classification outcomes (lethality, morbidity, transmissibility) were constructed using machine learning (ML) techniques, employing datasets emphasizing virological and clinical parameters from inoculated ferrets, limited to viral sequence-based information, or combining both data types. Among 11 different ML algorithms tested and assessed, gradient boosting machines and random forest algorithms yielded the highest performance, with models for lethality and transmission consistently better performing than models predicting morbidity. Comparisons of feature selection among models was performed, and highest performing models were validated with results from external risk assessment studies. Our findings show that ML algorithms can be used to summarize complex in vivo experimental work into succinct summaries that inform and enhance risk assessment criteria for pandemic preparedness that take in vivo data into account.

Published in Communications Biology

ISSN: 2399-3642 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Science: Biology (General)
Website: https://www.nature.com/commsbio/

About the journal