Predicting host species susceptibility to influenza viruses and coronaviruses using genome data and machine learning: a scoping review

Famke Alberts; Olaf Berke; Olaf Berke; Olaf Berke; Leilani Rocha; Sheila Keay; Grazieli Maboni; Zvonimir Poljak; Zvonimir Poljak

doi:10.3389/fvets.2024.1358028

Frontiers in Veterinary Science (Sep 2024)

Predicting host species susceptibility to influenza viruses and coronaviruses using genome data and machine learning: a scoping review

Famke Alberts,
Olaf Berke,
Olaf Berke,
Olaf Berke,
Leilani Rocha,
Sheila Keay,
Grazieli Maboni,
Zvonimir Poljak,
Zvonimir Poljak

Affiliations

Famke Alberts: Department of Population Medicine, Ontario Veterinary College, University of Guelph, Guelph, ON, Canada
Olaf Berke: Department of Population Medicine, Ontario Veterinary College, University of Guelph, Guelph, ON, Canada
Olaf Berke: Centre for Public Health and Zoonoses, University of Guelph, Guelph, ON, Canada
Olaf Berke: Centre for Advancing Responsible and Ethical Artificial Intelligence, University of Guelph, Guelph, ON, Canada
Leilani Rocha: Department of Population Medicine, Ontario Veterinary College, University of Guelph, Guelph, ON, Canada
Sheila Keay: Department of Population Medicine, Ontario Veterinary College, University of Guelph, Guelph, ON, Canada
Grazieli Maboni: Athens Veterinary Diagnostic Laboratory, Department of Infectious Diseases, College of Veterinary Medicine, University of Georgia, Athens, GA, United States
Zvonimir Poljak: Department of Population Medicine, Ontario Veterinary College, University of Guelph, Guelph, ON, Canada
Zvonimir Poljak: Centre for Public Health and Zoonoses, University of Guelph, Guelph, ON, Canada

DOI: https://doi.org/10.3389/fvets.2024.1358028
Journal volume & issue: Vol. 11

Abstract

Read online

IntroductionPredicting which species are susceptible to viruses (i.e., host range) is important for understanding and developing effective strategies to control viral outbreaks in both humans and animals. The use of machine learning and bioinformatic approaches to predict viral hosts has been expanded with advancements in in-silico techniques. We conducted a scoping review to identify the breadth of machine learning methods applied to influenza and coronavirus genome data for the identification of susceptible host species.MethodsThe protocol for this scoping review is available at https://hdl.handle.net/10214/26112. Five online databases were searched, and 1,217 citations, published between January 2000 and May 2022, were obtained, and screened in duplicate for English language and in-silico research, covering the use of machine learning to identify susceptible species to viruses.ResultsFifty-three relevant publications were identified for data charting. The breadth of research was extensive including 32 different machine learning algorithms used in combination with 29 different feature selection methods and 43 different genome data input formats. There were 20 different methods used by authors to assess accuracy. Authors mostly used influenza viruses (n = 31/53 publications, 58.5%), however, more recent publications focused on coronaviruses and other viruses in combination with influenza viruses (n = 22/53, 41.5%). The susceptible animal groups authors most used were humans (n = 57/77 analyses, 74.0%), avian (n = 35/77 45.4%), and swine (n = 28/77, 36.4%). In total, 53 different hosts were used and, in most publications, data from multiple hosts was used.DiscussionThe main gaps in research were a lack of standardized reporting of methodology and the use of broad host categories for classification. Overall, approaches to viral host identification using machine learning were diverse and extensive.

Published in Frontiers in Veterinary Science

ISSN: 2297-1769 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Agriculture: Animal culture: Veterinary medicine
Website: https://www.frontiersin.org/journals/veterinary-science

About the journal

Abstract

Keywords