Nasopharyngeal metabolomics and machine learning approach for the diagnosis of influenza
Catherine A. Hogan,
Pranav Rajpurkar,
Hari Sowrirajan,
Nicholas A. Phillips,
Anthony T. Le,
Manhong Wu,
Natasha Garamani,
Malaya K. Sahoo,
Mona L. Wood,
ChunHong Huang,
Andrew Y. Ng,
Justin Mak,
Tina M. Cowan,
Benjamin A. Pinsky
Affiliations
Catherine A. Hogan
Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA; Clinical Virology Laboratory, Stanford Health Care, Palo Alto, CA 94304, USA; Corresponding author at: 655 W 12th Ave, Room 2054, Vancouver, BC, Canada, V6R 2M7
Pranav Rajpurkar
Stanford Computer Science Department, Stanford University, Stanford, CA 94305, USA
Hari Sowrirajan
Stanford Computer Science Department, Stanford University, Stanford, CA 94305, USA
Nicholas A. Phillips
Stanford Computer Science Department, Stanford University, Stanford, CA 94305, USA
Anthony T. Le
Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
Manhong Wu
Stanford Department of Anesthesiology, Stanford University, Stanford, CA 94305, USA
Natasha Garamani
Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
Malaya K. Sahoo
Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
Mona L. Wood
Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA; Clinical Virology Laboratory, Stanford Health Care, Palo Alto, CA 94304, USA
ChunHong Huang
Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
Andrew Y. Ng
Stanford Computer Science Department, Stanford University, Stanford, CA 94305, USA
Justin Mak
Stanford Biochemical Genetics Laboratory, Stanford Health Care, Palo Alto, CA 94304, USA
Tina M. Cowan
Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA; Stanford Biochemical Genetics Laboratory, Stanford Health Care, Palo Alto, CA 94304, USA
Benjamin A. Pinsky
Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA; Clinical Virology Laboratory, Stanford Health Care, Palo Alto, CA 94304, USA; Division of Infectious Diseases and Geographic Medicine, Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
Background: Respiratory virus infections are significant causes of morbidity and mortality, and may induce host metabolite alterations by infecting respiratory epithelial cells. We investigated the use of liquid chromatography quadrupole time-of-flight mass spectrometry (LC/Q-TOF) combined with machine learning for the diagnosis of influenza infection. Methods: We analyzed nasopharyngeal swab samples by LC/Q-TOF to identify distinct metabolic signatures for diagnosis of acute illness. Machine learning models were performed for classification, followed by Shapley additive explanation (SHAP) analysis to analyze feature importance and for biomarker discovery. Findings: A total of 236 samples were tested in the discovery phase by LC/Q-TOF, including 118 positive samples (40 influenza A 2009 H1N1, 39 influenza H3 and 39 influenza B) as well as 118 age and sex-matched negative controls with acute respiratory illness. Analysis showed an area under the receiver operating characteristic curve (AUC) of 1.00 (95% confidence interval [95% CI] 0.99, 1.00), sensitivity of 1.00 (95% CI 0.86, 1.00) and specificity of 0.96 (95% CI 0.81, 0.99). The metabolite most strongly associated with differential classification was pyroglutamic acid. Independent validation of a biomarker signature based on the top 20 differentiating ion features was performed in a prospective cohort of 96 symptomatic individuals including 48 positive samples (24 influenza A 2009 H1N1, 5 influenza H3 and 19 influenza B) and 48 negative samples. Testing performed using a clinically-applicable targeted approach, liquid chromatography triple quadrupole mass spectrometry, showed an AUC of 1.00 (95% CI 0.998, 1.00), sensitivity of 0.94 (95% CI 0.83, 0.98), and specificity of 1.00 (95% CI 0.93, 1.00). Limitations include lack of sample suitability assessment, and need to validate these findings in additional patient populations. Interpretation: This metabolomic approach has potential for diagnostic applications in infectious diseases testing, including other respiratory viruses, and may eventually be adapted for point-of-care testing. Funding: None.