Frontiers in Bioengineering and Biotechnology (Apr 2018)

A Machine Learning Application Based in Random Forest for Integrating Mass Spectrometry-Based Metabolomic Data: A Simple Screening Method for Patients With Zika Virus

  • Carlos Fernando Odir Rodrigues Melo,
  • Luiz Claudio Navarro,
  • Diogo Noin de Oliveira,
  • Tatiane Melina Guerreiro,
  • Estela de Oliveira Lima,
  • Jeany Delafiori,
  • Mohamed Ziad Dabaja,
  • Marta da Silva Ribeiro,
  • Maico de Menezes,
  • Rafael Gustavo Martins Rodrigues,
  • Karen Noda Morishita,
  • Cibele Zanardi Esteves,
  • Aline Lopes Lucas de Amorim,
  • Caroline Tiemi Aoyagui,
  • Pierina Lorencini Parise,
  • Guilherme Paier Milanez,
  • Gabriela Mansano do Nascimento,
  • André Ricardo Ribas Freitas,
  • André Ricardo Ribas Freitas,
  • Rodrigo Angerami,
  • Fábio Trindade Maranhão Costa,
  • Clarice Weis Arns,
  • Mariangela Ribeiro Resende,
  • Eliana Amaral,
  • Renato Passini Junior,
  • Carolina C. Ribeiro-do-Valle,
  • Helaine Milanez,
  • Maria Luiza Moretti,
  • Jose Luiz Proenca-Modena,
  • Sandra Avila,
  • Anderson Rocha,
  • Rodrigo Ramos Catharino

DOI
https://doi.org/10.3389/fbioe.2018.00031
Journal volume & issue
Vol. 6

Abstract

Read online

Recent Zika outbreaks in South America, accompanied by unexpectedly severe clinical complications have brought much interest in fast and reliable screening methods for ZIKV (Zika virus) identification. Reverse-transcriptase polymerase chain reaction (RT-PCR) is currently the method of choice to detect ZIKV in biological samples. This approach, nonetheless, demands a considerable amount of time and resources such as kits and reagents that, in endemic areas, may result in a substantial financial burden over affected individuals and health services veering away from RT-PCR analysis. This study presents a powerful combination of high-resolution mass spectrometry and a machine-learning prediction model for data analysis to assess the existence of ZIKV infection across a series of patients that bear similar symptomatic conditions, but not necessarily are infected with the disease. By using mass spectrometric data that are inputted with the developed decision-making algorithm, we were able to provide a set of features that work as a “fingerprint” for this specific pathophysiological condition, even after the acute phase of infection. Since both mass spectrometry and machine learning approaches are well-established and have largely utilized tools within their respective fields, this combination of methods emerges as a distinct alternative for clinical applications, providing a diagnostic screening—faster and more accurate—with improved cost-effectiveness when compared to existing technologies.

Keywords