Scientific Reports (Dec 2020)

Data preprocessing workflow for exhaled breath analysis by GC/MS using open sources

  • Rosa Alba Sola Martínez,
  • José María Pastor Hernández,
  • Gema Lozano Terol,
  • Julia Gallego-Jara,
  • Luis García-Marcos,
  • Manuel Cánovas Díaz,
  • Teresa de Diego Puente

DOI
https://doi.org/10.1038/s41598-020-79014-6
Journal volume & issue
Vol. 10, no. 1
pp. 1 – 11

Abstract

Read online

Abstract The noninvasive diagnosis and monitoring of high prevalence diseases such as cardiovascular diseases, cancers and chronic respiratory diseases are currently priority objectives in the area of health. In this regard, the analysis of volatile organic compounds (VOCs) has been identified as a potential noninvasive tool for the diagnosis and surveillance of several diseases. Despite the advantages of this strategy, it is not yet a routine clinical tool. The lack of reproducible protocols for each step of the biomarker discovery phase is an obstacle of the current state. Specifically, this issue is present at the data preprocessing step. Thus, an open source workflow for preprocessing the data obtained by the analysis of exhaled breath samples using gas chromatography coupled with single quadrupole mass spectrometry (GC/MS) is presented in this paper. This workflow is based on the connection of two approaches to transform raw data into a useful matrix for statistical analysis. Moreover, this workflow includes matching compounds from breath samples with a spectral library. Three free packages (xcms, cliqueMS and eRah) written in the language R are used for this purpose. Furthermore, this paper presents a suitable protocol for exhaled breath sample collection from infants under 2 years of age for GC/MS.