PLoS ONE (Jan 2022)

Fast and automated biomarker detection in breath samples with machine learning

  • Angelika Skarysz,
  • Dahlia Salman,
  • Michael Eddleston,
  • Martin Sykora,
  • Eugénie Hunsicker,
  • William H. Nailon,
  • Kareen Darnley,
  • Duncan B. McLaren,
  • C. L. Paul Thomas,
  • Andrea Soltoggio

Journal volume & issue
Vol. 17, no. 4

Abstract

Read online

Volatile organic compounds (VOCs) in human breath can reveal a large spectrum of health conditions and can be used for fast, accurate and non-invasive diagnostics. Gas chromatography-mass spectrometry (GC-MS) is used to measure VOCs, but its application is limited by expert-driven data analysis that is time-consuming, subjective and may introduce errors. We propose a machine learning-based system to perform GC-MS data analysis that exploits deep learning pattern recognition ability to learn and automatically detect VOCs directly from raw data, thus bypassing expert-led processing. We evaluate this new approach on clinical samples and with four types of convolutional neural networks (CNNs): VGG16, VGG-like, densely connected and residual CNNs. The proposed machine learning methods showed to outperform the expert-led analysis by detecting a significantly higher number of VOCs in just a fraction of time while maintaining high specificity. These results suggest that the proposed novel approach can help the large-scale deployment of breath-based diagnosis by reducing time and cost, and increasing accuracy and consistency.