Machine Learning of Raman Spectroscopy Data for Classifying Cancers: A Review of the Recent Literature

Nathan Blake; Riana Gaifulina; Lewis D. Griffin; Ian M. Bell; Geraint M. H. Thomas

doi:10.3390/diagnostics12061491

Diagnostics (Jun 2022)

Machine Learning of Raman Spectroscopy Data for Classifying Cancers: A Review of the Recent Literature

Nathan Blake,
Riana Gaifulina,
Lewis D. Griffin,
Ian M. Bell,
Geraint M. H. Thomas

Affiliations

Nathan Blake: Department of Cell and Developmental Biology, University College London, London WC1E 6BT, UK
Riana Gaifulina: Department of Cell and Developmental Biology, University College London, London WC1E 6BT, UK
Lewis D. Griffin: Department of Computer Science, University College London, London WC1E 6BT, UK
Ian M. Bell: Spectroscopy Products Division, Renishaw plc, Wotton-under-Edge GL12 8JR, UK
Geraint M. H. Thomas: Department of Cell and Developmental Biology, University College London, London WC1E 6BT, UK

DOI: https://doi.org/10.3390/diagnostics12061491
Journal volume & issue: Vol. 12, no. 6
p. 1491

Abstract

Read online

Raman Spectroscopy has long been anticipated to augment clinical decision making, such as classifying oncological samples. Unfortunately, the complexity of Raman data has thus far inhibited their routine use in clinical settings. Traditional machine learning models have been used to help exploit this information, but recent advances in deep learning have the potential to improve the field. However, there are a number of potential pitfalls with both traditional and deep learning models. We conduct a literature review to ascertain the recent machine learning methods used to classify cancers using Raman spectral data. We find that while deep learning models are popular, and ostensibly outperform traditional learning models, there are many methodological considerations which may be leading to an over-estimation of performance; primarily, small sample sizes which compound sub-optimal choices regarding sampling and validation strategies. Amongst several recommendations is a call to collate large benchmark Raman datasets, similar to those that have helped transform digital pathology, which researchers can use to develop and refine deep learning models.

Published in Diagnostics

ISSN: 2075-4418 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Medicine: Medicine (General)
Website: http://www.mdpi.com/journal/diagnostics

About the journal

Abstract

Keywords