Scientific Reports (Mar 2024)

Identification of extracellular vesicles from their Raman spectra via self-supervised learning

  • Mathias N. Jensen,
  • Eduarda M. Guerreiro,
  • Agustin Enciso-Martinez,
  • Sergei G. Kruglik,
  • Cees Otto,
  • Omri Snir,
  • Benjamin Ricaud,
  • Olav Gaute Hellesø

DOI
https://doi.org/10.1038/s41598-024-56788-7
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Extracellular vesicles (EVs) released from cells attract interest for their possible role in health and diseases. The detection and characterization of EVs is challenging due to the lack of specialized methodologies. Raman spectroscopy, however, has been suggested as a novel approach for biochemical analysis of EVs. To extract information from the spectra, a novel deep learning architecture is explored as a versatile variant of autoencoders. The proposed architecture considers the frequency range separately from the intensity of the spectra. This enables the model to adapt to the frequency range, rather than requiring that all spectra be pre-processed to the same frequency range as it was trained on. It is demonstrated that the proposed architecture accepts Raman spectra of EVs and lipoproteins from 13 biological sources and from two laboratories. High reconstruction accuracy is maintained despite large variances in frequency range and noise level. It is also shown that the architecture is able to cluster the biological nanoparticles by their Raman spectra and differentiate them by their origin without pre-processing of the spectra or supervision during learning. The model performs label-free differentiation, including separating EVs from activated vs. non-activated blood platelets and EVs/lipoproteins from prostate cancer patients versus non-cancer controls. The differentiation is evaluated by creating a neural network classifier that observes the features extracted by the model to classify the spectra according to their sample origin. The classification reveals a test sensitivity of $$92.2\%$$ 92.2 % and selectivity of $$92.3\%$$ 92.3 % over 769 measurements from two labs that have different measurement configurations.