Nature Communications (Mar 2024)
Forward-predictive SERS-based chemical taxonomy for untargeted structural elucidation of epimeric cerebrosides
Abstract
Abstract Achieving untargeted chemical identification, isomeric differentiation, and quantification is critical to most scientific and technological problems but remains challenging. Here, we demonstrate an integrated SERS-based chemical taxonomy machine learning framework for untargeted structural elucidation of 11 epimeric cerebrosides, attaining >90% accuracy and robust single epimer and multiplex quantification with <10% errors. First, we utilize 4-mercaptophenylboronic acid to selectively capture the epimers at molecular sites of isomerism to form epimer-specific SERS fingerprints. Corroborating with in-silico experiments, we establish five spectral features, each corresponding to a structural characteristic: (1) presence/absence of epimers, (2) monosaccharide/cerebroside, (3) saturated/unsaturated cerebroside, (4) glucosyl/galactosyl, and (5) GlcCer or GalCer’s carbon chain lengths. Leveraging these insights, we create a fully generalizable framework to identify and quantify cerebrosides at concentrations between 10−4 to 10−10 M and achieve multiplex quantification of binary mixtures containing biomarkers GlcCer24:1, and GalCer24:1 using their untrained spectra in the models.