Metabolites (May 2024)

Accurate Prediction of <sup>1</sup>H NMR Chemical Shifts of Small Molecules Using Machine Learning

  • Tanvir Sajed,
  • Zinat Sayeeda,
  • Brian L. Lee,
  • Mark Berjanskii,
  • Fei Wang,
  • Vasuk Gautam,
  • David S. Wishart

DOI
https://doi.org/10.3390/metabo14050290
Journal volume & issue
Vol. 14, no. 5
p. 290

Abstract

Read online

NMR is widely considered the gold standard for organic compound structure determination. As such, NMR is routinely used in organic compound identification, drug metabolite characterization, natural product discovery, and the deconvolution of metabolite mixtures in biofluids (metabolomics and exposomics). In many cases, compound identification by NMR is achieved by matching measured NMR spectra to experimentally collected NMR spectral reference libraries. Unfortunately, the number of available experimental NMR reference spectra, especially for metabolomics, medical diagnostics, or drug-related studies, is quite small. This experimental gap could be filled by predicting NMR chemical shifts for known compounds using computational methods such as machine learning (ML). Here, we describe how a deep learning algorithm that is trained on a high-quality, “solvent-aware” experimental dataset can be used to predict 1H chemical shifts more accurately than any other known method. The new program, called PROSPRE (PROton Shift PREdictor) can accurately (mean absolute error of 1H chemical shifts in water (at neutral pH), chloroform, dimethyl sulfoxide, and methanol from a user-submitted chemical structure. PROSPRE (pronounced “prosper”) has also been used to predict 1H chemical shifts for >600,000 molecules in many popular metabolomic, drug, and natural product databases.

Keywords