Precision Chemistry (Feb 2023)

Elucidating Structures of Complex Organic Compounds Using a Machine Learning Model Based on the 13C NMR Chemical Shifts

  • Anan Wu,
  • Qing Ye,
  • Xiaowei Zhuang,
  • Qiwen Chen,
  • Jinkun Zhang,
  • Jianming Wu,
  • Xin Xu

DOI
https://doi.org/10.1021/prechem.3c00005
Journal volume & issue
Vol. 1, no. 1
pp. 57 – 68

Abstract

Read online

We present a protocol that combines the support vector machine (SVM) model with accurate 13C chemical shift calculations at the xOPBE/6-311+G­(2d,p) level of theory, denoted as SVM-M (i.e., SVM for magnetic property). We show here that this SVM-M protocol is a versatile tool for identifying the structural and stereochemical assignment of complex organic compounds with high confidence. Of particular significance is that, by utilizing the dual role of the decision values in SVM, the present SVM-M protocol provides an accurate yet efficient solution to simultaneously handle the classification issue (i.e., “is a given structure correct or incorrect?”) and the comparison-based problem (i.e., “which structure is more likely to be correct or wrong among several candidate structures?”). A significantly high success rate has been reached (i.e., ∼100% on a set of 760 sample molecules with 15928 13C chemical shifts), which makes the SVM-M protocol a powerful tool for routine applications in structural and stereochemical assignments, as well as in detecting mis-assignments, for complex organic compounds, including natural products.