Molecules (Jun 2019)

Machine Learning Models Combined with Virtual Screening and Molecular Docking to Predict Human Topoisomerase I Inhibitors

  • Bingke Li,
  • Xiaokang Kang,
  • Dan Zhao,
  • Yurong Zou,
  • Xudong Huang,
  • Jiexue Wang,
  • Chenghua Zhang

DOI
https://doi.org/10.3390/molecules24112107
Journal volume & issue
Vol. 24, no. 11
p. 2107

Abstract

Read online

In this work, random forest (RF), support vector machine, k-nearest neighbor and C4.5 decision tree, were used to establish classification models for predicting whether an unknown molecule is an inhibitor of human topoisomerase I (Top1) protein. All these models have achieved satisfactory results, with total prediction accuracies from 89.70% to 97.12%. Through comparative analysis, it can be found that the RF model has the best forecasting effect. The parameters were further optimized to generate the best-performing RF model. At the same time, features selection was implemented to choose properties most relevant to the inhibition of Top1 from 189 molecular descriptors through a special RF procedure. Subsequently, a ligand-based virtual screening was performed from the Maybridge database by the optimal RF model and 596 hits were picked out. Then, 67 molecules with relative probability scores over 0.7 were selected based on the screening results. Next, the 67 molecules above were docked to Top1 using AutoDock Vina. Finally, six top-ranked molecules with binding energies less than −10.0 kcal/mol were screened out and a common backbone, which is entirely different from that of existing Top1 inhibitors reported in the literature, was found.

Keywords