Genomics, Proteomics & Bioinformatics (Oct 2019)

DeepCPI: A Deep Learning-based Framework for Large-scale in silico Drug Screening

  • Fangping Wan,
  • Yue Zhu,
  • Hailin Hu,
  • Antao Dai,
  • Xiaoqing Cai,
  • Ligong Chen,
  • Haipeng Gong,
  • Tian Xia,
  • Dehua Yang,
  • Ming-Wei Wang,
  • Jianyang Zeng

Journal volume & issue
Vol. 17, no. 5
pp. 478 – 495

Abstract

Read online

Accurate identification of compound–protein interactions (CPIs) in silico may deepen our understanding of the underlying mechanisms of drug action and thus remarkably facilitate drug discovery and development. Conventional similarity- or docking-based computational methods for predicting CPIs rarely exploit latent features from currently available large-scale unlabeled compound and protein data and often limit their usage to relatively small-scale datasets. In the present study, we propose DeepCPI, a novel general and scalable computational framework that combines effective feature embedding (a technique of representation learning) with powerful deep learning methods to accurately predict CPIs at a large scale. DeepCPI automatically learns the implicit yet expressive low-dimensional features of compounds and proteins from a massive amount of unlabeled data. Evaluations of the measured CPIs in large-scale databases, such as ChEMBL and BindingDB, as well as of the known drug–target interactions from DrugBank, demonstrated the superior predictive performance of DeepCPI. Furthermore, several interactions among small-molecule compounds and three G protein-coupled receptor targets (glucagon-like peptide-1 receptor, glucagon receptor, and vasoactive intestinal peptide receptor) predicted using DeepCPI were experimentally validated. The present study suggests that DeepCPI is a useful and powerful tool for drug discovery and repositioning. The source code of DeepCPI can be downloaded from https://github.com/FangpingWan/DeepCPI. Keywords: Deep learning, Machine learning, Drug discovery, In silico drug screening, Compound–protein interaction prediction