DeepCPI: A Deep Learning-based Framework for Large-scale in silico Drug Screening

Fangping Wan; Yue Zhu; Hailin Hu; Antao Dai; Xiaoqing Cai; Ligong Chen; Haipeng Gong; Tian Xia; Dehua Yang; Ming-Wei Wang; Jianyang Zeng

Genomics, Proteomics & Bioinformatics (Oct 2019)

DeepCPI: A Deep Learning-based Framework for Large-scale in silico Drug Screening

Fangping Wan,
Yue Zhu,
Hailin Hu,
Antao Dai,
Xiaoqing Cai,
Ligong Chen,
Haipeng Gong,
Tian Xia,
Dehua Yang,
Ming-Wei Wang,
Jianyang Zeng

Affiliations

Fangping Wan: Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
Yue Zhu: The National Center for Drug Screening and the CAS Key Laboratory of Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
Hailin Hu: School of Medicine, Tsinghua University, Beijing 100084, China
Antao Dai: The National Center for Drug Screening and the CAS Key Laboratory of Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
Xiaoqing Cai: The National Center for Drug Screening and the CAS Key Laboratory of Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
Ligong Chen: School of Pharmaceutical Sciences, Tsinghua University, Beijing 100084, China
Haipeng Gong: School of Life Science, Tsinghua University, Beijing 100084, China
Tian Xia: Department of Electronics and Information Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
Dehua Yang: The National Center for Drug Screening and the CAS Key Laboratory of Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; Corresponding authors.
Ming-Wei Wang: The National Center for Drug Screening and the CAS Key Laboratory of Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China; Shanghai Medical College, Fudan University, Shanghai 200032, China; Corresponding authors.
Jianyang Zeng: Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China; MOE Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China; Corresponding authors.

Journal volume & issue: Vol. 17, no. 5
pp. 478 – 495

Abstract

Read online

Accurate identification of compound–protein interactions (CPIs) in silico may deepen our understanding of the underlying mechanisms of drug action and thus remarkably facilitate drug discovery and development. Conventional similarity- or docking-based computational methods for predicting CPIs rarely exploit latent features from currently available large-scale unlabeled compound and protein data and often limit their usage to relatively small-scale datasets. In the present study, we propose DeepCPI, a novel general and scalable computational framework that combines effective feature embedding (a technique of representation learning) with powerful deep learning methods to accurately predict CPIs at a large scale. DeepCPI automatically learns the implicit yet expressive low-dimensional features of compounds and proteins from a massive amount of unlabeled data. Evaluations of the measured CPIs in large-scale databases, such as ChEMBL and BindingDB, as well as of the known drug–target interactions from DrugBank, demonstrated the superior predictive performance of DeepCPI. Furthermore, several interactions among small-molecule compounds and three G protein-coupled receptor targets (glucagon-like peptide-1 receptor, glucagon receptor, and vasoactive intestinal peptide receptor) predicted using DeepCPI were experimentally validated. The present study suggests that DeepCPI is a useful and powerful tool for drug discovery and repositioning. The source code of DeepCPI can be downloaded from https://github.com/FangpingWan/DeepCPI. Keywords: Deep learning, Machine learning, Drug discovery, In silico drug screening, Compound–protein interaction prediction

Published in Genomics, Proteomics & Bioinformatics

ISSN: 1672-0229 (Print); 2210-3244 (Online)
Publisher: Oxford University Press
Country of publisher: United Kingdom
LCC subjects: Science: Biology (General); Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://academic.oup.com/gpb

About the journal