Optimization of binding affinities in chemical space with generative pre-trained transformer and deep reinforcement learning [version 2; peer review: 2 approved, 3 approved with reservations]

Xiaopeng Xu; Juexiao Zhou; Xin Gao; Yu Wang; Xingyu Liao; Zhongxiao Li; Ruochi Zhang; Chen Zhu; Qing Zhan

F1000Research (Feb 2024)

Optimization of binding affinities in chemical space with generative pre-trained transformer and deep reinforcement learning [version 2; peer review: 2 approved, 3 approved with reservations]

Xiaopeng Xu,
Juexiao Zhou,
Xin Gao,
Yu Wang,
Xingyu Liao,
Zhongxiao Li,
Ruochi Zhang,
Chen Zhu,
Qing Zhan

Affiliations

Xiaopeng Xu: ORCiD; Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
Juexiao Zhou: Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
Xin Gao: Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
Yu Wang: Syneron Technology, Guangzhou, China
Xingyu Liao: Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
Zhongxiao Li: ORCiD; Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
Ruochi Zhang: ORCiD; Syneron Technology, Guangzhou, China
Chen Zhu: KAUST Catalysis Center (KCC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
Qing Zhan: Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia

Journal volume & issue: Vol. 12

Abstract

Read online

Background The key challenge in drug discovery is to discover novel compounds with desirable properties. Among the properties, binding affinity to a target is one of the prerequisites and usually evaluated by molecular docking or quantitative structure activity relationship (QSAR) models. Methods In this study, we developed SGPT-RL, which uses a generative pre-trained transformer (GPT) as the policy network of the reinforcement learning (RL) agent to optimize the binding affinity to a target. SGPT-RL was evaluated on the Moses distribution learning benchmark and two goal-directed generation tasks, with Dopamine Receptor D2 (DRD2) and Angiotensin-Converting Enzyme 2 (ACE2) as the targets. Both QSAR model and molecular docking were implemented as the optimization goals in the tasks. The popular Reinvent method was used as the baseline for comparison. Results The results on the Moses benchmark showed that SGPT-RL learned good property distributions and generated molecules with high validity and novelty. On the two goal-directed generation tasks, both SGPT-RL and Reinvent were able to generate valid molecules with improved target scores. The SGPT-RL method achieved better results than Reinvent on the ACE2 task, where molecular docking was used as the optimization goal. Further analysis shows that SGPT-RL learned conserved scaffold patterns during exploration. Conclusions The superior performance of SGPT-RL in the ACE2 task indicates that it can be applied to the virtual screening process where molecular docking is widely used as the criteria. Besides, the scaffold patterns learned by SGPT-RL during the exploration process can assist chemists to better design and discover novel lead candidates.

Published in F1000Research

ISSN: 2046-1402 (Online)
Publisher: F1000 Research Ltd
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://f1000research.com

About the journal

Abstract

Keywords