DLM-DTI: a dual language model for the prediction of drug-target interaction with hint-based learning

Jonghyun Lee; Dae Won Jun; Ildae Song; Yun Kim

doi:10.1186/s13321-024-00808-1

Journal of Cheminformatics (Feb 2024)

DLM-DTI: a dual language model for the prediction of drug-target interaction with hint-based learning

Jonghyun Lee,
Dae Won Jun,
Ildae Song,
Yun Kim

Affiliations

Jonghyun Lee: Department of Medical and Digital Engineering, Hanyang University College of Engineering
Dae Won Jun: Department of Medical and Digital Engineering, Hanyang University College of Engineering
Ildae Song: Department of Pharmaceutical Science and Technology, Kyungsung University
Yun Kim: College of Pharmacy, Deagu Catholic University

DOI: https://doi.org/10.1186/s13321-024-00808-1
Journal volume & issue: Vol. 16, no. 1
pp. 1 – 12

Abstract

Read online

Abstract The drug discovery process is demanding and time-consuming, and machine learning-based research is increasingly proposed to enhance efficiency. A significant challenge in this field is predicting whether a drug molecule’s structure will interact with a target protein. A recent study attempted to address this challenge by utilizing an encoder that leverages prior knowledge of molecular and protein structures, resulting in notable improvements in the prediction performance of the drug-target interactions task. Nonetheless, the target encoders employed in previous studies exhibit computational complexity that increases quadratically with the input length, thereby limiting their practical utility. To overcome this challenge, we adopt a hint-based learning strategy to develop a compact and efficient target encoder. With the adaptation parameter, our model can blend general knowledge and target-oriented knowledge to build features of the protein sequences. This approach yielded considerable performance enhancements and improved learning efficiency on three benchmark datasets: BIOSNAP, DAVIS, and Binding DB. Furthermore, our methodology boasts the merit of necessitating only a minimal Video RAM (VRAM) allocation, specifically 7.7GB, during the training phase (16.24% of the previous state-of-the-art model). This ensures the feasibility of training and inference even with constrained computational resources.

Published in Journal of Cheminformatics

ISSN: 1758-2946 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Chemistry
Website: https://jcheminf.biomedcentral.com/

About the journal

Abstract

Keywords