Drug target prediction through deep learning functional representation of gene signatures

Hao Chen; Frederick J. King; Bin Zhou; Yu Wang; Carter J. Canedy; Joel Hayashi; Yang Zhong; Max W. Chang; Lars Pache; Julian L. Wong; Yong Jia; John Joslin; Tao Jiang; Christopher Benner; Sumit K. Chanda; Yingyao Zhou

doi:10.1038/s41467-024-46089-y

Nature Communications (Feb 2024)

Drug target prediction through deep learning functional representation of gene signatures

Hao Chen,
Frederick J. King,
Bin Zhou,
Yu Wang,
Carter J. Canedy,
Joel Hayashi,
Yang Zhong,
Max W. Chang,
Lars Pache,
Julian L. Wong,
Yong Jia,
John Joslin,
Tao Jiang,
Christopher Benner,
Sumit K. Chanda,
Yingyao Zhou

Affiliations

Hao Chen: Novartis Biomedical Research
Frederick J. King: Novartis Biomedical Research
Bin Zhou: Novartis Biomedical Research
Yu Wang: Novartis Biomedical Research
Carter J. Canedy: Novartis Biomedical Research
Joel Hayashi: Novartis Biomedical Research
Yang Zhong: Novartis Biomedical Research
Max W. Chang: Department of Medicine, University of California, San Diego
Lars Pache: NCI Designated Cancer Center, Sanford Burnham Prebys Medical Discovery Institute
Julian L. Wong: Novartis Biomedical Research
Yong Jia: Novartis Biomedical Research
John Joslin: Novartis Biomedical Research
Tao Jiang: Department of Computer Science and Engineering, University of California, Riverside
Christopher Benner: Department of Medicine, University of California, San Diego
Sumit K. Chanda: Department of Immunology and Microbiology, Scripps Research
Yingyao Zhou: Novartis Biomedical Research

DOI: https://doi.org/10.1038/s41467-024-46089-y
Journal volume & issue: Vol. 15, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Many machine learning applications in bioinformatics currently rely on matching gene identities when analyzing input gene signatures and fail to take advantage of preexisting knowledge about gene functions. To further enable comparative analysis of OMICS datasets, including target deconvolution and mechanism of action studies, we develop an approach that represents gene signatures projected onto their biological functions, instead of their identities, similar to how the word2vec technique works in natural language processing. We develop the Functional Representation of Gene Signatures (FRoGS) approach by training a deep learning model and demonstrate that its application to the Broad Institute’s L1000 datasets results in more effective compound-target predictions than models based on gene identities alone. By integrating additional pharmacological activity data sources, FRoGS significantly increases the number of high-quality compound-target predictions relative to existing approaches, many of which are supported by in silico and/or experimental evidence. These results underscore the general utility of FRoGS in machine learning-based bioinformatics applications. Prediction networks pre-equipped with the knowledge of gene functions may help uncover new relationships among gene signatures acquired by large-scale OMICs studies on compounds, cell types, disease models, and patient cohorts.

Published in Nature Communications

ISSN: 2041-1723 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Science
Website: https://www.nature.com/ncomms/

About the journal