Cross‐Species Prediction of Transcription Factor Binding by Adversarial Training of a Novel Nucleotide‐Level Deep Neural Network

Qinhu Zhang; Siguo Wang; Zhipeng Li; Yijie Pan; De‐Shuang Huang

doi:10.1002/advs.202405685

Advanced Science (Sep 2024)

Cross‐Species Prediction of Transcription Factor Binding by Adversarial Training of a Novel Nucleotide‐Level Deep Neural Network

Qinhu Zhang,
Siguo Wang,
Zhipeng Li,
Yijie Pan,
De‐Shuang Huang

Affiliations

Qinhu Zhang: Ningbo Institute of Digital Twin Eastern Institute of Technology Ningbo 315201 China
Siguo Wang: Ningbo Institute of Digital Twin Eastern Institute of Technology Ningbo 315201 China
Zhipeng Li: Ningbo Institute of Digital Twin Eastern Institute of Technology Ningbo 315201 China
Yijie Pan: Ningbo Institute of Digital Twin Eastern Institute of Technology Ningbo 315201 China
De‐Shuang Huang: Ningbo Institute of Digital Twin Eastern Institute of Technology Ningbo 315201 China

DOI: https://doi.org/10.1002/advs.202405685
Journal volume & issue: Vol. 11, no. 36
pp. n/a – n/a

Abstract

Read online

Abstract Cross‐species prediction of TF binding remains a major challenge due to the rapid evolutionary turnover of individual TF binding sites, resulting in cross‐species predictive performance being consistently worse than within‐species performance. In this study, a novel Nucleotide‐Level Deep Neural Network (NLDNN) is first proposed to predict TF binding within or across species. NLDNN regards the task of TF binding prediction as a nucleotide‐level regression task, which takes DNA sequences as input and directly predicts experimental coverage values. Beyond predictive performance, it also assesses model performance by locating potential TF binding regions, discriminating TF‐specific single‐nucleotide polymorphisms (SNPs), and identifying causal disease‐associated SNPs. The experimental results show that NLDNN outperforms the competing methods in these tasks. Then, a dual‐path framework is designed for adversarial training of NLDNN to further improve the cross‐species prediction performance by pulling the domain space of human and mouse species closer. Through comparison and analysis, it finds that adversarial training not only can improve the cross‐species prediction performance between humans and mice but also enhance the ability to locate TF binding regions and discriminate TF‐specific SNPs. By visualizing the predictions, it is figured out that the framework corrects some mispredictions by amplifying the coverage values of incorrectly predicted peaks.

Published in Advanced Science

ISSN: 2198-3844 (Online)
Publisher: Wiley
Country of publisher: Germany
LCC subjects: Science
Website: https://onlinelibrary.wiley.com/journal/21983844

About the journal

Abstract

Keywords