Frontiers in Genetics (Dec 2023)

DPred_3S: identifying dihydrouridine (D) modification on three species epitranscriptome based on multiple sequence-derived features

  • Jinjin Ren,
  • Jinjin Ren,
  • Xiaozhen Chen,
  • Zhengqian Zhang,
  • Haoran Shi,
  • Shuxiang Wu,
  • Shuxiang Wu

DOI
https://doi.org/10.3389/fgene.2023.1334132
Journal volume & issue
Vol. 14

Abstract

Read online

Introduction: Dihydrouridine (D) is a conserved modification of tRNA among all three life domains. D modification enhances the flexibility of a single nucleotide base in the spatial structure and is disease- and evolution-associated. Recent studies have also suggested the presence of dihydrouridine on mRNA.Methods: To identify D in epitranscriptome, we provided a prediction framework named “DPred_3S” based on the machine learning approach for three species D epitranscriptome, which used epitranscriptome sequencing data as training data for the first time.Results: The optimal features were evaluated by the F-score and integration of different features; our model achieved area under the receiver operating characteristic curve (AUROC) scores 0.955, 0.946, and 0.905 for Saccharomyces cerevisiae, Escherichia coli, and Schizosaccharomyces pombe, respectively. The performances of different machine learning algorithms were also compared in this study.Discussion: The high performances of our model suggest the D sites can be distinguished based on their surrounding sequence, but the lower performance of cross-species prediction may be limited by technique preferences.

Keywords