Genome Biology (Jun 2024)

Splice_sim: a nucleotide conversion-enabled RNA-seq simulation and evaluation framework

  • Niko Popitsch,
  • Tobias Neumann,
  • Arndt von Haeseler,
  • Stefan L. Ameres

DOI
https://doi.org/10.1186/s13059-024-03313-8
Journal volume & issue
Vol. 25, no. 1
pp. 1 – 19

Abstract

Read online

Abstract Nucleotide conversion RNA sequencing techniques interrogate chemical RNA modifications in cellular transcripts, resulting in mismatch-containing reads. Biases in mapping the resulting reads to reference genomes remain poorly understood. We present splice_sim, a splice-aware RNA-seq simulation and evaluation pipeline that introduces user-defined nucleotide conversions at set frequencies, creates mixture models of converted and unconverted reads, and calculates mapping accuracies per genomic annotation. By simulating nucleotide conversion RNA-seq datasets under realistic experimental conditions, including metabolic RNA labeling and RNA bisulfite sequencing, we measure mapping accuracies of state-of-the-art spliced-read mappers for mouse and human transcripts and derive strategies to prevent biases in the data interpretation.

Keywords