Genome Biology (Oct 2024)

Neglecting the impact of normalization in semi-synthetic RNA-seq data simulations generates artificial false positives

  • Boris P. Hejblum,
  • Kalidou Ba,
  • Rodolphe Thiébaut,
  • Denis Agniel

DOI
https://doi.org/10.1186/s13059-024-03231-9
Journal volume & issue
Vol. 25, no. 1
pp. 1 – 5

Abstract

Read online

Abstract A recent study reported exaggerated false positives by popular differential expression methods when analyzing large population samples. We reproduce the differential expression analysis simulation results and identify a caveat in the data generation process. Data not truly generated under the null hypothesis led to incorrect comparisons of benchmark methods. We provide corrected simulation results that demonstrate the good performance of dearseq and argue against the superiority of the Wilcoxon rank-sum test as suggested in the previous study.

Keywords