Genome Biology (Oct 2024)

Winsorization greatly reduces false positives by popular differential expression methods when analyzing human population samples

  • Lu Yang,
  • Xianyang Zhang,
  • Jun Chen

DOI
https://doi.org/10.1186/s13059-024-03230-w
Journal volume & issue
Vol. 25, no. 1
pp. 1 – 4

Abstract

Read online

Abstract A recent study found severely inflated type I error rates for DESeq2 and edgeR, two dominant tools used for differential expression analysis of RNA-seq data. Here, we show that by properly addressing the outliers in the RNA-Seq data using winsorization, the type I error rate of DESeq2 and edgeR can be substantially reduced, and the power is comparable to Wilcoxon rank-sum test for large datasets. Therefore, as an alternative to Wilcoxon rank-sum test, they may still be applied for differential expression analysis of large RNA-Seq datasets.