BMC Bioinformatics (Jan 2010)

A benchmark for statistical microarray data analysis that preserves actual biological and technical variance

  • Gaigneaux Anthoula,
  • Bareke Eric,
  • Pierre Michael,
  • Berger Fabrice,
  • De Meulder Bertrand,
  • De Hertogh Benoît,
  • Depiereux Eric

DOI
https://doi.org/10.1186/1471-2105-11-17
Journal volume & issue
Vol. 11, no. 1
p. 17

Abstract

Read online

Abstract Background Recent reanalysis of spike-in datasets underscored the need for new and more accurate benchmark datasets for statistical microarray analysis. We present here a fresh method using biologically-relevant data to evaluate the performance of statistical methods. Results Our novel method ranks the probesets from a dataset composed of publicly-available biological microarray data and extracts subset matrices with precise information/noise ratios. Our method can be used to determine the capability of different methods to better estimate variance for a given number of replicates. The mean-variance and mean-fold change relationships of the matrices revealed a closer approximation of biological reality. Conclusions Performance analysis refined the results from benchmarks published previously. We show that the Shrinkage t test (close to Limma) was the best of the methods tested, except when two replicates were examined, where the Regularized t test and the Window t test performed slightly better. Availability The R scripts used for the analysis are available at http://urbm-cluster.urbm.fundp.ac.be/~bdemeulder/.