PLoS ONE (Jan 2010)
Intertwining threshold settings, biological data and database knowledge to optimize the selection of differentially expressed genes from microarray.
Abstract
BACKGROUND: Many tools used to analyze microarrays in different conditions have been described. However, the integration of deregulated genes within coherent metabolic pathways is lacking. Currently no objective selection criterion based on biological functions exists to determine a threshold demonstrating that a gene is indeed differentially expressed. METHODOLOGY/PRINCIPAL FINDINGS: To improve transcriptomic analysis of microarrays, we propose a new statistical approach that takes into account biological parameters. We present an iterative method to optimise the selection of differentially expressed genes in two experimental conditions. The stringency level of gene selection was associated simultaneously with the p-value of expression variation and the occurrence rate parameter associated with the percentage of donors whose transcriptomic profile is similar. Our method intertwines stringency level settings, biological data and a knowledge database to highlight molecular interactions using networks and pathways. Analysis performed during iterations helped us to select the optimal threshold required for the most pertinent selection of differentially expressed genes. CONCLUSIONS/SIGNIFICANCE: We have applied this approach to the well documented mechanism of human macrophage response to lipopolysaccharide stimulation. We thus verified that our method was able to determine with the highest degree of accuracy the best threshold for selecting genes that are truly differentially expressed.