BMC Bioinformatics (Apr 2006)
Assessment of the relationship between pre-chip and post-chip quality measures for Affymetrix GeneChip expression data
Abstract
Abstract Background Gene expression microarray experiments are expensive to conduct and guidelines for acceptable quality control at intermediate steps before and after the samples are hybridised to chips are vague. We conducted an experiment hybridising RNA from human brain to 117 U133A Affymetrix GeneChips and used these data to explore the relationship between 4 pre-chip variables and 22 post-chip outcomes and quality control measures. Results We found that the pre-chip variables were significantly correlated with each other but that this correlation was strongest between measures of RNA quality and cRNA yield. Post-mortem interval was negatively correlated with these variables. Four principal components, reflecting array outliers, array adjustment, hybridisation noise and RNA integrity, explain about 75% of the total post-chip measure variability. Two significant canonical correlations existed between the pre-chip and post-chip variables, derived from MAS 5.0, dChip and the Bioconductor packages affy and affyPLM. The strongest (CANCOR 0.838, p Conclusion We have found that the post-chip variables having the strongest association with quantities measurable before hybridisation are those reflecting RNA integrity. Other aspects of quality, such as noise measures (reflecting the execution of the assay) or measures reflecting data quality (outlier status and array adjustment variables) are not well predicted by the variables we were able to determine ahead of time. There could be other variables measurable pre-hybridisation which may be better associated with expression data quality measures. Uncovering such connections could create savings on costly microarray experiments by eliminating poor samples before hybridisation.