BMC Bioinformatics (Dec 2009)
Identifying differentially methylated genes using mixed effect and generalized least square models
Abstract
Abstract Background DNA methylation plays an important role in the process of tumorigenesis. Identifying differentially methylated genes or CpG islands (CGIs) associated with genes between two tumor subtypes is thus an important biological question. The methylation status of all CGIs in the whole genome can be assayed with differential methylation hybridization (DMH) microarrays. However, patient samples or cell lines are heterogeneous, so their methylation pattern may be very different. In addition, neighboring probes at each CGI are correlated. How these factors affect the analysis of DMH data is unknown. Results We propose a new method for identifying differentially methylated (DM) genes by identifying the associated DM CGI(s). At each CGI, we implement four different mixed effect and generalized least square models to identify DM genes between two groups. We compare four models with a simple least square regression model to study the impact of incorporating random effects and correlations. Conclusions We demonstrate that the inclusion (or exclusion) of random effects and the choice of correlation structures can significantly affect the results of the data analysis. We also assess the false discovery rate of different models using CGIs associated with housekeeping genes.