Frontiers in Genetics (Feb 2016)
Estimation of cell-type composition including T and B cell subtypes for whole blood methylation microarray data
Abstract
DNA methylation levels vary markedly by cell-type makeup of a sample. Understanding these differences and estimating the cell-type makeup of a sample is an important aspect of studying DNA methylation. DNA from leukocytes in whole blood is simple to obtain and pervasive in research. However, leukocytes contain many distinct cell types and subtypes. We propose a two-stage model that estimates the proportions of 6 main cell types in whole blood (CD4+ T cells, CD8+ T cells, monocytes, B cells, granulocytes, and natural killer cells) as well as subtypes of T and B cells. Unlike previous methods that only estimate overall proportions of CD4+ T cell, CD8+ T cells, and B cells, our model is able to estimate proportions of naïve, memory, and regulatory CD4+ T cells as well as naïve and memory CD8+ T cells and naïve and memory B cells. Using real and simulated data, we are able to demonstrate that our model is able to reliably estimate proportions of these cell types and subtypes. In studies with DNA methylation data from Illumina’s HumanMethylation450k arrays, our estimates will be useful both for testing for associations of cell type and subtype composition with phenotypes of interest as well as for adjustment purposes to prevent confounding in epigenetic association studies. Additionally, our method can be easily adapted for use with whole genome bisulfite sequencing data or any other genome-wide methylation data platform.
Keywords