PLoS ONE (Jan 2019)
Insights from deconvolution of cell subtype proportions enhance the interpretation of functional genomic data.
Abstract
Cell subtype proportion variability between samples contributes significantly to the variation of functional genomic properties such as gene expression or DNA methylation. Although the impact of the variation of cell subtype composition on measured genomic quantities is recognized, and some innovative tools have been developed for the analysis of heterogeneous samples, most functional genomics studies using samples with mixed cell types still ignore the influence of cell subtype proportion variation, or just deal with it as a nuisance variable to be eliminated. Here we demonstrate how harvesting information about cell subtype proportions from functional genomics data can provide insights into cellular changes associated with phenotypes. We focused on two types of mixed cell populations, human blood and mouse kidney. Cell type prediction is well developed in the former, but not currently in the latter. Estimating the cellular repertoire is easier when a reference dataset from purified samples of all cell types in the tissue is available, as is the case for blood. However, reference datasets are not available for most other tissues, such as the kidney. In this study, we showed that the proportion of alterations attributable to changes in the cellular composition varies strikingly in the two disorders (asthma and systemic lupus erythematosus), suggesting that the contribution of cell subtype proportion changes to functional genomic properties can be disease-specific. We also showed that a reference dataset from a single-cell RNA-seq study successfully estimated the cell subtype proportions in mouse kidney and allowed us to distinguish altered cell subtype differences between two different knock-out mouse models, both of which had reported a reduced number of glomeruli compared to their wild-type counterparts. These findings demonstrate that testing for changes in cell subtype proportions between conditions can yield important insights in functional genomics studies.