Scientific Reports (Feb 2024)
New generative methods for single-cell transcriptome data in bulk RNA sequence deconvolution
Abstract
Abstract Numerous methods for bulk RNA sequence deconvolution have been developed to identify cellular targets of diseases by understanding the composition of cell types in disease-related tissues. However, issues of heterogeneity in gene expression between subjects and the shortage of reference single-cell RNA sequence data remain to achieve accurate bulk deconvolution. In our study, we investigated whether a new data generative method named sc-CMGAN and benchmarking generative methods (Copula, CTGAN and TVAE) could solve these issues and improve the bulk deconvolutions. We also evaluated the robustness of sc-CMGAN using three deconvolution methods and four public datasets. In almost all conditions, the generative methods contributed to improved deconvolution. Notably, sc-CMGAN outperformed the benchmarking methods and demonstrated higher robustness. This study is the first to examine the impact of data augmentation on bulk deconvolution. The new generative method, sc-CMGAN, is expected to become one of the powerful tools for the preprocessing of bulk deconvolution.
Keywords