Nature Communications (Jul 2023)

Multi-batch single-cell comparative atlas construction by deep learning disentanglement

  • Allen W. Lynch,
  • Myles Brown,
  • Clifford A. Meyer

DOI
https://doi.org/10.1038/s41467-023-39494-2
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 22

Abstract

Read online

Abstract Cell state atlases constructed through single-cell RNA-seq and ATAC-seq analysis are powerful tools for analyzing the effects of genetic and drug treatment-induced perturbations on complex cell systems. Comparative analysis of such atlases can yield new insights into cell state and trajectory alterations. Perturbation experiments often require that single-cell assays be carried out in multiple batches, which can introduce technical distortions that confound the comparison of biological quantities between different batches. Here we propose CODAL, a variational autoencoder-based statistical model which uses a mutual information regularization technique to explicitly disentangle factors related to technical and biological effects. We demonstrate CODAL’s capacity for batch-confounded cell type discovery when applied to simulated datasets and embryonic development atlases with gene knockouts. CODAL improves the representation of RNA-seq and ATAC-seq modalities, yields interpretable modules of biological variation, and enables the generalization of other count-based generative models to multi-batched data.