PLoS Computational Biology (Oct 2022)

Variant calling enhances the identification of cancer cells in single-cell RNA sequencing data.

  • William Gasper,
  • Francesca Rossi,
  • Matteo Ligorio,
  • Dario Ghersi

DOI
https://doi.org/10.1371/journal.pcbi.1010576
Journal volume & issue
Vol. 18, no. 10
p. e1010576

Abstract

Read online

Single-cell RNA-sequencing is an invaluable research tool that allows for the investigation of gene expression in heterogeneous cancer cell populations in ways that bulk RNA-seq cannot. However, normal (i.e., non tumor) cells in cancer samples have the potential to confound the downstream analysis of single-cell RNA-seq data. Existing methods for identifying cancer and normal cells include copy number variation inference, marker-gene expression analysis, and expression-based clustering. This work aims to extend the existing approaches for identifying cancer cells in single-cell RNA-seq samples by incorporating variant calling and the identification of putative driver alterations. We found that putative driver alterations can be detected in single-cell RNA-seq data obtained with full-length transcript technologies and noticed that a subset of cells in tumor samples are enriched for putative driver alterations as compared to normal cells. Furthermore, we show that the number of putative driver alterations and inferred copy number variation are not correlated in all samples. Taken together, our findings suggest that augmenting existing cancer-cell filtering methods with variant calling and analysis can increase the number of tumor cells that can be confidently included in downstream analyses of single-cell full-length transcript RNA-seq datasets.