A benchmark of computational methods for correcting biases of established and unknown origin in CRISPR-Cas9 screening data

Alessandro Vinceti; Raffaele M. Iannuzzi; Isabella Boyle; Lucia Trastulla; Catarina D. Campbell; Francisca Vazquez; Joshua M. Dempster; Francesco Iorio

doi:10.1186/s13059-024-03336-1

Genome Biology (Jul 2024)

A benchmark of computational methods for correcting biases of established and unknown origin in CRISPR-Cas9 screening data

Alessandro Vinceti,
Raffaele M. Iannuzzi,
Isabella Boyle,
Lucia Trastulla,
Catarina D. Campbell,
Francisca Vazquez,
Joshua M. Dempster,
Francesco Iorio

Affiliations

Alessandro Vinceti: Computational Biology Research Centre, Human Technopole
Raffaele M. Iannuzzi: Computational Biology Research Centre, Human Technopole
Isabella Boyle: Broad Institute of Harvard and MIT
Lucia Trastulla: Computational Biology Research Centre, Human Technopole
Catarina D. Campbell: Broad Institute of Harvard and MIT
Francisca Vazquez: Broad Institute of Harvard and MIT
Joshua M. Dempster: Broad Institute of Harvard and MIT
Francesco Iorio: Computational Biology Research Centre, Human Technopole

DOI: https://doi.org/10.1186/s13059-024-03336-1
Journal volume & issue: Vol. 25, no. 1
pp. 1 – 25

Abstract

Read online

Abstract Background CRISPR-Cas9 dropout screens are formidable tools for investigating biology with unprecedented precision and scale. However, biases in data lead to potential confounding effects on interpretation and compromise overall quality. The activity of Cas9 is influenced by structural features of the target site, including copy number amplifications (CN bias). More worryingly, proximal targeted loci tend to generate similar gene-independent responses to CRISPR-Cas9 targeting (proximity bias), possibly due to Cas9-induced whole chromosome-arm truncations or other genomic structural features and different chromatin accessibility levels. Results We benchmarked eight computational methods, rigorously evaluating their ability to reduce both CN and proximity bias in the two largest publicly available cell-line-based CRISPR-Cas9 screens to date. We also evaluated the capability of each method to preserve data quality and heterogeneity by assessing the extent to which the processed data allows accurate detection of true positive essential genes, established oncogenetic addictions, and known/novel biomarkers of cancer dependency. Our analysis sheds light on the ability of each method to correct biases under different scenarios. AC-Chronos outperforms other methods in correcting both CN and proximity biases when jointly processing multiple screens of models with available CN information, whereas CRISPRcleanR is the top performing method for individual screens or when CN information is not available. In addition, Chronos and AC-Chronos yield a final dataset better able to recapitulate known sets of essential and non-essential genes. Conclusions Overall, our investigation provides guidance for the selection of the most appropriate bias-correction method, based on its strengths, weaknesses and experimental settings.

Published in Genome Biology

ISSN: 1474-760X (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Science: Biology (General): Genetics
Website: https://genomebiology.biomedcentral.com/

About the journal