BMC Genomics (Nov 2024)
A catalog of gene editing sites and genetic variations in editing sites in model organisms
Abstract
Abstract Background CRISPR-Cas systems require a protospacer adjacent motif (PAM), which plays an essential role in self/non-self discrimination in their natural context, to cleave DNA for genome editing. Unfortunately, common genetic variation is distributed throughout genomes, which can block recognition of target sites by Cas proteins. However, little information is available about the distribution of editing sites in model organisms and how often common variation overlaps with those PAM sites. Results Herein, we characterized six representative Cas proteins (Cas9, Cas12a, Cas12b, Cas12i, Cas12j and Cas12l) genomic editing sites in ten model organisms (yeast, flatworms, flies, zebrafish, mice, humans, rice, maize, Arabidopsis and tomato). We demonstrated that there were more than 34 editing sites per kilobase on average in these genomes. In each genome, 91.69–99.83% and 95.4–99.73% of genes had at least one unique editing site in exon and promoter, respectively. Depending on publicly available genomic diversity data, we identified the variations (SNPs and InDels) in editing sites in humans and rice, indicating the risk in the application of CRISPR/Cas technology. Finally, using CCR5 and BCL11A as examples, we revealed variation site was a factor that must be considered when designing sgRNA. Conclusions Our findings not only revealed the distribution characteristics of editing sites of six representative Cas proteins in ten model organism genomes but also shed light on the adverse effect of variation sites on target site recognition. Our current work will serve as a reminder of the risks of CRISPR application.
Keywords