Genome Biology (Sep 2021)

PAM-repeat associations and spacer selection preferences in single and co-occurring CRISPR-Cas systems

  • Jochem N. A. Vink,
  • Jan H. L. Baijens,
  • Stan J. J. Brouns

DOI
https://doi.org/10.1186/s13059-021-02495-9
Journal volume & issue
Vol. 22, no. 1
pp. 1 – 25

Abstract

Read online

Abstract Background The adaptive CRISPR-Cas immune system stores sequences from past invaders as spacers in CRISPR arrays and thereby provides direct evidence that links invaders to hosts. Mapping CRISPR spacers has revealed many aspects of CRISPR-Cas biology, including target requirements such as the protospacer adjacent motif (PAM). However, studies have so far been limited by a low number of mapped spacers in the database. Results By using vast metagenomic sequence databases, we map approximately one-third of more than 200,000 unique CRISPR spacers from a variety of microbes and derive a catalog of more than two hundred unique PAM sequences associated with specific CRISPR-Cas subtypes. These PAMs are further used to correctly assign the orientation of CRISPR arrays, revealing conserved patterns between the last nucleotides of the CRISPR repeat and PAM. We could also deduce CRISPR-Cas subtype-specific preferences for targeting either template or coding strand of open reading frames. While some DNA-targeting systems (type I-E and type II systems) prefer the template strand and avoid mRNA, other DNA- and RNA-targeting systems (types I-A and I-B and type III systems) prefer the coding strand and mRNA. In addition, we find large-scale evidence that both CRISPR-Cas adaptation machinery and CRISPR arrays are shared between different CRISPR-Cas systems. This could lead to simultaneous DNA and RNA targeting of invaders, which may be effective at combating mobile genetic invaders. Conclusions This study has broad implications for our understanding of how CRISPR-Cas systems work in a wide range of organisms for which only the genome sequence is known.