Genome Medicine (Aug 2023)

Sequence dependencies and mutation rates of localized mutational processes in cancer

  • Gustav Alexander Poulsgaard,
  • Simon Grund Sørensen,
  • Randi Istrup Juul,
  • Morten Muhlig Nielsen,
  • Jakob Skou Pedersen

DOI
https://doi.org/10.1186/s13073-023-01217-z
Journal volume & issue
Vol. 15, no. 1
pp. 1 – 19

Abstract

Read online

Abstract Background Cancer mutations accumulate through replication errors and DNA damage coupled with incomplete repair. Individual mutational processes often show nucleotide sequence and functional region preferences. As a result, some sequence contexts mutate at much higher rates than others, with additional variation found between functional regions. Mutational hotspots, with recurrent mutations across cancer samples, represent genomic positions with elevated mutation rates, often caused by highly localized mutational processes. Methods We count the 11-mer genomic sequences across the genome, and using the PCAWG set of 2583 pan-cancer whole genomes, we associate 11-mers with mutational signatures, hotspots of single nucleotide variants, and specific genomic regions. We evaluate the mutation rates of individual and combined sets of 11-mers and derive mutational sequence motifs. Results We show that hotspots generally identify highly mutable sequence contexts. Using these, we show that some mutational signatures are enriched in hotspot sequence contexts, corresponding to well-defined sequence preferences for the underlying localized mutational processes. This includes signature 17b (of unknown etiology) and signatures 62 (POLE deficiency), 7a (UV), and 72 (linked to lymphomas). In some cases, the mutation rate and sequence preference increase further when focusing on certain genomic regions, such as signature 62 in transcribed regions, where the mutation rate is increased up to 9-folds over cancer type and mutational signature average. Conclusions We summarize our findings in a catalog of localized mutational processes, their sequence preferences, and their estimated mutation rates.

Keywords