A data-driven approach for constructing mutation categories for mutational signature analysis.

Gal Gilad; Mark D M Leiserson; Roded Sharan

doi:10.1371/journal.pcbi.1009542

PLoS Computational Biology (Oct 2021)

A data-driven approach for constructing mutation categories for mutational signature analysis.

Gal Gilad,
Mark D M Leiserson,
Roded Sharan

Affiliations

Gal Gilad
Mark D M Leiserson
Roded Sharan

DOI: https://doi.org/10.1371/journal.pcbi.1009542
Journal volume & issue: Vol. 17, no. 10
p. e1009542

Abstract

Read online

Mutational processes shape the genomes of cancer patients and their understanding has important applications in diagnosis and treatment. Current modeling of mutational processes by identifying their characteristic signatures views each base substitution in a limited context of a single flanking base on each side. This context definition gives rise to 96 categories of mutations that have become the standard in the field, even though wider contexts have been shown to be informative in specific cases. Here we propose a data-driven approach for constructing a mutation categorization for mutational signature analysis. Our approach is based on the assumption that tumor cells that are exposed to similar mutational processes, show similar expression levels of DNA damage repair genes that are involved in these processes. We attempt to find a categorization that maximizes the agreement between mutation and gene expression data, and show that it outperforms the standard categorization over multiple quality measures. Moreover, we show that the categorization we identify generalizes to unseen data from different cancer types, suggesting that mutation context patterns extend beyond the immediate flanking bases.

Published in PLoS Computational Biology

ISSN: 1553-734X (Print); 1553-7358 (Online)
Publisher: Public Library of Science (PLoS)
Country of publisher: United States
LCC subjects: Science: Biology (General)
Website: https://journals.plos.org/ploscompbiol/

About the journal