Genome Biology (Apr 2025)
Bayesian multi-study non-negative matrix factorization for mutational signatures
Abstract
Abstract Mutational signatures are typically identified from tumor genome sequencing data using non-negative matrix factorization (NMF). However, existing NMF techniques only decompose a single dataset, limiting rigorous comparisons of signatures across conditions. We propose a Bayesian NMF method that jointly decomposes multiple datasets to identify signatures and their sharing pattern across conditions. We propose a fully unsupervised “discovery-only” model and a semi-supervised “recovery-discovery” model that simultaneously estimates known and novel signatures, and extend both to estimate covariate effects. We demonstrate our approach on extensive simulations, and apply our method to answer questions related to colorectal cancer and early-onset breast cancer.
Keywords