Genome Biology (Jun 2020)
Whole-genome sequencing of glioblastoma reveals enrichment of non-coding constraint mutations in known and novel genes
Abstract
Abstract Background Glioblastoma (GBM) has one of the worst 5-year survival rates of all cancers. While genomic studies of the disease have been performed, alterations in the non-coding regulatory regions of GBM have largely remained unexplored. We apply whole-genome sequencing (WGS) to identify non-coding mutations, with regulatory potential in GBM, under the hypothesis that regions of evolutionary constraint are likely to be functional, and somatic mutations are likely more damaging than in unconstrained regions. Results We validate our GBM cohort, finding similar copy number aberrations and mutated genes based on coding mutations as previous studies. Performing analysis on non-coding constraint mutations and their position relative to nearby genes, we find a significant enrichment of non-coding constraint mutations in the neighborhood of 78 genes that have previously been implicated in GBM. Among them, SEMA3C and DYNC1I1 show the highest frequencies of alterations, with multiple mutations overlapping transcription factor binding sites. We find that a non-coding constraint mutation in the SEMA3C promoter reduces the DNA binding capacity of the region. We also identify 1776 other genes enriched for non-coding constraint mutations with likely regulatory potential, providing additional candidate GBM genes. The mutations in the top four genes, DLX5, DLX6, FOXA1, and ISL1, are distributed over promoters, UTRs, and multiple transcription factor binding sites. Conclusions These results suggest that non-coding constraint mutations could play an essential role in GBM, underscoring the need to connect non-coding genomic variation to biological function and disease pathology.
Keywords