Informatics in Medicine Unlocked (Jan 2019)
Computational assessment of somatic and germline mutations of p16INK4a: Structural insights and implications in disease
Abstract
The vast amounts of genomic data available today often overshadow findings that were laboriously uncovered decades ago. A bottleneck exists between the computational study of pathogenic variants and the pathogenicity of novel variants reported in the literature before the age of computers. With the decline in fatality from infectious diseases, and the propagation of industrialization, cancer has become a major ailment in the modern world. We pooled a large set of mutational data for germline and somatic mutations of CDKN2A. The gene encodes p16INK4a, which we found to be one of the most frequently altered regulatory proteins in most cancers. High frequency somatic mutations in cancer samples, and common germline variants of p16, were thoroughly sorted. Using a number of reliable computational tools, an assessment of the pathogenicity of these variants was made. The structural properties of the mutants in relation to pathogenic nature was evaluated. A total of 295 missense variants were sorted from over 5000 available variants. Using a combination of eight pathogenicity prediction tools, a hierarchy of pathogenic p16 missense mutants was created. The global incidence and geographical distribution of germline p16 missense variants was observed. Taking into consideration the frequency of repeating somatic alterations in different samples and global ordination of variant alleles, we found 23 missense mutants to be of special interest. Structural insights and homology between other members of the protein family revealed a number of structural considerations to take into account regarding these harmful substitutions. Keywords: p16INK4a, Databases, Bioinformatics, Mutations, Cancer, Pathogenicity