BMC Bioinformatics (Jul 2011)

Ranking insertion, deletion and nonsense mutations based on their effect on genetic information

  • Moses Alan M,
  • Zia Amin

DOI
https://doi.org/10.1186/1471-2105-12-299
Journal volume & issue
Vol. 12, no. 1
p. 299

Abstract

Read online

Abstract Background Genetic variations contribute to normal phenotypic differences as well as diseases, and new sequencing technologies are greatly increasing the capacity to identify these variations. Given the large number of variations now being discovered, computational methods to prioritize the functional importance of genetic variations are of growing interest. Thus far, the focus of computational tools has been mainly on the prediction of the effects of amino acid changing single nucleotide polymorphisms (SNPs) and little attention has been paid to indels or nonsense SNPs that result in premature stop codons. Results We propose computational methods to rank insertion-deletion mutations in the coding as well as non-coding regions and nonsense mutations. We rank these variations by measuring the extent of their effect on biological function, based on the assumption that evolutionary conservation reflects function. Using sequence data from budding yeast and human, we show that variations which that we predict to have larger effects segregate at significantly lower allele frequencies, and occur less frequently than expected by chance, indicating stronger purifying selection. Furthermore, we find that insertions, deletions and premature stop codons associated with disease in the human have significantly larger predicted effects than those not associated with disease. Interestingly, the large-effect mutations associated with disease show a similar distribution of predicted effects to that expected for completely random mutations. Conclusions This demonstrates that the evolutionary conservation context of the sequences that harbour insertions, deletions and nonsense mutations can be used to predict and rank the effects of the mutations.