BMC Bioinformatics (Jun 2019)

In silico analysis of missense mutations in exons 1–5 of the F9 gene that cause hemophilia B

  • Lennon Meléndez-Aranda,
  • Ana Rebeca Jaloma-Cruz,
  • Nina Pastor,
  • Marina María de Jesús Romero-Prado

DOI
https://doi.org/10.1186/s12859-019-2919-x
Journal volume & issue
Vol. 20, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Background Missense mutations in the first five exons of F9, which encodes factor FIX, represent 40% of all mutations that cause hemophilia B. To address the ongoing debate regarding in silico identification of disease-causing mutations at these exons, we analyzed 215 missense mutations from www.factorix.org using six in silico prediction tools, which are the most common used programs for analysis prediction of impact of mutations on the protein structure and function, with further advantage of using similar approaches. We developed different algorithms to integrate multiple predictions from such tools. In order to approach a structural analysis on FIX we performed a modeling of five selected pathogenic mutations. Results SIFT, PolyPhen-2 HumDiv, SNAP2, and MutationAssessor were the most successful in identifying true non-causative and causative mutations. A proposed function integrating these algorithms (wgP4) was the most sensitive (90.1%), specific (22.6%), and accurate (87%) than similar functions, and identified 187 variants as deleterious. Clinical phenotype was significantly associated with predicted causative mutations at all five exons. However, PolyPhen-2 HumDiv was more successful in linking clinical severity to specific exons, while functions that integrate 4–6 predictions were more successful in linking phenotype to genotypes at the light chain (exons 3–5). The most important value of integrating multiple predictions is the inclusion of scores derived from different approaches. Modeling of protein structure showed the effects of pathogenic nsSNPs on structure and function of FIX. Conclusions A simple function that integrates information from different in silico programs yields the best prediction of mutated phenotypes. However, the specificity, sensitivity, and accuracy of genotype-phenotype predictions depend on specific characteristics of the protein domain and the disease of interest as we validated by the structural analysis of selected pathogenic F9 mutations. The proposed function integrating algorithm (wgP4) might be useful for the analysis of nsSNPs impact on other genes.

Keywords