Frontiers in Molecular Biosciences (Sep 2022)

Pathogenic variation types in human genes relate to diseases through Pfam and InterPro mapping

  • Giulia Babbi,
  • Castrense Savojardo,
  • Davide Baldazzi,
  • Pier Luigi Martelli,
  • Rita Casadio,
  • Rita Casadio

DOI
https://doi.org/10.3389/fmolb.2022.966927
Journal volume & issue
Vol. 9

Abstract

Read online

Grouping residue variations in a protein according to their physicochemical properties allows a dimensionality reduction of all the possible substitutions in a variant with respect to the wild type. Here, by using a large dataset of proteins with disease-related and benign variations, as derived by merging Humsavar and ClinVar data, we investigate to which extent our physicochemical grouping procedure can help in determining whether patterns of variation types are related to specific groups of diseases and whether they occur in Pfam and/or InterPro gene domains. Here, we download 75,145 germline disease-related and benign variations of 3,605 genes, group them according to physicochemical categories and map them into Pfam and InterPro gene domains. Statistically validated analysis indicates that each cluster of genes associated to Mondo anatomical system categorizations is characterized by a specific variation pattern. Patterns identify specific Pfam and InterPro domain–Mondo category associations. Our data suggest that the association of variation patterns to Mondo categories is unique and may help in associating gene variants to genetic diseases. This work corroborates in a much larger data set previous observations from our group.

Keywords