Frontiers in Bioinformatics (Feb 2023)

Protein domains provide a new layer of information for classifying human variations in rare diseases

  • Mélanie Corcuff,
  • Marc Garibal,
  • Jean-Pierre Desvignes,
  • Céline Guien,
  • Coralie Grattepanche,
  • Gwenaëlle Collod-Béroud,
  • Estelle Ménoret,
  • David Salgado,
  • Christophe Béroud,
  • Christophe Béroud

DOI
https://doi.org/10.3389/fbinf.2023.1127341
Journal volume & issue
Vol. 3

Abstract

Read online

Introduction: Using the ACMG-AMP guidelines for the interpretation of sequence variants, it remains difficult to meet the criterion associated with the protein domain, PM1, which is assigned in only about 10% of cases, whereas the criteria related to variant frequency, PM2/BA1/BS1, is reported in 50% of cases. To improve the classification of human missense variants using protein domains information, we developed the DOLPHIN system (https://dolphin.mmg-gbit.eu).Methods: We used Pfam alignments of eukaryotes to define DOLPHIN scores to identify protein domain residues and variants that have a significant impact. In parallel, we enriched gnomAD variants frequencies for each domains’ residue. These were validated using ClinVar data.Results: We applied this method to all potential human transcripts’ variants, resulting in 30.0% being assigned a PM1 label, whereas 33.2% were eligible for a new benign support criterion, BP8. We also showed that DOLPHIN provides an extrapolated frequency for 31.8% of the variants, compared to the original frequency available in gnomAD for 7.6% of them.Discussion: Overall, DOLPHIN allows a simplified use of the PM1 criterion, an expanded application of the PM2/BS1 criteria and the creation of a new BP8 criterion. DOLPHIN could facilitate the classification of amino acid substitutions in protein domains that cover nearly 40% of proteins and represent the sites of most pathogenic variants.

Keywords