Human Genomics (Sep 2024)

AI-derived comparative assessment of the performance of pathogenicity prediction tools on missense variants of breast cancer genes

  • Rahaf M. Ahmad,
  • Bassam R. Ali,
  • Fatma Al-Jasmi,
  • Noura Al Dhaheri,
  • Saeed Al Turki,
  • Praseetha Kizhakkedath,
  • Mohd Saberi Mohamad

DOI
https://doi.org/10.1186/s40246-024-00667-9
Journal volume & issue
Vol. 18, no. 1
pp. 1 – 16

Abstract

Read online

Abstract Single nucleotide variants (SNVs) can exert substantial and extremely variable impacts on various cellular functions, making accurate predictions of their consequences challenging, albeit crucial especially in clinical settings such as in oncology. Laboratory-based experimental methods for assessing these effects are time-consuming and often impractical, highlighting the importance of in-silico tools for variant impact prediction. However, the performance metrics of currently available tools on breast cancer missense variants from benchmarking databases have not been thoroughly investigated, creating a knowledge gap in the accurate prediction of pathogenicity. In this study, the benchmarking datasets ClinVar and HGMD were used to evaluate 21 Artificial Intelligence (AI)-derived in-silico tools. Missense variants in breast cancer genes were extracted from ClinVar and HGMD professional v2023.1. The HGMD dataset focused on pathogenic variants only, to ensure balance, benign variants for the same genes were included from the ClinVar database. Interestingly, our analysis of both datasets revealed variants across genes with varying penetrance levels like low and moderate in addition to high, reinforcing the value of disease-specific tools. The top-performing tools on ClinVar dataset identified were MutPred (Accuracy = 0.73), Meta-RNN (Accuracy = 0.72), ClinPred (Accuracy = 0.71), Meta-SVM, REVEL, and Fathmm-XF (Accuracy = 0.70). While on HGMD dataset they were ClinPred (Accuracy = 0.72), MetaRNN (Accuracy = 0.71), CADD (Accuracy = 0.69), Fathmm-MKL (Accuracy = 0.68), and Fathmm-XF (Accuracy = 0.67). These findings offer clinicians and researchers valuable insights for selecting, improving, and developing effective in-silico tools for breast cancer pathogenicity prediction. Bridging this knowledge gap contributes to advancing precision medicine and enhancing diagnostic and therapeutic approaches for breast cancer patients with potential implications for other conditions.

Keywords