Rapid discrimination between deleterious and benign missense mutations in the CAGI 6 experiment

Eshel Faraggi; Robert L. Jernigan; Andrzej Kloczkowski

doi:10.1186/s40246-024-00655-z

Human Genomics (Aug 2024)

Rapid discrimination between deleterious and benign missense mutations in the CAGI 6 experiment

Eshel Faraggi,
Robert L. Jernigan,
Andrzej Kloczkowski

Affiliations

Eshel Faraggi: Research and Information Systems, LLC
Robert L. Jernigan: Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University
Andrzej Kloczkowski: The Steve and Cindy Rasmussen Institute for Genomic Medicine

DOI: https://doi.org/10.1186/s40246-024-00655-z
Journal volume & issue: Vol. 18, no. 1
pp. 1 – 7

Abstract

Read online

Abstract We describe the machine learning tool that we applied in the CAGI 6 experiment to predict whether single residue mutations in proteins are deleterious or benign. This tool was trained using only single sequences, i.e., without multiple sequence alignments or structural information. Instead, we used global characterizations of the protein sequence. Training and testing data for human gene mutations was obtained from ClinVar (ncbi.nlm.nih.gov/pub/ClinVar/), and for non-human gene mutations from Uniprot (www.uniprot.org). Testing was done on post-training data from ClinVar. This testing yielded high AUC and Matthews correlation coefficient (MCC) for well trained examples but low generalizability. For genes with either sparse or unbalanced training data, the prediction accuracy is poor. The resulting prediction server is available online at http://www.mamiris.com/Shoni.cagi6.

Published in Human Genomics

ISSN: 1479-7364 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine; Science: Biology (General): Genetics
Website: https://humgenomics.biomedcentral.com/

About the journal