Human Genomics (Mar 2023)

Allelic phenotype prediction of phenylketonuria based on the machine learning method

  • Yang Fang,
  • Jinshuang Gao,
  • Yaqing Guo,
  • Xiaole Li,
  • Enwu Yuan,
  • Erfeng Yuan,
  • Liying Song,
  • Qianqian Shi,
  • Haiyang Yu,
  • Dehua Zhao,
  • Linlin Zhang

DOI
https://doi.org/10.1186/s40246-023-00481-9
Journal volume & issue
Vol. 17, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Background Phenylketonuria (PKU) is caused by mutations in the phenylalanine hydroxylase (PAH) gene. Our study aimed to predict the phenotype using the allelic genotype. Methods A total of 1291 PKU patients with 623 various variants were used as the training dataset for predicting allelic phenotypes. We designed a common machine learning framework to predict allelic genotypes associated with the phenotype. Results We identified 235 different mutations and 623 various allelic genotypes. The features extracted from the structure of mutations and graph properties of the PKU network to predict the phenotype of PKU were named PPML (PKU phenotype predicted by machine learning). The phenotype of PKU was classified into three different categories: classical PKU (cPKU), mild PKU (mPKU) and mild hyperphenylalaninemia (MHP). Three hub nodes (c.728G>A for cPKU, c.721 for mPKU and c.158G>A for HPA) were used as each classification center, and 5 node attributes were extracted from the network graph for machine learning training features. The area under the ROC curve was AUC = 0.832 for cPKU, AUC = 0.678 for mPKU and AUC = 0.874 for MHP. This suggests that PPML is a powerful method to predict allelic phenotypes in PKU and can be used for genetic counseling of PKU families. Conclusions The web version of PPML predicts PKU allele classification supported by applicable real cases and prediction results. It is an online database that can be used for PKU phenotype prediction http://www.bioinfogenetics.info/PPML/ .

Keywords