Scientific Reports (Mar 2022)

Discriminant analysis and binary logistic regression enable more accurate prediction of autism spectrum disorder than principal component analysis

  • Wail M. Hassan,
  • Abeer Al-Dbass,
  • Laila Al-Ayadhi,
  • Ramesa Shafi Bhat,
  • Afaf El-Ansary

DOI
https://doi.org/10.1038/s41598-022-07829-6
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by impaired social interaction and restricted, repetitive behavior. Multiple studies have suggested mitochondrial dysfunction, glutamate excitotoxicity, and impaired detoxification mechanism as accepted etiological mechanisms of ASD that can be targeted for therapeutic intervention. In the current study, blood samples were collected from 40 people with autism and 40 control participants after informed consent and full approval from the Institutional Review Board of King Saud University. Sodium (Na+), Potassium (K+), lactate dehydrogenase (LDH), glutathione-s-transferase (GST), and mitochondrial respiratory chain complex I (MRC1) were measured in plasma of both groups. Predictive models were established to discriminate individuals with ASD from controls. The predictive power of these five variables, individually and in combination, was compared using the area under a ROC curve (AUC). We compared the performance of principal component analysis (PCA), discriminant analysis (DA), and binary logistic regression (BLR) as ways to combine single variables and create the predictive models. K+ had the highest AUC (0.801) of any single variable, followed by GST, LDH, Na+, and MRC1, respectively. Combining the five variables resulted in higher AUCs than those obtained using single variables across all models. Both DA and BLR were superior to PCA and comparable to each other. In our study, the combination of Na+, K+, LDH, GST, and MRC1 showed the highest promise in discriminating individuals with autism from controls. These results provide a platform that can potentially be used to verify the efficacy of our models with a larger sample size or evaluate other biomarkers.