Translational Psychiatry (Jun 2024)

A novel blood-based epigenetic biosignature in first-episode schizophrenia patients through automated machine learning

  • Makrina Karaglani,
  • Agorastos Agorastos,
  • Maria Panagopoulou,
  • Eleni Parlapani,
  • Panagiotis Athanasis,
  • Panagiotis Bitsios,
  • Konstantina Tzitzikou,
  • Theodosis Theodosiou,
  • Ioannis Iliopoulos,
  • Vasilios-Panteleimon Bozikas,
  • Ekaterini Chatzaki

DOI
https://doi.org/10.1038/s41398-024-02946-4
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Schizophrenia (SCZ) is a chronic, severe, and complex psychiatric disorder that affects all aspects of personal functioning. While SCZ has a very strong biological component, there are still no objective diagnostic tests. Lately, special attention has been given to epigenetic biomarkers in SCZ. In this study, we introduce a three-step, automated machine learning (AutoML)-based, data-driven, biomarker discovery pipeline approach, using genome-wide DNA methylation datasets and laboratory validation, to deliver a highly performing, blood-based epigenetic biosignature of diagnostic clinical value in SCZ. Publicly available blood methylomes from SCZ patients and healthy individuals were analyzed via AutoML, to identify SCZ-specific biomarkers. The methylation of the identified genes was then analyzed by targeted qMSP assays in blood gDNA of 30 first-episode drug-naïve SCZ patients and 30 healthy controls (CTRL). Finally, AutoML was used to produce an optimized disease-specific biosignature based on patient methylation data combined with demographics. AutoML identified a SCZ-specific set of novel gene methylation biomarkers including IGF2BP1, CENPI, and PSME4. Functional analysis investigated correlations with SCZ pathology. Methylation levels of IGF2BP1 and PSME4, but not CENPI were found to differ, IGF2BP1 being higher and PSME4 lower in the SCZ group as compared to the CTRL group. Additional AutoML classification analysis of our experimental patient data led to a five-feature biosignature including all three genes, as well as age and sex, that discriminated SCZ patients from healthy individuals [AUC 0.755 (0.636, 0.862) and average precision 0.758 (0.690, 0.825)]. In conclusion, this three-step pipeline enabled the discovery of three novel genes and an epigenetic biosignature bearing potential value as promising SCZ blood-based diagnostics.