International Journal of Genomics (Jan 2023)
Identification of Biomarkers Associated with Heart Failure Caused by Idiopathic Dilated Cardiomyopathy Using WGCNA and Machine Learning Algorithms
Abstract
Background. The genetic factors and pathogenesis of idiopathic dilated cardiomyopathy-induced heart failure (IDCM-HF) have not been understood thoroughly; there is a lack of specific diagnostic markers and treatment methods for the disease. Hence, we aimed to identify the mechanisms of action at the molecular level and potential molecular markers for this disease. Methods. Gene expression profiles of IDCM-HF and non-heart failure (NF) specimens were acquired from the database of Gene Expression Omnibus (GEO). We then identified the differentially expressed genes (DEGs) and analyzed their functions and related pathways by using “Metascape”. Weighted gene co-expression network analysis (WGCNA) was utilized to search for key module genes. Candidate genes were identified by intersecting the key module genes identified via WGCNA with DEGs and further screened via the support vector machine-recursive feature elimination (SVM-RFE) method and the least absolute shrinkage and selection operator (LASSO) algorithm. At last, the biomarkers were validated and evaluated the diagnostic efficacy by the area under curve (AUC) value and further confirmed the differential expression in the IDCM-HF and NF groups using an external database. Results. We detected 490 genes exhibiting differential expression between IDCM-HF and NF specimens from the GSE57338 dataset, with most of them being concentrated in the extracellular matrix (ECM) of cells related to biological processes and pathways. After screening, 13 candidate genes were identified. Aquaporin 3 (AQP3) and cytochrome P450 2J2 (CYP2J2) showed high diagnostic efficacy in the GSE57338 and GSE6406 datasets, respectively. In comparison to the NF group, AQP3 was significantly down-regulated in the IDCM-HF group, while CYP2J2 was significantly up-regulated. Conclusion. As far as we know, this is the first study that combines WGCNA and machine learning algorithms to screen for potential biomarkers of IDCM-HF. Our findings suggest that AQP3 and CYP2J2 could be used as novel diagnostic markers and treatment targets of IDCM-HF.