Frontiers in Genetics (Nov 2021)
Construction of a Support Vector Machine–Based Classifier for Pulmonary Arterial Hypertension Patients
Abstract
Pulmonary arterial hypertension (PAH) is a disease leading to right heart failure and death due to increased pulmonary arterial tension and vascular resistance. So far, PAH has not been fully understood, and current treatments are much limited. Gene expression profiles of healthy people and PAH patients in GSE33463 dataset were analyzed in this study. Then 110 differentially expressed genes (DEGs) were obtained. Afterward, the PPI network based on DEGs was constructed, followed by the analysis of functional modules, whose results showed that the genes in the major function modules significantly enriched in immune-related functions. Moreover, four optimal feature genes were screened from the DEGs by support vector machine–recursive feature elimination (SVM-RFE) algorithm (EPB42, IFIT2, FOSB, and SNF1LK). The receiver operating characteristic curve showed that the SVM classifier based on optimal feature genes could effectively distinguish healthy people from PAH patients. Last, the expression of optimal feature genes was analyzed in the GSE33463 dataset and clinical samples. It was found that EPB42 and IFIT2 were highly expressed in PAH patients, while FOSB and SNF1LK were lowly expressed. In conclusion, the four optimal feature genes screened here are potential biomarkers for PAH and are expected to be used in early diagnosis for PAH.
Keywords