Heliyon (Sep 2024)
Identification and experimental validation of key genes in osteoarthritis based on machine learning algorithms and single-cell sequencing analysis
Abstract
Purpose: Osteoarthritis (OA) is a prevalent cause of disability in older adults. Identifying diagnostic markers for OA is essential for elucidating its mechanisms and facilitating early diagnosis. Methods: We analyzed 53 synovial tissue samples (n = 30 for OA, n = 23 for the control group) from two datasets in the Gene Express Omnibus (GEO) database. We identified differentially expressed genes (DEGs) between the groups and applied dimensionality reduction using six machine learning algorithms to pinpoint characteristic genes (key genes). We classified the OA samples into subtypes based on these key genes and explored the differences in biological functions and immune characteristics among subtypes, as well as the roles of the key genes. Additionally, we constructed a protein-protein interaction network to predict small molecules that target these genes. Further, we accessed synovial tissue sample data from the single-cell RNA dataset GSE152805, categorized the cells into various types, and examined variations in gene expression and their correlation with OA progression. Validation of key gene expression was conducted in cellular experiments using the qPCR method. Results: Four genes AGMAT, MAP3K8, PER1, and XIST, were identified as characteristic genes of OA. All can independently predict the occurrence of OA. With these genes, the OA samples can be clustered into two subtypes, which showed significant differences in functional pathways and immune infiltration. Eight cell types were obtained by analyzing the single-cell RNA data, with synovial intimal fibroblasts (SIF) accounting for the highest proportion in each sample. The key genes were found over-expressed in SIF and significantly correlated with OA progression and the content of immune cells (ICs). We validated the relative levels of key genes in OA and normal cartilage tissue cells, which showed an expression trend consistency with the bioinformatics result except for XIST. Conclusion: Four genes, AGMAT, MAP3K8, PER1, and XIST are closely related to the progression of OA, and play as diagnostic and predictive markers in early OA.