Frontiers in Genetics (Nov 2022)
Construction of the model for predicting prognosis by key genes regulating EGFR-TKI resistance
Abstract
Background: Previous studies have suggested that patients with lung adenocarcinoma (LUAD) will significantly benefit from epidermal growth factor receptor tyrosine kinase inhibitors (EGFR-TKI). However, many LUAD patients will develop resistance to EGFR-TKI. Thus, our study aims to develop models to predict EGFR-TKI resistance and the LUAD prognosis.Methods: Two Gene Expression Omnibus (GEO) datasets (GSE31625 and GSE34228) were used as the discovery datasets to find the common differentially expressed genes (DEGs) in EGFR-TKI resistant LUAD profiles. The association of these common DEGs with LUAD prognosis was investigated in The Cancer Genome Atlas (TCGA) database. Moreover, we constructed the risk score for prognosis prediction of LUAD by LASSO analysis. The performance of the risk score for predicting LUAD prognosis was calculated using an independent dataset (GSE37745). A random forest model by risk score genes was trained in the training dataset, and the diagnostic ability for distinguishing sensitive and EGFR-TKI resistant samples was validated in the internal testing dataset and external testing datasets (GSE122005, GSE80344, and GSE123066).Results: From the discovery datasets, 267 common upregulated genes and 374 common downregulated genes were identified. Among these common DEGs, there were 59 genes negatively associated with prognosis, while 21 genes exhibited positive correlations with prognosis. Eight genes (ABCC2, ARL2BP, DKK1, FUT1, LRFN4, PYGL, SMNDC1, and SNAI2) were selected to construct the risk score signature. In both the discovery and independent validation datasets, LUAD patients with the higher risk score had a poorer prognosis. The nomogram based on risk score showed good performance in prognosis prediction with a C-index of 0.77. The expression levels of ABCC2, ARL2BP, DKK1, LRFN4, PYGL, SMNDC1, and SNAI2 were positively related to the resistance of EGFR-TKI. However, the expression level of FUT1 was favorably correlated with EGFR-TKI responsiveness. The RF model worked wonderfully for distinguishing sensitive and resistant EGFR-TKI samples in the internal and external testing datasets, with predictive area under the curves (AUC) of 0.973 and 0.817, respectively.Conclusion: Our investigation revealed eight genes associated with EGFR-TKI resistance and provided models for EGFR-TKI resistance and prognosis prediction in LUAD patients.
Keywords