Frontiers in Genetics (Oct 2024)
Transparent sparse graph pathway network for analyzing the internal relationship of lung cancer
Abstract
While it is important to find the key biomarkers and improve the accuracy of disease models, it is equally important to understand their interaction relationships. In this study, a transparent sparse graph pathway network (TSGPN) is proposed based on the structure of graph neural networks. This network simulates the action of genes in vivo, adds to prior knowledge, and improves the model’s accuracy. First, the graph connection was constructed according to protein–protein interaction networks and competing endogenous RNA (ceRNA) networks, from which some noise or unimportant connections were spontaneously removed based on the graph attention mechanism and hard concrete estimation. This realized the reconstruction of the ceRNA network representing the influence of other genes in the disease on mRNA. Next, the gene-based interpretation was transformed into a pathway-based interpretation based on the pathway database, and the hidden layer was added to realize the high-dimensional analysis of the pathway. Finally, the experimental results showed that the proposed TSGPN method is superior to other comparison methods in F1 score and AUC, and more importantly, it can effectively display the role of genes. Through data analysis applied to lung cancer prognosis, ten pathways related to LUSC prognosis were found, as well as the key biomarkers closely related to these pathways, such as HOXA10, hsa-mir-182, and LINC02544. The relationship between them was also reconstructed to better explain the internal mechanism of the disease.
Keywords