Heliyon (Oct 2024)
Identification of novel proteins for coronary artery disease by integrating GWAS data and human plasma proteomes
Abstract
Background: Most coronary artery disease (CAD) risk loci identified by genome-wide association studies (GWAS) are located in non-coding regions, hampering the interpretation of how they confer CAD risk. It is essential to integrate GWAS with molecular traits data to further explore the genetic basis of CAD. Methods: We used the probabilistic Mendelian randomization (PMR) method to identify potential proteins involved in CAD by integrating CAD GWAS data (∼76,014 cases and ∼264,785 controls) and human plasma proteomes (N = 35,559). Then, Bayesian co-localization analysis, confirmatory PMR analysis using independent plasma proteome data (N = 7752), and gene expression data (N1 = 213, N2 = 670) were performed to validate candidate proteins. We further investigated the associations between candidate proteins and CAD-related traits and explored the rationality and biological functions of candidate proteins through disease enrichment, cell type-specific, GO, and KEGG enrichment analysis. Results: This study inferred that the abundance of 30 proteins in the plasma was causally associated with CAD (P < 0.05/4408, Bonferroni correction), such as PLG, IL15RA, and CSNK2A1. PLG, PSCK9, COLEC11, ZNF180, ERP29, TCP1, FN1, CDH5, IL15RA, MGAT4B, TNFRSF6B, DNM2, and TGF1R were replicated in the confirmatory PMR (P < 0.05). PCSK9 (PP.H4 = 0.99), APOB (PP.H4 = 0.89), FN1 (PP.H4 = 0.87), and APOC1 (PP.H4 = 0.78) coding proteins shared one common variant with CAD. MTAP, TCP1, APOC2, ERP29, MORF4L1, C19orf80, PCSK9, APOC1, EPOR, DNM2, TNFRSF6B, CDKN2B, and LDLR were supported by PMR at the transcriptome level in whole blood and/or coronary arteries (P < 0.05). Enrichment analysis identified multiple pathways involved in cholesterol metabolism, regulation of lipoprotein levels and telomerase, such as cholesterol metabolism (hsa04979, P = 2.25E-7), plasma lipoprotein particle clearance (GO:0034381, P = 5.47E-5), and regulation of telomerase activity (GO:0051972, P = 2.34E-3). Conclusions: Our integration analysis has identified 30 candidate proteins for CAD, which may provide important leads to design future functional studies and potential drug targets for CAD.