Genome-wide association studies are enriched for interacting genes

Peter T. Nguyen; Simon G. Coetzee; Irina Silacheva; Dennis J. Hazelett

doi:10.1186/s13040-024-00421-w

BioData Mining (Jan 2025)

Genome-wide association studies are enriched for interacting genes

Peter T. Nguyen,
Simon G. Coetzee,
Irina Silacheva,
Dennis J. Hazelett

Affiliations

Peter T. Nguyen: The Department of Biomedical and Translational Sciences, Cedars-Sinai Medical Center
Simon G. Coetzee: The Department of Computational Biomedicine, Cedars-Sinai Medical Center
Irina Silacheva: The Department of Biomedical and Translational Sciences, Cedars-Sinai Medical Center
Dennis J. Hazelett: The Department of Computational Biomedicine, Cedars-Sinai Medical Center

DOI: https://doi.org/10.1186/s13040-024-00421-w
Journal volume & issue: Vol. 18, no. 1
pp. 1 – 18

Abstract

Read online

Abstract Background With recent advances in single cell technology, high-throughput methods provide unique insight into disease mechanisms and more importantly, cell type origin. Here, we used multi-omics data to understand how genetic variants from genome-wide association studies influence development of disease. We show in principle how to use genetic algorithms with normal, matching pairs of single-nucleus RNA- and ATAC-seq, genome annotations, and protein-protein interaction data to describe the genes and cell types collectively and their contribution to increased risk. Results We used genetic algorithms to measure fitness of gene-cell set proposals against a series of objective functions that capture data and annotations. The highest information objective function captured protein-protein interactions. We observed significantly greater fitness scores and subgraph sizes in foreground vs. matching sets of control variants. Furthermore, our model reliably identified known targets and ligand-receptor pairs, consistent with prior studies. Conclusions Our findings suggested that application of genetic algorithms to association studies can generate a coherent cellular model of risk from a set of susceptibility variants. Further, we showed, using breast cancer as an example, that such variants have a greater number of physical interactions than expected due to chance.

Published in BioData Mining

ISSN: 1756-0381 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Mathematics: Analysis
Website: https://biodatamining.biomedcentral.com/

About the journal

Abstract

Keywords