Unfolding the genotype-to-phenotype black box of cardiovascular diseases through cross-scale modeling
Xi Xi,
Haochen Li,
Shengquan Chen,
Tingting Lv,
Tianxing Ma,
Rui Jiang,
Ping Zhang,
Wing Hung Wong,
Xuegong Zhang
Affiliations
Xi Xi
MOE Key Laboratory of Bioinformatics and Bioinformatics Division, BNRIST / Department of Automation, Tsinghua University, Beijing 100084, China
Haochen Li
School of Medicine, Tsinghua University, Beijing 100084, China
Shengquan Chen
MOE Key Laboratory of Bioinformatics and Bioinformatics Division, BNRIST / Department of Automation, Tsinghua University, Beijing 100084, China
Tingting Lv
Department of Cardiology, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing 102218, China
Tianxing Ma
MOE Key Laboratory of Bioinformatics and Bioinformatics Division, BNRIST / Department of Automation, Tsinghua University, Beijing 100084, China
Rui Jiang
MOE Key Laboratory of Bioinformatics and Bioinformatics Division, BNRIST / Department of Automation, Tsinghua University, Beijing 100084, China
Ping Zhang
Department of Cardiology, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing 102218, China
Wing Hung Wong
Departments of Statistics and Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
Xuegong Zhang
MOE Key Laboratory of Bioinformatics and Bioinformatics Division, BNRIST / Department of Automation, Tsinghua University, Beijing 100084, China; School of Medicine, Tsinghua University, Beijing 100084, China; Corresponding author
Summary: Complex traits such as cardiovascular diseases (CVD) are the results of complicated processes jointly affected by genetic and environmental factors. Genome-wide association studies (GWAS) identified genetic variants associated with diseases but usually did not reveal the underlying mechanisms. There could be many intermediate steps at epigenetic, transcriptomic, and cellular scales inside the black box of genotype-phenotype associations. In this article, we present a machine-learning-based cross-scale framework GRPath to decipher putative causal paths (pcPaths) from genetic variants to disease phenotypes by integrating multiple omics data. Applying GRPath on CVD, we identified 646 and 549 pcPaths linking putative causal regions, variants, and gene expressions in specific cell types for two types of heart failure, respectively. The findings suggest new understandings of coronary heart disease. Our work promoted the modeling of tissue- and cell type-specific cross-scale regulation to uncover mechanisms behind disease-associated variants, and provided new findings on the molecular mechanisms of CVD.