Frontiers in Genetics (Dec 2024)
AnchorFCI: harnessing genetic anchors for enhanced causal discovery of cardiometabolic disease pathways
Abstract
IntroductionCardiometabolic diseases, a major global health concern, stem from complex interactions of lifestyle, genetics, and biochemical markers. While extensive research has revealed strong associations between various risk factors and these diseases, latent confounding and limited causal discovery methods hinder understanding of their causal relationships, essential for mechanistic insights and developing effective prevention and intervention strategies.MethodsWe introduce anchorFCI, a novel adaptation of the conservative Really Fast Causal Inference (RFCI) algorithm, designed to enhance robustness and discovery power in causal learning by strategically selecting and integrating reliable anchor variables from a set of variables known not to be caused by the variables of interest. This approach is well-suited for studies of phenotypic, clinical, and sociodemographic data, using genetic variables that are recognized to be unaffected by these factors. We demonstrate the method’s effectiveness through simulation studies and a comprehensive causal analysis of the 2015 ISA-Nutrition dataset, featuring both anchorFCI for causal discovery and state-of-the-art effect size identification tools from Judea Pearl’s framework, showcasing a robust, fully data-driven causal inference pipeline.ResultsOur simulation studies reveal that anchorFCI effectively enhances robustness and discovery power while handles latent confounding by integrating reliable anchor variables and their non-ancestral relationships. The 2015 ISA-Nutrition dataset analysis not only supports many established causal relationships but also elucidates their interconnections, providing a clearer understanding of the complex dynamics and multifaceted nature of cardiometabolic risk.DiscussionAnchorFCI holds significant potential for reliable causal discovery in complex, multidimensional datasets. By effectively integrating non-ancestral knowledge and addressing latent confounding, it is well-suited for various applications requiring robust causal inference from observational studies, providing valuable insights in epidemiology, genetics, and public health.
Keywords