Large language models for causal hypothesis generation in science

Kai-Hendrik Cohrs; Emiliano Diaz; Vasileios Sitokonstantinou; Gherardo Varando; Gustau Camps-Valls

doi:10.1088/2632-2153/ada47f

Machine Learning: Science and Technology (Jan 2025)

Large language models for causal hypothesis generation in science

Kai-Hendrik Cohrs,
Emiliano Diaz,
Vasileios Sitokonstantinou,
Gherardo Varando,
Gustau Camps-Valls

Affiliations

Kai-Hendrik Cohrs: ORCiD; Image Processing Laboratory (IPL), Universitat de València , València, Spain
Emiliano Diaz: ORCiD; Image Processing Laboratory (IPL), Universitat de València , València, Spain
Vasileios Sitokonstantinou: ORCiD; Image Processing Laboratory (IPL), Universitat de València , València, Spain
Gherardo Varando: ORCiD; Image Processing Laboratory (IPL), Universitat de València , València, Spain
Gustau Camps-Valls: ORCiD; Image Processing Laboratory (IPL), Universitat de València , València, Spain

DOI: https://doi.org/10.1088/2632-2153/ada47f
Journal volume & issue: Vol. 6, no. 1
p. 013001

Abstract

Read online

Towards the goal of understanding the causal structure underlying complex systems—such as the Earth, the climate, or the brain—integrating Large language models (LLMs) with data-driven and domain-expertise-driven approaches has the potential to become a game-changer, especially in data and expertise-limited scenarios. Debates persist around LLMs’ causal reasoning capacities. However, rather than engaging in philosophical debates, we propose integrating LLMs into a scientific framework for causal hypothesis generation alongside expert knowledge and data. Our goals include formalizing LLMs as probabilistic imperfect experts, developing adaptive methods for causal hypothesis generation, and establishing universal benchmarks for comprehensive comparisons. Specifically, we introduce a spectrum of integration methods for experts, LLMs, and data-driven approaches. We review existing approaches for causal hypothesis generation and classify them within this spectrum. As an example, our hybrid (LLM + data) causal discovery algorithm illustrates ways for deeper integration. Characterizing imperfect experts along dimensions such as (1) reliability, (2) consistency, (3) uncertainty, and (4) content vs. reasoning are emphasized for developing adaptable methods. Lastly, we stress the importance of model-agnostic benchmarks.

Published in Machine Learning: Science and Technology

ISSN: 2632-2153 (Online)
Publisher: IOP Publishing
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://iopscience.iop.org/journal/2632-2153

About the journal

Abstract

Keywords