Technical note:  Incorporating expert domain knowledge into causal structure discovery workflows

J. Mäkelä; L. Melkas; I. Mammarella; T. Nieminen; T. Nieminen; S. Chandramouli; R. Savvides; K. Puolamäki; K. Puolamäki

doi:10.5194/bg-19-2095-2022

Biogeosciences (Apr 2022)

Technical note: Incorporating expert domain knowledge into causal structure discovery workflows

J. Mäkelä,
L. Melkas,
I. Mammarella,
T. Nieminen,
T. Nieminen,
S. Chandramouli,
R. Savvides,
K. Puolamäki,
K. Puolamäki

Affiliations

J. Mäkelä: Department of Computer Science, P.O. Box 68, 00014 University of Helsinki, Helsinki, Finland
L. Melkas: Department of Computer Science, P.O. Box 68, 00014 University of Helsinki, Helsinki, Finland
I. Mammarella: Institute for Atmospheric and Earth System Research/Physics, P.O. Box 64, 00014 University of Helsinki, Helsinki, Finland
T. Nieminen: Institute for Atmospheric and Earth System Research/Physics, P.O. Box 64, 00014 University of Helsinki, Helsinki, Finland
T. Nieminen: Institute for Atmospheric and Earth System Research/Forest Sciences, P.O. Box 27, 00014 University of Helsinki, Helsinki, Finland
S. Chandramouli: Department of Computer Science, P.O. Box 68, 00014 University of Helsinki, Helsinki, Finland
R. Savvides: Department of Computer Science, P.O. Box 68, 00014 University of Helsinki, Helsinki, Finland
K. Puolamäki: Department of Computer Science, P.O. Box 68, 00014 University of Helsinki, Helsinki, Finland
K. Puolamäki: Institute for Atmospheric and Earth System Research/Physics, P.O. Box 64, 00014 University of Helsinki, Helsinki, Finland

DOI: https://doi.org/10.5194/bg-19-2095-2022
Journal volume & issue: Vol. 19
pp. 2095 – 2099

Abstract

Read online

In this note, we argue that the outputs of causal discovery algorithms should not usually be considered end results but rather starting points and hypotheses for further study. The incentive to explore this topic came from a recent study by Krich et al. (2020), which gives a good introduction to estimating causal networks in biosphere–atmosphere interaction but confines the scope by investigating the outcome of a single algorithm. We aim to give a broader perspective to this study and to illustrate how not only different algorithms but also different initial states and prior information of possible causal model structures affect the outcome. We provide a proof-of-concept demonstration of how to incorporate expert domain knowledge with causal structure discovery and remark on how to detect and take into account over-fitting and concept drift.

Published in Biogeosciences

ISSN: 1726-4170 (Print); 1726-4189 (Online)
Publisher: Copernicus Publications
Country of publisher: Germany
LCC subjects: Science: Biology (General): Ecology; Science: Biology (General): Life; Science: Geology
Website: http://www.biogeosciences.net

About the journal