IEEE Access (Jan 2021)
Experiment Selection in Meta-Analytic Piecemeal Causal Discovery
Abstract
Scientists try to design experiments that will yield maximal information. For instance, given the available evidence and a limitation on the number of variables that can be observed simultaneously, it may be more informative to intervene on variable $X$ and observe the response of variable $Y$ than to intervene on $X$ and observe $Z$ ; in other situations, the opposite may be true. Scientists must often make these decisions without primary data. To address this problem, in previous work, we created software for annotating aggregate statistics in the literature and deriving consistent causal explanations, expressed as causal graphs. This meta-analytic pipeline is useful not only for synthesizing evidence but also for planning experiments: one can use it strategically to select experiments that could further eliminate causal graphs from consideration. In this paper, we introduce interpretable policies for selecting experiments in the context of piecemeal causal discovery, a common setting in biological sciences in which each experiment can measure not an entire system but rather a strict subset of its variables. The limits of this piecemeal approach are only beginning to be fully characterized, with crucial theoretical work published recently. With simulations, we show that our experiment-selection policies identify causal structures more efficiently than random experiment selection. Unlike methods that require primary data, our meta-analytic approach offers a flexible alternative for those seeking to incorporate qualitative domain knowledge into their search for causal mechanisms. We also present a method that categorizes hypotheses with respect to their utility for identifying a system’s causal structure. Although this categorization is usually infeasible to perform manually, it is critical for conducting research efficiently.
Keywords