Bias in O-Information Estimation

Johanna Gehlen; Jie Li; Cillian Hourican; Stavroula Tassi; Pashupati P. Mishra; Terho Lehtimäki; Mika Kähönen; Olli Raitakari; Jos A. Bosch; Rick Quax

doi:10.3390/e26100837

Entropy (Sep 2024)

Bias in O-Information Estimation

Johanna Gehlen,
Jie Li,
Cillian Hourican,
Stavroula Tassi,
Pashupati P. Mishra,
Terho Lehtimäki,
Mika Kähönen,
Olli Raitakari,
Jos A. Bosch,
Rick Quax

Affiliations

Johanna Gehlen: Computational Science Lab, Informatics Institute, University of Amsterdam, 1098 Amsterdam, The Netherlands
Jie Li: Computational Science Lab, Informatics Institute, University of Amsterdam, 1098 Amsterdam, The Netherlands
Cillian Hourican: Computational Science Lab, Informatics Institute, University of Amsterdam, 1098 Amsterdam, The Netherlands
Stavroula Tassi: Unit of Medical Technology and Intelligent Information Systems (MEDLAB), Department of Material Science and Engineering, University of Ioannina, 45110 Ioannina, Greece
Pashupati P. Mishra: Department of Clinical Chemistry, Faculty of Medicine and Health Technology, Tampere University, 33720 Tampere, Finland
Terho Lehtimäki: Department of Clinical Chemistry, Faculty of Medicine and Health Technology, Tampere University, 33720 Tampere, Finland
Mika Kähönen: Finnish Cardiovascular Research Center Tampere, Faculty of Medicine and Health Technology, Tampere University, 33720 Tampere, Finland
Olli Raitakari: Centre for Population Health Research, University of Turku and Turku University Hospital, 20520 Turku, Finland
Jos A. Bosch: Clinical Psychology, Faculty of Social and Behavioural Sciences, University of Amsterdam, 1018 Amsterdam, The Netherlands
Rick Quax: Computational Science Lab, Informatics Institute, University of Amsterdam, 1098 Amsterdam, The Netherlands

DOI: https://doi.org/10.3390/e26100837
Journal volume & issue: Vol. 26, no. 10
p. 837

Abstract

Read online

Higher-order relationships are a central concept in the science of complex systems. A popular method of attempting to estimate the higher-order relationships of synergy and redundancy from data is through the O-information. It is an information–theoretic measure composed of Shannon entropy terms that quantifies the balance between redundancy and synergy in a system. However, bias is not yet taken into account in the estimation of the O-information of discrete variables. In this paper, we explain where this bias comes from and explore it for fully synergistic, fully redundant, and fully independent simulated systems of n=3 variables. Specifically, we explore how the sample size and number of bins affect the bias in the O-information estimation. The main finding is that the O-information of independent systems is severely biased towards synergy if the sample size is smaller than the number of jointly possible observations. This could mean that triplets identified as highly synergistic may in fact be close to independent. A bias approximation based on the Miller–Maddow method is derived for the O-information. We find that for systems of n=3 variables the bias approximation can partially correct for the bias. However, simulations of fully independent systems are still required as null models to provide a benchmark of the bias of the O-information.

Published in Entropy

ISSN: 1099-4300 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Astronomy: Astrophysics; Science: Physics
Website: http://www.mdpi.com/journal/entropy

About the journal

Abstract

Keywords