Human Biology and Public Health (Jul 2023)

In Python available: St. Nicolas House Algorithm (SNHA) with bootstrap support for improved performance in dense networks

  • Tim Hake,
  • Bernhard Bodenberger,
  • Detlef Groth

DOI
https://doi.org/10.52905/hbph2023.1.63
Journal volume & issue
Vol. 1

Abstract

Read online

The St. Nicolas House algorithm (SNHA) finds association chains of direct dependent variables in a data set. The dependency is based on the correlation coefficient, which is visualized as an undirected graph. The network prediction is improved by a bootstrap routine. It enables the computation of the empirical p-value, which is used to evaluate the significance of the predicted edges. Synthetic data generated with the Monte Carlo method were used to firstly compare the Python package with the original R package, and secondly to evaluate the predicted network using the sensitivity, specificity, balanced classification rate and the Matthew's correlation coefficient (MCC). The Python implementation yields the same results as the R package. Hence, the algorithm was correctly ported into Python. The SNHA scores high specificity values for all tested graphs. For graphs with high edge densities, the other evaluation metrics decrease due to lower sensitivity, which could be partially improved by using bootstrap,while for graphs with low edge densities the algorithm achieves high evaluation scores. The empirical p-values indicated that the predicted edges indeed are significant.

Keywords