Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki (Jan 2016)

SYNTHESIS OF THE SECONDARY STRUCTURE OF ALGEBRAIC BAYESIAN NETWORKS: AN INCREMENTAL ALGORITHM AND STATISTICAL ESTIMATION OF ITS COMPLEXITY

  • M. A. Zotov,
  • D. G. Levenets,
  • A. L. Tulupyev,
  • A. A. Zolotin

DOI
https://doi.org/10.17586/2226-1494-2016-16-1-122-132
Journal volume & issue
Vol. 16, no. 1
pp. 122 – 132

Abstract

Read online

An improved algorithm for the synthesis of the secondary structure of algebraic Bayesian networks represented by a minimal join graph is proposed in the paper. The algorithm differs from the previously offered one so that it relies on the incremental principle, uses specially selected edges and, finally, eliminates redundant edges by a greedy algorithm. The correct operation of the incremental algorithm is mathematically proved. Comparison of the computational complexity of the new (incremental) algorithm implementation and two well-known (greedy and direct) is made by means of statistical estimates of complexity, based on the sample values of the runtime ratio of software implementations of two compared algorithms. Theoretical complexity estimates of the greedy and direct algorithms have been obtained earlier, but are not suitable for comparative analysis, as they are based on the hidden characteristics of the secondary structure, which can be calculated only when it is built. To minimize the influence of random factors at calculating the ratio average program runtime is used obtained by N launches on the same set of workloads. The sample values of ratio is formed for M sets of equal power K. According to the sample values the median is calculated, as well as the other statistics that characterize the spread: borders of the 97% confidence interval along with the first and the third quartiles. Sets of loads are stochastically generated according to the specified parameters using the algorithm described in the paper. The stochastic algorithms generating a set of loads with given power, as well as collecting the statistical data and calculating of statistical estimates of the ratio of forward and greedy algorithm to the incremental algorithm runtimes is described in the paper. A series of experiments is carried out in which N is changed in the range 1, 2 ... 9, 10, 26, 42 ... 170.They have showed that the incremental algorithm speed exceeds the forward and greedy ones, moreover in the 10-170 load sets power range this finding is statistically significant (97% level). The results of experiments are visualized using a graphs library Highcharts. The developed incremental algorithm is designed for application in problems solving of algebraic Bayesian networks machine learning.

Keywords