Journal of Statistical Software (Dec 2014)

Learning Continuous Time Bayesian Network Classifiers Using MapReduce

  • Simone Villa,
  • Marco Rossetti

DOI
https://doi.org/10.18637/jss.v062.i03
Journal volume & issue
Vol. 62, no. 1
pp. 1 – 25

Abstract

Read online

Parameter and structural learning on continuous time Bayesian network classifiers are challenging tasks when you are dealing with big data. This paper describes an efficient scalable parallel algorithm for parameter and structural learning in the case of complete data using the MapReduce framework. Two popular instances of classifiers are analyzed, namely the continuous time naive Bayes and the continuous time tree augmented naive Bayes. Details of the proposed algorithm are presented using Hadoop, an open-source implementation of a distributed file system and the MapReduce framework for distributed data processing. Performance evaluation of the designed algorithm shows a robust parallel scaling.