Computational Ecology and Software (Sep 2012)

A novel approach for modeling malaria incidence using complex categorical household data: The minimum message length (MML) method applied to Indonesian data

  • Gerhard Visser,
  • Pat Dale,
  • David Dowe, et al.

Journal volume & issue
Vol. 2, no. 3
pp. 140 – 159

Abstract

Read online

We investigated the application of a Minimum Message Length (MML) modeling approach to identify the simplest model that would explain two target malaria incidence variables: incidence in the short term and on the average longer term, in two areas in Indonesia, based on a range of ecological variables including environmental and socio-economic ones. The approach is suitable for dealing with a variety of problems such as complexity and where there are missing values in the data. It can detect weak relations, is resistant to overfittingand can show the way in which many variables, working together, contribute to explaining malaria incidence. This last point is a major strength of the method as it allows many variables to be analysed. Data were obtained at household level by questionnaire for villages in West Timor and Central Java. Data were collected on 26 variables in nine categories: stratum (a village-level variable based on the API/AMI categories), ecology, occupation, preventative measures taken, health care facilities, the immediate environment, household characteristics, socio-economic status and perception of malaria cause. Several models were used and the simplest (best) model, that is the one with the minimum message length was selected for each area. The results showed that consistent predictors of malaria included combinations of ecology (coastal), preventative (clean backyard) and environment (mosquito breeding place, garden and rice cultivation). The models also showed that most of the other variables were not good predictors and this is discussed in the paper. We conclude that the method has potential for identifying simple predictors of malaria and that it could be used to focus malaria management on combinations of variables rather than relying on single ones that may not be consistently reliable.

Keywords