F1000Research (Jun 2020)

Ensemble machine learning modeling for the prediction of artemisinin resistance in malaria [version 5; peer review: 1 approved, 2 approved with reservations]

  • Daniel Janies,
  • Colby T. Ford

Journal volume & issue
Vol. 9

Abstract

Read online

Resistance in malaria is a growing concern affecting many areas of Sub-Saharan Africa and Southeast Asia. Since the emergence of artemisinin resistance in the late 2000s in Cambodia, research into the underlying mechanisms has been underway. The 2019 Malaria Challenge posited the task of developing computational models that address important problems in advancing the fight against malaria. The first goal was to accurately predict artemisinin drug resistance levels of Plasmodium falciparum isolates, as quantified by the IC50. The second goal was to predict the parasite clearance rate of malaria parasite isolates based on in vitro transcriptional profiles. In this work, we develop machine learning models using novel methods for transforming isolate data and handling the tens of thousands of variables that result from these data transformation exercises. This is demonstrated by using massively parallel processing of the data vectorization for use in scalable machine learning. In addition, we show the utility of ensemble machine learning modeling for highly effective predictions of both goals of this challenge. This is demonstrated by the use of multiple machine learning algorithms combined with various scaling and normalization preprocessing steps. Then, using a voting ensemble, multiple models are combined to generate a final model prediction.

Keywords