PLoS Computational Biology (Jan 2017)

THE REAL McCOIL: A method for the concurrent estimation of the complexity of infection and SNP allele frequency for malaria parasites.

  • Hsiao-Han Chang,
  • Colin J Worby,
  • Adoke Yeka,
  • Joaniter Nankabirwa,
  • Moses R Kamya,
  • Sarah G Staedke,
  • Grant Dorsey,
  • Maxwell Murphy,
  • Daniel E Neafsey,
  • Anna E Jeffreys,
  • Christina Hubbart,
  • Kirk A Rockett,
  • Roberto Amato,
  • Dominic P Kwiatkowski,
  • Caroline O Buckee,
  • Bryan Greenhouse

DOI
https://doi.org/10.1371/journal.pcbi.1005348
Journal volume & issue
Vol. 13, no. 1
p. e1005348

Abstract

Read online

As many malaria-endemic countries move towards elimination of Plasmodium falciparum, the most virulent human malaria parasite, effective tools for monitoring malaria epidemiology are urgent priorities. P. falciparum population genetic approaches offer promising tools for understanding transmission and spread of the disease, but a high prevalence of multi-clone or polygenomic infections can render estimation of even the most basic parameters, such as allele frequencies, challenging. A previous method, COIL, was developed to estimate complexity of infection (COI) from single nucleotide polymorphism (SNP) data, but relies on monogenomic infections to estimate allele frequencies or requires external allele frequency data which may not available. Estimates limited to monogenomic infections may not be representative, however, and when the average COI is high, they can be difficult or impossible to obtain. Therefore, we developed THE REAL McCOIL, Turning HEterozygous SNP data into Robust Estimates of ALelle frequency, via Markov chain Monte Carlo, and Complexity Of Infection using Likelihood, to incorporate polygenomic samples and simultaneously estimate allele frequency and COI. This approach was tested via simulations then applied to SNP data from cross-sectional surveys performed in three Ugandan sites with varying malaria transmission. We show that THE REAL McCOIL consistently outperforms COIL on simulated data, particularly when most infections are polygenomic. Using field data we show that, unlike with COIL, we can distinguish epidemiologically relevant differences in COI between and within these sites. Surprisingly, for example, we estimated high average COI in a peri-urban subregion with lower transmission intensity, suggesting that many of these cases were imported from surrounding regions with higher transmission intensity. THE REAL McCOIL therefore provides a robust tool for understanding the molecular epidemiology of malaria across transmission settings.