Wellcome Open Research (Apr 2022)

An open dataset of Plasmodium vivax genome variation in 1,895 worldwide samples [version 1; peer review: 2 approved]

  • Sonia Goncalves,
  • Lemu Golassa,
  • Wasif Khan,
  • Sisay Alemu,
  • Mohammad Shafiul Alam,
  • Pharath Lim,
  • Elizabeth Ashley,
  • Nicholas M Anstey,
  • Bridget E Barber,
  • Jutta Marfurt,
  • Ashenafi Assefa,
  • Dhelio Batista Pereira,
  • Alyssa Barry,
  • Nguyen Hoang Chau,
  • Jun Cao,
  • Fe Espino,
  • Cindy Chu,
  • Ivo Mueller,
  • María Fernanda Villegas,
  • Rick Fairhurst,
  • Thuy-Nhien Nguyen,
  • Yaghoob Hamedi,
  • Matthew J Grigg,
  • Rintis Noviyanti,
  • Ye Htut,
  • Tran Tinh Hien,
  • Nadira Karunaweera,
  • Kimberly J Johnson,
  • Dominic P Kwiatkowski,
  • Srivicha Krudsood,
  • Francois Nosten,
  • Benedikt Ley,
  • Marcus Lacerda,
  • Alejandro Llanos-Cuentas,
  • Yaobao Liu,
  • Tatiana Lopera-Mesa,
  • Milijaona Randrianarivelojosia,
  • Chanthap Lon,
  • Sasithon Pukrittayakamee,
  • Rezika Mohammed,
  • Pascal Michon,
  • Paul N Newton,
  • Chayadol Namaik-larp,
  • Richard D Pearson,
  • Julian C Rayner,
  • Zuleima Pava,
  • Aung P Phyo,
  • Beyene Petros,
  • Awab Ghulam Rahim,
  • Ric N Price,
  • Sasha V Siegel,
  • Angela Rumaseb,
  • Kamala Thriemer,
  • Victoria J Simpson,
  • Marcelo Urbano Ferreira,
  • Alberto Tobon-Castano,
  • Sonam Wangchuk,
  • Ivan D Vélez,
  • Nicholas J White,
  • Thomas E Wellems,
  • Maria F Yasnot,
  • Arjen M. Dondorp,
  • Timothy William,
  • Daniel Yilma,
  • Sarah Auburn,
  • Hidayat Trimarsanto,
  • Abraham Aseffa,
  • Qi Gao,
  • Roberto Amato,
  • Voahangy Andrianaranjaka,
  • Ishag Adam,
  • Kesinee Chotivanich,
  • Olivo Miotto,
  • Chanaki Amaratunga,
  • Eleanor Drury,
  • Diego F. Echeverry,
  • Berhanu Erko,
  • Abdul Faiz

Journal volume & issue
Vol. 7

Abstract

Read online

This report describes the MalariaGEN Pv4 dataset, a new release of curated genome variation data on 1,895 samples of Plasmodium vivax collected at 88 worldwide locations between 2001 and 2017. It includes 1,370 new samples contributed by MalariaGEN and VivaxGEN partner studies in addition to previously published samples from these and other sources. We provide genotype calls at over 4.5 million variable positions including over 3 million single nucleotide polymorphisms (SNPs), as well as short indels and tandem duplications. This enlarged dataset highlights major compartments of parasite population structure, with clear differentiation between Africa, Latin America, Oceania, Western Asia and different parts of Southeast Asia. Each sample has been classified for drug resistance to sulfadoxine, pyrimethamine and mefloquine based on known markers at the dhfr, dhps and mdr1 loci. The prevalence of all of these resistance markers was much higher in Southeast Asia and Oceania than elsewhere. This open resource of analysis-ready genome variation data from the MalariaGEN and VivaxGEN networks is driven by our collective goal to advance research into the complex biology of P. vivax and to accelerate genomic surveillance for malaria control and elimination.

Keywords