Communications Biology (May 2024)

A metagenomic analysis of the phase 2 Anopheles gambiae 1000 genomes dataset reveals a wide diversity of cobionts associated with field collected mosquitoes

  • Andrzej Pastusiak,
  • Michael R. Reddy,
  • Xiaoji Chen,
  • Isaiah Hoyer,
  • Jack Dorman,
  • Mary E. Gebhardt,
  • Giovanna Carpi,
  • Douglas E. Norris,
  • James M. Pipas,
  • Ethan K. Jackson

DOI
https://doi.org/10.1038/s42003-024-06337-9
Journal volume & issue
Vol. 7, no. 1
pp. 1 – 11

Abstract

Read online

Abstract The Anopheles gambiae 1000 Genomes (Ag1000G) Consortium previously utilized deep sequencing methods to catalogue genetic diversity across African An. gambiae populations. We analyzed the complete datasets of 1142 individually sequenced mosquitoes through Microsoft Premonition’s Bayesian mixture model based (BMM) metagenomics pipeline. All specimens were confirmed as either An. gambiae sensu stricto (s.s.) or An. coluzzii with a high degree of confidence ( > 98% identity to reference). Homo sapiens DNA was identified in all specimens indicating contamination may have occurred either at the time of specimen collection, preparation and/or sequencing. We found evidence of vertebrate hosts in 162 specimens. 59 specimens contained validated Plasmodium falciparum reads. Human hepatitis B and primate erythroparvovirus-1 viral sequences were identified in fifteen and three mosquito specimens, respectively. 478 of the 1,142 specimens were found to contain bacterial reads and bacteriophage-related contigs were detected in 27 specimens. This analysis demonstrates the capacity of metagenomic approaches to elucidate important vector-host-pathogen interactions of epidemiological significance.