Scientific Data (Aug 2024)

Improved high quality sand fly assemblies enabled by ultra low input long read sequencing

  • Michelle Huang,
  • Sarah Kingan,
  • Douglas Shoue,
  • Oanh Nguyen,
  • Lutz Froenicke,
  • Brendan Galvin,
  • Christine Lambert,
  • Ruqayya Khan,
  • Chirag Maheshwari,
  • David Weisz,
  • Gareth Maslen,
  • Helen Davison,
  • Erez Lieberman Aiden,
  • Jonas Korlach,
  • Olga Dudchenko,
  • Mary Ann McDowell,
  • Stephen Richards

DOI
https://doi.org/10.1038/s41597-024-03628-y
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 8

Abstract

Read online

Abstract Phlebotomine sand flies are the vectors of leishmaniasis, a neglected tropical disease. High-quality reference genomes are an important tool for understanding the biology and eco-evolutionary dynamics underpinning disease epidemiology. Previous leishmaniasis vector reference sequences were limited by sequencing technologies available at the time and inadequate for high-resolution genomic inquiry. Here, we present updated reference assemblies of two sand flies, Phlebotomus papatasi and Lutzomyia longipalpis. These chromosome-level assemblies were generated using an ultra-low input library protocol, PacBio HiFi long reads, and Hi-C technology. The new P. papatasi reference has a final assembly span of 351.6 Mb and contig and scaffold N50s of 926 kb and 111.8 Mb, respectively. The new Lu. longipalpis reference has a final assembly span of 147.8 Mb and contig and scaffold N50s of 1.09 Mb and 40.6 Mb, respectively. Benchmarking Universal Single-Copy Orthologue (BUSCO) assessments indicated 94.5% and 95.6% complete single copy insecta orthologs for P. papatasi and Lu. longipalpis. These improved assemblies will serve as an invaluable resource for future genomic work on phlebotomine sandflies.