Genome Biology (Aug 2023)

Ariadne: synthetic long read deconvolution using assembly graphs

  • Lauren Mak,
  • Dmitry Meleshko,
  • David C. Danko,
  • Waris N. Barakzai,
  • Salil Maharjan,
  • Natan Belchikov,
  • Iman Hajirasouliha

DOI
https://doi.org/10.1186/s13059-023-03033-5
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 29

Abstract

Read online

Abstract Synthetic long read sequencing techniques such as UST’s TELL-Seq and Loop Genomics’ LoopSeq combine 3 $$'$$ ′ barcoding with standard short-read sequencing to expand the range of linkage resolution from hundreds to tens of thousands of base-pairs. However, the lack of a 1:1 correspondence between a long fragment and a 3 $$'$$ ′ unique molecular identifier confounds the assignment of linkage between short reads. We introduce Ariadne, a novel assembly graph-based synthetic long read deconvolution algorithm, that can be used to extract single-species read-clouds from synthetic long read datasets to improve the taxonomic classification and de novo assembly of complex populations, such as metagenomes.

Keywords