Robust and scalable barcoding for massively parallel long-read sequencing

Joaquín Ezpeleta; Ignacio Garcia Labari; Gabriela Vanina Villanova; Pilar Bulacio; Sofía Lavista-Llanos; Victoria Posner; Flavia Krsticevic; Silvia Arranz; Elizabeth Tapia

doi:10.1038/s41598-022-11656-0

Scientific Reports (May 2022)

Robust and scalable barcoding for massively parallel long-read sequencing

Joaquín Ezpeleta,
Ignacio Garcia Labari,
Gabriela Vanina Villanova,
Pilar Bulacio,
Sofía Lavista-Llanos,
Victoria Posner,
Flavia Krsticevic,
Silvia Arranz,
Elizabeth Tapia

Affiliations

Joaquín Ezpeleta: Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas
Ignacio Garcia Labari: Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas
Gabriela Vanina Villanova: Consejo Nacional de Investigaciones Científicas y Técnicas
Pilar Bulacio: Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas
Sofía Lavista-Llanos: Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas
Victoria Posner: Laboratorio Mixto de Biotecnología Acuática, Facultad de Ciencias Bioquímicas y Farmacéuticas, Universidad Nacional de Rosario - Centro Científico Tecnológico y Educativo Acuario del Río Paraná
Flavia Krsticevic: Robert H Smith Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem
Silvia Arranz: Laboratorio Mixto de Biotecnología Acuática, Facultad de Ciencias Bioquímicas y Farmacéuticas, Universidad Nacional de Rosario - Centro Científico Tecnológico y Educativo Acuario del Río Paraná
Elizabeth Tapia: Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas

DOI: https://doi.org/10.1038/s41598-022-11656-0
Journal volume & issue: Vol. 12, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Nucleic-acid barcoding is an enabling technique for many applications, but its use remains limited in emerging long-read sequencing technologies with intrinsically low raw accuracy. Here, we apply so-called NS-watermark barcodes, whose error correction capability was previously validated in silico, in a proof of concept where we synthesize 3840 NS-watermark barcodes and use them to asymmetrically tag and simultaneously sequence amplicons from two evolutionarily distant species (namely Bordetella pertussis and Drosophila mojavensis) on the ONT MinION platform. To our knowledge, this is the largest number of distinct, non-random tags ever sequenced in parallel and the first report of microarray-based synthesis as a source for large oligonucleotide pools for barcoding. We recovered the identity of more than 86% of the barcodes, with a crosstalk rate of 0.17% (i.e., one misassignment every 584 reads). This falls in the range of the index hopping rate of established, high-accuracy Illumina sequencing, despite the increased number of tags and the relatively low accuracy of both microarray-based synthesis and long-read sequencing. The robustness of NS-watermark barcodes, together with their scalable design and compatibility with low-cost massive synthesis, makes them promising for present and future sequencing applications requiring massive labeling, such as long-read single-cell RNA-Seq.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal