Scientific Reports (Apr 2024)

Long-read sequencing and optical mapping generates near T2T assemblies that resolves a centromeric translocation

  • Esmee ten Berk de Boer,
  • Adam Ameur,
  • Ignas Bunikis,
  • Marlene Ek,
  • Eva-Lena Stattin,
  • Lars Feuk,
  • Jesper Eisfeldt,
  • Anna Lindstrand

DOI
https://doi.org/10.1038/s41598-024-59683-3
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 8

Abstract

Read online

Abstract Long-read genome sequencing (lrGS) is a promising method in genetic diagnostics. Here we investigate the potential of lrGS to detect a disease-associated chromosomal translocation between 17p13 and the 19 centromere. We constructed two sets of phased and non-phased de novo assemblies; (i) based on lrGS only and (ii) hybrid assemblies combining lrGS with optical mapping using lrGS reads with a median coverage of 34X. Variant calling detected both structural variants (SVs) and small variants and the accuracy of the small variant calling was compared with those called with short-read genome sequencing (srGS). The de novo and hybrid assemblies had high quality and contiguity with N50 of 62.85 Mb, enabling a near telomere to telomere assembly with less than a 100 contigs per haplotype. Notably, we successfully identified the centromeric breakpoint of the translocation. A concordance of 92% was observed when comparing small variant calling between srGS and lrGS. In summary, our findings underscore the remarkable potential of lrGS as a comprehensive and accurate solution for the analysis of SVs and small variants. Thus, lrGS could replace a large battery of genetic tests that were used for the diagnosis of a single symptomatic translocation carrier, highlighting the potential of lrGS in the realm of digital karyotyping.