PLoS ONE (Jan 2012)

Paired-end sequencing of long-range DNA fragments for de novo assembly of large, complex Mammalian genomes by direct intra-molecule ligation.

  • Asan,
  • Chunyu Geng,
  • Yan Chen,
  • Kui Wu,
  • Qingle Cai,
  • Yu Wang,
  • Yongshan Lang,
  • Hongzhi Cao,
  • Huangming Yang,
  • Jian Wang,
  • Xiuqing Zhang

DOI
https://doi.org/10.1371/journal.pone.0046211
Journal volume & issue
Vol. 7, no. 9
p. e46211

Abstract

Read online

BACKGROUND: The relatively short read lengths from next generation sequencing (NGS) technologies still pose a challenge for de novo assembly of complex mammal genomes. One important solution is to use paired-end (PE) sequence information experimentally obtained from long-range DNA fragments (>1 kb). Here, we characterize and extend a long-range PE library construction method based on direct intra-molecule ligation (or molecular linker-free circularization) for NGS. RESULTS: We found that the method performs stably for PE sequencing of 2- to 5- kb DNA fragments, and can be extended to 10-20 kb (and even in extremes, up to ∼35 kb). We also characterized the impact of low quality input DNA on the method, and develop a whole-genome amplification (WGA) based protocol using limited input DNA (2 Mb, which is over 100-times greater than the initial size produced with only small insert PE reads(17 kb). In addition, we mapped two 7- to 8- kb insertions in the YH genome using the larger insert sizes of the long-range PE data. CONCLUSIONS: In conclusion, we demonstrate here the effectiveness of this long-range PE sequencing method and its use for the de novo assembly of a large, complex genome using NGS short reads.