Nature Communications (Oct 2024)

Composite Hedges Nanopores codec system for rapid and portable DNA data readout with high INDEL-Correction

  • Xuyang Zhao,
  • Junyao Li,
  • Qingyuan Fan,
  • Jing Dai,
  • Yanping Long,
  • Ronghui Liu,
  • Jixian Zhai,
  • Qing Pan,
  • Yi Li

DOI
https://doi.org/10.1038/s41467-024-53455-3
Journal volume & issue
Vol. 15, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Reading digital information from highly dense but lightweight DNA medium nowadays relies on time-consuming next-generation sequencing. Nanopore sequencing holds the promise to overcome the efficiency problem, but high indel error rates lead to the requirement of large amount of high quality data for accurate readout. Here we introduce Composite Hedges Nanopores, capable of handling indel rates up to 15.9% and substitution rates up to 7.8%. The overall information density can be doubled from 0.59 to 1.17 by utilizing a degenerated eight-letter alphabet. We demonstrate that sequencing times of 20 and 120 minutes are sufficient for processing representative text and image files, respectively. Moreover, to achieve complete data recovery, it is estimated that text and image data require 4× and 8× physical redundancy of composite strands, respectively. Our codec system excels on both molecular design and equalized dictionary usage, laying a solid foundation approaching to real-time DNA data retrieval and encoding.