Scientific Reports (Apr 2019)

High information capacity DNA-based data storage with augmented encoding characters using degenerate bases

  • Yeongjae Choi,
  • Taehoon Ryu,
  • Amos C. Lee,
  • Hansol Choi,
  • Hansaem Lee,
  • Jaejun Park,
  • Suk-Heung Song,
  • Seojoo Kim,
  • Hyeli Kim,
  • Wook Park,
  • Sunghoon Kwon

DOI
https://doi.org/10.1038/s41598-019-43105-w
Journal volume & issue
Vol. 9, no. 1
pp. 1 – 7

Abstract

Read online

Abstract DNA-based data storage has emerged as a promising method to satisfy the exponentially increasing demand for information storage. However, practical implementation of DNA-based data storage remains a challenge because of the high cost of data writing through DNA synthesis. Here, we propose the use of degenerate bases as encoding characters in addition to A, C, G, and T, which augments the amount of data that can be stored per length of DNA sequence designed (information capacity) and lowering the amount of DNA synthesis per storing unit data. Using the proposed method, we experimentally achieved an information capacity of 3.37 bits/character. The demonstrated information capacity is more than twice when compared to the highest information capacity previously achieved. The proposed method can be integrated with synthetic technologies in the future to reduce the cost of DNA-based data storage by 50%.