IEEE Access (Jan 2023)

DNA-Based Storage of RDF Graph Data: A Futuristic Approach to Data Analytics

  • Asad Usmani,
  • Lena Wiese

DOI
https://doi.org/10.1109/ACCESS.2023.3332254
Journal volume & issue
Vol. 11
pp. 129931 – 129944

Abstract

Read online

Future data analytics will require enormous storage space for data-driven decisions, necessitating alternative storage sources for massive data archives. Storage solutions have always been in demand due to the limitations of existing media. Deoxyribonucleic Acid (DNA) is an emergent storage medium suitable for archival storage of rapidly increasing digital volumes. Due to its longevity, DNA storage technology has led to numerous applications to store and retrieve entire data. In this way, DNA synthesis and sequencing costs can be reduced by compressing data in full before it is stored. However, prior works have not used DNA storage to retrieve partial data from complex graphs, while taking advantage of cost-effective advanced analytics. In this paper, we present an efficient DNA-based query processing system to retrieve partial information using RDF graph data. Moreover, using binary search, we fetch and decode significantly fewer DNA strands to obtain partial information about RDF graph data based on SPARQL queries. Specifically, the experimental analysis shows that the average data retrieval per query as output is found less than 1% for RDF graphs with more than 1MB (Megabytes) in size, which consequently reduces a significant amount of sequencing costs.

Keywords