Algorithms (Apr 2012)

An Online Algorithm for Lightweight Grammar-Based Compression

  • Masayuki Takeda,
  • Hiroshi Sakamoto,
  • Shirou Maruyama

DOI
https://doi.org/10.3390/a5020214
Journal volume & issue
Vol. 5, no. 2
pp. 214 – 235

Abstract

Read online

Grammar-based compression is a well-studied technique to construct a context-free grammar (CFG) deriving a given text uniquely. In this work, we propose an online algorithm for grammar-based compression. Our algorithm guarantees O(log2 n)- approximation ratio for the minimum grammar size, where n is an input size, and it runs in input linear time and output linear space. In addition, we propose a practical encoding, which transforms a restricted CFG into a more compact representation. Experimental results by comparison with standard compressors demonstrate that our algorithm is especially effective for highly repetitive text.

Keywords