Lempel-Ziv Parsing for Sequences of Blocks

Dmitry Kosolobov; Daniel Valenzuela

doi:10.3390/a14120359

Algorithms (Dec 2021)

Lempel-Ziv Parsing for Sequences of Blocks

Dmitry Kosolobov,
Daniel Valenzuela

Affiliations

Dmitry Kosolobov: Department of Physical and Mathematical Sciences, Ural Federal University, 620000 Ekaterinburg, Russia
Daniel Valenzuela: Department of Computer Science, University of Helsinki, FI-00014 Helsinki, Finland

DOI: https://doi.org/10.3390/a14120359
Journal volume & issue: Vol. 14, no. 12
p. 359

Abstract

Read online

The Lempel-Ziv parsing (LZ77) is a widely popular construction lying at the heart of many compression algorithms. These algorithms usually treat the data as a sequence of bytes, i.e., blocks of fixed length 8. Another common option is to view the data as a sequence of bits. We investigate the following natural question: what is the relationship between the LZ77 parsings of the same data interpreted as a sequence of fixed-length blocks and as a sequence of bits (or other “elementary” letters)? In this paper, we prove that, for any integer b>1, the number z of phrases in the LZ77 parsing of a string of length n and the number zb of phrases in the LZ77 parsing of the same string in which blocks of length b are interpreted as separate letters (e.g., b=8 in case of bytes) are related as zb=O(bzlognz). The bound holds for both “overlapping” and “non-overlapping” versions of LZ77. Further, we establish a tight bound zb=O(bz) for the special case when each phrase in the LZ77 parsing of the string has a “phrase-aligned” earlier occurrence (an occurrence equal to the concatenation of consecutive phrases). The latter is an important particular case of parsing produced, for instance, by grammar-based compression methods.

Published in Algorithms

ISSN: 1999-4893 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.mdpi.com/journal/algorithms

About the journal

Abstract

Keywords