# The Smallest Grammar Problem Revisited

Hideo Bannai, Momoko Hirayama, Danny Hucke, Shunsuke Inenaga, Artur Jez, Markus Lohrey, Carl Philipp Reh

## Abstract

In a seminal paper, Charikar et al. derive upper and lower bounds on the approximation ratios for several grammar-based compressors, but in all cases there is a gap between the lower and upper bound. Here the gaps for LZ78 and BISECTION are closed by showing that the approximation ratio of LZ78 is $\Theta ((\text {n}/\log \text {n})^{2/3})$ , whereas the approximation ratio of BISECTION is $\Theta (\sqrt {\text {n}/\log \text {n}})$. In addition, the lower bound for RePair is improved from $\Omega (\sqrt {\log \text {n}})$ to $\Omega (\log \text {n}/\log \log \text {n})$. Finally, results of Arpe and Reischuk relating grammar-based compression for arbitrary alphabets and binary alphabets are improved.

Original language English 9259056 317-328 12 IEEE Transactions on Information Theory 67 1 https://doi.org/10.1109/TIT.2020.3038147 Published - Jan 2021

