TY - JOUR
T1 - The Smallest Grammar Problem Revisited
AU - Bannai, Hideo
AU - Hirayama, Momoko
AU - Hucke, Danny
AU - Inenaga, Shunsuke
AU - Jez, Artur
AU - Lohrey, Markus
AU - Reh, Carl Philipp
N1 - Funding Information:
Manuscript received January 9, 2018; revised February 14, 2020; accepted September 10, 2020. Date of publication November 16, 2020; date of current version December 21, 2020. The work of Hideo Bannai was supported by JSPS KAKENHI under Grant JP16H02783 and Grant JP20H04141. The work of Shunsuke Inenaga was supported in part by JSPS KAKENHI under Grant JP17H01697 and in part by JST PRESTO under Grant JPMJPR1922. The work of Artur Jeż was supported by the National Science Centre, Poland, under Project 2017/26/E/ST6/00191. The work of Markus Lohrey was supported by the DFG research project LO 748/10-1 (QUANT-KOMP). (Corresponding author: Carl Philipp Reh.) Hideo Bannai is with the Department of Data Science Algorithm Design and Analysis, Tokyo Medical and Dental University, Tokyo 113-8510, Japan.
Publisher Copyright:
© 1963-2012 IEEE.
PY - 2021/1
Y1 - 2021/1
N2 - In a seminal paper, Charikar et al. derive upper and lower bounds on the approximation ratios for several grammar-based compressors, but in all cases there is a gap between the lower and upper bound. Here the gaps for LZ78 and BISECTION are closed by showing that the approximation ratio of LZ78 is $\Theta ((\text {n}/\log \text {n})^{2/3})$ , whereas the approximation ratio of BISECTION is $\Theta (\sqrt {\text {n}/\log \text {n}})$. In addition, the lower bound for RePair is improved from $\Omega (\sqrt {\log \text {n}})$ to $\Omega (\log \text {n}/\log \log \text {n})$. Finally, results of Arpe and Reischuk relating grammar-based compression for arbitrary alphabets and binary alphabets are improved.
AB - In a seminal paper, Charikar et al. derive upper and lower bounds on the approximation ratios for several grammar-based compressors, but in all cases there is a gap between the lower and upper bound. Here the gaps for LZ78 and BISECTION are closed by showing that the approximation ratio of LZ78 is $\Theta ((\text {n}/\log \text {n})^{2/3})$ , whereas the approximation ratio of BISECTION is $\Theta (\sqrt {\text {n}/\log \text {n}})$. In addition, the lower bound for RePair is improved from $\Omega (\sqrt {\log \text {n}})$ to $\Omega (\log \text {n}/\log \log \text {n})$. Finally, results of Arpe and Reischuk relating grammar-based compression for arbitrary alphabets and binary alphabets are improved.
UR - http://www.scopus.com/inward/record.url?scp=85097139422&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85097139422&partnerID=8YFLogxK
U2 - 10.1109/TIT.2020.3038147
DO - 10.1109/TIT.2020.3038147
M3 - Article
AN - SCOPUS:85097139422
SN - 0018-9448
VL - 67
SP - 317
EP - 328
JO - IRE Professional Group on Information Theory
JF - IRE Professional Group on Information Theory
IS - 1
M1 - 9259056
ER -