TY - GEN
T1 - An online algorithm for lightweight grammar-based compression
AU - Maruyama, Shirou
AU - Takeda, Masayuki
AU - Nakahara, Masaya
AU - Sakamoto, Hiroshi
PY - 2011
Y1 - 2011
N2 - Grammar-based compression is a well-studied technique for constructing a small context-free grammar (CFG) uniquely deriving a given text. In this paper, we present an online algorithm for lightweight grammar-based compression. Our algorithm is based on the LCA algorithm [Sakamoto et al. 2004]which guarantees nearly optimum compression ratio and space. LCA, however, is an offline algorithm and requires external space to save space consumption. Therefore, we present its online version which inherits most characteristics of the original LCA. Our algorithm guarantees O(log2n)-approximation ratio for an optimum grammar size, and all work is carried out on a main memory space which is bounded by the output size. In addition, we propose more practical encoding based on parentheses representation of a binary tree. Experimental results for repetitive texts demonstrate that our algorithm achieves effective compression compared to other practical compressors and the space consumption of our algorithm is smaller than the input text size.
AB - Grammar-based compression is a well-studied technique for constructing a small context-free grammar (CFG) uniquely deriving a given text. In this paper, we present an online algorithm for lightweight grammar-based compression. Our algorithm is based on the LCA algorithm [Sakamoto et al. 2004]which guarantees nearly optimum compression ratio and space. LCA, however, is an offline algorithm and requires external space to save space consumption. Therefore, we present its online version which inherits most characteristics of the original LCA. Our algorithm guarantees O(log2n)-approximation ratio for an optimum grammar size, and all work is carried out on a main memory space which is bounded by the output size. In addition, we propose more practical encoding based on parentheses representation of a binary tree. Experimental results for repetitive texts demonstrate that our algorithm achieves effective compression compared to other practical compressors and the space consumption of our algorithm is smaller than the input text size.
UR - http://www.scopus.com/inward/record.url?scp=81255164860&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=81255164860&partnerID=8YFLogxK
U2 - 10.1109/CCP.2011.40
DO - 10.1109/CCP.2011.40
M3 - Conference contribution
AN - SCOPUS:81255164860
SN - 9780769545288
T3 - Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011
SP - 19
EP - 28
BT - Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011
T2 - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011
Y2 - 21 June 2011 through 24 June 2011
ER -