An online algorithm for lightweight grammar-based compression

Shirou Maruyama, Masayuki Takeda, Masaya Nakahara, Hiroshi Sakamoto

研究成果: 著書/レポートタイプへの貢献会議での発言

5 引用 (Scopus)

抄録

Grammar-based compression is a well-studied technique for constructing a small context-free grammar (CFG) uniquely deriving a given text. In this paper, we present an online algorithm for lightweight grammar-based compression. Our algorithm is based on the LCA algorithm [Sakamoto et al. 2004]which guarantees nearly optimum compression ratio and space. LCA, however, is an offline algorithm and requires external space to save space consumption. Therefore, we present its online version which inherits most characteristics of the original LCA. Our algorithm guarantees O(log2n)-approximation ratio for an optimum grammar size, and all work is carried out on a main memory space which is bounded by the output size. In addition, we propose more practical encoding based on parentheses representation of a binary tree. Experimental results for repetitive texts demonstrate that our algorithm achieves effective compression compared to other practical compressors and the space consumption of our algorithm is smaller than the input text size.

元の言語英語
ホスト出版物のタイトルProceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011
ページ19-28
ページ数10
DOI
出版物ステータス出版済み - 11 21 2011
イベント1st International Conference on Data Compression, Communication, and Processing, CCP 2011 - Palinuro, Cilento Coast, イタリア
継続期間: 6 21 20116 24 2011

出版物シリーズ

名前Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011

その他

その他1st International Conference on Data Compression, Communication, and Processing, CCP 2011
イタリア
Palinuro, Cilento Coast
期間6/21/116/24/11

Fingerprint

Context free grammars
Binary trees
Compressors
Data storage equipment

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems

これを引用

Maruyama, S., Takeda, M., Nakahara, M., & Sakamoto, H. (2011). An online algorithm for lightweight grammar-based compression. : Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011 (pp. 19-28). [6061023] (Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011). https://doi.org/10.1109/CCP.2011.40

An online algorithm for lightweight grammar-based compression. / Maruyama, Shirou; Takeda, Masayuki; Nakahara, Masaya; Sakamoto, Hiroshi.

Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011. 2011. p. 19-28 6061023 (Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011).

研究成果: 著書/レポートタイプへの貢献会議での発言

Maruyama, S, Takeda, M, Nakahara, M & Sakamoto, H 2011, An online algorithm for lightweight grammar-based compression. : Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011., 6061023, Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011, pp. 19-28, 1st International Conference on Data Compression, Communication, and Processing, CCP 2011, Palinuro, Cilento Coast, イタリア, 6/21/11. https://doi.org/10.1109/CCP.2011.40
Maruyama S, Takeda M, Nakahara M, Sakamoto H. An online algorithm for lightweight grammar-based compression. : Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011. 2011. p. 19-28. 6061023. (Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011). https://doi.org/10.1109/CCP.2011.40
Maruyama, Shirou ; Takeda, Masayuki ; Nakahara, Masaya ; Sakamoto, Hiroshi. / An online algorithm for lightweight grammar-based compression. Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011. 2011. pp. 19-28 (Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011).
@inproceedings{1be51445b8744aad9b27e95261cd050e,
title = "An online algorithm for lightweight grammar-based compression",
abstract = "Grammar-based compression is a well-studied technique for constructing a small context-free grammar (CFG) uniquely deriving a given text. In this paper, we present an online algorithm for lightweight grammar-based compression. Our algorithm is based on the LCA algorithm [Sakamoto et al. 2004]which guarantees nearly optimum compression ratio and space. LCA, however, is an offline algorithm and requires external space to save space consumption. Therefore, we present its online version which inherits most characteristics of the original LCA. Our algorithm guarantees O(log2n)-approximation ratio for an optimum grammar size, and all work is carried out on a main memory space which is bounded by the output size. In addition, we propose more practical encoding based on parentheses representation of a binary tree. Experimental results for repetitive texts demonstrate that our algorithm achieves effective compression compared to other practical compressors and the space consumption of our algorithm is smaller than the input text size.",
author = "Shirou Maruyama and Masayuki Takeda and Masaya Nakahara and Hiroshi Sakamoto",
year = "2011",
month = "11",
day = "21",
doi = "10.1109/CCP.2011.40",
language = "English",
isbn = "9780769545288",
series = "Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011",
pages = "19--28",
booktitle = "Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011",

}

TY - GEN

T1 - An online algorithm for lightweight grammar-based compression

AU - Maruyama, Shirou

AU - Takeda, Masayuki

AU - Nakahara, Masaya

AU - Sakamoto, Hiroshi

PY - 2011/11/21

Y1 - 2011/11/21

N2 - Grammar-based compression is a well-studied technique for constructing a small context-free grammar (CFG) uniquely deriving a given text. In this paper, we present an online algorithm for lightweight grammar-based compression. Our algorithm is based on the LCA algorithm [Sakamoto et al. 2004]which guarantees nearly optimum compression ratio and space. LCA, however, is an offline algorithm and requires external space to save space consumption. Therefore, we present its online version which inherits most characteristics of the original LCA. Our algorithm guarantees O(log2n)-approximation ratio for an optimum grammar size, and all work is carried out on a main memory space which is bounded by the output size. In addition, we propose more practical encoding based on parentheses representation of a binary tree. Experimental results for repetitive texts demonstrate that our algorithm achieves effective compression compared to other practical compressors and the space consumption of our algorithm is smaller than the input text size.

AB - Grammar-based compression is a well-studied technique for constructing a small context-free grammar (CFG) uniquely deriving a given text. In this paper, we present an online algorithm for lightweight grammar-based compression. Our algorithm is based on the LCA algorithm [Sakamoto et al. 2004]which guarantees nearly optimum compression ratio and space. LCA, however, is an offline algorithm and requires external space to save space consumption. Therefore, we present its online version which inherits most characteristics of the original LCA. Our algorithm guarantees O(log2n)-approximation ratio for an optimum grammar size, and all work is carried out on a main memory space which is bounded by the output size. In addition, we propose more practical encoding based on parentheses representation of a binary tree. Experimental results for repetitive texts demonstrate that our algorithm achieves effective compression compared to other practical compressors and the space consumption of our algorithm is smaller than the input text size.

UR - http://www.scopus.com/inward/record.url?scp=81255164860&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=81255164860&partnerID=8YFLogxK

U2 - 10.1109/CCP.2011.40

DO - 10.1109/CCP.2011.40

M3 - Conference contribution

AN - SCOPUS:81255164860

SN - 9780769545288

T3 - Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011

SP - 19

EP - 28

BT - Proceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011

ER -