An online algorithm for lightweight grammar-based compression

Shirou Maruyama, Masayuki Takeda, Masaya Nakahara, Hiroshi Sakamoto

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Grammar-based compression is a well-studied technique for constructing a small context-free grammar (CFG) uniquely deriving a given text. In this paper, we present an online algorithm for lightweight grammar-based compression. Our algorithm is based on the LCA algorithm [Sakamoto et al. 2004]which guarantees nearly optimum compression ratio and space. LCA, however, is an offline algorithm and requires external space to save space consumption. Therefore, we present its online version which inherits most characteristics of the original LCA. Our algorithm guarantees O(log2n)-approximation ratio for an optimum grammar size, and all work is carried out on a main memory space which is bounded by the output size. In addition, we propose more practical encoding based on parentheses representation of a binary tree. Experimental results for repetitive texts demonstrate that our algorithm achieves effective compression compared to other practical compressors and the space consumption of our algorithm is smaller than the input text size.

Original languageEnglish
Title of host publicationProceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011
Pages19-28
Number of pages10
DOIs
Publication statusPublished - 2011
Event1st International Conference on Data Compression, Communication, and Processing, CCP 2011 - Palinuro, Cilento Coast, Italy
Duration: Jun 21 2011Jun 24 2011

Publication series

NameProceedings - 1st International Conference on Data Compression, Communication, and Processing, CCP 2011

Other

Other1st International Conference on Data Compression, Communication, and Processing, CCP 2011
Country/TerritoryItaly
CityPalinuro, Cilento Coast
Period6/21/116/24/11

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems

Fingerprint

Dive into the research topics of 'An online algorithm for lightweight grammar-based compression'. Together they form a unique fingerprint.

Cite this