MR-RePair: Grammar Compression Based on Maximal Repeats

Isamu Furuya, Takuya Takagi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Takuya Kida

研究成果: Chapter in Book/Report/Conference proceedingConference contribution

4 被引用数 (Scopus)

抄録

We analyze the grammar generation algorithm of the RePair compression algorithm and show the relation between a grammar generated by RePair and maximal repeats. We reveal that RePair replaces step by step the most frequent pairs within the corresponding most frequent maximal repeats. Then, we design a novel variant of RePair, called MR-RePair, which substitutes the most frequent maximal repeats at once instead of substituting the most frequent pairs consecutively. We implemented MR-RePair and compared the size of the grammar generated by MR-RePair to that by RePair on several text corpora. Our experiments show that MR-RePair generates more compact grammars than RePair does, especially for highly repetitive texts.

本文言語英語
ホスト出版物のタイトルProceedings - DCC 2019
ホスト出版物のサブタイトル2019 Data Compression Conference
編集者Joan Serra-Sagrista, Ali Bilgin, Michael W. Marcellin, James A. Storer
出版社Institute of Electrical and Electronics Engineers Inc.
ページ508-517
ページ数10
ISBN(電子版)9781728106571
DOI
出版ステータス出版済み - 5 10 2019
イベント2019 Data Compression Conference, DCC 2019 - Snowbird, 米国
継続期間: 3 26 20193 29 2019

出版物シリーズ

名前Data Compression Conference Proceedings
2019-March
ISSN(印刷版)1068-0314

会議

会議2019 Data Compression Conference, DCC 2019
Country米国
CitySnowbird
Period3/26/193/29/19

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications

フィンガープリント 「MR-RePair: Grammar Compression Based on Maximal Repeats」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル