TY - GEN
T1 - MR-RePair
T2 - 2019 Data Compression Conference, DCC 2019
AU - Furuya, Isamu
AU - Takagi, Takuya
AU - Nakashima, Yuto
AU - Inenaga, Shunsuke
AU - Bannai, Hideo
AU - Kida, Takuya
PY - 2019/5/10
Y1 - 2019/5/10
N2 - We analyze the grammar generation algorithm of the RePair compression algorithm and show the relation between a grammar generated by RePair and maximal repeats. We reveal that RePair replaces step by step the most frequent pairs within the corresponding most frequent maximal repeats. Then, we design a novel variant of RePair, called MR-RePair, which substitutes the most frequent maximal repeats at once instead of substituting the most frequent pairs consecutively. We implemented MR-RePair and compared the size of the grammar generated by MR-RePair to that by RePair on several text corpora. Our experiments show that MR-RePair generates more compact grammars than RePair does, especially for highly repetitive texts.
AB - We analyze the grammar generation algorithm of the RePair compression algorithm and show the relation between a grammar generated by RePair and maximal repeats. We reveal that RePair replaces step by step the most frequent pairs within the corresponding most frequent maximal repeats. Then, we design a novel variant of RePair, called MR-RePair, which substitutes the most frequent maximal repeats at once instead of substituting the most frequent pairs consecutively. We implemented MR-RePair and compared the size of the grammar generated by MR-RePair to that by RePair on several text corpora. Our experiments show that MR-RePair generates more compact grammars than RePair does, especially for highly repetitive texts.
UR - http://www.scopus.com/inward/record.url?scp=85066340305&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85066340305&partnerID=8YFLogxK
U2 - 10.1109/DCC.2019.00059
DO - 10.1109/DCC.2019.00059
M3 - Conference contribution
AN - SCOPUS:85066340305
T3 - Data Compression Conference Proceedings
SP - 508
EP - 517
BT - Proceedings - DCC 2019
A2 - Serra-Sagrista, Joan
A2 - Bilgin, Ali
A2 - Marcellin, Michael W.
A2 - Storer, James A.
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 26 March 2019 through 29 March 2019
ER -