Linear-Time text compression by longest-first substitution

Ryosuke Nakamura, Shunsuke Inenaga, Hideo Bannai, Takashi Funamoto, Masayuki Takeda, Ayumi Shinohara

研究成果: ジャーナルへの寄稿記事

9 引用 (Scopus)

抄録

We consider grammar-based text compression with longest first substitution (LFS), where non-overlapping occurrences of a longest repeating factor of the input text are replaced by a new non-terminal symbol. We present the first linear-time algorithm for LFS. Our algorithm employs a new data structure called sparse lazy suffix trees. We also deal with a more sophisticated version of LFS, called LFS2, that allows better compression. The first linear-time algorithm for LFS2 is also presented.

元の言語英語
ページ(範囲)1429-1448
ページ数20
ジャーナルAlgorithms
2
発行部数4
DOI
出版物ステータス出版済み - 12 1 2009

Fingerprint

Text Compression
Substitution
Linear Time
Substitution reactions
Linear-time Algorithm
Suffix Tree
Grammar
Data structures
Data Structures
Compression

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Numerical Analysis
  • Computational Theory and Mathematics
  • Computational Mathematics

これを引用

Linear-Time text compression by longest-first substitution. / Nakamura, Ryosuke; Inenaga, Shunsuke; Bannai, Hideo; Funamoto, Takashi; Takeda, Masayuki; Shinohara, Ayumi.

:: Algorithms, 巻 2, 番号 4, 01.12.2009, p. 1429-1448.

研究成果: ジャーナルへの寄稿記事

Nakamura, Ryosuke ; Inenaga, Shunsuke ; Bannai, Hideo ; Funamoto, Takashi ; Takeda, Masayuki ; Shinohara, Ayumi. / Linear-Time text compression by longest-first substitution. :: Algorithms. 2009 ; 巻 2, 番号 4. pp. 1429-1448.
@article{d92f77e4b8484f7e8cd2dab0477d418d,
title = "Linear-Time text compression by longest-first substitution",
abstract = "We consider grammar-based text compression with longest first substitution (LFS), where non-overlapping occurrences of a longest repeating factor of the input text are replaced by a new non-terminal symbol. We present the first linear-time algorithm for LFS. Our algorithm employs a new data structure called sparse lazy suffix trees. We also deal with a more sophisticated version of LFS, called LFS2, that allows better compression. The first linear-time algorithm for LFS2 is also presented.",
author = "Ryosuke Nakamura and Shunsuke Inenaga and Hideo Bannai and Takashi Funamoto and Masayuki Takeda and Ayumi Shinohara",
year = "2009",
month = "12",
day = "1",
doi = "10.3390/a2041429",
language = "English",
volume = "2",
pages = "1429--1448",
journal = "Algorithms",
issn = "1999-4893",
publisher = "MDPI AG",
number = "4",

}

TY - JOUR

T1 - Linear-Time text compression by longest-first substitution

AU - Nakamura, Ryosuke

AU - Inenaga, Shunsuke

AU - Bannai, Hideo

AU - Funamoto, Takashi

AU - Takeda, Masayuki

AU - Shinohara, Ayumi

PY - 2009/12/1

Y1 - 2009/12/1

N2 - We consider grammar-based text compression with longest first substitution (LFS), where non-overlapping occurrences of a longest repeating factor of the input text are replaced by a new non-terminal symbol. We present the first linear-time algorithm for LFS. Our algorithm employs a new data structure called sparse lazy suffix trees. We also deal with a more sophisticated version of LFS, called LFS2, that allows better compression. The first linear-time algorithm for LFS2 is also presented.

AB - We consider grammar-based text compression with longest first substitution (LFS), where non-overlapping occurrences of a longest repeating factor of the input text are replaced by a new non-terminal symbol. We present the first linear-time algorithm for LFS. Our algorithm employs a new data structure called sparse lazy suffix trees. We also deal with a more sophisticated version of LFS, called LFS2, that allows better compression. The first linear-time algorithm for LFS2 is also presented.

UR - http://www.scopus.com/inward/record.url?scp=77953038533&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77953038533&partnerID=8YFLogxK

U2 - 10.3390/a2041429

DO - 10.3390/a2041429

M3 - Article

VL - 2

SP - 1429

EP - 1448

JO - Algorithms

JF - Algorithms

SN - 1999-4893

IS - 4

ER -