Fully dynamic data structure for LCE queries in compressed space

Takaaki Nishimoto, I. Tomohiro, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

研究成果: 著書/レポートタイプへの貢献会議での発言

5 引用 (Scopus)

抄録

A Longest Common Extension (LCE) query on a text T of length N asks for the length of the longest common prefix of suffixes starting at given two positions. We show that the signature encoding G of size w = O(min(z logN log∗M,N)) [Mehlhorn et al., Algorithmica 17(2):183- 198, 1997] of T, which can be seen as a compressed representation of T, has a capability to support LCE queries in O(logN + log ℓ log∗M) time, where ℓ is the answer to the query, z is the size of the Lempel-Ziv77 (LZ77) factorization of T, and M ≥ 4N is an integer that can be handled in constant time under word RAM model. In compressed space, this is the fastest deterministic LCE data structure in many cases. Moreover, G can be enhanced to support efficient update operations: After processing G in O(wfA) time, we can insert/delete any (sub)string of length y into/from an arbitrary position of T in O((y + logN log∗M)fA) time, where fA = O(min{log log M log log w/log log log M, √log w/log log w}). This yields the first fully dynamic LCE data structure working in compressed space. We also present efficient construction algorithms from various types of inputs: We can construct G in O(NfA) time from uncompressed string T; in O(n log log(n log∗M) logN log∗M) time from grammar-compressed string T represented by a straight-line program of size n; and in O(zfA logN log∗M) time from LZ77-compressed string T with z factors. On top of the above contributions, we show several applications of our data structures which improve previous best known results on grammar-compressed string processing.

元の言語英語
ホスト出版物のタイトル41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016
出版者Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
58
ISBN(電子版)9783959770163
DOI
出版物ステータス出版済み - 8 1 2016
イベント41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016 - Krakow, ポーランド
継続期間: 8 22 20168 26 2016

その他

その他41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016
ポーランド
Krakow
期間8/22/168/26/16

Fingerprint

Data structures
Random access storage
Processing
Factorization

All Science Journal Classification (ASJC) codes

  • Software

これを引用

Nishimoto, T., Tomohiro, I., Inenaga, S., Bannai, H., & Takeda, M. (2016). Fully dynamic data structure for LCE queries in compressed space. : 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016 (巻 58). [72] Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. https://doi.org/10.4230/LIPIcs.MFCS.2016.72

Fully dynamic data structure for LCE queries in compressed space. / Nishimoto, Takaaki; Tomohiro, I.; Inenaga, Shunsuke; Bannai, Hideo; Takeda, Masayuki.

41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016. 巻 58 Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2016. 72.

研究成果: 著書/レポートタイプへの貢献会議での発言

Nishimoto, T, Tomohiro, I, Inenaga, S, Bannai, H & Takeda, M 2016, Fully dynamic data structure for LCE queries in compressed space. : 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016. 巻. 58, 72, Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016, Krakow, ポーランド, 8/22/16. https://doi.org/10.4230/LIPIcs.MFCS.2016.72
Nishimoto T, Tomohiro I, Inenaga S, Bannai H, Takeda M. Fully dynamic data structure for LCE queries in compressed space. : 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016. 巻 58. Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. 2016. 72 https://doi.org/10.4230/LIPIcs.MFCS.2016.72
Nishimoto, Takaaki ; Tomohiro, I. ; Inenaga, Shunsuke ; Bannai, Hideo ; Takeda, Masayuki. / Fully dynamic data structure for LCE queries in compressed space. 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016. 巻 58 Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2016.
@inproceedings{21f2ad028e7941d9a6a16ff005e87860,
title = "Fully dynamic data structure for LCE queries in compressed space",
abstract = "A Longest Common Extension (LCE) query on a text T of length N asks for the length of the longest common prefix of suffixes starting at given two positions. We show that the signature encoding G of size w = O(min(z logN log∗M,N)) [Mehlhorn et al., Algorithmica 17(2):183- 198, 1997] of T, which can be seen as a compressed representation of T, has a capability to support LCE queries in O(logN + log ℓ log∗M) time, where ℓ is the answer to the query, z is the size of the Lempel-Ziv77 (LZ77) factorization of T, and M ≥ 4N is an integer that can be handled in constant time under word RAM model. In compressed space, this is the fastest deterministic LCE data structure in many cases. Moreover, G can be enhanced to support efficient update operations: After processing G in O(wfA) time, we can insert/delete any (sub)string of length y into/from an arbitrary position of T in O((y + logN log∗M)fA) time, where fA = O(min{log log M log log w/log log log M, √log w/log log w}). This yields the first fully dynamic LCE data structure working in compressed space. We also present efficient construction algorithms from various types of inputs: We can construct G in O(NfA) time from uncompressed string T; in O(n log log(n log∗M) logN log∗M) time from grammar-compressed string T represented by a straight-line program of size n; and in O(zfA logN log∗M) time from LZ77-compressed string T with z factors. On top of the above contributions, we show several applications of our data structures which improve previous best known results on grammar-compressed string processing.",
author = "Takaaki Nishimoto and I. Tomohiro and Shunsuke Inenaga and Hideo Bannai and Masayuki Takeda",
year = "2016",
month = "8",
day = "1",
doi = "10.4230/LIPIcs.MFCS.2016.72",
language = "English",
volume = "58",
booktitle = "41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016",
publisher = "Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing",

}

TY - GEN

T1 - Fully dynamic data structure for LCE queries in compressed space

AU - Nishimoto, Takaaki

AU - Tomohiro, I.

AU - Inenaga, Shunsuke

AU - Bannai, Hideo

AU - Takeda, Masayuki

PY - 2016/8/1

Y1 - 2016/8/1

N2 - A Longest Common Extension (LCE) query on a text T of length N asks for the length of the longest common prefix of suffixes starting at given two positions. We show that the signature encoding G of size w = O(min(z logN log∗M,N)) [Mehlhorn et al., Algorithmica 17(2):183- 198, 1997] of T, which can be seen as a compressed representation of T, has a capability to support LCE queries in O(logN + log ℓ log∗M) time, where ℓ is the answer to the query, z is the size of the Lempel-Ziv77 (LZ77) factorization of T, and M ≥ 4N is an integer that can be handled in constant time under word RAM model. In compressed space, this is the fastest deterministic LCE data structure in many cases. Moreover, G can be enhanced to support efficient update operations: After processing G in O(wfA) time, we can insert/delete any (sub)string of length y into/from an arbitrary position of T in O((y + logN log∗M)fA) time, where fA = O(min{log log M log log w/log log log M, √log w/log log w}). This yields the first fully dynamic LCE data structure working in compressed space. We also present efficient construction algorithms from various types of inputs: We can construct G in O(NfA) time from uncompressed string T; in O(n log log(n log∗M) logN log∗M) time from grammar-compressed string T represented by a straight-line program of size n; and in O(zfA logN log∗M) time from LZ77-compressed string T with z factors. On top of the above contributions, we show several applications of our data structures which improve previous best known results on grammar-compressed string processing.

AB - A Longest Common Extension (LCE) query on a text T of length N asks for the length of the longest common prefix of suffixes starting at given two positions. We show that the signature encoding G of size w = O(min(z logN log∗M,N)) [Mehlhorn et al., Algorithmica 17(2):183- 198, 1997] of T, which can be seen as a compressed representation of T, has a capability to support LCE queries in O(logN + log ℓ log∗M) time, where ℓ is the answer to the query, z is the size of the Lempel-Ziv77 (LZ77) factorization of T, and M ≥ 4N is an integer that can be handled in constant time under word RAM model. In compressed space, this is the fastest deterministic LCE data structure in many cases. Moreover, G can be enhanced to support efficient update operations: After processing G in O(wfA) time, we can insert/delete any (sub)string of length y into/from an arbitrary position of T in O((y + logN log∗M)fA) time, where fA = O(min{log log M log log w/log log log M, √log w/log log w}). This yields the first fully dynamic LCE data structure working in compressed space. We also present efficient construction algorithms from various types of inputs: We can construct G in O(NfA) time from uncompressed string T; in O(n log log(n log∗M) logN log∗M) time from grammar-compressed string T represented by a straight-line program of size n; and in O(zfA logN log∗M) time from LZ77-compressed string T with z factors. On top of the above contributions, we show several applications of our data structures which improve previous best known results on grammar-compressed string processing.

UR - http://www.scopus.com/inward/record.url?scp=85012885943&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85012885943&partnerID=8YFLogxK

U2 - 10.4230/LIPIcs.MFCS.2016.72

DO - 10.4230/LIPIcs.MFCS.2016.72

M3 - Conference contribution

VL - 58

BT - 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016

PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing

ER -