Small-space LCE data structure with constant-time queries

Yuka Tanimura, Takaaki Nishimoto, Hideo Bannai, Shunsuke Inenaga, Masayuki Takeda

研究成果: 著書/レポートタイプへの貢献会議での発言

3 引用 (Scopus)

抄録

The longest common extension (LCE) problem is to preprocess a given string ω of length n so that the length of the longest common prefix between suffixes of ω that start at any two given positions is answered quickly. In this paper, we present a data structure of O(z2 + n/t ) words of space which answers LCE queries in O(1) time and can be built in O(n log δ) time, where 1 ≤ T ≤ √n is a parameter, z is the size of the Lempel-Ziv 77 factorization of ω and φ is the alphabet size. The proposed LCE data structure does not access the input string ω when answering queries, and thus w can be deleted after preprocessing. On top of this main result, we obtain further results using (variants of) our LCE data structure, which include the following: For highly repetitive strings where the z2 term is dominated by n/x, we obtain a constant-time and sub-linear space LCE query data structure. Even when the input string is not well compressible via Lempel-Ziv 77 factorization, we still can obtain a constant-time and sub-linear space LCE data structure for suitable and for φ ≤ 2o(log n). The time-space trade-off lower bounds for the LCE problem by Bille et al. [J. Discrete Algorithms, 25:42-50, 2014] and by Kosolobov [CoRR, abs/1611.02891, 2016] do not apply in some cases with our LCE data structure.

元の言語英語
ホスト出版物のタイトル42nd International Symposium on Mathematical Foundations of Computer Science, MFCS 2017
編集者Kim G. Larsen, Jean-Francois Raskin, Hans L. Bodlaender
出版者Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
ISBN(電子版)9783959770460
DOI
出版物ステータス出版済み - 11 1 2017
イベント42nd International Symposium on Mathematical Foundations of Computer Science, MFCS 2017 - Aalborg, デンマーク
継続期間: 8 21 20178 25 2017

出版物シリーズ

名前Leibniz International Proceedings in Informatics, LIPIcs
83
ISSN(印刷物)1868-8969

その他

その他42nd International Symposium on Mathematical Foundations of Computer Science, MFCS 2017
デンマーク
Aalborg
期間8/21/178/25/17

Fingerprint

Data structures
Factorization

All Science Journal Classification (ASJC) codes

  • Software

これを引用

Tanimura, Y., Nishimoto, T., Bannai, H., Inenaga, S., & Takeda, M. (2017). Small-space LCE data structure with constant-time queries. : K. G. Larsen, J-F. Raskin, & H. L. Bodlaender (版), 42nd International Symposium on Mathematical Foundations of Computer Science, MFCS 2017 (Leibniz International Proceedings in Informatics, LIPIcs; 巻数 83). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. https://doi.org/10.4230/LIPIcs.MFCS.2017.10

Small-space LCE data structure with constant-time queries. / Tanimura, Yuka; Nishimoto, Takaaki; Bannai, Hideo; Inenaga, Shunsuke; Takeda, Masayuki.

42nd International Symposium on Mathematical Foundations of Computer Science, MFCS 2017. 版 / Kim G. Larsen; Jean-Francois Raskin; Hans L. Bodlaender. Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2017. (Leibniz International Proceedings in Informatics, LIPIcs; 巻 83).

研究成果: 著書/レポートタイプへの貢献会議での発言

Tanimura, Y, Nishimoto, T, Bannai, H, Inenaga, S & Takeda, M 2017, Small-space LCE data structure with constant-time queries. : KG Larsen, J-F Raskin & HL Bodlaender (版), 42nd International Symposium on Mathematical Foundations of Computer Science, MFCS 2017. Leibniz International Proceedings in Informatics, LIPIcs, 巻. 83, Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 42nd International Symposium on Mathematical Foundations of Computer Science, MFCS 2017, Aalborg, デンマーク, 8/21/17. https://doi.org/10.4230/LIPIcs.MFCS.2017.10
Tanimura Y, Nishimoto T, Bannai H, Inenaga S, Takeda M. Small-space LCE data structure with constant-time queries. : Larsen KG, Raskin J-F, Bodlaender HL, 編集者, 42nd International Symposium on Mathematical Foundations of Computer Science, MFCS 2017. Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. 2017. (Leibniz International Proceedings in Informatics, LIPIcs). https://doi.org/10.4230/LIPIcs.MFCS.2017.10
Tanimura, Yuka ; Nishimoto, Takaaki ; Bannai, Hideo ; Inenaga, Shunsuke ; Takeda, Masayuki. / Small-space LCE data structure with constant-time queries. 42nd International Symposium on Mathematical Foundations of Computer Science, MFCS 2017. 編集者 / Kim G. Larsen ; Jean-Francois Raskin ; Hans L. Bodlaender. Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2017. (Leibniz International Proceedings in Informatics, LIPIcs).
@inproceedings{ae06c3e765184e1caef47db6138d38c6,
title = "Small-space LCE data structure with constant-time queries",
abstract = "The longest common extension (LCE) problem is to preprocess a given string ω of length n so that the length of the longest common prefix between suffixes of ω that start at any two given positions is answered quickly. In this paper, we present a data structure of O(z2 + n/t ) words of space which answers LCE queries in O(1) time and can be built in O(n log δ) time, where 1 ≤ T ≤ √n is a parameter, z is the size of the Lempel-Ziv 77 factorization of ω and φ is the alphabet size. The proposed LCE data structure does not access the input string ω when answering queries, and thus w can be deleted after preprocessing. On top of this main result, we obtain further results using (variants of) our LCE data structure, which include the following: For highly repetitive strings where the z2 term is dominated by n/x, we obtain a constant-time and sub-linear space LCE query data structure. Even when the input string is not well compressible via Lempel-Ziv 77 factorization, we still can obtain a constant-time and sub-linear space LCE data structure for suitable and for φ ≤ 2o(log n). The time-space trade-off lower bounds for the LCE problem by Bille et al. [J. Discrete Algorithms, 25:42-50, 2014] and by Kosolobov [CoRR, abs/1611.02891, 2016] do not apply in some cases with our LCE data structure.",
author = "Yuka Tanimura and Takaaki Nishimoto and Hideo Bannai and Shunsuke Inenaga and Masayuki Takeda",
year = "2017",
month = "11",
day = "1",
doi = "10.4230/LIPIcs.MFCS.2017.10",
language = "English",
series = "Leibniz International Proceedings in Informatics, LIPIcs",
publisher = "Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing",
editor = "Larsen, {Kim G.} and Jean-Francois Raskin and Bodlaender, {Hans L.}",
booktitle = "42nd International Symposium on Mathematical Foundations of Computer Science, MFCS 2017",

}

TY - GEN

T1 - Small-space LCE data structure with constant-time queries

AU - Tanimura, Yuka

AU - Nishimoto, Takaaki

AU - Bannai, Hideo

AU - Inenaga, Shunsuke

AU - Takeda, Masayuki

PY - 2017/11/1

Y1 - 2017/11/1

N2 - The longest common extension (LCE) problem is to preprocess a given string ω of length n so that the length of the longest common prefix between suffixes of ω that start at any two given positions is answered quickly. In this paper, we present a data structure of O(z2 + n/t ) words of space which answers LCE queries in O(1) time and can be built in O(n log δ) time, where 1 ≤ T ≤ √n is a parameter, z is the size of the Lempel-Ziv 77 factorization of ω and φ is the alphabet size. The proposed LCE data structure does not access the input string ω when answering queries, and thus w can be deleted after preprocessing. On top of this main result, we obtain further results using (variants of) our LCE data structure, which include the following: For highly repetitive strings where the z2 term is dominated by n/x, we obtain a constant-time and sub-linear space LCE query data structure. Even when the input string is not well compressible via Lempel-Ziv 77 factorization, we still can obtain a constant-time and sub-linear space LCE data structure for suitable and for φ ≤ 2o(log n). The time-space trade-off lower bounds for the LCE problem by Bille et al. [J. Discrete Algorithms, 25:42-50, 2014] and by Kosolobov [CoRR, abs/1611.02891, 2016] do not apply in some cases with our LCE data structure.

AB - The longest common extension (LCE) problem is to preprocess a given string ω of length n so that the length of the longest common prefix between suffixes of ω that start at any two given positions is answered quickly. In this paper, we present a data structure of O(z2 + n/t ) words of space which answers LCE queries in O(1) time and can be built in O(n log δ) time, where 1 ≤ T ≤ √n is a parameter, z is the size of the Lempel-Ziv 77 factorization of ω and φ is the alphabet size. The proposed LCE data structure does not access the input string ω when answering queries, and thus w can be deleted after preprocessing. On top of this main result, we obtain further results using (variants of) our LCE data structure, which include the following: For highly repetitive strings where the z2 term is dominated by n/x, we obtain a constant-time and sub-linear space LCE query data structure. Even when the input string is not well compressible via Lempel-Ziv 77 factorization, we still can obtain a constant-time and sub-linear space LCE data structure for suitable and for φ ≤ 2o(log n). The time-space trade-off lower bounds for the LCE problem by Bille et al. [J. Discrete Algorithms, 25:42-50, 2014] and by Kosolobov [CoRR, abs/1611.02891, 2016] do not apply in some cases with our LCE data structure.

UR - http://www.scopus.com/inward/record.url?scp=85038430729&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85038430729&partnerID=8YFLogxK

U2 - 10.4230/LIPIcs.MFCS.2017.10

DO - 10.4230/LIPIcs.MFCS.2017.10

M3 - Conference contribution

T3 - Leibniz International Proceedings in Informatics, LIPIcs

BT - 42nd International Symposium on Mathematical Foundations of Computer Science, MFCS 2017

A2 - Larsen, Kim G.

A2 - Raskin, Jean-Francois

A2 - Bodlaender, Hans L.

PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing

ER -