TY - GEN

T1 - Shortest unique substring queries on run-length encoded strings

AU - Mieno, Takuya

AU - Inenaga, Shunsuke

AU - Bannai, Hideo

AU - Takeda, Masayuki

N1 - Publisher Copyright:
© Takuya Mieno, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda.
Copyright:
Copyright 2017 Elsevier B.V., All rights reserved.

PY - 2016/8/1

Y1 - 2016/8/1

N2 - We consider the problem of answering shortest unique substring (SUS) queries on run-length encoded strings. For a string S, a unique substring u = S[i..j] is said to be a shortest unique substring (SUS) of S containing an interval [s, t] (i ≤ s ≤ t ≤ j) if for any i0 ≤ s ≤t ≤j0 with j - i > j0 - i0, S[i0..j0] occurs at least twice in S. Given a run-length encoding of size m of a string of length N, we show that we can construct a data structure of size O(m + πs(N,m)) in O(mlogm + πc(N,m)) time such that queries can be answered in O(πq(N,m) + k) time, where k is the size of the output (the number of SUSs), and πs(N,m), πc(N,m), πq(N,m) are, respectively, the size, construction time, and query time for a predecessor/successor query data structure of m elements for the universe of [1,N]. Using the data structure by Beam and Fich (JCSS 2002), this results in a data structure of O(m) space that is constructed in O(mlogm) time, and answers queries in O( √ log m/log logm + k) time.

AB - We consider the problem of answering shortest unique substring (SUS) queries on run-length encoded strings. For a string S, a unique substring u = S[i..j] is said to be a shortest unique substring (SUS) of S containing an interval [s, t] (i ≤ s ≤ t ≤ j) if for any i0 ≤ s ≤t ≤j0 with j - i > j0 - i0, S[i0..j0] occurs at least twice in S. Given a run-length encoding of size m of a string of length N, we show that we can construct a data structure of size O(m + πs(N,m)) in O(mlogm + πc(N,m)) time such that queries can be answered in O(πq(N,m) + k) time, where k is the size of the output (the number of SUSs), and πs(N,m), πc(N,m), πq(N,m) are, respectively, the size, construction time, and query time for a predecessor/successor query data structure of m elements for the universe of [1,N]. Using the data structure by Beam and Fich (JCSS 2002), this results in a data structure of O(m) space that is constructed in O(mlogm) time, and answers queries in O( √ log m/log logm + k) time.

UR - http://www.scopus.com/inward/record.url?scp=85012892239&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85012892239&partnerID=8YFLogxK

U2 - 10.4230/LIPIcs.MFCS.2016.69

DO - 10.4230/LIPIcs.MFCS.2016.69

M3 - Conference contribution

AN - SCOPUS:85012892239

T3 - Leibniz International Proceedings in Informatics, LIPIcs

BT - 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016

A2 - Muscholl, Anca

A2 - Faliszewski, Piotr

A2 - Niedermeier, Rolf

PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing

T2 - 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016

Y2 - 22 August 2016 through 26 August 2016

ER -