TY - JOUR
T1 - Fully-Online Suffix Tree and Directed Acyclic Word Graph Construction for Multiple Texts
AU - Takagi, Takuya
AU - Inenaga, Shunsuke
AU - Arimura, Hiroki
AU - Breslauer, Dany
AU - Hendrian, Diptarama
N1 - Publisher Copyright:
© 2019, Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2020/5/1
Y1 - 2020/5/1
N2 - We consider the construction of the suffix tree and the directed acyclic word graph (DAWG) indexing data structures for a collection T of texts, where a new symbol may be appended to any text in T= { T1, … , TK} , at any time. This fully-online scenario, which arises in dynamically indexing multi-sensor data, is a natural generalization of the long solved semi-online text indexing problem, where texts T1, … , Tk are permanently fixed before the next text Tk + 1 is processed for each k (1 ≤ k< K). We first show that a direct application of Weiner’s right-to-left online construction for the suffix tree of a single text to fully-online multiple texts requires superlinear time. This also means that Blumer et al.’s left-to-right online construction for the DAWG of a single text requires superlinear time in the fully-online setting. We then present our fully-online versions of these algorithms that run in O(Nlog σ) time and O(N) space, where N is the total length of the texts in T and σ is their alphabet size. We then show how to extend Ukkonen’s left-to-right online suffix tree construction to fully-online multiple strings, with the aid of Weiner’s suffix tree for the reversed texts.
AB - We consider the construction of the suffix tree and the directed acyclic word graph (DAWG) indexing data structures for a collection T of texts, where a new symbol may be appended to any text in T= { T1, … , TK} , at any time. This fully-online scenario, which arises in dynamically indexing multi-sensor data, is a natural generalization of the long solved semi-online text indexing problem, where texts T1, … , Tk are permanently fixed before the next text Tk + 1 is processed for each k (1 ≤ k< K). We first show that a direct application of Weiner’s right-to-left online construction for the suffix tree of a single text to fully-online multiple texts requires superlinear time. This also means that Blumer et al.’s left-to-right online construction for the DAWG of a single text requires superlinear time in the fully-online setting. We then present our fully-online versions of these algorithms that run in O(Nlog σ) time and O(N) space, where N is the total length of the texts in T and σ is their alphabet size. We then show how to extend Ukkonen’s left-to-right online suffix tree construction to fully-online multiple strings, with the aid of Weiner’s suffix tree for the reversed texts.
UR - http://www.scopus.com/inward/record.url?scp=85074691026&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85074691026&partnerID=8YFLogxK
U2 - 10.1007/s00453-019-00646-w
DO - 10.1007/s00453-019-00646-w
M3 - Article
AN - SCOPUS:85074691026
SN - 0178-4617
VL - 82
SP - 1346
EP - 1377
JO - Algorithmica
JF - Algorithmica
IS - 5
ER -