TY - JOUR
T1 - Compressed automata for dictionary matching
AU - I, Tomohiro
AU - Nishimoto, Takaaki
AU - Inenaga, Shunsuke
AU - Bannai, Hideo
AU - Takeda, Masayuki
N1 - Funding Information:
The research of Shunsuke Inenaga was in part supported by Grant-in-Aid of Inamori Foundation and by KAKENHI 23700022 . Hideo Bannai was supported by KAKENHI 25280086 . Masayuki Takeda was supported by KAKENHI 25240003 .
Publisher Copyright:
© 2015 Elsevier B.V..
PY - 2015/5/1
Y1 - 2015/5/1
N2 - We address a variant of the dictionary matching problem where the dictionary is represented by a straight line program (SLP). For a given SLP-compressed dictionary D of size n and height h representing m patterns of total length N, we present an O(n2log N)-size representation of Aho-Corasick automaton which recognizes all occurrences of the patterns in D in amortized O(h+m) running time per character. We also propose an algorithm to construct this compressed Aho-Corasick automaton in O(n3log n log N) time and O(n2log N) space. In a spacial case where D represents only a single pattern, we present an O(n log N)-size representation of the Morris-Pratt automaton which permits us to find all occurrences of the pattern in amortized O(h) running time per character, and we show how to construct this representation in O(n3log n log N) time with O(n2log N) working space.
AB - We address a variant of the dictionary matching problem where the dictionary is represented by a straight line program (SLP). For a given SLP-compressed dictionary D of size n and height h representing m patterns of total length N, we present an O(n2log N)-size representation of Aho-Corasick automaton which recognizes all occurrences of the patterns in D in amortized O(h+m) running time per character. We also propose an algorithm to construct this compressed Aho-Corasick automaton in O(n3log n log N) time and O(n2log N) space. In a spacial case where D represents only a single pattern, we present an O(n log N)-size representation of the Morris-Pratt automaton which permits us to find all occurrences of the pattern in amortized O(h) running time per character, and we show how to construct this representation in O(n3log n log N) time with O(n2log N) working space.
UR - http://www.scopus.com/inward/record.url?scp=84951866541&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84951866541&partnerID=8YFLogxK
U2 - 10.1016/j.tcs.2015.01.019
DO - 10.1016/j.tcs.2015.01.019
M3 - Article
AN - SCOPUS:84951866541
SN - 0304-3975
VL - 578
SP - 30
EP - 41
JO - Theoretical Computer Science
JF - Theoretical Computer Science
ER -