TY - GEN
T1 - Computing convolution on grammar-compressed text
AU - Tanaka, Toshiya
AU - Tomohiro, I.
AU - Inenaga, Shunsuke
AU - Bannai, Hideo
AU - Takeda, Masayuki
N1 - Copyright:
Copyright 2019 Elsevier B.V., All rights reserved.
PY - 2013
Y1 - 2013
N2 - The convolution between a text string S of length N and a pattern string P of length m can be computed in Ο(N logm) time by FFT. It is known that various types of approximate string matching problems are reducible to convolution. In this paper, we assume that the input text string is given in a compressed form, as a straight-line program (SLP), which is a context free grammar in the Chomsky normal form that derives a single string. Given an SLP S of size n describing a text S of length N, and an uncompressed pattern P of length m, we present a simple Ο(nmlogm)-time algorithm to compute the convolution between S and P. We then show that this can be improved to Ο(min{nm,N - α} logm) time, where α ≥ 0 is a value that represents the amount of redundancy that the SLP captures with respect to the length-m substrings. The key of the improvement is our new algorithm that computes the convolution between a trie of size r and a pattern string P of length m in Ο(r logm) time.
AB - The convolution between a text string S of length N and a pattern string P of length m can be computed in Ο(N logm) time by FFT. It is known that various types of approximate string matching problems are reducible to convolution. In this paper, we assume that the input text string is given in a compressed form, as a straight-line program (SLP), which is a context free grammar in the Chomsky normal form that derives a single string. Given an SLP S of size n describing a text S of length N, and an uncompressed pattern P of length m, we present a simple Ο(nmlogm)-time algorithm to compute the convolution between S and P. We then show that this can be improved to Ο(min{nm,N - α} logm) time, where α ≥ 0 is a value that represents the amount of redundancy that the SLP captures with respect to the length-m substrings. The key of the improvement is our new algorithm that computes the convolution between a trie of size r and a pattern string P of length m in Ο(r logm) time.
UR - http://www.scopus.com/inward/record.url?scp=84881048177&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84881048177&partnerID=8YFLogxK
U2 - 10.1109/DCC.2013.53
DO - 10.1109/DCC.2013.53
M3 - Conference contribution
AN - SCOPUS:84881048177
SN - 9780769549651
T3 - Data Compression Conference Proceedings
SP - 451
EP - 460
BT - Proceedings - DCC 2013
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2013 Data Compression Conference, DCC 2013
Y2 - 20 March 2013 through 22 March 2013
ER -