A boyer-moore type algorithm for compressed pattern matching

Yusuke Shibata, Tetsuya Matsumoto, Masayuki Takeda, Ayumi Shinohara, Setsuo Arikawa

研究成果: Chapter in Book/Report/Conference proceedingConference contribution

45 被引用数 (Scopus)

抄録

We apply the Boyer-Moore technique to compressed pat-tern matching for text string described in terms of collage system, which is a formal framework that captures various dictionary-based compres-sion methods. For a subclass of collage systems that contain no trun-cation, our new algorithm runs in O(‖D‖ + n m + m2 + r) time using O(‖D‖ + m2) space, where ‖D‖ is the size of dictionary D, n is the compressed text length, m is the pattern length, and r is the number of pattern occurrences. For a general collage system, the time complexity is O(height(D) (‖D‖+n)+n m+m2+r), where height(D) is the maximum dependency of tokens in D. We showed that the algorithm specialized for the so-called byte pair encoding (BPE) is very fast in practice. In fact it runs about 1:2 ~ 3:0 times faster than the exact match routine of the software package agrep, known as the fastest pattern matching tool.

本文言語英語
ホスト出版物のタイトルCombinatorial Pattern Matching - 11th Annual Symposium, CPM 2000, Proceedings
編集者Raffaele Giancarlo, David Sankoff
出版社Springer Verlag
ページ181-194
ページ数14
ISBN(電子版)3540676333, 9783540676331
DOI
出版ステータス出版済み - 2000
イベント11th Annual Symposium on Combinatorial Pattern Matching, CPM 2000 - Montreal, カナダ
継続期間: 6 21 20006 23 2000

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
1848
ISSN(印刷版)0302-9743
ISSN(電子版)1611-3349

その他

その他11th Annual Symposium on Combinatorial Pattern Matching, CPM 2000
国/地域カナダ
CityMontreal
Period6/21/006/23/00

All Science Journal Classification (ASJC) codes

  • 理論的コンピュータサイエンス
  • コンピュータ サイエンス(全般)

フィンガープリント

「A boyer-moore type algorithm for compressed pattern matching」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル