Context-sensitive grammar transform: Compression and pattern matching

Shirou Maruyama, Yohei Tanaka, Hiroshi Sakamoto, Masayuki Takeda

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A framework of context-sensitive grammar transform is proposed. A greedy compression algorithm with the transform model is presented as well as a Knuth-Morris-Pratt (KMP)-type compressed pattern matching (CPM) algorithm. The compression performance is a match for gzip and Re-Pair. The search speed of our CPM algorithm is almost twice faster than the KMP type CPM algorithm on Byte-Pair-Encoding by Shibata et al. (2000), and in the case of short patterns, faster than the Boyer-Moore-Horspool algorithm with the stopper encoding by Rautio et al. (2002), which is regarded as one of the best combinations that allows a practically fast search.

Original languageEnglish
Title of host publicationString Processing and Information Retrieval - 15th International Symposium, SPIRE 2008, Proceedings
EditorsAndrew Turpin, Alistair Moffat, Amihood Amir
PublisherSpringer Verlag
Pages27-38
Number of pages12
ISBN (Print)9783540890966
DOIs
Publication statusPublished - Jan 1 2008
Event15th International Symposium on String Processing and Information Retrieval, SPIRE 2008 - Melbourne. VIC, Australia
Duration: Nov 10 2008Nov 12 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5280 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference15th International Symposium on String Processing and Information Retrieval, SPIRE 2008
CountryAustralia
CityMelbourne. VIC
Period11/10/0811/12/08

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Context-sensitive grammar transform: Compression and pattern matching'. Together they form a unique fingerprint.

  • Cite this

    Maruyama, S., Tanaka, Y., Sakamoto, H., & Takeda, M. (2008). Context-sensitive grammar transform: Compression and pattern matching. In A. Turpin, A. Moffat, & A. Amir (Eds.), String Processing and Information Retrieval - 15th International Symposium, SPIRE 2008, Proceedings (pp. 27-38). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5280 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-540-89097-3_5