Compressed pattern matching for Sequitur

S. Mitarai, M. Hirao, T. Matsumoto, A. Shinohara, Masayuki Takeda, S. Arikawa

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

Sequitur due to Nevill-Manning and Witten. [19] is a powerful program to infer a phrase hierarchy from the input text, that also provides extremely effective compression of large quantities of semi-structured text [18]. In this paper, we address the problem of searching in Sequitur compressed text directly. We show a compressed pattern matching algorithm that finds a pattern in compressed text without explicit decompression. We show that our algorithm is approximately 1.27 times faster than a decompression followed by an ordinal search.

Original languageEnglish
Pages (from-to)469-478
Number of pages10
JournalUnknown Journal
Publication statusPublished - 2001

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Electrical and Electronic Engineering

Cite this

Mitarai, S., Hirao, M., Matsumoto, T., Shinohara, A., Takeda, M., & Arikawa, S. (2001). Compressed pattern matching for Sequitur. Unknown Journal, 469-478.