On-line linear-time construction of word suffix trees

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Citations (Scopus)

Abstract

Suffix trees are the key data structure for text string matching, and are used in wide application areas such as bioinformatics and data compression. Sparse suffix trees are kind of suffix trees that represent only a subset of suffixes of the input string. In this paper we study word auffix trees, which are one variation of sparse suffix trees, Let D be a dictionary of words and w be a string in D+, namely, ω is a sequence ω1 ⋯ ωk of k words in D. The word suffix tree of ω w.r.t. D is a path-compressed trie that represents only the k suffixes in the form of ωi ⋯ ωk- A typical example of its application is word- and phrase-level search on natural language documents. Andersson et al. proposed an algorithm to build word suffix trees in O(n) expected time with O(k) space, In this paper we present a new word suffix tree construction algorithm with O(n) running time and O(k) space in the worst cases. Our algorithm is on-line, which means that it can sequentially process the characters in the input, each by each, from left to right.

Original languageEnglish
Title of host publicationCombinatorial Pattern Matching - 17th Annual Symposium, CPM 2006, Proceedings
PublisherSpringer Verlag
Pages60-71
Number of pages12
ISBN (Print)3540354557, 9783540354550
DOIs
Publication statusPublished - Jan 1 2006
Event17th Annual Symposium on Combinatorial Pattern Matching, CPM 2006 - Barcelona, Spain
Duration: Jul 5 2006Jul 7 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4009 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other17th Annual Symposium on Combinatorial Pattern Matching, CPM 2006
Country/TerritorySpain
CityBarcelona
Period7/5/067/7/06

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'On-line linear-time construction of word suffix trees'. Together they form a unique fingerprint.

Cite this