Ternary directed acyclic word graphs

Satoru Miyamoto, Shunsuke Inenaga, Masayuki Takeda, Ayumi Shinohara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Given a set S of strings, a DFA accepting S offers a very time-efficient solution to the pattern matching problem over S. The key is how to implement such a DFA in the trade-off between time and space, and especially the choice of how to implement the transitions of each state is critical. Bentley and Sedgewick proposed an effective tree structure called ternary trees. The idea of ternary trees is to ‘implant’ the process of binary search for transitions into the structure of the trees themselves. This way the process of binary search becomes visible, and the implementation of the trees becomes quite easy. The directed acyclic word graph (DAWG) of a string w is the smallest DFA that accepts all suffixes of w, and requires only linear space. We apply the scheme of ternary trees to DAWGs, introducing a new data structure named ternary DAWGs (TDAWGs). We perform some experiments that show the efficiency of TDAWGs, compared to DAWGs in which transitions are implemented by tables and linked lists.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
EditorsOscar H. Ibarra, Zhe Dang
PublisherSpringer Verlag
Pages108-120
Number of pages13
ISBN (Print)3540405615
Publication statusPublished - Jan 1 2003
Event8th International Conference on Implementation and Application of Automata, CIAA 2003 - Santa Barbara, United States
Duration: Jul 16 2003Jul 18 2003

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2759
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other8th International Conference on Implementation and Application of Automata, CIAA 2003
CountryUnited States
CitySanta Barbara
Period7/16/037/18/03

Fingerprint

Pattern matching
Ternary
Data structures
Binary search
Graph in graph theory
Experiments
Strings
Suffix
Implant
Pattern Matching
Matching Problem
Tree Structure
Efficient Solution
Linear Space
Tables
Data Structures
Trade-offs
Experiment

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Miyamoto, S., Inenaga, S., Takeda, M., & Shinohara, A. (2003). Ternary directed acyclic word graphs. In O. H. Ibarra, & Z. Dang (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 108-120). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2759). Springer Verlag.

Ternary directed acyclic word graphs. / Miyamoto, Satoru; Inenaga, Shunsuke; Takeda, Masayuki; Shinohara, Ayumi.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). ed. / Oscar H. Ibarra; Zhe Dang. Springer Verlag, 2003. p. 108-120 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2759).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Miyamoto, S, Inenaga, S, Takeda, M & Shinohara, A 2003, Ternary directed acyclic word graphs. in OH Ibarra & Z Dang (eds), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 2759, Springer Verlag, pp. 108-120, 8th International Conference on Implementation and Application of Automata, CIAA 2003, Santa Barbara, United States, 7/16/03.
Miyamoto S, Inenaga S, Takeda M, Shinohara A. Ternary directed acyclic word graphs. In Ibarra OH, Dang Z, editors, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer Verlag. 2003. p. 108-120. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Miyamoto, Satoru ; Inenaga, Shunsuke ; Takeda, Masayuki ; Shinohara, Ayumi. / Ternary directed acyclic word graphs. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). editor / Oscar H. Ibarra ; Zhe Dang. Springer Verlag, 2003. pp. 108-120 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{b723dd6924d94efc94e0fc1cdf13580e,
title = "Ternary directed acyclic word graphs",
abstract = "Given a set S of strings, a DFA accepting S offers a very time-efficient solution to the pattern matching problem over S. The key is how to implement such a DFA in the trade-off between time and space, and especially the choice of how to implement the transitions of each state is critical. Bentley and Sedgewick proposed an effective tree structure called ternary trees. The idea of ternary trees is to ‘implant’ the process of binary search for transitions into the structure of the trees themselves. This way the process of binary search becomes visible, and the implementation of the trees becomes quite easy. The directed acyclic word graph (DAWG) of a string w is the smallest DFA that accepts all suffixes of w, and requires only linear space. We apply the scheme of ternary trees to DAWGs, introducing a new data structure named ternary DAWGs (TDAWGs). We perform some experiments that show the efficiency of TDAWGs, compared to DAWGs in which transitions are implemented by tables and linked lists.",
author = "Satoru Miyamoto and Shunsuke Inenaga and Masayuki Takeda and Ayumi Shinohara",
year = "2003",
month = "1",
day = "1",
language = "English",
isbn = "3540405615",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "108--120",
editor = "Ibarra, {Oscar H.} and Zhe Dang",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
address = "Germany",

}

TY - GEN

T1 - Ternary directed acyclic word graphs

AU - Miyamoto, Satoru

AU - Inenaga, Shunsuke

AU - Takeda, Masayuki

AU - Shinohara, Ayumi

PY - 2003/1/1

Y1 - 2003/1/1

N2 - Given a set S of strings, a DFA accepting S offers a very time-efficient solution to the pattern matching problem over S. The key is how to implement such a DFA in the trade-off between time and space, and especially the choice of how to implement the transitions of each state is critical. Bentley and Sedgewick proposed an effective tree structure called ternary trees. The idea of ternary trees is to ‘implant’ the process of binary search for transitions into the structure of the trees themselves. This way the process of binary search becomes visible, and the implementation of the trees becomes quite easy. The directed acyclic word graph (DAWG) of a string w is the smallest DFA that accepts all suffixes of w, and requires only linear space. We apply the scheme of ternary trees to DAWGs, introducing a new data structure named ternary DAWGs (TDAWGs). We perform some experiments that show the efficiency of TDAWGs, compared to DAWGs in which transitions are implemented by tables and linked lists.

AB - Given a set S of strings, a DFA accepting S offers a very time-efficient solution to the pattern matching problem over S. The key is how to implement such a DFA in the trade-off between time and space, and especially the choice of how to implement the transitions of each state is critical. Bentley and Sedgewick proposed an effective tree structure called ternary trees. The idea of ternary trees is to ‘implant’ the process of binary search for transitions into the structure of the trees themselves. This way the process of binary search becomes visible, and the implementation of the trees becomes quite easy. The directed acyclic word graph (DAWG) of a string w is the smallest DFA that accepts all suffixes of w, and requires only linear space. We apply the scheme of ternary trees to DAWGs, introducing a new data structure named ternary DAWGs (TDAWGs). We perform some experiments that show the efficiency of TDAWGs, compared to DAWGs in which transitions are implemented by tables and linked lists.

UR - http://www.scopus.com/inward/record.url?scp=84944328633&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84944328633&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84944328633

SN - 3540405615

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 108

EP - 120

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

A2 - Ibarra, Oscar H.

A2 - Dang, Zhe

PB - Springer Verlag

ER -