Computing DAWGs and minimal absent words in linear time for integer alphabets

Yuta Fujishige, Yuki Tsujimaru, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

研究成果: 著書/レポートタイプへの貢献会議での発言

6 引用 (Scopus)

抄録

The directed acyclic word graph (DAWG) of a string y is the smallest (partial) DFA which recognizes all suffixes of y and has only O(n) nodes and edges. We present the first O(n)-time algorithm for computing the DAWG of a given string y of length n over an integer alphabet of polynomial size in n. We also show that a straightforward modification to our DAWG construction algorithm leads to the first O(n)-time algorithm for constructing the affix tree of a given string y over an integer alphabet. Affix trees are a text indexing structure supporting bidirectional pattern searches. As an application to our O(n)-time DAWG construction algorithm, we show that the set MAW(y) of all minimal absent words of y can be computed in optimal O(n +MAW(y)) time and O(n) working space for integer alphabets.

元の言語英語
ホスト出版物のタイトル41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016
出版者Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
58
ISBN(電子版)9783959770163
DOI
出版物ステータス出版済み - 8 1 2016
イベント41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016 - Krakow, ポーランド
継続期間: 8 22 20168 26 2016

その他

その他41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016
ポーランド
Krakow
期間8/22/168/26/16

Fingerprint

Polynomials

All Science Journal Classification (ASJC) codes

  • Software

これを引用

Fujishige, Y., Tsujimaru, Y., Inenaga, S., Bannai, H., & Takeda, M. (2016). Computing DAWGs and minimal absent words in linear time for integer alphabets. : 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016 (巻 58). [38] Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. https://doi.org/10.4230/LIPIcs.MFCS.2016.38

Computing DAWGs and minimal absent words in linear time for integer alphabets. / Fujishige, Yuta; Tsujimaru, Yuki; Inenaga, Shunsuke; Bannai, Hideo; Takeda, Masayuki.

41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016. 巻 58 Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2016. 38.

研究成果: 著書/レポートタイプへの貢献会議での発言

Fujishige, Y, Tsujimaru, Y, Inenaga, S, Bannai, H & Takeda, M 2016, Computing DAWGs and minimal absent words in linear time for integer alphabets. : 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016. 巻. 58, 38, Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016, Krakow, ポーランド, 8/22/16. https://doi.org/10.4230/LIPIcs.MFCS.2016.38
Fujishige Y, Tsujimaru Y, Inenaga S, Bannai H, Takeda M. Computing DAWGs and minimal absent words in linear time for integer alphabets. : 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016. 巻 58. Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. 2016. 38 https://doi.org/10.4230/LIPIcs.MFCS.2016.38
Fujishige, Yuta ; Tsujimaru, Yuki ; Inenaga, Shunsuke ; Bannai, Hideo ; Takeda, Masayuki. / Computing DAWGs and minimal absent words in linear time for integer alphabets. 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016. 巻 58 Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2016.
@inproceedings{108cb1326cde4de1bec57fe0ff262195,
title = "Computing DAWGs and minimal absent words in linear time for integer alphabets",
abstract = "The directed acyclic word graph (DAWG) of a string y is the smallest (partial) DFA which recognizes all suffixes of y and has only O(n) nodes and edges. We present the first O(n)-time algorithm for computing the DAWG of a given string y of length n over an integer alphabet of polynomial size in n. We also show that a straightforward modification to our DAWG construction algorithm leads to the first O(n)-time algorithm for constructing the affix tree of a given string y over an integer alphabet. Affix trees are a text indexing structure supporting bidirectional pattern searches. As an application to our O(n)-time DAWG construction algorithm, we show that the set MAW(y) of all minimal absent words of y can be computed in optimal O(n +MAW(y)) time and O(n) working space for integer alphabets.",
author = "Yuta Fujishige and Yuki Tsujimaru and Shunsuke Inenaga and Hideo Bannai and Masayuki Takeda",
year = "2016",
month = "8",
day = "1",
doi = "10.4230/LIPIcs.MFCS.2016.38",
language = "English",
volume = "58",
booktitle = "41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016",
publisher = "Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing",

}

TY - GEN

T1 - Computing DAWGs and minimal absent words in linear time for integer alphabets

AU - Fujishige, Yuta

AU - Tsujimaru, Yuki

AU - Inenaga, Shunsuke

AU - Bannai, Hideo

AU - Takeda, Masayuki

PY - 2016/8/1

Y1 - 2016/8/1

N2 - The directed acyclic word graph (DAWG) of a string y is the smallest (partial) DFA which recognizes all suffixes of y and has only O(n) nodes and edges. We present the first O(n)-time algorithm for computing the DAWG of a given string y of length n over an integer alphabet of polynomial size in n. We also show that a straightforward modification to our DAWG construction algorithm leads to the first O(n)-time algorithm for constructing the affix tree of a given string y over an integer alphabet. Affix trees are a text indexing structure supporting bidirectional pattern searches. As an application to our O(n)-time DAWG construction algorithm, we show that the set MAW(y) of all minimal absent words of y can be computed in optimal O(n +MAW(y)) time and O(n) working space for integer alphabets.

AB - The directed acyclic word graph (DAWG) of a string y is the smallest (partial) DFA which recognizes all suffixes of y and has only O(n) nodes and edges. We present the first O(n)-time algorithm for computing the DAWG of a given string y of length n over an integer alphabet of polynomial size in n. We also show that a straightforward modification to our DAWG construction algorithm leads to the first O(n)-time algorithm for constructing the affix tree of a given string y over an integer alphabet. Affix trees are a text indexing structure supporting bidirectional pattern searches. As an application to our O(n)-time DAWG construction algorithm, we show that the set MAW(y) of all minimal absent words of y can be computed in optimal O(n +MAW(y)) time and O(n) working space for integer alphabets.

UR - http://www.scopus.com/inward/record.url?scp=85012877585&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85012877585&partnerID=8YFLogxK

U2 - 10.4230/LIPIcs.MFCS.2016.38

DO - 10.4230/LIPIcs.MFCS.2016.38

M3 - Conference contribution

VL - 58

BT - 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016

PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing

ER -