Computing DAWGs and minimal absent words in linear time for integer alphabets

Yuta Fujishige, Yuki Tsujimaru, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

The directed acyclic word graph (DAWG) of a string y is the smallest (partial) DFA which recognizes all suffixes of y and has only O(n) nodes and edges. We present the first O(n)-time algorithm for computing the DAWG of a given string y of length n over an integer alphabet of polynomial size in n. We also show that a straightforward modification to our DAWG construction algorithm leads to the first O(n)-time algorithm for constructing the affix tree of a given string y over an integer alphabet. Affix trees are a text indexing structure supporting bidirectional pattern searches. As an application to our O(n)-time DAWG construction algorithm, we show that the set MAW(y) of all minimal absent words of y can be computed in optimal O(n +MAW(y)) time and O(n) working space for integer alphabets.

Original languageEnglish
Title of host publication41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016
EditorsAnca Muscholl, Piotr Faliszewski, Rolf Niedermeier
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
ISBN (Electronic)9783959770163
DOIs
Publication statusPublished - Aug 1 2016
Event41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016 - Krakow, Poland
Duration: Aug 22 2016Aug 26 2016

Publication series

NameLeibniz International Proceedings in Informatics, LIPIcs
Volume58
ISSN (Print)1868-8969

Other

Other41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016
CountryPoland
CityKrakow
Period8/22/168/26/16

All Science Journal Classification (ASJC) codes

  • Software

Fingerprint Dive into the research topics of 'Computing DAWGs and minimal absent words in linear time for integer alphabets'. Together they form a unique fingerprint.

  • Cite this

    Fujishige, Y., Tsujimaru, Y., Inenaga, S., Bannai, H., & Takeda, M. (2016). Computing DAWGs and minimal absent words in linear time for integer alphabets. In A. Muscholl, P. Faliszewski, & R. Niedermeier (Eds.), 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016 [38] (Leibniz International Proceedings in Informatics, LIPIcs; Vol. 58). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. https://doi.org/10.4230/LIPIcs.MFCS.2016.38