### Abstract

The directed acyclic word graph (DAWG) of a string y is the smallest (partial) DFA which recognizes all suffixes of y and has only O(n) nodes and edges. We present the first O(n)-time algorithm for computing the DAWG of a given string y of length n over an integer alphabet of polynomial size in n. We also show that a straightforward modification to our DAWG construction algorithm leads to the first O(n)-time algorithm for constructing the affix tree of a given string y over an integer alphabet. Affix trees are a text indexing structure supporting bidirectional pattern searches. As an application to our O(n)-time DAWG construction algorithm, we show that the set MAW(y) of all minimal absent words of y can be computed in optimal O(n +MAW(y)) time and O(n) working space for integer alphabets.

Original language | English |
---|---|

Title of host publication | 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016 |

Editors | Anca Muscholl, Piotr Faliszewski, Rolf Niedermeier |

Publisher | Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing |

ISBN (Electronic) | 9783959770163 |

DOIs | |

Publication status | Published - Aug 1 2016 |

Event | 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016 - Krakow, Poland Duration: Aug 22 2016 → Aug 26 2016 |

### Publication series

Name | Leibniz International Proceedings in Informatics, LIPIcs |
---|---|

Volume | 58 |

ISSN (Print) | 1868-8969 |

### Other

Other | 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016 |
---|---|

Country | Poland |

City | Krakow |

Period | 8/22/16 → 8/26/16 |

### Fingerprint

### All Science Journal Classification (ASJC) codes

- Software

### Cite this

*41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016*[38] (Leibniz International Proceedings in Informatics, LIPIcs; Vol. 58). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. https://doi.org/10.4230/LIPIcs.MFCS.2016.38

**Computing DAWGs and minimal absent words in linear time for integer alphabets.** / Fujishige, Yuta; Tsujimaru, Yuki; Inenaga, Shunsuke; Bannai, Hideo; Takeda, Masayuki.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

*41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016.*, 38, Leibniz International Proceedings in Informatics, LIPIcs, vol. 58, Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016, Krakow, Poland, 8/22/16. https://doi.org/10.4230/LIPIcs.MFCS.2016.38

}

TY - GEN

T1 - Computing DAWGs and minimal absent words in linear time for integer alphabets

AU - Fujishige, Yuta

AU - Tsujimaru, Yuki

AU - Inenaga, Shunsuke

AU - Bannai, Hideo

AU - Takeda, Masayuki

PY - 2016/8/1

Y1 - 2016/8/1

N2 - The directed acyclic word graph (DAWG) of a string y is the smallest (partial) DFA which recognizes all suffixes of y and has only O(n) nodes and edges. We present the first O(n)-time algorithm for computing the DAWG of a given string y of length n over an integer alphabet of polynomial size in n. We also show that a straightforward modification to our DAWG construction algorithm leads to the first O(n)-time algorithm for constructing the affix tree of a given string y over an integer alphabet. Affix trees are a text indexing structure supporting bidirectional pattern searches. As an application to our O(n)-time DAWG construction algorithm, we show that the set MAW(y) of all minimal absent words of y can be computed in optimal O(n +MAW(y)) time and O(n) working space for integer alphabets.

AB - The directed acyclic word graph (DAWG) of a string y is the smallest (partial) DFA which recognizes all suffixes of y and has only O(n) nodes and edges. We present the first O(n)-time algorithm for computing the DAWG of a given string y of length n over an integer alphabet of polynomial size in n. We also show that a straightforward modification to our DAWG construction algorithm leads to the first O(n)-time algorithm for constructing the affix tree of a given string y over an integer alphabet. Affix trees are a text indexing structure supporting bidirectional pattern searches. As an application to our O(n)-time DAWG construction algorithm, we show that the set MAW(y) of all minimal absent words of y can be computed in optimal O(n +MAW(y)) time and O(n) working space for integer alphabets.

UR - http://www.scopus.com/inward/record.url?scp=85012877585&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85012877585&partnerID=8YFLogxK

U2 - 10.4230/LIPIcs.MFCS.2016.38

DO - 10.4230/LIPIcs.MFCS.2016.38

M3 - Conference contribution

AN - SCOPUS:85012877585

T3 - Leibniz International Proceedings in Informatics, LIPIcs

BT - 41st International Symposium on Mathematical Foundations of Computer Science, MFCS 2016

A2 - Muscholl, Anca

A2 - Faliszewski, Piotr

A2 - Niedermeier, Rolf

PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing

ER -