### Abstract

A substring Q of a string S is called a shortest unique substring (SUS) for interval [s, t] in S, if Q occurs exactly once in S, this occurrence of Q contains interval [s, t], and every substring of S which contains interval [s, t] and is shorter than Q occurs at least twice in S. The SUS problem is, given a string S, to preprocess S so that for any subsequent query interval [s, t] all the SUSs for interval [s, t] can be answered quickly. When s = t, we call the SUSs for [s, t] as point SUSs, and when s ≤ t, we call the SUSs for [s, t] as interval SUSs. There exist optimal O(n)-time preprocessing scheme which answers queries in optimal O(k) time for both point and interval SUSs, where n is the length of S and k is the number of outputs for a given query. In this paper, we reveal structural, combinatorial properties underlying the SUS problem: Namely, we show that the number of intervals in S that correspond to point SUSs for all query positions in S is less than 1.5n, and show that this is a matching upper and lower bound. Also, we consider the maximum number of intervals in S that correspond to interval SUSs for all query intervals in S.

Original language | English |
---|---|

Title of host publication | 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017 |

Publisher | Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing |

Volume | 78 |

ISBN (Electronic) | 9783959770392 |

DOIs | |

Publication status | Published - Jul 1 2017 |

Event | 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017 - Warsaw, Poland Duration: Jul 4 2017 → Jul 6 2017 |

### Other

Other | 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017 |
---|---|

Country | Poland |

City | Warsaw |

Period | 7/4/17 → 7/6/17 |

### All Science Journal Classification (ASJC) codes

- Software

## Cite this

*28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017*(Vol. 78). [24] Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. https://doi.org/10.4230/LIPIcs.CPM.2017.24