Given a string T of length n, a substring u=T[i..j] of T is called a shortest unique substring (SUS) for an interval [s,t] if (a) u occurs exactly once in T, (b) u contains the interval [s,t] (i.e. i≤s≤t≤j), and (c) every substring v of T with |v|<|u| containing [s,t] occurs at least twice in T. Given a query interval [s,t]⊂[1,n], the interval SUS problem is to output all the SUSs for the interval [s,t]. In this article, we propose a 4n+o(n) bits data structure answering an interval SUS query in output-sensitive O(occ) time, where occ is the number of returned SUSs. Additionally, we focus on the point SUS problem, which is the interval SUS problem for s=t. Here, we propose a ⌈(log23+1)n⌉+o(n) bits data structure answering a point SUS query in the same output-sensitive time. We also propose space-efficient algorithms for computing the minimal unique substrings of T.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Computer Science(all)