### Abstract

We present a new text indexing structure based on the run length encoding (RLE) of a text string T which, given the RLE of a query pattern P, reports all the occ occurrences of P in T in O(m+occ+log n) time, where n and m are the sizes of the RLEs of T and P, respectively. The data structure requires n(2 logN+log n+log σ)+O(n) bits of space, where N is the length of the uncompressed text string T and σ is the alphabet size. Moreover, using n(3 logN + logn + logσ) + 2σ log N/σ + O(n log log n) bits of total space, our data structure can be enhanced to answer the beginning position of the lexicographically ith smallest suffix of T for a given rank i in O(log^{2} n) time. All these data structures can be constructed in O(n log n) time using O(n logN) bits of extra space.

Original language | English |
---|---|

Title of host publication | Algorithms and Complexity - 9th International Conference, CIAC 2015, Proceedings |

Editors | Peter Widmayer, Vangelis Th. Paschos |

Publisher | Springer Verlag |

Pages | 390-402 |

Number of pages | 13 |

ISBN (Print) | 9783319181721 |

DOIs | |

Publication status | Published - Jan 1 2015 |

Event | 9th International Conference on Algorithms and Complexity, CIAC 2015 - Paris, France Duration: May 20 2015 → May 22 2015 |

### Publication series

Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|

Volume | 9079 |

ISSN (Print) | 0302-9743 |

ISSN (Electronic) | 1611-3349 |

### Other

Other | 9th International Conference on Algorithms and Complexity, CIAC 2015 |
---|---|

Country | France |

City | Paris |

Period | 5/20/15 → 5/22/15 |

### Fingerprint

### All Science Journal Classification (ASJC) codes

- Theoretical Computer Science
- Computer Science(all)

### Cite this

*Algorithms and Complexity - 9th International Conference, CIAC 2015, Proceedings*(pp. 390-402). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9079). Springer Verlag. https://doi.org/10.1007/978-3-319-18173-8_29

**An opportunistic text indexing structure based on run length encoding.** / Tamakoshi, Yuya; Goto, Keisuke; Inenaga, Shunsuke; Bannai, Hideo; Takeda, Masayuki.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

*Algorithms and Complexity - 9th International Conference, CIAC 2015, Proceedings.*Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9079, Springer Verlag, pp. 390-402, 9th International Conference on Algorithms and Complexity, CIAC 2015, Paris, France, 5/20/15. https://doi.org/10.1007/978-3-319-18173-8_29

}

TY - GEN

T1 - An opportunistic text indexing structure based on run length encoding

AU - Tamakoshi, Yuya

AU - Goto, Keisuke

AU - Inenaga, Shunsuke

AU - Bannai, Hideo

AU - Takeda, Masayuki

PY - 2015/1/1

Y1 - 2015/1/1

N2 - We present a new text indexing structure based on the run length encoding (RLE) of a text string T which, given the RLE of a query pattern P, reports all the occ occurrences of P in T in O(m+occ+log n) time, where n and m are the sizes of the RLEs of T and P, respectively. The data structure requires n(2 logN+log n+log σ)+O(n) bits of space, where N is the length of the uncompressed text string T and σ is the alphabet size. Moreover, using n(3 logN + logn + logσ) + 2σ log N/σ + O(n log log n) bits of total space, our data structure can be enhanced to answer the beginning position of the lexicographically ith smallest suffix of T for a given rank i in O(log2 n) time. All these data structures can be constructed in O(n log n) time using O(n logN) bits of extra space.

AB - We present a new text indexing structure based on the run length encoding (RLE) of a text string T which, given the RLE of a query pattern P, reports all the occ occurrences of P in T in O(m+occ+log n) time, where n and m are the sizes of the RLEs of T and P, respectively. The data structure requires n(2 logN+log n+log σ)+O(n) bits of space, where N is the length of the uncompressed text string T and σ is the alphabet size. Moreover, using n(3 logN + logn + logσ) + 2σ log N/σ + O(n log log n) bits of total space, our data structure can be enhanced to answer the beginning position of the lexicographically ith smallest suffix of T for a given rank i in O(log2 n) time. All these data structures can be constructed in O(n log n) time using O(n logN) bits of extra space.

UR - http://www.scopus.com/inward/record.url?scp=84944731108&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84944731108&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-18173-8_29

DO - 10.1007/978-3-319-18173-8_29

M3 - Conference contribution

AN - SCOPUS:84944731108

SN - 9783319181721

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 390

EP - 402

BT - Algorithms and Complexity - 9th International Conference, CIAC 2015, Proceedings

A2 - Widmayer, Peter

A2 - Paschos, Vangelis Th.

PB - Springer Verlag

ER -