An opportunistic text indexing structure based on run length encoding

Yuya Tamakoshi, Keisuke Goto, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

We present a new text indexing structure based on the run length encoding (RLE) of a text string T which, given the RLE of a query pattern P, reports all the occ occurrences of P in T in O(m+occ+log n) time, where n and m are the sizes of the RLEs of T and P, respectively. The data structure requires n(2 logN+log n+log σ)+O(n) bits of space, where N is the length of the uncompressed text string T and σ is the alphabet size. Moreover, using n(3 logN + logn + logσ) + 2σ log N/σ + O(n log log n) bits of total space, our data structure can be enhanced to answer the beginning position of the lexicographically ith smallest suffix of T for a given rank i in O(log2 n) time. All these data structures can be constructed in O(n log n) time using O(n logN) bits of extra space.

Original languageEnglish
Title of host publicationAlgorithms and Complexity - 9th International Conference, CIAC 2015, Proceedings
EditorsPeter Widmayer, Vangelis Th. Paschos
PublisherSpringer Verlag
Pages390-402
Number of pages13
ISBN (Print)9783319181721
DOIs
Publication statusPublished - Jan 1 2015
Event9th International Conference on Algorithms and Complexity, CIAC 2015 - Paris, France
Duration: May 20 2015May 22 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9079
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other9th International Conference on Algorithms and Complexity, CIAC 2015
CountryFrance
CityParis
Period5/20/155/22/15

Fingerprint

Run-length Encoding
Text Indexing
Data structures
Data Structures
Strings
Suffix
Query
Text

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Tamakoshi, Y., Goto, K., Inenaga, S., Bannai, H., & Takeda, M. (2015). An opportunistic text indexing structure based on run length encoding. In P. Widmayer, & V. T. Paschos (Eds.), Algorithms and Complexity - 9th International Conference, CIAC 2015, Proceedings (pp. 390-402). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9079). Springer Verlag. https://doi.org/10.1007/978-3-319-18173-8_29

An opportunistic text indexing structure based on run length encoding. / Tamakoshi, Yuya; Goto, Keisuke; Inenaga, Shunsuke; Bannai, Hideo; Takeda, Masayuki.

Algorithms and Complexity - 9th International Conference, CIAC 2015, Proceedings. ed. / Peter Widmayer; Vangelis Th. Paschos. Springer Verlag, 2015. p. 390-402 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9079).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Tamakoshi, Y, Goto, K, Inenaga, S, Bannai, H & Takeda, M 2015, An opportunistic text indexing structure based on run length encoding. in P Widmayer & VT Paschos (eds), Algorithms and Complexity - 9th International Conference, CIAC 2015, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9079, Springer Verlag, pp. 390-402, 9th International Conference on Algorithms and Complexity, CIAC 2015, Paris, France, 5/20/15. https://doi.org/10.1007/978-3-319-18173-8_29
Tamakoshi Y, Goto K, Inenaga S, Bannai H, Takeda M. An opportunistic text indexing structure based on run length encoding. In Widmayer P, Paschos VT, editors, Algorithms and Complexity - 9th International Conference, CIAC 2015, Proceedings. Springer Verlag. 2015. p. 390-402. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-18173-8_29
Tamakoshi, Yuya ; Goto, Keisuke ; Inenaga, Shunsuke ; Bannai, Hideo ; Takeda, Masayuki. / An opportunistic text indexing structure based on run length encoding. Algorithms and Complexity - 9th International Conference, CIAC 2015, Proceedings. editor / Peter Widmayer ; Vangelis Th. Paschos. Springer Verlag, 2015. pp. 390-402 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{9380a98653494b5f9e062eafa494bb1d,
title = "An opportunistic text indexing structure based on run length encoding",
abstract = "We present a new text indexing structure based on the run length encoding (RLE) of a text string T which, given the RLE of a query pattern P, reports all the occ occurrences of P in T in O(m+occ+log n) time, where n and m are the sizes of the RLEs of T and P, respectively. The data structure requires n(2 logN+log n+log σ)+O(n) bits of space, where N is the length of the uncompressed text string T and σ is the alphabet size. Moreover, using n(3 logN + logn + logσ) + 2σ log N/σ + O(n log log n) bits of total space, our data structure can be enhanced to answer the beginning position of the lexicographically ith smallest suffix of T for a given rank i in O(log2 n) time. All these data structures can be constructed in O(n log n) time using O(n logN) bits of extra space.",
author = "Yuya Tamakoshi and Keisuke Goto and Shunsuke Inenaga and Hideo Bannai and Masayuki Takeda",
year = "2015",
month = "1",
day = "1",
doi = "10.1007/978-3-319-18173-8_29",
language = "English",
isbn = "9783319181721",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "390--402",
editor = "Peter Widmayer and Paschos, {Vangelis Th.}",
booktitle = "Algorithms and Complexity - 9th International Conference, CIAC 2015, Proceedings",
address = "Germany",

}

TY - GEN

T1 - An opportunistic text indexing structure based on run length encoding

AU - Tamakoshi, Yuya

AU - Goto, Keisuke

AU - Inenaga, Shunsuke

AU - Bannai, Hideo

AU - Takeda, Masayuki

PY - 2015/1/1

Y1 - 2015/1/1

N2 - We present a new text indexing structure based on the run length encoding (RLE) of a text string T which, given the RLE of a query pattern P, reports all the occ occurrences of P in T in O(m+occ+log n) time, where n and m are the sizes of the RLEs of T and P, respectively. The data structure requires n(2 logN+log n+log σ)+O(n) bits of space, where N is the length of the uncompressed text string T and σ is the alphabet size. Moreover, using n(3 logN + logn + logσ) + 2σ log N/σ + O(n log log n) bits of total space, our data structure can be enhanced to answer the beginning position of the lexicographically ith smallest suffix of T for a given rank i in O(log2 n) time. All these data structures can be constructed in O(n log n) time using O(n logN) bits of extra space.

AB - We present a new text indexing structure based on the run length encoding (RLE) of a text string T which, given the RLE of a query pattern P, reports all the occ occurrences of P in T in O(m+occ+log n) time, where n and m are the sizes of the RLEs of T and P, respectively. The data structure requires n(2 logN+log n+log σ)+O(n) bits of space, where N is the length of the uncompressed text string T and σ is the alphabet size. Moreover, using n(3 logN + logn + logσ) + 2σ log N/σ + O(n log log n) bits of total space, our data structure can be enhanced to answer the beginning position of the lexicographically ith smallest suffix of T for a given rank i in O(log2 n) time. All these data structures can be constructed in O(n log n) time using O(n logN) bits of extra space.

UR - http://www.scopus.com/inward/record.url?scp=84944731108&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84944731108&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-18173-8_29

DO - 10.1007/978-3-319-18173-8_29

M3 - Conference contribution

AN - SCOPUS:84944731108

SN - 9783319181721

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 390

EP - 402

BT - Algorithms and Complexity - 9th International Conference, CIAC 2015, Proceedings

A2 - Widmayer, Peter

A2 - Paschos, Vangelis Th.

PB - Springer Verlag

ER -