The "runs" theorem

研究成果: ジャーナルへの寄稿記事

22 引用 (Scopus)

抄録

We give a new characterization of maximal repetitions (or runs) in strings based on Lyndon words. The characterization leads to a proof of what was known as the "runs" conjecture [R. M. Kolpakov andG.Kucherov, Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS), IEEE Computer Society, Los Alamitos, CA, 1999, pp. 596-604]), which states that the maximum number of runs ρ(n) in a string of length n is less than n. The proof is remarkably simple, considering the numerous endeavors to tackle this problem in the last 15 years, and significantly improves our understanding of how runs can occur in strings. In addition, we obtain an upper bound of 3n for the maximum sum of exponents σ(n) of runs in a string of length n, improving on the best known bound of 4.1n by Crochemore et al. [J. Discrete Algorithms, 14 (2012), pp. 29-36], as well as other improved bounds on related problems. The characterization also gives rise to a new, conceptually simple linear-time algorithm for computing all the runs in a string. A notable characteristic of our algorithm is that, unlike all existing linear-time algorithms, it does not utilize the Lempel-Ziv factorization of the string. We also establish a relationship between runs and nodes of the Lyndon tree, which gives a simple optimal solution to the 2-period query problem that was recently solved by Kociumaka et al. [Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, (SODA) 2015, San Diego, CA, SIAM, Philadelphia, 2015, pp. 532-551].

元の言語英語
ページ(範囲)1501-1514
ページ数14
ジャーナルSIAM Journal on Computing
46
発行部数5
DOI
出版物ステータス出版済み - 1 1 2017

Fingerprint

Strings
Theorem
Linear-time Algorithm
Lyndon Words
Factorization
Computer science
Annual
Computer Science
Optimal Solution
Exponent
Query
Upper bound
Computing
Vertex of a graph

All Science Journal Classification (ASJC) codes

  • Computer Science(all)
  • Mathematics(all)

これを引用

The "runs" theorem. / Bannai, Hideo; I, Tomohiro; Inenaga, Shunsuke; Nakashima, Yuto; Takeda, Masayuki; Tsuruta, Kazuya.

:: SIAM Journal on Computing, 巻 46, 番号 5, 01.01.2017, p. 1501-1514.

研究成果: ジャーナルへの寄稿記事

Bannai, Hideo ; I, Tomohiro ; Inenaga, Shunsuke ; Nakashima, Yuto ; Takeda, Masayuki ; Tsuruta, Kazuya. / The "runs" theorem. :: SIAM Journal on Computing. 2017 ; 巻 46, 番号 5. pp. 1501-1514.
@article{8ec771897ff149e4943cce757697b74c,
title = "The {"}runs{"} theorem",
abstract = "We give a new characterization of maximal repetitions (or runs) in strings based on Lyndon words. The characterization leads to a proof of what was known as the {"}runs{"} conjecture [R. M. Kolpakov andG.Kucherov, Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS), IEEE Computer Society, Los Alamitos, CA, 1999, pp. 596-604]), which states that the maximum number of runs ρ(n) in a string of length n is less than n. The proof is remarkably simple, considering the numerous endeavors to tackle this problem in the last 15 years, and significantly improves our understanding of how runs can occur in strings. In addition, we obtain an upper bound of 3n for the maximum sum of exponents σ(n) of runs in a string of length n, improving on the best known bound of 4.1n by Crochemore et al. [J. Discrete Algorithms, 14 (2012), pp. 29-36], as well as other improved bounds on related problems. The characterization also gives rise to a new, conceptually simple linear-time algorithm for computing all the runs in a string. A notable characteristic of our algorithm is that, unlike all existing linear-time algorithms, it does not utilize the Lempel-Ziv factorization of the string. We also establish a relationship between runs and nodes of the Lyndon tree, which gives a simple optimal solution to the 2-period query problem that was recently solved by Kociumaka et al. [Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, (SODA) 2015, San Diego, CA, SIAM, Philadelphia, 2015, pp. 532-551].",
author = "Hideo Bannai and Tomohiro I and Shunsuke Inenaga and Yuto Nakashima and Masayuki Takeda and Kazuya Tsuruta",
year = "2017",
month = "1",
day = "1",
doi = "10.1137/15M1011032",
language = "English",
volume = "46",
pages = "1501--1514",
journal = "SIAM Journal on Computing",
issn = "0097-5397",
publisher = "Society for Industrial and Applied Mathematics Publications",
number = "5",

}

TY - JOUR

T1 - The "runs" theorem

AU - Bannai, Hideo

AU - I, Tomohiro

AU - Inenaga, Shunsuke

AU - Nakashima, Yuto

AU - Takeda, Masayuki

AU - Tsuruta, Kazuya

PY - 2017/1/1

Y1 - 2017/1/1

N2 - We give a new characterization of maximal repetitions (or runs) in strings based on Lyndon words. The characterization leads to a proof of what was known as the "runs" conjecture [R. M. Kolpakov andG.Kucherov, Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS), IEEE Computer Society, Los Alamitos, CA, 1999, pp. 596-604]), which states that the maximum number of runs ρ(n) in a string of length n is less than n. The proof is remarkably simple, considering the numerous endeavors to tackle this problem in the last 15 years, and significantly improves our understanding of how runs can occur in strings. In addition, we obtain an upper bound of 3n for the maximum sum of exponents σ(n) of runs in a string of length n, improving on the best known bound of 4.1n by Crochemore et al. [J. Discrete Algorithms, 14 (2012), pp. 29-36], as well as other improved bounds on related problems. The characterization also gives rise to a new, conceptually simple linear-time algorithm for computing all the runs in a string. A notable characteristic of our algorithm is that, unlike all existing linear-time algorithms, it does not utilize the Lempel-Ziv factorization of the string. We also establish a relationship between runs and nodes of the Lyndon tree, which gives a simple optimal solution to the 2-period query problem that was recently solved by Kociumaka et al. [Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, (SODA) 2015, San Diego, CA, SIAM, Philadelphia, 2015, pp. 532-551].

AB - We give a new characterization of maximal repetitions (or runs) in strings based on Lyndon words. The characterization leads to a proof of what was known as the "runs" conjecture [R. M. Kolpakov andG.Kucherov, Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS), IEEE Computer Society, Los Alamitos, CA, 1999, pp. 596-604]), which states that the maximum number of runs ρ(n) in a string of length n is less than n. The proof is remarkably simple, considering the numerous endeavors to tackle this problem in the last 15 years, and significantly improves our understanding of how runs can occur in strings. In addition, we obtain an upper bound of 3n for the maximum sum of exponents σ(n) of runs in a string of length n, improving on the best known bound of 4.1n by Crochemore et al. [J. Discrete Algorithms, 14 (2012), pp. 29-36], as well as other improved bounds on related problems. The characterization also gives rise to a new, conceptually simple linear-time algorithm for computing all the runs in a string. A notable characteristic of our algorithm is that, unlike all existing linear-time algorithms, it does not utilize the Lempel-Ziv factorization of the string. We also establish a relationship between runs and nodes of the Lyndon tree, which gives a simple optimal solution to the 2-period query problem that was recently solved by Kociumaka et al. [Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, (SODA) 2015, San Diego, CA, SIAM, Philadelphia, 2015, pp. 532-551].

UR - http://www.scopus.com/inward/record.url?scp=85032917335&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85032917335&partnerID=8YFLogxK

U2 - 10.1137/15M1011032

DO - 10.1137/15M1011032

M3 - Article

AN - SCOPUS:85032917335

VL - 46

SP - 1501

EP - 1514

JO - SIAM Journal on Computing

JF - SIAM Journal on Computing

SN - 0097-5397

IS - 5

ER -