TY - GEN

T1 - Inferring strings from Lyndon factorization

AU - Nakashima, Yuto

AU - Okabe, Takashi

AU - Tomohiro, I.

AU - Inenaga, Shunsuke

AU - Bannai, Hideo

AU - Takeda, Masayuki

PY - 2014

Y1 - 2014

N2 - The Lyndon factorization of a string w is a unique factorization ℓp11,⋯, ℓpmm of w s.t. ℓ1,⋯, ℓm is a sequence of Lyndon words that is monotonically decreasing in lexicographic order. In this paper, we consider the reverse-engineering problem on Lyndon factorization: Given a sequence S = ((s1, p1),⋯, (sm, p m)) of ordered pairs of positive integers, find a string w whose Lyndon factorization corresponds to the input sequence S, i.e., the Lyndon factorization of w is in a form of ℓp11,⋯, ℓpmm with |ℓi| = si for all 1 ≤ i ≤ m. Firstly, we show that there exists a simple O(n)-time algorithm if the size of the alphabet is unbounded, where n is the length of the output string. Secondly, we present an O(n)-time algorithm to compute a string over an alphabet of the smallest size. Thirdly, we show how to compute only the size of the smallest alphabet in O(m) time. Fourthly, we give an O(m)-time algorithm to compute an O(m)-size representation of a string over an alphabet of the smallest size. Finally, we propose an efficient algorithm to enumerate all strings whose Lyndon factorizations correspond to S.

AB - The Lyndon factorization of a string w is a unique factorization ℓp11,⋯, ℓpmm of w s.t. ℓ1,⋯, ℓm is a sequence of Lyndon words that is monotonically decreasing in lexicographic order. In this paper, we consider the reverse-engineering problem on Lyndon factorization: Given a sequence S = ((s1, p1),⋯, (sm, p m)) of ordered pairs of positive integers, find a string w whose Lyndon factorization corresponds to the input sequence S, i.e., the Lyndon factorization of w is in a form of ℓp11,⋯, ℓpmm with |ℓi| = si for all 1 ≤ i ≤ m. Firstly, we show that there exists a simple O(n)-time algorithm if the size of the alphabet is unbounded, where n is the length of the output string. Secondly, we present an O(n)-time algorithm to compute a string over an alphabet of the smallest size. Thirdly, we show how to compute only the size of the smallest alphabet in O(m) time. Fourthly, we give an O(m)-time algorithm to compute an O(m)-size representation of a string over an alphabet of the smallest size. Finally, we propose an efficient algorithm to enumerate all strings whose Lyndon factorizations correspond to S.

UR - http://www.scopus.com/inward/record.url?scp=84906230308&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84906230308&partnerID=8YFLogxK

U2 - 10.1007/978-3-662-44465-8_48

DO - 10.1007/978-3-662-44465-8_48

M3 - Conference contribution

AN - SCOPUS:84906230308

SN - 9783662444641

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 565

EP - 576

BT - Mathematical Foundations of Computer Science 2014 - 39th International Symposium, MFCS 2014, Proceedings

PB - Springer Verlag

T2 - 39th International Symposium on Mathematical Foundations of Computer Science, MFCS 2014

Y2 - 25 August 2014 through 29 August 2014

ER -