### Abstract

The Lyndon factorization of a string w is a unique factorization ℓ_{1} ^{p1 },…,ℓ_{m} ^{pm } of w such that ℓ_{1},…,ℓ_{m} is a sequence of Lyndon words that is monotonically decreasing in lexicographic order. In this paper, we consider the reverse-engineering problem on Lyndon factorization: Given a sequence S=((s_{1},p_{1}),…,(s_{m},p_{m})) of ordered pairs of positive integers, find a string w whose Lyndon factorization corresponds to the input sequence S, i.e., the Lyndon factorization of w is in a form of ℓ_{1} ^{p1 },…,ℓ_{m} ^{pm } with |ℓ_{i}|=s_{i} for all 1≤i≤m. Firstly, we show that there exists a simple O(n)-time algorithm if the size of the alphabet is unbounded, where n is the length of the output string. Secondly, we present an O(n)-time algorithm to compute a string over an alphabet of the smallest size. Thirdly, we show how to compute only the size of the smallest alphabet in O(m) time. Fourthly, we give an O(m)-time algorithm to compute an O(m)-size representation of a string over an alphabet of the smallest size. Finally, we propose an efficient algorithm to enumerate all strings whose Lyndon factorizations correspond to S.

Original language | English |
---|---|

Pages (from-to) | 147-156 |

Number of pages | 10 |

Journal | Theoretical Computer Science |

Volume | 689 |

DOIs | |

Publication status | Published - Aug 15 2017 |

### Fingerprint

### All Science Journal Classification (ASJC) codes

- Theoretical Computer Science
- Computer Science(all)

### Cite this

*Theoretical Computer Science*,

*689*, 147-156. https://doi.org/10.1016/j.tcs.2017.05.038

**Inferring strings from Lyndon factorization.** / Nakashima, Yuto; Okabe, Takashi; I, Tomohiro; Inenaga, Shunsuke; Bannai, Hideo; Takeda, Masayuki.

Research output: Contribution to journal › Article

*Theoretical Computer Science*, vol. 689, pp. 147-156. https://doi.org/10.1016/j.tcs.2017.05.038

}

TY - JOUR

T1 - Inferring strings from Lyndon factorization

AU - Nakashima, Yuto

AU - Okabe, Takashi

AU - I, Tomohiro

AU - Inenaga, Shunsuke

AU - Bannai, Hideo

AU - Takeda, Masayuki

PY - 2017/8/15

Y1 - 2017/8/15

N2 - The Lyndon factorization of a string w is a unique factorization ℓ1 p1 ,…,ℓm pm of w such that ℓ1,…,ℓm is a sequence of Lyndon words that is monotonically decreasing in lexicographic order. In this paper, we consider the reverse-engineering problem on Lyndon factorization: Given a sequence S=((s1,p1),…,(sm,pm)) of ordered pairs of positive integers, find a string w whose Lyndon factorization corresponds to the input sequence S, i.e., the Lyndon factorization of w is in a form of ℓ1 p1 ,…,ℓm pm with |ℓi|=si for all 1≤i≤m. Firstly, we show that there exists a simple O(n)-time algorithm if the size of the alphabet is unbounded, where n is the length of the output string. Secondly, we present an O(n)-time algorithm to compute a string over an alphabet of the smallest size. Thirdly, we show how to compute only the size of the smallest alphabet in O(m) time. Fourthly, we give an O(m)-time algorithm to compute an O(m)-size representation of a string over an alphabet of the smallest size. Finally, we propose an efficient algorithm to enumerate all strings whose Lyndon factorizations correspond to S.

AB - The Lyndon factorization of a string w is a unique factorization ℓ1 p1 ,…,ℓm pm of w such that ℓ1,…,ℓm is a sequence of Lyndon words that is monotonically decreasing in lexicographic order. In this paper, we consider the reverse-engineering problem on Lyndon factorization: Given a sequence S=((s1,p1),…,(sm,pm)) of ordered pairs of positive integers, find a string w whose Lyndon factorization corresponds to the input sequence S, i.e., the Lyndon factorization of w is in a form of ℓ1 p1 ,…,ℓm pm with |ℓi|=si for all 1≤i≤m. Firstly, we show that there exists a simple O(n)-time algorithm if the size of the alphabet is unbounded, where n is the length of the output string. Secondly, we present an O(n)-time algorithm to compute a string over an alphabet of the smallest size. Thirdly, we show how to compute only the size of the smallest alphabet in O(m) time. Fourthly, we give an O(m)-time algorithm to compute an O(m)-size representation of a string over an alphabet of the smallest size. Finally, we propose an efficient algorithm to enumerate all strings whose Lyndon factorizations correspond to S.

UR - http://www.scopus.com/inward/record.url?scp=85020825934&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85020825934&partnerID=8YFLogxK

U2 - 10.1016/j.tcs.2017.05.038

DO - 10.1016/j.tcs.2017.05.038

M3 - Article

AN - SCOPUS:85020825934

VL - 689

SP - 147

EP - 156

JO - Theoretical Computer Science

JF - Theoretical Computer Science

SN - 0304-3975

ER -