Dynamic RLE-Compressed Edit Distance Tables under General Weighted Cost Functions

Heikki Hyyrö, Shunsuke Inenaga

研究成果: ジャーナルへの寄稿記事

抄録

Kim and Park [A dynamic edit distance table, J. Disc. Algo., 2:302-312, 2004] proposed a method (KP) based on a "dynamic edit distance table" that allows one to efficiently maintain unit cost edit distance information between two strings A of length m and B of length n when the strings can be modified by single-character edits to their left or right ends. This type of computation is useful e.g. in cyclic string comparison. KP uses linear time, O(m + n), to update the distance representation after each single edit. Recently Hyyrö et al. [Incremental string comparison, J. Disc. Algo., 34:2-17, 2015] presented an efficient method for maintaining the dynamic edit distance table under general weighted edit distance, running in O(c(m + n)) time per single edit, where c is the maximum weight of the cost function. The work noted that the Θ(mn) space requirement, and not the running time, may be the main bottleneck in using the dynamic edit distance table. In this paper we take the first steps towards reducing the space usage of the dynamic edit distance table by RLE compressing A and B. Let M and N be the lengths of RLE compressed versions of A and B, respectively. We propose how to store the dynamic edit distance table using Θ(mN + Mn) space while maintaining the same time complexity as the previous methods for uncompressed strings.

元の言語英語
ページ(範囲)623-645
ページ数23
ジャーナルInternational Journal of Foundations of Computer Science
29
発行部数4
DOI
出版物ステータス出版済み - 6 1 2018

Fingerprint

Cost functions
Costs

All Science Journal Classification (ASJC) codes

  • Computer Science (miscellaneous)

これを引用

Dynamic RLE-Compressed Edit Distance Tables under General Weighted Cost Functions. / Hyyrö, Heikki; Inenaga, Shunsuke.

:: International Journal of Foundations of Computer Science, 巻 29, 番号 4, 01.06.2018, p. 623-645.

研究成果: ジャーナルへの寄稿記事

@article{15688e4a57a3459d922b81078dfa80df,
title = "Dynamic RLE-Compressed Edit Distance Tables under General Weighted Cost Functions",
abstract = "Kim and Park [A dynamic edit distance table, J. Disc. Algo., 2:302-312, 2004] proposed a method (KP) based on a {"}dynamic edit distance table{"} that allows one to efficiently maintain unit cost edit distance information between two strings A of length m and B of length n when the strings can be modified by single-character edits to their left or right ends. This type of computation is useful e.g. in cyclic string comparison. KP uses linear time, O(m + n), to update the distance representation after each single edit. Recently Hyyr{\"o} et al. [Incremental string comparison, J. Disc. Algo., 34:2-17, 2015] presented an efficient method for maintaining the dynamic edit distance table under general weighted edit distance, running in O(c(m + n)) time per single edit, where c is the maximum weight of the cost function. The work noted that the Θ(mn) space requirement, and not the running time, may be the main bottleneck in using the dynamic edit distance table. In this paper we take the first steps towards reducing the space usage of the dynamic edit distance table by RLE compressing A and B. Let M and N be the lengths of RLE compressed versions of A and B, respectively. We propose how to store the dynamic edit distance table using Θ(mN + Mn) space while maintaining the same time complexity as the previous methods for uncompressed strings.",
author = "Heikki Hyyr{\"o} and Shunsuke Inenaga",
year = "2018",
month = "6",
day = "1",
doi = "10.1142/S0129054118410083",
language = "English",
volume = "29",
pages = "623--645",
journal = "International Journal of Foundations of Computer Science",
issn = "0129-0541",
publisher = "World Scientific Publishing Co. Pte Ltd",
number = "4",

}

TY - JOUR

T1 - Dynamic RLE-Compressed Edit Distance Tables under General Weighted Cost Functions

AU - Hyyrö, Heikki

AU - Inenaga, Shunsuke

PY - 2018/6/1

Y1 - 2018/6/1

N2 - Kim and Park [A dynamic edit distance table, J. Disc. Algo., 2:302-312, 2004] proposed a method (KP) based on a "dynamic edit distance table" that allows one to efficiently maintain unit cost edit distance information between two strings A of length m and B of length n when the strings can be modified by single-character edits to their left or right ends. This type of computation is useful e.g. in cyclic string comparison. KP uses linear time, O(m + n), to update the distance representation after each single edit. Recently Hyyrö et al. [Incremental string comparison, J. Disc. Algo., 34:2-17, 2015] presented an efficient method for maintaining the dynamic edit distance table under general weighted edit distance, running in O(c(m + n)) time per single edit, where c is the maximum weight of the cost function. The work noted that the Θ(mn) space requirement, and not the running time, may be the main bottleneck in using the dynamic edit distance table. In this paper we take the first steps towards reducing the space usage of the dynamic edit distance table by RLE compressing A and B. Let M and N be the lengths of RLE compressed versions of A and B, respectively. We propose how to store the dynamic edit distance table using Θ(mN + Mn) space while maintaining the same time complexity as the previous methods for uncompressed strings.

AB - Kim and Park [A dynamic edit distance table, J. Disc. Algo., 2:302-312, 2004] proposed a method (KP) based on a "dynamic edit distance table" that allows one to efficiently maintain unit cost edit distance information between two strings A of length m and B of length n when the strings can be modified by single-character edits to their left or right ends. This type of computation is useful e.g. in cyclic string comparison. KP uses linear time, O(m + n), to update the distance representation after each single edit. Recently Hyyrö et al. [Incremental string comparison, J. Disc. Algo., 34:2-17, 2015] presented an efficient method for maintaining the dynamic edit distance table under general weighted edit distance, running in O(c(m + n)) time per single edit, where c is the maximum weight of the cost function. The work noted that the Θ(mn) space requirement, and not the running time, may be the main bottleneck in using the dynamic edit distance table. In this paper we take the first steps towards reducing the space usage of the dynamic edit distance table by RLE compressing A and B. Let M and N be the lengths of RLE compressed versions of A and B, respectively. We propose how to store the dynamic edit distance table using Θ(mN + Mn) space while maintaining the same time complexity as the previous methods for uncompressed strings.

UR - http://www.scopus.com/inward/record.url?scp=85049332787&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85049332787&partnerID=8YFLogxK

U2 - 10.1142/S0129054118410083

DO - 10.1142/S0129054118410083

M3 - Article

VL - 29

SP - 623

EP - 645

JO - International Journal of Foundations of Computer Science

JF - International Journal of Foundations of Computer Science

SN - 0129-0541

IS - 4

ER -