Dynamic edit distance table under a general weighted cost function

Heikki Hyyrö, Kazuyuki Narisawa, Shunsuke Inenaga

研究成果: ジャーナルへの寄稿記事

3 引用 (Scopus)

抄録

We discuss the problem of edit distance computation under a dynamic setting, where one of the two compared strings may be modified by single-character edit operations and we wish to keep the edit distance information between the strings up-to-date. A previous algorithm by Kim and Park (2004) [6] solves a more limited problem where modifications can be done only at the ends of the strings (so-called decremental or incremental edits) and the edit operations have (essentially) unit costs. If the lengths of the two strings are m and n, their algorithm requires O(m+n) time per modification. We propose a simple and practical algorithm that (1) allows arbitrary non-negative costs for the edit operations and (2) allows the modifications to be done at arbitrary positions. If the latter string is modified at position j∗, our algorithm requires O(min {rc(m+n),mn}) time, where r=min {j∗,n-j∗+1} and c is the maximum edit operation cost. This equals O(m+n) in the simple decremental/incremental unit cost case. Our experiments indicate that the algorithm performs much faster than the theoretical worst-case time limit O(mn) in the general case with arbitrary edit costs and modification positions. The main practical limitation of the algorithm is its Θ(mn) memory requirement for storing the edit distance information.

元の言語英語
ページ(範囲)2-17
ページ数16
ジャーナルJournal of Discrete Algorithms
34
DOI
出版物ステータス出版済み - 9 1 2015

Fingerprint

Edit Distance
Cost functions
Cost Function
Table
Strings
Costs
Arbitrary
Unit
Non-negative
Data storage equipment
Requirements
Experiment
Experiments

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Discrete Mathematics and Combinatorics
  • Computational Theory and Mathematics

これを引用

Dynamic edit distance table under a general weighted cost function. / Hyyrö, Heikki; Narisawa, Kazuyuki; Inenaga, Shunsuke.

:: Journal of Discrete Algorithms, 巻 34, 01.09.2015, p. 2-17.

研究成果: ジャーナルへの寄稿記事

@article{68f7743e076642fca21799f8881aa9ca,
title = "Dynamic edit distance table under a general weighted cost function",
abstract = "We discuss the problem of edit distance computation under a dynamic setting, where one of the two compared strings may be modified by single-character edit operations and we wish to keep the edit distance information between the strings up-to-date. A previous algorithm by Kim and Park (2004) [6] solves a more limited problem where modifications can be done only at the ends of the strings (so-called decremental or incremental edits) and the edit operations have (essentially) unit costs. If the lengths of the two strings are m and n, their algorithm requires O(m+n) time per modification. We propose a simple and practical algorithm that (1) allows arbitrary non-negative costs for the edit operations and (2) allows the modifications to be done at arbitrary positions. If the latter string is modified at position j∗, our algorithm requires O(min {rc(m+n),mn}) time, where r=min {j∗,n-j∗+1} and c is the maximum edit operation cost. This equals O(m+n) in the simple decremental/incremental unit cost case. Our experiments indicate that the algorithm performs much faster than the theoretical worst-case time limit O(mn) in the general case with arbitrary edit costs and modification positions. The main practical limitation of the algorithm is its Θ(mn) memory requirement for storing the edit distance information.",
author = "Heikki Hyyr{\"o} and Kazuyuki Narisawa and Shunsuke Inenaga",
year = "2015",
month = "9",
day = "1",
doi = "10.1016/j.jda.2015.05.007",
language = "English",
volume = "34",
pages = "2--17",
journal = "Journal of Discrete Algorithms",
issn = "1570-8667",
publisher = "Elsevier",

}

TY - JOUR

T1 - Dynamic edit distance table under a general weighted cost function

AU - Hyyrö, Heikki

AU - Narisawa, Kazuyuki

AU - Inenaga, Shunsuke

PY - 2015/9/1

Y1 - 2015/9/1

N2 - We discuss the problem of edit distance computation under a dynamic setting, where one of the two compared strings may be modified by single-character edit operations and we wish to keep the edit distance information between the strings up-to-date. A previous algorithm by Kim and Park (2004) [6] solves a more limited problem where modifications can be done only at the ends of the strings (so-called decremental or incremental edits) and the edit operations have (essentially) unit costs. If the lengths of the two strings are m and n, their algorithm requires O(m+n) time per modification. We propose a simple and practical algorithm that (1) allows arbitrary non-negative costs for the edit operations and (2) allows the modifications to be done at arbitrary positions. If the latter string is modified at position j∗, our algorithm requires O(min {rc(m+n),mn}) time, where r=min {j∗,n-j∗+1} and c is the maximum edit operation cost. This equals O(m+n) in the simple decremental/incremental unit cost case. Our experiments indicate that the algorithm performs much faster than the theoretical worst-case time limit O(mn) in the general case with arbitrary edit costs and modification positions. The main practical limitation of the algorithm is its Θ(mn) memory requirement for storing the edit distance information.

AB - We discuss the problem of edit distance computation under a dynamic setting, where one of the two compared strings may be modified by single-character edit operations and we wish to keep the edit distance information between the strings up-to-date. A previous algorithm by Kim and Park (2004) [6] solves a more limited problem where modifications can be done only at the ends of the strings (so-called decremental or incremental edits) and the edit operations have (essentially) unit costs. If the lengths of the two strings are m and n, their algorithm requires O(m+n) time per modification. We propose a simple and practical algorithm that (1) allows arbitrary non-negative costs for the edit operations and (2) allows the modifications to be done at arbitrary positions. If the latter string is modified at position j∗, our algorithm requires O(min {rc(m+n),mn}) time, where r=min {j∗,n-j∗+1} and c is the maximum edit operation cost. This equals O(m+n) in the simple decremental/incremental unit cost case. Our experiments indicate that the algorithm performs much faster than the theoretical worst-case time limit O(mn) in the general case with arbitrary edit costs and modification positions. The main practical limitation of the algorithm is its Θ(mn) memory requirement for storing the edit distance information.

UR - http://www.scopus.com/inward/record.url?scp=84939565936&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84939565936&partnerID=8YFLogxK

U2 - 10.1016/j.jda.2015.05.007

DO - 10.1016/j.jda.2015.05.007

M3 - Article

VL - 34

SP - 2

EP - 17

JO - Journal of Discrete Algorithms

JF - Journal of Discrete Algorithms

SN - 1570-8667

ER -