### Abstract

Finding a pattern which separates two sets is a critical task in discovery. Given two sets of strings, consider the problem to find a subsequence that is common to one set but never appears in the other set. The problem is known to be NP-complete. Episode pattern is a generalized concept of subsequence pattern where the length of substring containing the subsequence is bounded. We generalize these problems to optimization problems, and give practical algorithms to solve them exactly. Our algorithms utilize some pruning heuristics based on the combinatorial properties of strings, and efficient data structures which recognize subsequence and episode patterns.

Original language | English |
---|---|

Title of host publication | Progress in Discovery Science |

Pages | 307-317 |

Number of pages | 11 |

Volume | 2281 |

Publication status | Published - 2002 |

### Publication series

Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|

Volume | 2281 |

ISSN (Print) | 03029743 |

ISSN (Electronic) | 16113349 |

### Fingerprint

### All Science Journal Classification (ASJC) codes

- Computer Science(all)
- Theoretical Computer Science

### Cite this

*Progress in Discovery Science*(Vol. 2281, pp. 307-317). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2281).

**Finding best patterns practically.** / Shinohara, Ayumi; Takeda, Masayuki; Arikawa, Setsuo; Hirao, Masahiro; Hoshino, Hiromasa; Inenaga, Shunsuke.

Research output: Chapter in Book/Report/Conference proceeding › Chapter

*Progress in Discovery Science.*vol. 2281, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 2281, pp. 307-317.

}

TY - CHAP

T1 - Finding best patterns practically

AU - Shinohara, Ayumi

AU - Takeda, Masayuki

AU - Arikawa, Setsuo

AU - Hirao, Masahiro

AU - Hoshino, Hiromasa

AU - Inenaga, Shunsuke

PY - 2002

Y1 - 2002

N2 - Finding a pattern which separates two sets is a critical task in discovery. Given two sets of strings, consider the problem to find a subsequence that is common to one set but never appears in the other set. The problem is known to be NP-complete. Episode pattern is a generalized concept of subsequence pattern where the length of substring containing the subsequence is bounded. We generalize these problems to optimization problems, and give practical algorithms to solve them exactly. Our algorithms utilize some pruning heuristics based on the combinatorial properties of strings, and efficient data structures which recognize subsequence and episode patterns.

AB - Finding a pattern which separates two sets is a critical task in discovery. Given two sets of strings, consider the problem to find a subsequence that is common to one set but never appears in the other set. The problem is known to be NP-complete. Episode pattern is a generalized concept of subsequence pattern where the length of substring containing the subsequence is bounded. We generalize these problems to optimization problems, and give practical algorithms to solve them exactly. Our algorithms utilize some pruning heuristics based on the combinatorial properties of strings, and efficient data structures which recognize subsequence and episode patterns.

UR - http://www.scopus.com/inward/record.url?scp=23044534938&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=23044534938&partnerID=8YFLogxK

M3 - Chapter

AN - SCOPUS:23044534938

SN - 3540433384

SN - 9783540433385

VL - 2281

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 307

EP - 317

BT - Progress in Discovery Science

ER -