Given two sets of strings, consider the problem to find a subsequence that is common to one set but never appears in the other set. We regard it to find a subsequence pattern which separates these two sets. The problem is known to be NP-complete. We naturally generalize it to an optimization problem, where we try to find a subsequence pattern which maximally separates these two sets. We provide a practical algorithm to solve it exactly. Our algorithm uses two pruning heuristics based on the properties of subsequence languages, and utilizes the data structure called subsequence automata. We report some experimental results, which show these heuristics and the data structure contribute to reduce the search time.
All Science Journal Classification (ASJC) codes
- コンピュータ サイエンス（全般）