### Abstract

We define the best consensus motif (BCM) problem motivated by the problem of extracting motifs from nucleic acid and amino acid sequences. A type over an alphabet Σ is a family Ω of subsets of Σ *. A motif π of type Ω is a string π=π_{1} ... π_{n} of motif components, each of which stands for an element in Ω. The BCM problem for Ω is, given a yes-no sample S={(α ^{(1)},β^{(1)},..., (α^{(m)},β^{(m)})} of pairs of strings in Σ* with α ^{(i)} ≠β^{(i)} for 1 ≤ i ≤ m, to find a motif π of type Ω that maximizes the number of good pairs in S, where (α ^{(i)}, β ^{(i)}) is good for π if π accepts α (i) and rejects β _{(i)} We prove that the BCM problem is NP-complete even for a very simple type (Formula presented), which is used, in practice, for describing protein motifs in the PROSITE database. We also show that the NP-completeness of the problem does not change for the type Ω _{∞}=Ω_{1}∪ {Σ^{+}}∪{Σ^{[i,j]}1≤i≤ j}, where Σ ^{[i,j]} is the set of strings over Σ of length between i and j Furthermore, for the BCM problem for Ω _{1} we provide a polynomial-time greedy algorithm based on the probabilistic method. Its performance analysis shows an explicit approximation ratio of the algorithm.

Original language | English |
---|---|

Title of host publication | STACS 1996 - 13th Annual Symposium on Theoretical Aspects of Computer Science, Proceedings |

Editors | Claude Puech, Rudiger Reischuk |

Publisher | Springer Verlag |

Pages | 219-230 |

Number of pages | 12 |

ISBN (Print) | 9783540609223 |

DOIs | |

Publication status | Published - 1996 |

Event | 13th Annual Symposium on Theoretical Aspects of Computer Science, STACS 1996 - Grenoble, France Duration: Feb 22 1996 → Feb 24 1996 |

### Publication series

Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|

Volume | 1046 |

ISSN (Print) | 0302-9743 |

ISSN (Electronic) | 1611-3349 |

### Other

Other | 13th Annual Symposium on Theoretical Aspects of Computer Science, STACS 1996 |
---|---|

Country | France |

City | Grenoble |

Period | 2/22/96 → 2/24/96 |

### All Science Journal Classification (ASJC) codes

- Theoretical Computer Science
- Computer Science(all)

## Fingerprint Dive into the research topics of 'Extracting best consensus motifs from positive and negative examples'. Together they form a unique fingerprint.

## Cite this

*STACS 1996 - 13th Annual Symposium on Theoretical Aspects of Computer Science, Proceedings*(pp. 219-230). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1046). Springer Verlag. https://doi.org/10.1007/3-540-60922-9_19