Extracting best consensus motifs from positive and negative examples

Erika Tateishi, Osamu Maruyama, Satoru Miyano

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

We define the best consensus motif (BCM) problem motivated by the problem of extracting motifs from nucleic acid and amino acid sequences. A type over an alphabet Σ is a family Ω of subsets of Σ *. A motif π of type Ω is a string π=π1 ... πn of motif components, each of which stands for an element in Ω. The BCM problem for Ω is, given a yes-no sample S={(α (1)(1),..., (α(m)(m))} of pairs of strings in Σ* with α (i) ≠β(i) for 1 ≤ i ≤ m, to find a motif π of type Ω that maximizes the number of good pairs in S, where (α (i), β (i)) is good for π if π accepts α (i) and rejects β (i) We prove that the BCM problem is NP-complete even for a very simple type (Formula presented), which is used, in practice, for describing protein motifs in the PROSITE database. We also show that the NP-completeness of the problem does not change for the type Ω 1∪ {Σ+}∪{Σ[i,j]1≤i≤ j}, where Σ [i,j] is the set of strings over Σ of length between i and j Furthermore, for the BCM problem for Ω 1 we provide a polynomial-time greedy algorithm based on the probabilistic method. Its performance analysis shows an explicit approximation ratio of the algorithm.

Original languageEnglish
Title of host publicationSTACS 1996 - 13th Annual Symposium on Theoretical Aspects of Computer Science, Proceedings
EditorsClaude Puech, Rudiger Reischuk
PublisherSpringer Verlag
Pages219-230
Number of pages12
ISBN (Print)9783540609223
DOIs
Publication statusPublished - 1996
Event13th Annual Symposium on Theoretical Aspects of Computer Science, STACS 1996 - Grenoble, France
Duration: Feb 22 1996Feb 24 1996

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1046
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other13th Annual Symposium on Theoretical Aspects of Computer Science, STACS 1996
CountryFrance
CityGrenoble
Period2/22/962/24/96

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Extracting best consensus motifs from positive and negative examples'. Together they form a unique fingerprint.

  • Cite this

    Tateishi, E., Maruyama, O., & Miyano, S. (1996). Extracting best consensus motifs from positive and negative examples. In C. Puech, & R. Reischuk (Eds.), STACS 1996 - 13th Annual Symposium on Theoretical Aspects of Computer Science, Proceedings (pp. 219-230). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1046). Springer Verlag. https://doi.org/10.1007/3-540-60922-9_19