Composite pattern discovery for PCR application

Stanislav Angelov, Shunsuke Inenaga

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

We consider the problem of finding pairs of short patterns such that, in a given input sequence of length n, the distance between each pair's patterns is at least α. The problem was introduced in [1] and is motivated by the optimization of multiplexed nested PCR. We study algorithms for the following two cases; the special case when the two patterns in the pair are required to have the same length, and the more general case when the patterns can have different lengths. For the first case we present an O(αn log log n) time and O(n) space algorithm, and for the general case we give an O(αn log n) time and O(n) space algorithm. The algorithms work for any alphabet size and use asymptotically less space than the algorithms presented in [1]. For alphabets of constant size we also give an O(n√n log2 n) time algorithm for the general case. We demonstrate that the algorithms perform well in practice and present our findings for the human genome. In addition, we study an extended version of the problem where patterns in the pair occur at certain positions at a distance at most α, but do not occur α-close anywhere else, in the input sequence.

Original languageEnglish
Title of host publicationString Processing and Information Retrieval - 12th International Conference, SPIRE 2005, Proceedings
Pages167-178
Number of pages12
DOIs
Publication statusPublished - Dec 1 2005
Event12th International Conference on String Processing and Information Retrieval, SPIRE 2005 - Buenos Aires, Argentina
Duration: Nov 2 2005Nov 4 2005

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3772 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other12th International Conference on String Processing and Information Retrieval, SPIRE 2005
CountryArgentina
CityBuenos Aires
Period11/2/0511/4/05

Fingerprint

Pattern Discovery
Composite
Composite materials
Genome
Genes
Optimization
Demonstrate

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Angelov, S., & Inenaga, S. (2005). Composite pattern discovery for PCR application. In String Processing and Information Retrieval - 12th International Conference, SPIRE 2005, Proceedings (pp. 167-178). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3772 LNCS). https://doi.org/10.1007/11575832_19

Composite pattern discovery for PCR application. / Angelov, Stanislav; Inenaga, Shunsuke.

String Processing and Information Retrieval - 12th International Conference, SPIRE 2005, Proceedings. 2005. p. 167-178 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3772 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Angelov, S & Inenaga, S 2005, Composite pattern discovery for PCR application. in String Processing and Information Retrieval - 12th International Conference, SPIRE 2005, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 3772 LNCS, pp. 167-178, 12th International Conference on String Processing and Information Retrieval, SPIRE 2005, Buenos Aires, Argentina, 11/2/05. https://doi.org/10.1007/11575832_19
Angelov S, Inenaga S. Composite pattern discovery for PCR application. In String Processing and Information Retrieval - 12th International Conference, SPIRE 2005, Proceedings. 2005. p. 167-178. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/11575832_19
Angelov, Stanislav ; Inenaga, Shunsuke. / Composite pattern discovery for PCR application. String Processing and Information Retrieval - 12th International Conference, SPIRE 2005, Proceedings. 2005. pp. 167-178 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{4ff1a9ee43e449cd867eabd8f3d50f37,
title = "Composite pattern discovery for PCR application",
abstract = "We consider the problem of finding pairs of short patterns such that, in a given input sequence of length n, the distance between each pair's patterns is at least α. The problem was introduced in [1] and is motivated by the optimization of multiplexed nested PCR. We study algorithms for the following two cases; the special case when the two patterns in the pair are required to have the same length, and the more general case when the patterns can have different lengths. For the first case we present an O(αn log log n) time and O(n) space algorithm, and for the general case we give an O(αn log n) time and O(n) space algorithm. The algorithms work for any alphabet size and use asymptotically less space than the algorithms presented in [1]. For alphabets of constant size we also give an O(n√n log2 n) time algorithm for the general case. We demonstrate that the algorithms perform well in practice and present our findings for the human genome. In addition, we study an extended version of the problem where patterns in the pair occur at certain positions at a distance at most α, but do not occur α-close anywhere else, in the input sequence.",
author = "Stanislav Angelov and Shunsuke Inenaga",
year = "2005",
month = "12",
day = "1",
doi = "10.1007/11575832_19",
language = "English",
isbn = "3540297405",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "167--178",
booktitle = "String Processing and Information Retrieval - 12th International Conference, SPIRE 2005, Proceedings",

}

TY - GEN

T1 - Composite pattern discovery for PCR application

AU - Angelov, Stanislav

AU - Inenaga, Shunsuke

PY - 2005/12/1

Y1 - 2005/12/1

N2 - We consider the problem of finding pairs of short patterns such that, in a given input sequence of length n, the distance between each pair's patterns is at least α. The problem was introduced in [1] and is motivated by the optimization of multiplexed nested PCR. We study algorithms for the following two cases; the special case when the two patterns in the pair are required to have the same length, and the more general case when the patterns can have different lengths. For the first case we present an O(αn log log n) time and O(n) space algorithm, and for the general case we give an O(αn log n) time and O(n) space algorithm. The algorithms work for any alphabet size and use asymptotically less space than the algorithms presented in [1]. For alphabets of constant size we also give an O(n√n log2 n) time algorithm for the general case. We demonstrate that the algorithms perform well in practice and present our findings for the human genome. In addition, we study an extended version of the problem where patterns in the pair occur at certain positions at a distance at most α, but do not occur α-close anywhere else, in the input sequence.

AB - We consider the problem of finding pairs of short patterns such that, in a given input sequence of length n, the distance between each pair's patterns is at least α. The problem was introduced in [1] and is motivated by the optimization of multiplexed nested PCR. We study algorithms for the following two cases; the special case when the two patterns in the pair are required to have the same length, and the more general case when the patterns can have different lengths. For the first case we present an O(αn log log n) time and O(n) space algorithm, and for the general case we give an O(αn log n) time and O(n) space algorithm. The algorithms work for any alphabet size and use asymptotically less space than the algorithms presented in [1]. For alphabets of constant size we also give an O(n√n log2 n) time algorithm for the general case. We demonstrate that the algorithms perform well in practice and present our findings for the human genome. In addition, we study an extended version of the problem where patterns in the pair occur at certain positions at a distance at most α, but do not occur α-close anywhere else, in the input sequence.

UR - http://www.scopus.com/inward/record.url?scp=33646754225&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33646754225&partnerID=8YFLogxK

U2 - 10.1007/11575832_19

DO - 10.1007/11575832_19

M3 - Conference contribution

SN - 3540297405

SN - 9783540297406

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 167

EP - 178

BT - String Processing and Information Retrieval - 12th International Conference, SPIRE 2005, Proceedings

ER -