Order-preserving pattern matching indeterminate strings

Rui Henriques, Alexandre P. Francisco, Luís M.S. Russo, Hideo Bannai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Given an indeterminate string pattern p and an indeterminate string text t, the problem of orderpreserving pattern matching with character uncertainties (μOPPM) is to find all substrings of t that satisfy one of the possible orderings defined by p. When the text and pattern are determinate strings, we are in the presence of the well-studied exact order-preserving pattern matching (OPPM) problem with diverse applications on time series analysis. Despite its relevance, the exact OPPM problem suffers from two major drawbacks: 1) the inability to deal with indetermination in the text, thus preventing the analysis of noisy time series; and 2) the inability to deal with indetermination in the pattern, thus imposing the strict satisfaction of the orders among all pattern positions. In this paper, we provide the first polynomial algorithms to answer the μOPPM problem when: 1) indetermination is observed on the pattern or text; and 2) indetermination is observed on both the pattern and the text and given by uncertainties between pairs of characters. First, given two strings with the same length m and O(r) uncertain characters per string position, we show that the μOPPM problem can be solved in O(mr lg r) time when one string is indeterminate and r ∈ N+ and in O(m2) time when both strings are indeterminate and r=2. Second, given an indeterminate text string of length n, we show that μOPPM can be efficiently solved in polynomial time and linear space.

Original languageEnglish
Title of host publication29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018
EditorsBinhai Zhu, Gonzalo Navarro, David Sankoff
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
Pages21-215
Number of pages195
ISBN (Electronic)9783959770743
DOIs
Publication statusPublished - May 1 2018
Event29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018 - Qingdao, China
Duration: Jul 2 2018Jul 4 2018

Publication series

NameLeibniz International Proceedings in Informatics, LIPIcs
Volume105
ISSN (Print)1868-8969

Other

Other29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018
CountryChina
CityQingdao
Period7/2/187/4/18

Fingerprint

Pattern matching
Polynomials
Time series analysis
Time series

All Science Journal Classification (ASJC) codes

  • Software

Cite this

Henriques, R., Francisco, A. P., Russo, L. M. S., & Bannai, H. (2018). Order-preserving pattern matching indeterminate strings. In B. Zhu, G. Navarro, & D. Sankoff (Eds.), 29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018 (pp. 21-215). (Leibniz International Proceedings in Informatics, LIPIcs; Vol. 105). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. https://doi.org/10.4230/LIPIcs.CPM.2018.2

Order-preserving pattern matching indeterminate strings. / Henriques, Rui; Francisco, Alexandre P.; Russo, Luís M.S.; Bannai, Hideo.

29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018. ed. / Binhai Zhu; Gonzalo Navarro; David Sankoff. Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2018. p. 21-215 (Leibniz International Proceedings in Informatics, LIPIcs; Vol. 105).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Henriques, R, Francisco, AP, Russo, LMS & Bannai, H 2018, Order-preserving pattern matching indeterminate strings. in B Zhu, G Navarro & D Sankoff (eds), 29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018. Leibniz International Proceedings in Informatics, LIPIcs, vol. 105, Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, pp. 21-215, 29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018, Qingdao, China, 7/2/18. https://doi.org/10.4230/LIPIcs.CPM.2018.2
Henriques R, Francisco AP, Russo LMS, Bannai H. Order-preserving pattern matching indeterminate strings. In Zhu B, Navarro G, Sankoff D, editors, 29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018. Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. 2018. p. 21-215. (Leibniz International Proceedings in Informatics, LIPIcs). https://doi.org/10.4230/LIPIcs.CPM.2018.2
Henriques, Rui ; Francisco, Alexandre P. ; Russo, Luís M.S. ; Bannai, Hideo. / Order-preserving pattern matching indeterminate strings. 29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018. editor / Binhai Zhu ; Gonzalo Navarro ; David Sankoff. Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2018. pp. 21-215 (Leibniz International Proceedings in Informatics, LIPIcs).
@inproceedings{66f60bc425b54e36b646ffdcb264d9f7,
title = "Order-preserving pattern matching indeterminate strings",
abstract = "Given an indeterminate string pattern p and an indeterminate string text t, the problem of orderpreserving pattern matching with character uncertainties (μOPPM) is to find all substrings of t that satisfy one of the possible orderings defined by p. When the text and pattern are determinate strings, we are in the presence of the well-studied exact order-preserving pattern matching (OPPM) problem with diverse applications on time series analysis. Despite its relevance, the exact OPPM problem suffers from two major drawbacks: 1) the inability to deal with indetermination in the text, thus preventing the analysis of noisy time series; and 2) the inability to deal with indetermination in the pattern, thus imposing the strict satisfaction of the orders among all pattern positions. In this paper, we provide the first polynomial algorithms to answer the μOPPM problem when: 1) indetermination is observed on the pattern or text; and 2) indetermination is observed on both the pattern and the text and given by uncertainties between pairs of characters. First, given two strings with the same length m and O(r) uncertain characters per string position, we show that the μOPPM problem can be solved in O(mr lg r) time when one string is indeterminate and r ∈ N+ and in O(m2) time when both strings are indeterminate and r=2. Second, given an indeterminate text string of length n, we show that μOPPM can be efficiently solved in polynomial time and linear space.",
author = "Rui Henriques and Francisco, {Alexandre P.} and Russo, {Lu{\'i}s M.S.} and Hideo Bannai",
year = "2018",
month = "5",
day = "1",
doi = "10.4230/LIPIcs.CPM.2018.2",
language = "English",
series = "Leibniz International Proceedings in Informatics, LIPIcs",
publisher = "Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing",
pages = "21--215",
editor = "Binhai Zhu and Gonzalo Navarro and David Sankoff",
booktitle = "29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018",

}

TY - GEN

T1 - Order-preserving pattern matching indeterminate strings

AU - Henriques, Rui

AU - Francisco, Alexandre P.

AU - Russo, Luís M.S.

AU - Bannai, Hideo

PY - 2018/5/1

Y1 - 2018/5/1

N2 - Given an indeterminate string pattern p and an indeterminate string text t, the problem of orderpreserving pattern matching with character uncertainties (μOPPM) is to find all substrings of t that satisfy one of the possible orderings defined by p. When the text and pattern are determinate strings, we are in the presence of the well-studied exact order-preserving pattern matching (OPPM) problem with diverse applications on time series analysis. Despite its relevance, the exact OPPM problem suffers from two major drawbacks: 1) the inability to deal with indetermination in the text, thus preventing the analysis of noisy time series; and 2) the inability to deal with indetermination in the pattern, thus imposing the strict satisfaction of the orders among all pattern positions. In this paper, we provide the first polynomial algorithms to answer the μOPPM problem when: 1) indetermination is observed on the pattern or text; and 2) indetermination is observed on both the pattern and the text and given by uncertainties between pairs of characters. First, given two strings with the same length m and O(r) uncertain characters per string position, we show that the μOPPM problem can be solved in O(mr lg r) time when one string is indeterminate and r ∈ N+ and in O(m2) time when both strings are indeterminate and r=2. Second, given an indeterminate text string of length n, we show that μOPPM can be efficiently solved in polynomial time and linear space.

AB - Given an indeterminate string pattern p and an indeterminate string text t, the problem of orderpreserving pattern matching with character uncertainties (μOPPM) is to find all substrings of t that satisfy one of the possible orderings defined by p. When the text and pattern are determinate strings, we are in the presence of the well-studied exact order-preserving pattern matching (OPPM) problem with diverse applications on time series analysis. Despite its relevance, the exact OPPM problem suffers from two major drawbacks: 1) the inability to deal with indetermination in the text, thus preventing the analysis of noisy time series; and 2) the inability to deal with indetermination in the pattern, thus imposing the strict satisfaction of the orders among all pattern positions. In this paper, we provide the first polynomial algorithms to answer the μOPPM problem when: 1) indetermination is observed on the pattern or text; and 2) indetermination is observed on both the pattern and the text and given by uncertainties between pairs of characters. First, given two strings with the same length m and O(r) uncertain characters per string position, we show that the μOPPM problem can be solved in O(mr lg r) time when one string is indeterminate and r ∈ N+ and in O(m2) time when both strings are indeterminate and r=2. Second, given an indeterminate text string of length n, we show that μOPPM can be efficiently solved in polynomial time and linear space.

UR - http://www.scopus.com/inward/record.url?scp=85048250361&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048250361&partnerID=8YFLogxK

U2 - 10.4230/LIPIcs.CPM.2018.2

DO - 10.4230/LIPIcs.CPM.2018.2

M3 - Conference contribution

AN - SCOPUS:85048250361

T3 - Leibniz International Proceedings in Informatics, LIPIcs

SP - 21

EP - 215

BT - 29th Annual Symposium on Combinatorial Pattern Matching, CPM 2018

A2 - Zhu, Binhai

A2 - Navarro, Gonzalo

A2 - Sankoff, David

PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing

ER -