Fast clustering for time-series data with average-time-sequence-vector generation based on dynamic time warping

Kazuki Nakamoto, Yuu Yamada, Einoshin Suzuki

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

This paper proposes a fast clustering method for time-series data based on average time sequence vector. A clustering procedure based on an exhaustive search method is time-consuming although its result typically exhibits high quality. BIRCH, which reduces the number of examples by data squashing based on a data structure CF (Clustering Feature) tree, represents an effective solution for such a method when the data set consists of numerical attributes only. For time-series data, however, a straightforward application of BIRCH based on a Euclidean distance for a pair of sequences, miserably fails since such a distance typically differs from human's perception. A dissimilarity measure based on DTW (Dynamic Time Warping) is desirable, but to the best of our knowledge no methods have been proposed for time-series data in the context of data squashing. In order to circumvent this problem, we propose DTWS (Dynamic Time Warping Squashed) tree, which employs a dissimilarity measure based on DTW, and compresses time sequences to the average time sequence vector. An average time sequence vector is obtained by a novel procedure which estimates correct shrinkage of a result of DTW. Experiments using the Australian sign language data demonstrate the superiority of the proposed method in terms of correctness of clustering, while its degradation of time efficiency is negligible.

Original languageEnglish
Pages (from-to)144-152
Number of pages9
JournalTransactions of the Japanese Society for Artificial Intelligence
Volume18
Issue number3
DOIs
Publication statusPublished - Dec 1 2003
Externally publishedYes

Fingerprint

Time series
Data structures
Degradation
Experiments

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence

Cite this

Fast clustering for time-series data with average-time-sequence-vector generation based on dynamic time warping. / Nakamoto, Kazuki; Yamada, Yuu; Suzuki, Einoshin.

In: Transactions of the Japanese Society for Artificial Intelligence, Vol. 18, No. 3, 01.12.2003, p. 144-152.

Research output: Contribution to journalArticle

@article{6b6c0f6dd7384e959f6ae8cb36e4a78e,
title = "Fast clustering for time-series data with average-time-sequence-vector generation based on dynamic time warping",
abstract = "This paper proposes a fast clustering method for time-series data based on average time sequence vector. A clustering procedure based on an exhaustive search method is time-consuming although its result typically exhibits high quality. BIRCH, which reduces the number of examples by data squashing based on a data structure CF (Clustering Feature) tree, represents an effective solution for such a method when the data set consists of numerical attributes only. For time-series data, however, a straightforward application of BIRCH based on a Euclidean distance for a pair of sequences, miserably fails since such a distance typically differs from human's perception. A dissimilarity measure based on DTW (Dynamic Time Warping) is desirable, but to the best of our knowledge no methods have been proposed for time-series data in the context of data squashing. In order to circumvent this problem, we propose DTWS (Dynamic Time Warping Squashed) tree, which employs a dissimilarity measure based on DTW, and compresses time sequences to the average time sequence vector. An average time sequence vector is obtained by a novel procedure which estimates correct shrinkage of a result of DTW. Experiments using the Australian sign language data demonstrate the superiority of the proposed method in terms of correctness of clustering, while its degradation of time efficiency is negligible.",
author = "Kazuki Nakamoto and Yuu Yamada and Einoshin Suzuki",
year = "2003",
month = "12",
day = "1",
doi = "10.1527/tjsai.18.144",
language = "English",
volume = "18",
pages = "144--152",
journal = "Transactions of the Japanese Society for Artificial Intelligence",
issn = "1346-0714",
publisher = "Japanese Society for Artificial Intelligence",
number = "3",

}

TY - JOUR

T1 - Fast clustering for time-series data with average-time-sequence-vector generation based on dynamic time warping

AU - Nakamoto, Kazuki

AU - Yamada, Yuu

AU - Suzuki, Einoshin

PY - 2003/12/1

Y1 - 2003/12/1

N2 - This paper proposes a fast clustering method for time-series data based on average time sequence vector. A clustering procedure based on an exhaustive search method is time-consuming although its result typically exhibits high quality. BIRCH, which reduces the number of examples by data squashing based on a data structure CF (Clustering Feature) tree, represents an effective solution for such a method when the data set consists of numerical attributes only. For time-series data, however, a straightforward application of BIRCH based on a Euclidean distance for a pair of sequences, miserably fails since such a distance typically differs from human's perception. A dissimilarity measure based on DTW (Dynamic Time Warping) is desirable, but to the best of our knowledge no methods have been proposed for time-series data in the context of data squashing. In order to circumvent this problem, we propose DTWS (Dynamic Time Warping Squashed) tree, which employs a dissimilarity measure based on DTW, and compresses time sequences to the average time sequence vector. An average time sequence vector is obtained by a novel procedure which estimates correct shrinkage of a result of DTW. Experiments using the Australian sign language data demonstrate the superiority of the proposed method in terms of correctness of clustering, while its degradation of time efficiency is negligible.

AB - This paper proposes a fast clustering method for time-series data based on average time sequence vector. A clustering procedure based on an exhaustive search method is time-consuming although its result typically exhibits high quality. BIRCH, which reduces the number of examples by data squashing based on a data structure CF (Clustering Feature) tree, represents an effective solution for such a method when the data set consists of numerical attributes only. For time-series data, however, a straightforward application of BIRCH based on a Euclidean distance for a pair of sequences, miserably fails since such a distance typically differs from human's perception. A dissimilarity measure based on DTW (Dynamic Time Warping) is desirable, but to the best of our knowledge no methods have been proposed for time-series data in the context of data squashing. In order to circumvent this problem, we propose DTWS (Dynamic Time Warping Squashed) tree, which employs a dissimilarity measure based on DTW, and compresses time sequences to the average time sequence vector. An average time sequence vector is obtained by a novel procedure which estimates correct shrinkage of a result of DTW. Experiments using the Australian sign language data demonstrate the superiority of the proposed method in terms of correctness of clustering, while its degradation of time efficiency is negligible.

UR - http://www.scopus.com/inward/record.url?scp=18444402778&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=18444402778&partnerID=8YFLogxK

U2 - 10.1527/tjsai.18.144

DO - 10.1527/tjsai.18.144

M3 - Article

VL - 18

SP - 144

EP - 152

JO - Transactions of the Japanese Society for Artificial Intelligence

JF - Transactions of the Japanese Society for Artificial Intelligence

SN - 1346-0714

IS - 3

ER -