MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Iterative incremental clustering of time series (2004) [10 citations — 0 self]

Download:
Download as a PDF
by Jessica Lin, Michail Vlachos, Eamonn Keogh, Dimitrios Gunopulos
In EDBT
http://www.cs.ucr.edu/~mvlachos/pubs/edbt04.pdf
Add To MetaCart

Abstract:

Abstract. We present a novel anytime version of partitional clustering algorithm, such as k-Means and EM, for time series. The algorithm works by leveraging off the multi-resolution property of wavelets. The dilemma of choosing the initial centers is mitigated by initializing the centers at each approximation level, using the final centers returned by the coarser representations. In addition to casting the clustering algorithms as anytime algorithms, this approach has two other very desirable properties. By working at lower dimensionalities we can efficiently avoid local minima. Therefore, the quality of the clustering is usually better than the batch algorithm. In addition, even if the algorithm is run to completion, our approach is much faster than its batch counterpart. We explain, and empirically demonstrate these surprising and desirable properties with comprehensive experiments on several publicly available real data sets. We further demonstrate that our approach can be generalized to a framework of much broader range of algorithms or data mining problems. 1

Citations

4344 Maximum likelihood from incomplete data via the EM algorithm – Dempster, Laird, et al. - 1977
311 Efcient similarity search in sequence databases – Agrawal, Faloutsos, et al. - 1993
309 et al., Fast subsequence matching in timeseries databases – Faloutsos - 1994
180 Scaling Clustering Algorithms to Large Databases – Bradley, Fayyad, et al. - 1998
130 Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases – Keogh, Chakrabarti, et al. - 2001
127 Efficient time-series matching by wavelets – Chan, Fu - 1999
104 On the need for time series data mining benchmarks: A survey and empirical demonstration – Keogh, Kasetty
101 An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback – Keogh, Pazzani - 1998
96 Some methods for classification and analysis of multivariate observations – McQueen
94 Fast time sequence indexing for arbitrary Lp norms. In: A. El Abbadi et al. (eds – Yi, Faloutsos - 2000
74 Efficiently supporting ad hoc queries in large datasets of time sequences – Korn, Jagadish, et al. - 1997
39 Similarity search over time-series data using wavelets – Popivanov, Miller - 2002
38 Efficient retrieval of similar time sequences using DFT – Rafiei, Mendelzon - 1998
37 A comparison of dft and dwt based similarity search in time-series databases – Wu, Agrawal, et al. - 2000
36 Adaptive dimension reduction for clustering high dimensional data – Ding, He, et al. - 2002
27 The ucr time series data mining archive. http://www.cs.ucr.edu/∼eamonn/TSDMA/ index.html – Keogh, Folias - 2002
18 Iterative deepening dynamic time warping for time series – Chu, Hart, et al. - 2002
16 The Haar Wavelet Transform in the Time Series Similarity Paradigm – Struzik - 1999
13 Anytime algorithm development tools – Grass, Zilberstein - 1996
10 Anytime Exploratory data analysis for massive data sets – Smyth, Wolpert - 1997
10 A waveletbased anytime algorithm for k-means clustering of time series – Vlachos, Lin, et al. - 2003
5 Ten Lectures on Wavelets. Number 61 – Daubechies - 1992
4 Initialization of Iterative Refinement Clustering Algorithms – unknown authors - 1998
3 An Expectation Maximization (EM) Algorithm for the Identification and – Lawrence - 1990
3 TSA-tree: a Wavelet Based Approach to Improve the Efficiency of Multi-Level Surprise and Trend Queries – Shahabi, Tian - 2000