#### DMCA

## Probabilistic discovery of time series motifs (2003)

### Cached

### Download Links

- [people.apache.org]
- [www.cc.gatech.edu]
- [www.cc.gatech.edu]
- [www.cc.gatech.edu]
- [www.cc.gatech.edu]
- [www.cc.gatech.edu]
- [www.cs.ucr.edu]
- [www.cs.ucr.edu]
- [pdf.aminer.org]
- [www.cs.ucr.edu]
- [www.cs.ucr.edu]
- [www.cs.ucr.edu]
- [www.cs.ucr.edu]
- [www.cs.ucr.edu]
- [www.cs.ucr.edu]
- DBLP

### Other Repositories/Bibliography

Citations: | 178 - 24 self |

### Citations

1182 | Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids - Durbin, Eddy, et al. - 1998 |

693 |
Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment
- Lawrence, Altschul, et al.
- 1993
(Show Context)
Citation Context ...uitive results to be obtained. We note that the utility of allowing don’t care sections in time series has been documented before [1, 22], and it is a cornerstone of text and Biosequences data mining =-=[3, 24, 25, 28, 30, 34]-=-. The previous example illustrates the dangers of mining in the presence of noise. Indeed, this single spike might be best taken care of with a simple smoothing algorithm. More generally, however, we ... |

617 | Similarity search in high dimensions via hashing - Gionis, Piotr, et al. - 1999 |

431 | Introduction to algorithms. 2nd Edition - Cormen, Leiserson, et al. - 2001 |

427 | Identifying dna and protein patterns with statistically significant alignments of multiple sequences - Hertz, Stormo - 1999 |

311 | On the need for time series data mining benchmarks: a survey and empirical demonstration
- Keogh, Kasetty
- 2002
(Show Context)
Citation Context ...ints would not make much difference. However even small amounts of noise can dominate distance measures, including the most commonly used data mining distance measures, such as the Euclidean distance =-=[6, 7, 8, 21, 36]-=-. Figure 3 shows that the spike can cause one of our candidate motifs to appear to be much more similar to an artificial sequence which just happens to have spike in the same place. 3 2 1 0 20 40 60 8... |

283 | Finding motifs using random projections
- Buhler, Tompa
- 2001
(Show Context)
Citation Context ...uitive results to be obtained. We note that the utility of allowing don’t care sections in time series has been documented before [1, 22], and it is a cornerstone of text and Biosequences data mining =-=[3, 24, 25, 28, 30, 34]-=-. The previous example illustrates the dangers of mining in the presence of noise. Indeed, this single spike might be best taken care of with a simple smoothing algorithm. More generally, however, we ... |

279 | Efficient time series matching by wavelets
- Chan, Fu
- 1999
(Show Context)
Citation Context ... and noisy industrial dataset. Below) a zoom-in reveals just how similar the three occurrences are to each other There exists a vast body of work on efficiently locating known patterns in time series =-=[1, 6, 12, 23, 35, 36, 37]-=-. Here, however, we must be able to discover motifs without any prior knowledge about the regularities of the data under study. The obvious, nested-loop, brute force approach to motif discovery would ... |

275 | Unsupervised learning of multiple motifs in biopolymers using expectation maximization
- Bailey, Elkan
- 1995
(Show Context)
Citation Context ...uitive results to be obtained. We note that the utility of allowing don’t care sections in time series has been documented before [1, 22], and it is a cornerstone of text and Biosequences data mining =-=[3, 24, 25, 28, 30, 34]-=-. The previous example illustrates the dangers of mining in the presence of noise. Indeed, this single spike might be best taken care of with a simple smoothing algorithm. More generally, however, we ... |

272 | Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies - Helden, André, et al. - 1998 |

252 | Combinatorial Approaches to Finding Subtle Signals in DNA Sequences
- Pevzner, Sze
- 2000
(Show Context)
Citation Context |

251 | Discovering similar multidimensional trajectories - Vlachos, Kollios, et al. - 2002 |

235 | Dimensionality reduction for fast similarity search in large time series databases
- Keogh, Chakrabarti, et al.
- 2000
(Show Context)
Citation Context ... and noisy industrial dataset. Below) a zoom-in reveals just how similar the three occurrences are to each other There exists a vast body of work on efficiently locating known patterns in time series =-=[1, 6, 12, 23, 35, 36, 37]-=-. Here, however, we must be able to discover motifs without any prior knowledge about the regularities of the data under study. The obvious, nested-loop, brute force approach to motif discovery would ... |

234 | Fast similarity search in the presence of noise, scaling, and translation in time-series databases
- Agrawal, Lin, et al.
- 1995
(Show Context)
Citation Context ... and noisy industrial dataset. Below) a zoom-in reveals just how similar the three occurrences are to each other There exists a vast body of work on efficiently locating known patterns in time series =-=[1, 6, 12, 23, 35, 36, 37]-=-. Here, however, we must be able to discover motifs without any prior knowledge about the regularities of the data under study. The obvious, nested-loop, brute force approach to motif discovery would ... |

226 | Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm
- Rigoutsos, Floratos
- 1998
(Show Context)
Citation Context |

214 | Efficient retrieval of similar time sequences under time warping - Yi, Jagadish, et al. - 1998 |

208 | Reduction techniques for instance-based learning algorithms - Wilson, Martinez - 2000 |

180 |
Fast time sequence indexing for arbitrary Lp forms
- Yi, Faloutsos
- 2000
(Show Context)
Citation Context |

177 | Rule discovery from time series
- Das, Lin, et al.
- 1998
(Show Context)
Citation Context ...rnia - Riverside Riverside, CA 92521 {bill, eamonn, stelo }@cs.ucr.edu 493 492 • Mining association rules in time series requires the discovery of motifs. These are referred to as primitive shapes in =-=[7]-=- and frequent patterns in [18]. • Several time series classification algorithms work by constructing typical prototypes of each class [22, 15]. These prototypes may be considered motifs. • Many time s... |

170 | An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences - Lawrence, Reilly - 1990 |

160 | An enhanced representation of time-series which allows fast and accurate classification, clustering and relevance feedback - Keogh, Pazzani - 1998 |

159 | Efficient mining of partial periodic patterns in time series database
- Han, Dong, et al.
- 1999
(Show Context)
Citation Context ...nsors. We see these “experiences” as motifs. • Much of the work on finding approximate periodic patterns in time series can viewed as an attempt to discover motifs that occur at constrained intervals =-=[14]-=-. For example, the astute reader may have noticed that the motif in Figure 1 appears at approximately equal intervals, suggesting an unexpected regularity. In addition to the application domains menti... |

133 | Novelty detection in time series data using ideas from immunology
- Dasgupta, Forrest
- 1996
(Show Context)
Citation Context ...ess detection algorithms essentially consist of modeling normal behavior with a set of typical shapes (which we see as motifs), and detecting future patterns that are dissimilar to all typical shapes =-=[8]-=-. • In robotics, Oates et al. [27], have introduced a method to allow an autonomous agent to generalize from a set of qualitatively different experiences gleaned from sensors. We see these “experience... |

108 | Finding motifs in time series
- Lin, Keogh, et al.
- 2002
(Show Context)
Citation Context ... finding approximately repeated subsequences in a longer time series. In an earlier work, we formalized the idea of approximately repeated subsequences by introducing the notion of time series motifs =-=[26]-=-. We will define motifs more formally later in this work. In the meantime a simple graphic example will serve to develop the reader’s intuition. Figure 1 illustrates an example of a motif discovered i... |

103 | Probabilistic and statistical properties of words: an overview - REINERT, SCHBATH, et al. - 2000 |

91 | Efficient large-scale sequence comparison by locality-sensitive hashing - Buhler |

77 |
Identifying representative trends in massive time series datasets using sketches
- Indyk, Koudas, et al.
- 2000
(Show Context)
Citation Context ...spond to an upward trend or a downward trend of arbitrary angles. These “degenerate motifs” are unlikely to be of interest to anyone, and in any case, are trivial to enumerate with a simple algorithm =-=[19]-=-. We will therefore exclude them from further consideration. This can easily be achieved at the feature extraction stage, when using sliding windows to extract the subsequences. As the window is moved... |

70 | Deformable Markov model templates for timeseries pattern matching
- Ge, Smyth
- 2000
(Show Context)
Citation Context |

55 | 2001a), ‘Discovery of temporal patterns – learning rules about the qualitative behaviour of time series
- Höppner
(Show Context)
Citation Context ...A 92521 {bill, eamonn, stelo }@cs.ucr.edu 493 492 • Mining association rules in time series requires the discovery of motifs. These are referred to as primitive shapes in [7] and frequent patterns in =-=[18]-=-. • Several time series classification algorithms work by constructing typical prototypes of each class [22, 15]. These prototypes may be considered motifs. • Many time series anomaly/interestingness ... |

55 | LB Keogh Supports Exact Indexing of Shapes Under Rotation Invariance with Arbitrary Representations and Distance Measures - Keogh, Wei, et al. - 2006 |

51 | Methods for discovering novel motif in nucleic acid sequences - Staden - 1989 |

51 | i sax: indexing and mining terabyte sized time series - Shieh, Keogh - 2008 |

43 | Monotony of Surprise in Large-Scale Quest for Unusual Words
- Apostolico, Bock, et al.
(Show Context)
Citation Context ... the challenge by Pevzner and Sze [28] (see below). We mention, in no particular order and without pretending to be exhaustive, TEIRESIAS [30], GIBBSSAMPLER [24], MEME [3], WINNOWER [28], VERBUMCULUS =-=[2]-=-, PROJECTION [34], among others. Of particular interest is the PROJECTION algorithm by Buhler and Tompa [34]. They applied random projection in their paper to find motif in nucleotide sequences. The m... |

42 | Discovery of time-series motif from multidimensional data based on mdl principle - Tanaka, Iwamoto, et al. - 2005 |

40 | 80 million tiny images: a large database for non-parametric object and scene recognition - Torralba, Fergus, et al. - 2008 |

37 | A method for clustering the experiences of a mobile robot that accords with human judgments
- Oates, Schmill, et al.
- 2000
(Show Context)
Citation Context ...lly consist of modeling normal behavior with a set of typical shapes (which we see as motifs), and detecting future patterns that are dissimilar to all typical shapes [8]. • In robotics, Oates et al. =-=[27]-=-, have introduced a method to allow an autonomous agent to generalize from a set of qualitatively different experiences gleaned from sensors. We see these “experiences” as motifs. • Much of the work o... |

33 | Effective Proximity Retrieval by Ordering Permutations - Gonzalez, Figueroa, et al. |

26 |
A Bibliography of Temporal
- Roddick, Spiliopoulou
- 1999
(Show Context)
Citation Context ...roper context we will briefly consider related work. To date the majority of work in time series data mining has focused indexing time series, the efficient discovery of known patterns in time series =-=[1, 6, 12, 21, 22, 23, 31, 35, 36, 37]-=-. The innovative work of Oates et al. considers the problem of learning “qualitatively different experiences” (which we see as motifs), but the authors are working with relatively small datasets, and ... |

24 | Discovering multivariate motifs using subsequence density estimation and greedy mixture learning - Minnen, Isbell, et al. - 2007 |

23 | Anytime classification using the nearest neighbor algorithm with applications to stream mining - Ueno, Xi, et al. |

18 | High Performance Data Mining Using the Nearest Neighbor Join - Böhm, Krebs - 2002 |

17 | et al. Efficient color histogram indexing for quadratic form distance functions - Hafner - 1995 |

16 | Knowledge construction from time series data using a collaborative exploration system - Guyet, Garbay, et al. - 2007 |

14 | Efficiently Finding Arbitrarily Scaled Patterns - Keogh |

12 | Animated people textures - Celly, Zordan - 2004 |

12 | Mining Motifs from Human Motion - Meng, Yuan, et al. |

11 | Symbolic analysis of experimental data, Review of Scientific Instruments - Daw, Finney, et al. - 2001 |

11 | Unsupervised activity discovery and characterization from event- streams - Hamid, Maddi, et al. - 2005 |

11 | Discovering representative models in large time series databases - Rombo, Terracina - 2004 |

7 | C.: Selecting maximally informative genes to enable temporal expression profiling analysis - Androulakis, J, et al. |

7 | Declarative Querying For Biological Sequences - Tata - 2007 |

6 | AC. Functional uncoupling of hemodynamic from neuronal response by inhibition of neuronal nitric oxide synthase. J Cereb Blood Flow Metab - Stefanovic, Schwindt, et al. |

5 | Implementing an integrated time-series data mining environment based on temporal pattern extraction methods: A case study of an interferon therapy risk mining for chronic hepatitis - Abe, Ohsaki, et al. - 2006 |

5 | Learning recurrent behaviors from heterogeneous multivariate time-series. Artificial intelligence in medicine - Duchêne, Garbay, et al. |

5 | K.: Understanding the formation of tornadoes through data mining - McGovern, Rosendahl, et al. - 2007 |

5 | Atlas of EEG patterns - Stern, Engel - 2005 |

4 | Frequent motion pattern extraction for motion recognition in real-time human proxy - Arita, Yoshimatsu, et al. - 2005 |

4 | Protecting high-yielding sugarbeet varieties from loss to curly top, volume 1. November 2000. 156 - Kaffka, Wintermantel, et al. - 2001 |

4 | A study of extraction method of motion patterns observed frequently from time-series posture data - Murakami, Doki, et al. - 2005 |

4 | Characterization and correlation of dc electrical penetration graph waveforms with feeding behavior of beet leafhopper, circulifer tenellus. Entomologia Experimentalis et Applicata - Stafford, Walker |

3 |
Mining the MACHO dataset
- Hegland, Clarke, et al.
- 2002
(Show Context)
Citation Context ...covery of motifs. These are referred to as primitive shapes in [7] and frequent patterns in [18]. • Several time series classification algorithms work by constructing typical prototypes of each class =-=[22, 15]-=-. These prototypes may be considered motifs. • Many time series anomaly/interestingness detection algorithms essentially consist of modeling normal behavior with a set of typical shapes (which we see ... |

3 |
Bayesian blocks, a new method to analyze structure in photon counting data,” Astrophys
- Jackson, Scargle, et al.
- 1998
(Show Context)
Citation Context ...interest, time series: Definition 1. Time Series: A time series T = t 1,…,tm is an ordered set of m real-valued variables. Time series can be very long, sometimes containing trillions of observations =-=[12, 32]-=-. We are typically not interested in any of the global properties of a time series; rather, we are interested in subsections of the time series, which are called subsequences. Definition 2. Subsequenc... |

3 | Disturbance patterns in sleep - Loomis, Harvey, et al. - 1938 |

2 | Hypothesis generation strategies for adaptive problem solving
- Engelhardt, Chien, et al.
- 2000
(Show Context)
Citation Context ... suggesting an unexpected regularity. In addition to the application domains mentioned above, motif discovery can be very useful in its own right as an exploratory tool to allow hypothesis generation =-=[11]-=-. A B C Winding Dataset (The angular speed of reel 2) 0 500 1000 1500 2000 2500 A B C 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 Figure 1: Above) An example of a mot... |

1 | Discovering similar patterns in time series - unknown authors - 2000 |

1 | Locality-preserving hashing in multidimensional spaces - Raghavan - 1997 |

1 |
An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences
- A
- 1990
(Show Context)
Citation Context |