| Agrawal, R., Lin, K. I., Sawhney, H. S., and Swim, K. (1995). Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proceedings 21st VLDB. |
....commercial DBMS. Therefore, the information infrastructure of most enterprises is based on products such as Oracle or Informix. In recent years, an increasing number of applications has emerged processing large amounts of complex, applicationspecific data objects [Jag 91, GM 93, FBF 94, FRM 94, ALSS 95, KSF 96] In application domains such as multimedia, medical imaging, molecular biology, computer aided design, marketing and purchasing assistance, etc. a high efficiency of query processing is crucial due to the immense and even increasing size of current databases. The search in such ....
....that match the query person with a probability of at least 10 . determine the person that matches the query person with maximum probability. Technical Analysis of Share Price One of the classical applications of similarity search and data mining is clearly the analysis of time sequences [ALSS 95] such as share price analysis. Various similarity measures have been proposed. For practical analysis, however, quite different concepts are used, such as indicators, i.e. mathematical formulas derived from the time sequence that generate trading signals (buy, sell) Another concept for the ....
Agrawal R., Lin K., Sawhney H., Shim K.: Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases, Proc. of the 21st Int. Conf. on Very Large Databases, 1995, pp. 490-501.
....1000 1500 2000 2500 Figure 1: Above) An example of a motif that occurs three times in a complex and noisy industrial dataset. Below) a zoom in reveals just how similar the three occurrences are to each other There exists a vast body of work on efficiently locating known patterns in time series [1, 6, 12, 23, 35, 36, 37]. Here, however, we must be able to discover motifs without any prior knowledge about the regularities of the data under study. The obvious, nested loop, brute force approach to motif discovery would require a number of comparisons quadratic in the length of the database. Optimizations based on ....
....mine noisy datasets. Figure 3 also shows that allowing small don t care subsections (that is, sections which are ignored by the distance function) allows much more intuitive results to be obtained. We note that the utility of allowing don t care sections in time series has been documented before [1, 22], and it is a cornerstone of text and Biosequences data mining [3, 24, 25, 28, 30, 34] The previous example illustrates the dangers of mining in the presence of noise. Indeed, this single spike might be best taken care of with a simple smoothing algorithm. More generally, however, we may have a ....
[Article contains additional citation context not shown here]
Agrawal, R., Lin, K. I., Sawhney, H. S. & Shim, K. (1995). Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In proceedings of the 21 st Int'l Conference on Very Large Databases. Zurich, Switzerland, Sept. pp 490-50.
....patterns, to process user queries in a fast and an accurate manner, and to compute statistics on data streams in real time. 1.1 Related work There has been a substantial body of work on similarity search in sequence databases. Various high di mensional index structures have been proposed in [2, 3, 11, 16, 27, 29, 33, 37] to achieve fast query response time and a good quality of answers. Theoretical methods have been developed for com paring data streams under various Lp distances [12] for clustering and computing the k median [22, 32] and for computing aggregates over data streams [14, 18] Various ....
R. Agrawal, K. Lin, H. S. Sawhney, and K. Shim. Fast similarity search in presence of noise, scaling, and translation in time-series databases. In VLDB, pages 490-501, 1995.
....a given sequence. What exactly does similar mean depends on the application, and the definition of similarity may vary. This type of queries can be very useful when analysing financial or scientific data, for example for prediction and cluster ing purposes. Some examples of such queries are [1] [2], 8] 1. Determine products with similar selling patterns. 2. Find stocks whose stock prices move similarly. 3. Find cases in the past that resemble last year s sales pattern of a certain product. 4. Find portions of seismic waves that are not similar to spot geological irregularities. The ....
....that accounts for scaling and shifting is presented in [11] where it is proposed that the time sequences be first normalized before applying the distance met tic. Several other, more flexible definitions of similarity have been proposed in order to account for example, for the presence of noise [2] and time warping [31] In [24] a gen eral landmark similarity is introduced, which is invariant to shifting, uniform and non uniform amplitude scaling, uniform time scaling and time warping. a) b) c) Figure 2.1: Trancbrmations on time sequences: a) 4mplitude shine5 (b) Unirm amplitude ....
R. Agrawal, K.-I. Lin, H.S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In The V7DB Journal, 1995.
....sequences. Mannila, Toivonen, and Verkamo did vork in the same field, but concentrated more on finding frequent episodes (sub sequences) in specified time vindovs vithin sequences [15] They proposed an algorithm to find similar sub sequences betveen sequences of data in time series databases [2]. In this model, the sequences in question can be scaled or translated before similar sub sequences are found. Sequential data mining can have many applications, especially in the financial area, vhere it is usually used to find similar grovth patterns in companies, stocks, or product sales. 2.1 ....
R. Agrawal, K. Lin, H. Sawhney and K. Shim. Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases. In Proc. 21st VLDB, pages 490-501, 1995.
....are all evaluated on small datasets residing in main memory, and it is unclear if they can be made to scale to large databases. Further, the systems are evaluated without considering precision and recall, thus we can say little or nothing about the quality of the returned answer set. The work of [3, 36, 45, 25, 26] differs from the above in that they focus in providing a more flexible query language and not on performance issues. 2.2 Exact techniques for similarity searching. A time series C = c . c with n datapoints can be considered as a point in n dimensional space. This immediately suggests that ....
....( li . 1N , hi . hN ) APCA space; L= 11 . 1N and H= h I . hN denote the lower and higher endpoints of the major diagonal of R. APCA rectangle corresponding to APCA point C C = cmin, cr . cminzq, crzq , cmaxj, crj . cmaxzq, crzq ) GUi = GRi [1] GRi [2] GRi[3], GRi[4] i th region associated with R; Gel[l] and Gel[3] are low and high bounds along the value axis; GRi[2] and GRi[4] are those along the time axis MINDIST(Q, R) Minimum distance of MBR R from query time series Q M1NDIST(Q, R, t) Minimum distance of MBR R from Q at time instant t ....
[Article contains additional citation context not shown here]
Agrawal, R., Lin, K. I., Sawhney, H. S., & Shim, K. (1995). Fast similarity search in the presence of noise, scaling, and translation in times-series databases. Proceedings of 21 'h International Conference on Very Large Data Bases. Zurich. pp 490-50.
....polygons. Stock market, video, text, genome data are examples of sequence data. The disrance similarity measure for these data varies based on the application and the data type. Some of the distance measures currently in use are Euclidean distance [1, 12, 16, 26] other vector norms like L and L [2], shift scale invariant distance measures [13, 27] edit distance [8, 25, 35] and score matrix based similarity [21] The sizes of spatial and sequence datasets are growing rapidly. Excessive amounts of data make similarity search challenging, incurring a large amount of disk I O and CPU cost. ....
R. Agrawal, K. Lin, H.S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In VLDB, Ztirich, Switzerland, September 1995.
....interest in mining time series databases. As with most computer science problems, representation of the data is the key to efficient and effective solutions. Several high level representations of time series have been proposed, including Fourier Transforms [1,13] Wavelets [4] Symbolic Mappings [2, 5, 24] and Piecewise Linear Representation (PLR) In this work, we confine our attention to PLR, perhaps the most frequently used representation [8, 10, 12, 14, 15, 16, 17, 18, 20, 21, 22, 25, 27, 28, 30, 31] Intuitively Piecewise Linear Representation refers to the approximation of a time series T, ....
.... Since linear regression minimizes the sum of squares error, it also minimizes the Euclidean distance (the Euclidean distance is just the square root of the sum of squares) Euclidean distance, or some measure derived from it, is by far the most common metric used in data mining of time series [1, 2, 4, 5, 13, 14, 15, 16, 25, 31]. The linear interpolation versions of the algorithms, by definition, will always have a greater sum of squares error. We immediately encounter a problem when attempting to compare the algorithms. We cannot compare them for fixed values of K, since Sliding Windows does not allow one to specify ....
Agrawal, R., Lin, K. I., Sawhney, H. S., & Shim, K. (1995). Fast similarity search in the presence of noise, scaling, and translation in times-series databases. Proceedings of 21 International Conference on Very Large Data Bases. pp 490-50.
....user defined distance from the queried object. All pair similarity query (i.e. spatial join) where the objective is to find al the pars of elements that are within a user specified distance from each other. Significant progress has recently been made in sequence matching for tempora databases [1, 5, 28, 29, 54, 57] and for speech recognition techniques such as dynamic time warping [81] Two types of similarity queries for temporal data have emerged thus far: whole matching [1] in which the target sequence and the sequences in the database have the same length; subsequence matchin# [29] in which the target ....
....set of features is used for matching. This process is iterated until all of the features are exhausted. Compared to the method proposed in [29] HierarchyScan performs a hierarchical scan instead of using a tree structure for indexing. Different transformations were considered in [54] In [5], another approach is introduced to determine all similar sequences in a set of sequences. It is also applicable to find all subsequences similar to a target sequence. The similarity measure considered is the Euclidean distance between the sequences and the matching is performed in the time ....
[Article contains additional citation context not shown here]
R. Agrawal, K.-I. Lin, H.S. Sawhney, and K. Shim. Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases. Proceedings of the 21th International Conference on Very Large Data Bases, pages 490-501, September 1995.
....presents the experimental results and some discussions. Finally, Section 6 offers some concluding remarks. 2. RELATED WORK The simplest similarity measurements are Euclidean distance (L2) and other Lp norms. A similarity model that uses non overlapping ordered similar subsequence was proposed in [5]. Clearly, the above methods use the whole information. The most meaningful series behaviors are likely to be influenced, or even flooded by the overall series movement. Therefore it s difficult to discover really useful associations between time series by using the whole information. In [1] 13] ....
....and Computers (Hardware) This information was then used as the standard classification in our experiments. There is a broad consensus that similarity measurement with proper preprocessing could give better restfits. For example, time series can have different baselines and scaling factors [5][12] From human interpretation, if two time series are with similar behaviors running at different levels or with different scale, they should be treated as similar. In order to compare our approach with the previous measurements accurately, we first preprocessed the data using methods that were ....
R. Agrawal, K.I. Lin, H. S. Sawhney, K. Shim, Fast similarity search in the presence of noise, scaling and translation in time-series databases. The 23 Intl. Conf. on Very Large Data Bases, 1995.
....class generalizationbased knowledge discovery techniques can also be used for mining other kinds of knowledge, including data evolution regularities, data deviation rules, and data clustering. Methods for mining these kinds of rules are briefly outlined in this subsection. A data evolution rule [53, 2] reflects the general evolution (or changing) behavior of a set of data, e.g. the rule which describes the major factors which influence the fluctuations of certain stock values. The discovery of data evolution regularities is to mine the general characteristics of the set of data changing with ....
....finding the general trend or behavior of data in the database, to which the methods discussed in the previous sections and paragraphs may apply; and (2) mining the deviation data and characterizing their behaviors. To perform the second task, similarity measurements should be defined and applied [2], which may specify to what extent the different data may still be considered as similar but to what extent it should be categorized as deviations. Fuzzy logics or uncertainty measurements may be associated with the deviated data to quantitatively characterize the behaviors of the data deviations. ....
R. AgrawaJ, K.-I. Lin, H.S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaJing, and translation in time-series databases. In Proc. lst Int. Conf. Very Large Data Bases, pages 490-501, Zurich, Switzerland, Sept. 1995.
....interesting applications. For example, it can be used to help marketers discover distinct groups in their customer bases and develop targeted marketing programs. Data clustering has been studied in statistics, machine learning, image processing, and data mining with different methods and emphases [2, 32]. A data cubebased clustering analyzer must effectively deal with large amount and high dimensionality of data and find interesting clusters. Moreover, most of the existing data clustering methods can only handle numeric data or cannot produce good quality results in the case where categorical ....
....time series analysis is to find similar time related patterns (trends, segments, etc. in a large time series database, such as stock market database. Traditional trend analysis techniques, such as Fourier transformation, are adopted in most previous analyses of similarity based time series [2]. With the popular adoption of wavelet transformation and analysis methods, we examine the wavelet transformation based sim ilarity mining methods for discovery of trends and or similar curves or curve segments [30] Template segments can be specified by users based on the given curve segments or ....
R. Agrawal, K.-I. Lin, N.S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proc. 2lst Int. Conf. Very Large Data Rases pages 490-501 Zurich Switzerland Sept. 1995.
....time series mmlysis is to find similar time related patterns (trends, segments, etc. in a large time series database, such as stock market database. Traditional trend analysis techniques, such as Fourier transformation, are adopted in most previous analyses of similarity based time series [2]. With the popular adoption of wavelet transformation and analysis methods, we examine the wavelet transformation based sim ilarity mining methods for discovery of trends and or similar curves or curve segments [30] Template segments can be specified by users based on the given curve segments or ....
R. Agrawal, K.-I. Lin, H.S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proc. lst Int. Con]. Ver!t Lar.qe Data Rases, pages 490-501, Zurich, Switzerland, Sept. 1995.
....to allow stretching in time in order to get a better distance. Recently, there has been approached to make this measure more scalable [18, 21] A similar technique is to find the longest common subsequence (LCSS) of two sequences and then define the distance using the length of this subsequence [4, 7, 10, 8]. The LCSS shows how well the two sequences can match one another if we are allowed to stretch them but we cannot rearrange the sequence of values. Other techniques to define time series similarity are based on extracting certain features (Landmarks [22] or signatures [11] from each time series ....
R. Agrawal, K. Lin, H. S. Sawhney, and K. Shim. Fast Similarity Search in the Presence of Noise, Scaling and Translation in Time-Series Databases. In Proc of VLDB, pages 490--501, Sept. 1995.
....of two time sequences. One approach is to define the distance between two sequences to be the Euclidean distance in an appropriate multidimensional space [1, 4, 6, 11, 19, 25] NonEuclidean metrics have also been used to compute the similarity for time sequences. Agrawal, Lin, Sawhney, and Shim [2] use as the distance metric. The Landmark model by Perng, Wang, and Zhang [17] chooses only a subset of values from a time sequence, which are peak points, and uses them to represent the corresponding sequence. The authors define distance between two time sequences as a tuple of values, one ....
R. Agrawal, K. Lin, H.S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In VLDB, Zurich, Switzerland, September 1995.
....be used to discover better clusters. 3. REPRESENTATION OF TIME SERIES Time series data di#ers from other data representations in that a data point in time series is represented by a sequence typically measured at equal time intervals. Various time series representations have been proposed in [1, 5] for data with no errors. In this section we present a time series representation that models errors associated with data. In our model a time series sampled at T points is represented by a sequence of T distributions. We assume that each of these T samples are independent of each other and ....
Rakesh Agrawal, King-Ip Lin, Harpreet S. Sawhney, Kyuseok Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. VLDB, 490-501, 1995.
....element values along the time axis. Rafiei et al. 7] proposed a class of sequence transformations that can be used in a query language to express similarity with an R tree [11] index. The proposed transformations handle moving average and global time scaling, but not time warping. Agrawal et al. [2] proposed a new model of similarity that captures the intuitive notion that two sequences should be considered similar if they have enough non overlapping time ordered pairs of similar subsequences. More recent approaches permit the matching of sequences of di#erent lengths. Bozcaya et al. 12] ....
R. Agrawal, K. Lin, H. S. Sawhney, K. Shim, Fast similarity search in the presence of noise, scaling, and translation in time-series databases, in: Proc. Int'l Conf. on Very Large Data Bases (VLDB), 1995, pp. 490--501.
....market, video, text, genome data are examples of sequence data. The distance similarity measure for these data varies based on the application and the data type. Some of the distance measures currently in use are Euclidean distance [1, 13, 17, 26, 28, 39, 43] other vector norms like L 1 and L1 [2], shift scale invariant distance measures [14, 27] edit distance [9, 25, 35] and score matrix based similarity [22] The sizes of spatial and sequence datasets are growing rapidly. Excessive amount of data makes similarity search challenging, incurring large amount of disk I O and CPU cost. An ....
R. Agrawal, K. Lin, H.S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In VLDB, Zurich, Switzerland, September 1995.
....periodicities. That is, periodicities where a small number of occurrences are not 100 punctual. Early work in time series data mining addresses the pattern matching problem. Agrawal et al. in the early 90 s developed algorithms for pattern matching and similarity search in time series databases [1, 2, 3]. Mannila et al. 4] introduce an efficient solution to the discovery of frequent patterns in a sequence database. Chan et al. 5] study the use of wavelets in time series matching and Faloutsos et al. in [6] and Keogh et al. in [7] propose indexing methods for fast sequence matching using R ....
R. Agrawal, K. Lin, H. S. Sawhney, and K. Shim. Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases. In Proc. of the 21st Int. Conf. on Very Large Databases, Zurich, Switzerland, September 1995.
....the result size, since more random IOs have to be performed to retrieve the candidate sets and subsequently remove them from the actual result reported to the user. 7 Related Work Similarity queries have been studied in time series databases and have attracted lots of research interest [AFS93, ALSS95, FRM94] The unordered nature of sets and the freedom to represent categorical attributes in sets makes the techniques developed for the time series domain inapplicable. Manber [Udi94] considered the problem of retrieving similar files. Indyk et al. IM98] introduced Locality Sensitive Hashing ....
R. Agrawal, K. Lin, H. S. Sawhney, and K. Shim. Fast Similarity Search in the Presence of Noise, Scaling and Translation in Time-Series Databases. Proceedings of VLDB, pages 490-- 501, September 1995.
....are often extremely large. Consider the MACHCO project. This astronomical database contains a half terabyte of data and grows at the rate of several gigabytes a day [21, 32] Given the magnitude of many time series databases, much research has been devoted to speeding up the search process [1, 2, 3, 6, 11, 14, 17, 18, 19, 22, 23, 24, 30, 35]. The most promising methods are techniques that first perform dimensionality reduction on the data, and then use spatial access methods to index the data in the transformed space. The technique was introduced in [1] and extended in [11, 23, 24, 35] The original work by Agrawal et al. utilizes ....
....objects are missed because they appear distant in index space, are usually unacceptable. In this work, we will focus on admissible searching, indexing techniques that guarantee no false dismissals. Many inadmissible schemes have been proposed for similarity search in time series databases [2, 14, 19, 22, 30]. As they focus on speeding up search by sacrificing the guarantee of no false dismissals, we will not consider them further. As noted by Faloutsos et al. 11] there are several highly desirable properties for any indexing scheme: 1) It should be much faster than sequential scanning. 2) The ....
[Article contains additional citation context not shown here]
Agrawal, R., Lin, K. I., Sawhney, H. S., & Shim, K. (1995). Fast similarity search in the presence of noise, scaling, and translation in times-series databases. In Proceedings of 21 International Conference on Very Large Data Bases. Zurich. pp 490-50.
....[4, 8, 19, 22, 24, 26, 34, 36, 38, 42, 43, 48, 57] Moreover, showing some examples of matching time series is of little utility unless some strawman comparison is used. Many papers ask us to consider the quality of their proposed similarity measure without a single comparison to another technique [2, 4, 8, 24, 31, 38, 39, 41, 42, 46, 57]. This in particularly surprising since the most obvious strawman, Euclidean distance, is trivial to implement (For example, in the Matlab programming language it requires only 19 characters: sqrt(sum( q c) 2) We believe that one of the best (subjective) ways to evaluate a proposed ....
Agrawal, R., Lin, K. I., Sawhney, H. S. & Shim, K. (1995). Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In proceedings of the 21 t Int7 Conference on Very Large Databases. Zurich, Switzerland, Sept. pp 490-50.
....can be built in linear time. Yi an Faloutsos [31] also show that this signature can be used with arbitrary L p norms, i.e. distance measures of the form L p (#x,#y) l y i , 5) where l = without changing the index structure, which is something no previous method (e.g. [1, 2, 7, 8, 27, 32]) could accomplish. This means that the distance norm to be used may be specified by the user. Preprocessing to make the index more robust in the face of such transformations as offset translation, amplitude scaling, and time scaling can also be performed. Keogh et al. demonstrate that the ....
Rakesh Agrawal, King-Ip Lin, Harpreet S. Sawhney, and Kyuseok Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In The VLDB Journal, pages 490-- 501, 1995.
....or nearest neighbor techniques to search for interesting information from the data set. Recently, there has been considerable interest in defining intuitive and easily computable measures of similarity between complex objects and in using abstract similarity notions in querying databases [1, 2, 10, 14, 17, 19, 22, 4, 12, 15]. Ideally, the similarity notion is defined by the user, who understands the domain concepts well and is able to explicate the notions needed for similarity computations. However, in many applications the domain expertise is not available. The users do not understand the interconnections between ....
R. Agrawal, K.-I. Lin, H. S. Sawhney and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proc. of the 21st Intl. Conf. on Very Large Data Bases (VLDB), 1995, pp 490--501.
....F2(Y) In many cases D is simply the p norm distance between two k dimensional vectors. Usually p is either 2 (Euclidean distance) or i (Manhattan distance) A different technique is to find the longest common subsequence (LCSS) of F(X) and F(Y) and set D(F(X) F(Y) k LCSS(F(X) F(Y) [3, 7]. The LCSS shows how well the two sequences can match one another if we are allowed to stretch them but we cannot rearrange the sequence of values. Since the values are real numbers, we typically allow approximate matching, rather than exact matching. The LCSS model allows shifting of the time ....
....proposed. The alternative is to define a family of functions )r, such that F, F )r. The objective is to find those F, F in )r that minimize the distance. The distance between two time series X, Y is then argminF1,F:rD(F (X) F (Y) The family of functions )r can be global scaling, local scaling [3], global scaling and different baselines [7, 12, 10] or moving averaging [a0] Similarity of generalized time series Current work on the problem of computing the similarity of generalized time series is using the LCSS similarity model ( 53, 8] This model is generally robust against outliers ....
R. Agrawal, K.-I. Lin, H.S. Sawhney, K. Shim. Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases. In Proc. 21th Int. Conf. on Very Large Databases (VLDB-95), 1995.
....with the explicit goal of conducting the most comprehensive and detailed set of time series indexing experiments ever attempted. In particular we have taken the following steps to insure the most meaningful and generalizable results. Instead of testing on just one or two datasets as is typical [2, 3, 5, 7, 8, 10, 12, 13, 16, 19, 23, 24, 25, 30 34], we tested all algorithms on 32 datasets. These datasets cover the complete spectrum of stationary non stationary, noisy smooth, cyclical non cyclical, symmetric asymmetric, etc. The data also represents the many areas in which DTW is used, including finance, medicine, biometrics, chemistry, ....
....for queries of length 256, with a 16 dimensional index, for increasingly large databases. Note that the X axis is in logarithmic scale, and denotes the number of items in the database 6. Discussion and Conclusions In one of the most referenced papers on time series similarity ever published [2], the authors explicitly state, Dynamic time warping. cannot be speeded up by indexing . This sentiment has since been echoed in several dozen other papers [6, 33] How then have we achieved the seemingly impossible Firstly, we have only considered the case where the two sequences are of the ....
Agrawal, R., Lin, K. I., Sawhney, H. S., & Shim, K. (1995). Fast similarity search in the presence of noise, scaling, and translation in times-series databases. In Proc. 21 Int. Conf. on Very Large Databases, pp. 490-501.
....of the ISB representation, there are two linear regressions, realized by two time series Zl and z2, which have identical values on this subset but have different values on the other components of the ISB representation. To show that tb cannot be excluded, consider the time series Zl: 0, 0, 0 over [0, 2] and z2: 0, 0 over [1, 2] their linear regressions agree on t, but not on tb. Similarly, t cannot be removed. For , consider Zl :0,0andz2:l, lover[0,1] tb=0, t=l, 0for both, but = 0 for Zl and = I for z2. For , consider Zl: 0,0 and z2: 0,1 over [0,1] tb = 0, t = 1,b = 0 for both, but : 0 for Z1 ....
....there are two linear regressions, realized by two time series Zl and z2, which have identical values on this subset but have different values on the other components of the ISB representation. To show that tb cannot be excluded, consider the time series Zl: 0, 0, 0 over [0, 2] and z2: 0, 0 over [1, 2]; their linear regressions agree on t, but not on tb. Similarly, t cannot be removed. For , consider Zl :0,0andz2:l, lover[0,1] tb=0, t=l, 0for both, but = 0 for Zl and = I for z2. For , consider Zl: 0,0 and z2: 0,1 over [0,1] tb = 0, t = 1,b = 0 for both, but : 0 for Z1 and : 1 for z2. The ....
[Article contains additional citation context not shown here]
R. Agrawal, K.-I. Lin, H.S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. VLDB'95.
....and interactive exploration, often emphasizing the periodic nature of some calendar based data sets [6] Work in data mining has addressed the need for additional tools to identify patterns of trends of interest in these data sets. Algorithmic and statistical methods for identifying patterns [1,2,3,5,8,11] have provided substantial functionality in a wide variety of situations. In domains such as stock price analysis, familiar patterns have been named and identified as shorthand approaches to identifying trends of interest [12] Tools for specifying dynamic queries over these data sets have ....
....or for modifying queries based on example items. TimeSearcher uses Graph Envelope displays to provide overviews of the entire data set [9] and a simple drag and drop query by example mechanism supports the similarity queries often discussed in research in the mining of time series data [1,3,5,8,11]. TimeSearcher is implemented in Java, using the Swing toolkit for user interface components. Drawing and scenegraph control in the data and query displays is provided by Jazz, a zooming toolkit written in Java [4] 3 An Augmented Query Mechanism Timeboxes offer a very flexible query language, ....
[Article contains additional citation context not shown here]
R. Agrawal, K.-I. Lin, H. S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. The VLDB Journal, pages 490-501, 1995.
No context found.
Agrawal, R., Lin, K. I., Sawhney, H. S., and Swim, K. (1995). Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proceedings 21st VLDB.
No context found.
R. Agrawal, K.-I. Lin, H.S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In VLDB'95.
No context found.
R. Agarwal, K. Lin, H. Sawhney and K. Shim. Fast similarity search in the presence of noise, scaling and translation in time-series databases. Proc. 21st VLDB conf, 1995.
No context found.
R. Agarwal, K. Lin, H. Sawhney and K. Shim. Fast similarity search in the presence of noise, scaling and translation in time-series databases, Proc. VLDB conference, 1995.
No context found.
R. Agrawal, K.-I. Lin, H. S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proceedings of 21th International Conference on Very Large Data Bases (VLDB'95), 1995.
No context found.
R. Agrawal, K.-I. Lin, H. S. Sawhney, K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. VLDB Conference, pages 490-501, 1995.
No context found.
R. Agrawal, K.-I. Lin, H. S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proceedings of the 21st International Conference on Very Large Data Bases, pages 490--501, Zurich, Switzerland, 1995.
No context found.
Agrawal, R., Lin, K.I., Sawhney, H.S., & Shim, K. Fast similarity search in the presence of noise, scaling, and translation in times-series databases. In VLDB, September. 1995.
No context found.
R. Agrawal, K.-I. Lin, H. S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series database. In Proc. of the VLDB Conf., Zurich, Switzerland, 1995.
No context found.
R. Agrawal, K. Lin, H. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time--series databases. In 21st VLDB, pg. 490--501,1995.
No context found.
R. Agrawal, K.-I. Lin, H. S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proceedings of the 21 st International Conference on Very Large Databases (VLDB), Zurich, Switzerland, September 1995.
No context found.
Agrawal R., Lin K.-I., Sawhney H., Shim K.: Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases. Proc. 21th Int. Conf. on Very Large Databases (VLDB 95), pages 490-501, 1995.
No context found.
R. Agrawal, K.-I. Lin, H. S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In The VLDB Journal, pages 490-501, 1995.
No context found.
R. Agrawal, K.-I. Lin, H. S. Sawhney, and K. Shim. Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases. In Proceedings of the 21st VLDB Conference, pages 490--501, Zurich, Switzerland, September 1995.
No context found.
R. Agrawal, K. Lin, H. Sawhney, K. Shim. Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases. Proc. 21st VLDB Conf., pp. 490-501, 1995.
No context found.
# R. Agrawal, K. Lin, H. S. Sawhney, and K. Shim. Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases. In Proc. of the 21st Int. Conf. on Very Large Databases, Zurich, Switzerland, September 1995.
No context found.
R. Agrawal, K. Lin, H. S. Sawhney, and K. Shim. Fast similarity search in presence of noise, scaling, and translation in time-series databases. In VLDB, pages 490--501, 1995.
No context found.
R. Agrawal, K.-I. Lin, H. S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proc. the 21st Int'l Conf. on Very Large Data Bases, pages 490-501, 1995.
No context found.
R. Agrawal, K.-I. Lin, H. S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In The VLDB Journal, pages 490--501, 1995.
No context found.
Agrawal, R., Lin, K.-L., Sawhney, H. S., and Shim, K. (1995). Fast similarity search in the presence of noise, scaling, and translation in timeseries databases. In Proc. of the 21st Int. Conf. on Very Large Databases, Zurich, Switzerland.
No context found.
R. Agrawal, K.-I. Lin, H. Sawhney, K. Shim. Fast Similarity Search in the presence of noise, scaling, and translation in time series databases. VLDB Conference, 1995.
No context found.
Agrawal, R., Lin, K. I., Sawhney, H. S., & Shim, K. (1995). Fast similarity search in the presence of noise, scaling, and translation in times-series databases. In VLDB, September.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC