Results 1 - 10
of
17
Clustering of Time Series Subsequences is Meaningless: Implications for Past and Future Research
- In Proc. of the 3rd IEEE International Conference on Data Mining
, 2003
"... Time series data is perhaps the most frequently encountered type of data examined by the data mining community. Clustering is perhaps the most frequently used data mining algorithm, being useful in it’s own right as an exploratory technique, and also as a subroutine in more complex data mining algor ..."
Abstract
-
Cited by 58 (7 self)
- Add to MetaCart
Time series data is perhaps the most frequently encountered type of data examined by the data mining community. Clustering is perhaps the most frequently used data mining algorithm, being useful in it’s own right as an exploratory technique, and also as a subroutine in more complex data mining algorithms such as rule discovery, indexing, summarization, anomaly detection, and classification. Given these two facts, it is hardly surprising that time series clustering has attracted much attention. The data to be clustered can be in one of two formats: many individual time series, or a single time series, from which individual time series are extracted with a sliding window. Given the recent explosion of interest in streaming data and online algorithms, the latter case has received much attention. In this work we make a surprising claim. Clustering of streaming time series is completely meaningless. More concretely, clusters extracted from streaming time series are forced to obey a certain constraint that is pathologically unlikely to be satisfied by any dataset, and because of this, the clusters extracted by any clustering algorithm are essentially random. While this constraint can be intuitively demonstrated with a simple illustration and is simple to prove, it has never appeared in the literature. We can justify calling our claim surprising, since it invalidates the contribution of dozens of previously published papers. We will justify our claim with a theorem, illustrative examples, and a comprehensive set of experiments on reimplementations of previous work. Although the primary contribution of our work is to draw attention to the fact that an apparent solution to an important problem is incorrect and should no longer be used, we also introduce a novel method which, based on the concept of time series motifs, is able to meaningfully cluster some streaming time series datasets.
Detecting Correlation in Stock Market
- Physica A: Statistical Mechanics and its Applications, Volume 344, Issues 1-2
, 2004
"... We present a new method for detecting dependencies in the stock market. In order to find hidden correlations in the daily returns, we build cross prediction models and use the normalized modeling error as a generalized correlation measure that extends the concept of the classical correlation matrix. ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
We present a new method for detecting dependencies in the stock market. In order to find hidden correlations in the daily returns, we build cross prediction models and use the normalized modeling error as a generalized correlation measure that extends the concept of the classical correlation matrix.
Analysis of time series data with predictive clustering trees
- In proceedings of the 5 th International Workshop on Knowledge Discovery in Inductive Databases
, 2006
"... Abstract. Predictive clustering is a general framework that unifies clustering and prediction. This paper investigates how to apply this framework to cluster time series data. The resulting system, Clus-TS, constructs predictive clustering trees (PCTs) that partition a given set of time series into ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract. Predictive clustering is a general framework that unifies clustering and prediction. This paper investigates how to apply this framework to cluster time series data. The resulting system, Clus-TS, constructs predictive clustering trees (PCTs) that partition a given set of time series into homogeneous clusters. In addition, PCTs provide a symbolic description of the clusters. We evaluate Clus-TS on time series data from microarray experiments. Each data set records the change over time in the expression level of yeast genes in response to a change in environmental conditions. Our evaluation shows that Clus-TS is able to cluster genes with similar responses, and to predict the time series based on the description of a gene. Clus-TS is part of a larger project where the goal is to investigate how global models can be combined with inductive databases. 1
Clustering of Time Series Subsequences is Meaningless:
- In proceedings of the 3rd IEEE International Conference on Data Mining
, 2003
"... Given the recent explosion of interest in streaming data and online algorithms, clustering of time series subsequences, extracted via a sliding window, has received much attention. In this work we make a surprising claim. Clustering of time series subsequences is meaningless. More concretely, clus ..."
Abstract
- Add to MetaCart
Given the recent explosion of interest in streaming data and online algorithms, clustering of time series subsequences, extracted via a sliding window, has received much attention. In this work we make a surprising claim. Clustering of time series subsequences is meaningless. More concretely, clusters extracted from these time series are forced to obey a certain constraint that is pathologically unlikely to be satisfied by any dataset, and because of this, the clusters extracted by any clustering algorithm are essentially random.
COMOVEMENTS IN THE PRICES OF SECURITIES ISSUED BY LARGE COMPLEX FINANCIAL INSTITUTIONS ••
, 2004
"... In recent years, mergers, acquisitions and organic growth have meant that some of the largest and most complex financial groups have come to transcend national boundaries and traditionally defined business lines. As a result, they have become a potential channel for the cross-border and crossmarket ..."
Abstract
- Add to MetaCart
In recent years, mergers, acquisitions and organic growth have meant that some of the largest and most complex financial groups have come to transcend national boundaries and traditionally defined business lines. As a result, they have become a potential channel for the cross-border and crossmarket transmission of financial shocks. This paper analyses the degree of comovement in the prices of securities issued by a selected group of large complex financial institutions (LCFIs), and assesses the extent to which movements in the prices of these securities are driven by common factors. A relatively high degree of commonality is found for most LCFIs (compared to a control group of nonfinancials), although there are still noticeable divisions between sub-groups of LCFIs, both according to geography and primary business-line. This working paper is an extension of the December 2003 Financial Stability Review article “Large complex financial institutions: common influences on asset price behaviour? ” The main changes are the inclusion of a section on principal component analysis, an extension of the econometric estimation of the factor models, and a more detailed description of the papers ’ results. Christian Hawkesby and Ibrahim Stevens are at the Bank of England. Ian W. Marsh is at Cass Business School and
Financial Market- A Network Perspective
"... We construct a weighted financial network for a subset of NYSE traded stocks, in which the nodes correspond to stocks and edges to interactions between them. We identify clusters of stocks in the network, based on the Forbes business sector classification, and study their intensity and coherence. Ou ..."
Abstract
- Add to MetaCart
We construct a weighted financial network for a subset of NYSE traded stocks, in which the nodes correspond to stocks and edges to interactions between them. We identify clusters of stocks in the network, based on the Forbes business sector classification, and study their intensity and coherence. Our approach indicates to what extent the business sector classifications are visible in market prices, enabling us to gauge the extent of group-behaviour exhibited by stocks belonging to a given business sector. 1
Physica D 198 (2004) 51–73 Modeling share dynamics by extracting competition structure
, 2004
"... We propose a new method for analyzing multivariate time-series data governed by competitive dynamics such as fluctuations in the number of visitors to Web sites that form a market. To achieve this aim, we construct a probabilistic dynamical model using a replicator equation and derive its learning a ..."
Abstract
- Add to MetaCart
We propose a new method for analyzing multivariate time-series data governed by competitive dynamics such as fluctuations in the number of visitors to Web sites that form a market. To achieve this aim, we construct a probabilistic dynamical model using a replicator equation and derive its learning algorithm. This method is implemented for both categorizing the sites into groups of competitors and predicting the future shares of the sites based on the observed time-series data. We confirmed experimentally, using synthetic data, that the method successfully identifies the true model structure, and exhibits better prediction performance than conventional methods that leave competitive dynamics out of consideration. We also experimentally demonstrated, using real data of visitors to 20 Web sites offering streaming video contents, that the method suggested a reasonable competition structure that conventional methods failed to find and that it outperformed them in terms of predictive performance.
General Laws of Adaptation to Environmental Factors: from Ecological Stress to Financial Crisis
"... Abstract. We study ensembles of similar systems under load of environmental factors. The phenomenon of adaptation has similar properties for systems of different nature. Typically, when the load increases above some threshold, then the adapting systems become more different (variance increases), but ..."
Abstract
- Add to MetaCart
Abstract. We study ensembles of similar systems under load of environmental factors. The phenomenon of adaptation has similar properties for systems of different nature. Typically, when the load increases above some threshold, then the adapting systems become more different (variance increases), but the correlation increases too. If the stress continues to increase then the second threshold appears: the correlation achieves maximal value, and start to decrease, but the variance continue to increase. In many applications this second threshold is a signal of approaching of fatal outcome. This effect is supported by many experiments and observation of groups of humans, mice, trees, grassy plants, and on financial time series. A general approach to explanation of the effect through dynamics of adaptation is developed. H. Selye introduced “adaptation energy ” for explanation of adaptation phenomena. We formalize this approach in factors – resource models and develop hierarchy of models of adaptation. Different organization of interaction between factors (Liebig’s versus synergistic systems) lead to different adaptation dynamics. This gives an explanation to qualitatively different dynamics of correlation under different types of load and to some deviation from the typical reaction to stress. In addition to the “quasistatic ” optimization factor – resource models, dynamical models of adaptation are developed, and a simple model (three variables) for adaptation to one factor load is formulated explicitly.
unknown title
"... Large complex financial institutions: common influences on asset price behaviour? ..."
Abstract
- Add to MetaCart
Large complex financial institutions: common influences on asset price behaviour?
Will the US Economy Recover in 2010? A Minimal Spanning Tree Study
, 2010
"... Based on the temporal distributions of clustered segments in the time series of the ten Dow Jones US (DJUS) economic sector indices, we calculated their cross correlations over the period February 2000 to August 2008, the two-year intervals 2002–2003, 2004–2005, 2008–2009, and also over 11 correspon ..."
Abstract
- Add to MetaCart
Based on the temporal distributions of clustered segments in the time series of the ten Dow Jones US (DJUS) economic sector indices, we calculated their cross correlations over the period February 2000 to August 2008, the two-year intervals 2002–2003, 2004–2005, 2008–2009, and also over 11 corresponding segments within the present financial crisis. From these cross-correlation matrices, we constructed minimal spanning trees (MSTs) of the US economy at the sector level. We find that the average cross correlation is higher when the market volatility is higher, and lower when the market volatility is lower. In all MSTs, a core-fringe structure is found, with CY, IN, and NC consistently making up the core, and BM, EN, HC, TL, UT residing predominantly on the fringe. Taking advantage of the high-resolution temporal information available from the clustered segments in each time series, we mapped out the progressions of shocks in the MSTs. We saw that shocks accompanying volatility movements always start at the fringe, sometimes in conjunction with anomalously high cross correlations here, and propagate inwards to the core of all MSTs of the 11 statistically-stationary corresponding segments. Most of these volatility shocks originate within the domestic fringe

