Results 1 - 10
of
21
A Dynamic HMM for On-line Segmentation of Sequential Data
- Advances in Neural Information Processing Systems 14 (NIPS 2001
, 2002
"... We propose a novel method for the analysis of sequential data that exhibits an inherent mode switching. In particular, the data might be a non-stationary time series from a dynamical system that switches between multiple operating modes. Unlike other approaches, our method processes the data inc ..."
Abstract
-
Cited by 26 (1 self)
- Add to MetaCart
(Show Context)
We propose a novel method for the analysis of sequential data that exhibits an inherent mode switching. In particular, the data might be a non-stationary time series from a dynamical system that switches between multiple operating modes. Unlike other approaches, our method processes the data incrementally and without any training of internal parameters. We use an HMM with a dynamically changing number of states and an on-line variant of the Viterbi algorithm that performs an unsupervised segmentation and classification of the data on-the-fly, i.e. the method is able to process incoming data in real-time. The main idea of the approach is to track and segment changes of the probability density of the data in a sliding window on the incoming data stream. The usefulness of the algorithm is demonstrated by an application to a switching dynamical system.
Multiclassifier systems: Back to the future
- Multiple Classifier Systems, pages invited paper, 1–15. LNCS
, 2002
"... Abstract. While a variety of multiple classifier systems have been studied since at least the late 1950’s, this area came alive in the 90’s with significant theoretical advances as well as numerous successful practical applications. This article argues that our current understanding of ensemble-type ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
Abstract. While a variety of multiple classifier systems have been studied since at least the late 1950’s, this area came alive in the 90’s with significant theoretical advances as well as numerous successful practical applications. This article argues that our current understanding of ensemble-type multiclassifier systems is now quite mature and exhorts the reader to consider a broader set of models and situations for further progress. Some of these scenarios have already been considered in classical pattern recognition literature, but revisiting them often leads to new insights and progress. As an example, we consider how to integrate multiple clusterings, a problem central to several emerging distributed data mining applications. We also revisit output space decomposition to show how this can lead to extraction of valuable domain knowledge in addition to improved classification accuracy. 1 A Brief History of Multilearner Systems Multiple classifier systems are special cases of approaches that integrate several
Robust order statistics based ensemble for distributed data mining
- In Advances in Distributed and Parallel Knowledge Discovery
, 2000
"... Integrating the outputs of multiple classifiers via combiners or meta-learners has led to substantial improvements in several difficult pattern recognition problems. In the typical setting investigated till now, each classifier is trained on data taken or resampled from a common data set, or randoml ..."
Abstract
-
Cited by 18 (5 self)
- Add to MetaCart
Integrating the outputs of multiple classifiers via combiners or meta-learners has led to substantial improvements in several difficult pattern recognition problems. In the typical setting investigated till now, each classifier is trained on data taken or resampled from a common data set, or randomly selected partitions thereof, and thus experiences similar quality of training data. However, in distributed data mining involving heterogeneous databases, the nature, quality and quantity of data available to each site/classifier may vary substantially, leading to large discrepancies in their performance. In this chapter we introduce and investigate a family of meta-classifiers based on order statistics, for robust handling of such cases. Based on a mathematical modeling of how the decision boundaries are affected by order statistic combiners, we derive expressions for the reductions in error expected when such combiners are used. We show analytically that the selection of the median, the maximum and in general, the ith order statistic improves classification performance. Furthermore, we introduce the trim and spread combiners, both based on linear combinations of the ordered classifier outputs, and empirically show that they are significantly superior in the presence of outliers or uneven classifier performance. So they can be fruitfully applied to several heterogeneous distributed data mining situations, specially when it is
GAMLS: A Generalized framework for Associative Modular Learning Systems
- In Proceedings of the Applications and Science of Computational Intelligence II
, 1999
"... Learning a large number of simple local concepts is both faster and easier than learning a single global concept. Inspired by this principle of divide and conquer, a number of modular learning approaches have been proposed by the computational intelligence community. In modular learning, the classif ..."
Abstract
-
Cited by 13 (10 self)
- Add to MetaCart
(Show Context)
Learning a large number of simple local concepts is both faster and easier than learning a single global concept. Inspired by this principle of divide and conquer, a number of modular learning approaches have been proposed by the computational intelligence community. In modular learning, the classification/regression/clustering problem is first decomposed into a number of simpler subproblems, a module is learned for each of these subproblems, and finally their results are integrated by a suitable combining method. Mixtures of experts and clustering are two of the techniques that are describable in this paradigm. In this paper we present a broad framework for Generalized Associative Modular Learning Systems (GAMLS). Modularity is introduced through soft association of each training pattern with every module. The coupled problems of learning the module parameters and learning associations are solved iteratively using deterministic annealing. Starting at a high temperature with only one modu...
Frequency sensitive competitive learning for balanced clustering on high-dimensional hyperspheres
- IEEE TRANSACTIONS ON NEURAL NETWORKS
, 2004
"... Competitive learning mechanisms for clustering in general suffer from poor performance for very high dimensional (> 1000) data because of “curse of dimensionality” effects. In applications such as document clustering, it is customary to normalize the high dimensional input vectors to unit length ..."
Abstract
-
Cited by 12 (7 self)
- Add to MetaCart
(Show Context)
Competitive learning mechanisms for clustering in general suffer from poor performance for very high dimensional (> 1000) data because of “curse of dimensionality” effects. In applications such as document clustering, it is customary to normalize the high dimensional input vectors to unit length, and it is sometimes also desirable to obtain balanced clusters, i.e., clusters of comparable sizes. The spherical kmeans (spkmeans) algorithm, which normalizes the cluster centers as well as the inputs, has been successfully used to cluster normalized text documents in 2000+ dimensional space. Unfortunately, like regularkmeans and its soft EM based version,spkmeans tends to generate extremely imbalanced clusters in high dimensional spaces when the desired number of clusters is large (tens or more). In this paper, we first show that the spkmeans algorithm can be derived from a certain maximum likelihood formulation using a mixture of von Mises-Fisher distributions as the generative model and in fact it can be considered as a batch mode version of (normalized) competitive learning. The proposed generative model is then adapted in a principled way to yield three frequency sensitive competitive learning variants that are applicable to static data and produced high quality and well balanced clusters for high-dimensional data. Like kmeans, each iteration is linear in the number of data points and in the number of clusters for all the three algorithms. We also propose a frequency sensitive algorithm to cluster streaming 1 data. Experimental results on clustering of high-dimensional text data sets are provided to show the effectiveness and applicability of the proposed techniques.
Biologically Inspired Modular Neural Networks
, 2000
"... This dissertation explores the modular learning in artificial neural networks that mainly driven by the inspiration from the neurobiological basis of the human learning. The presented modularization approaches to the neural network design and learning are inspired by the engineering, complexity, psy ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
(Show Context)
This dissertation explores the modular learning in artificial neural networks that mainly driven by the inspiration from the neurobiological basis of the human learning. The presented modularization approaches to the neural network design and learning are inspired by the engineering, complexity, psychological and neurobiological aspects. The main theme of this dissertation is to explore the organization and functioning of the brain to discover new structural and learning inspirations that can be subsequently utilized to design artificial neural network. The artificial neural networks
Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography
- Computational Statistics & Data Analysis, In Press, Corrected Proof
, 2009
"... Ensemble methodology, which builds a classification model by integrating multiple classifiers, can be used for improving prediction performance. Researchers from various disciplines such as statistics, pattern recognition, and machine learning have seriously explored the use of ensemble methodology. ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Ensemble methodology, which builds a classification model by integrating multiple classifiers, can be used for improving prediction performance. Researchers from various disciplines such as statistics, pattern recognition, and machine learning have seriously explored the use of ensemble methodology. This paper presents an updated survey of ensemble methods in classification tasks, while introducing a new taxonomy for characterizing them. The new taxonomy, presented from the algorithm designer’s point of view, is based on five dimensions: inducer, combiner, diversity, size, and members dependency. We also propose several selection criteria, presented from the practitioner’s point of view, for choosing the most suitable ensemble method. Key words:
An On-Line Method For Segmentation And Identification Of Non-Stationary Time Series
- In NNSP 2001: Neural Networks for Signal Processing XI
, 2001
"... . We present a method for the analysis of non-stationary time series from dynamical systems that switch between multiple operating modes. In contrast to other approaches, our method processes the data incrementally and without any training of internal parameters. It straightaway performs an unsuperv ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
(Show Context)
. We present a method for the analysis of non-stationary time series from dynamical systems that switch between multiple operating modes. In contrast to other approaches, our method processes the data incrementally and without any training of internal parameters. It straightaway performs an unsupervised segmentation and classification of the data on-the-fly. In many cases it even allows to process incoming data in real-time. The main idea of the approach is to track and segment changes of the probability density of the data in a sliding window on the incoming data stream. An application to a switching dynamical system demonstrates the potential usefulness of the algorithm in a broad range of applications.
Decomposition Methodology for Classification Tasks – A Meta Decomposer Framework
- PATTERN ANALYSIS AND APPLICATIONS
, 2006
"... The idea of decomposition methodology for classification tasks is to break down a complex classification task into several simpler and more manageable sub-tasks that are solvable by using existing induction methods, then joining their solutions together in order to solve the original problem. In thi ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
The idea of decomposition methodology for classification tasks is to break down a complex classification task into several simpler and more manageable sub-tasks that are solvable by using existing induction methods, then joining their solutions together in order to solve the original problem. In this paper we provide an overview of very popular but diverse decomposition methods and introduce a related taxonomy to categorize them. Subsequently we suggest using this taxonomy to create a novel meta-decomposer framework to automatically select the appropriate decomposition method for a given problem. The experimental study validates the e#ectiveness of the proposed meta-decomposer on a set of benchmark datasets.