Results 1 - 10
of
163
Distributional Clustering Of English Words
- In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics
, 1993
"... We describe and evaluate experimentally a method for clustering words according to their dis- tribution in particular syntactic contexts. Words are represented by the relative frequency distributions of contexts in which they appear, and relative entropy between those distributions is used as the si ..."
Abstract
-
Cited by 629 (27 self)
- Add to MetaCart
We describe and evaluate experimentally a method for clustering words according to their dis- tribution in particular syntactic contexts. Words are represented by the relative frequency distributions of contexts in which they appear, and relative entropy between those distributions is used as the similarity measure for clustering. Clusters are represented by average context distributions derived from the given words according to their probabilities of cluster membership. In many cases, the clusters can be thought of as encoding coarse sense distinctions. Deterministic annealing is used to find lowest distortion sets of clusters: as the an- nealing parameter increases, existing clusters become unstable and subdivide, yielding a hierarchi- cal "soft" clustering of the data. Clusters are used as the basis for class models of word coocurrence, and the models evaluated with respect to held-out test data.
Mean shift, mode seeking, and clustering
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 1995
"... Mean shift, a simple iterative procedure that shifts each data point to the average of data points in its neighborhood, is generalized and analyzed in this paper. This generalization makes some k-means like clustering algorithms its special cases. It is shown that mean shift is a mode-seeking proce ..."
Abstract
-
Cited by 624 (0 self)
- Add to MetaCart
Mean shift, a simple iterative procedure that shifts each data point to the average of data points in its neighborhood, is generalized and analyzed in this paper. This generalization makes some k-means like clustering algorithms its special cases. It is shown that mean shift is a mode-seeking process on a surface constructed with a “shadow ” kernel. For Gaussian kernels, mean shift is a gradient mapping. Convergence is studied for mean shift iterations. Cluster analysis is treated as a deterministic problem of finding a fixed point of mean shift that characterizes the data. Applications in clustering and Hough transform are demon-trated. Mean shift is also considered as an evolutionary strategy that performs multistart global optimization.
A Unified Mixture Framework for Motion Segmentation: Incorporating Spatial Coherence and Estimating the Number of Models
"... Describing a video sequence in terms of a small number of coherently moving segments is useful for tasks ranging from video compression to event perception. A promising approach is to view the motion segmentation problem in a mixture estimation framework. However, existing formulations generally use ..."
Abstract
-
Cited by 175 (5 self)
- Add to MetaCart
Describing a video sequence in terms of a small number of coherently moving segments is useful for tasks ranging from video compression to event perception. A promising approach is to view the motion segmentation problem in a mixture estimation framework. However, existing formulations generally use only the motion data and thus fail to make use of static cues when segmenting the sequence. Furthermore, the number of models is either specified in advance or estimated outside the mixturemodel framework. In this work we address both of these issues. We show how to add spatial constraints to the mixture formulations and present a variant of the EM algorithm that makes use of both the form and the motion constraints. Moreover this algorithm estimates the number of segments given knowledge about the level of model failure expected in the sequence. The algorithm's performance is illustrated on synthetic and real image sequences.
SOM-Based Data Visualization Methods
- Intelligent Data Analysis
, 1999
"... The Self-Organizing Map (SOM) is an efficient tool for visualization of multidimensional numerical data. In this paper, an overview and categorization of both old and new methods for the visualization of SOM is presented. The purpose is to give an idea of what kind of information can be acquired fro ..."
Abstract
-
Cited by 124 (4 self)
- Add to MetaCart
(Show Context)
The Self-Organizing Map (SOM) is an efficient tool for visualization of multidimensional numerical data. In this paper, an overview and categorization of both old and new methods for the visualization of SOM is presented. The purpose is to give an idea of what kind of information can be acquired from different presentations and how the SOM can best be utilized in exploratory data visualization. Most of the presented methods can also be applied in the more general case of first making a vector quantization (e.g. k-means) and then a vector projection (e.g. Sammon's mapping).
Unsupervised Learning from Dyadic Data
, 1998
"... Dyadic data refers to a domain with two finite sets of objects in which observations are made for dyads, i.e., pairs with one element from either set. This includes event co-occurrences, histogram data, and single stimulus preference data as special cases. Dyadic data arises naturally in many applic ..."
Abstract
-
Cited by 122 (11 self)
- Add to MetaCart
(Show Context)
Dyadic data refers to a domain with two finite sets of objects in which observations are made for dyads, i.e., pairs with one element from either set. This includes event co-occurrences, histogram data, and single stimulus preference data as special cases. Dyadic data arises naturally in many applications ranging from computational linguistics and information retrieval to preference analysis and computer vision. In this paper, we present a systematic, domain-independent framework for unsupervised learning from dyadic data by statistical mixture models. Our approach covers different models with flat and hierarchical latent class structures and unifies probabilistic modeling and structure discovery. Mixture models provide both, a parsimonious yet flexible parameterization of probability distributions with good generalization performance on sparse data, as well as structural information about data-inherent grouping structure. We propose an annealed version of the standard Expectation Maximization algorithm for model fitting which is empirically evaluated on a variety of data sets from different domains.
Nonlinear Gated Experts for Time Series: Discovering Regimes and Avoiding Overfitting
, 1995
"... this paper: ftp://ftp.cs.colorado.edu/pub/Time-Series/MyPapers/experts.ps.Z, ..."
Abstract
-
Cited by 110 (5 self)
- Add to MetaCart
this paper: ftp://ftp.cs.colorado.edu/pub/Time-Series/MyPapers/experts.ps.Z,
Resampling method for unsupervised estimation of cluster validity
- Neural Computation
, 2001
"... We introduce a method for validation of results obtained by clustering analysis of data. The method is based on resampling the available data. A figure of merit that measures the stability of clustering solutions against resampling is introduced. Clusters which are stable against resampling give ris ..."
Abstract
-
Cited by 88 (3 self)
- Add to MetaCart
(Show Context)
We introduce a method for validation of results obtained by clustering analysis of data. The method is based on resampling the available data. A figure of merit that measures the stability of clustering solutions against resampling is introduced. Clusters which are stable against resampling give rise to local maxima of this figure of merit. This is presented first for a one-dimensional data set, for which an analytic approximation for the figure of merit is derived and compared with numerical measurements. Next, the applicability of the method is demonstrated for higher dimensional data, including gene microarray expression data. 1
A Survey of Fuzzy Clustering Algorithms for Pattern Recognition - Part 11
"... the concepts of fuzzy clustering and soft competitive learning in clustering algorithms is proposed on the basis of the existing literature. Moreover, a set of functional attributes is selected for use as dictionary entries in the comparison of clustering algorithms. In this paper, five clustering a ..."
Abstract
-
Cited by 81 (2 self)
- Add to MetaCart
(Show Context)
the concepts of fuzzy clustering and soft competitive learning in clustering algorithms is proposed on the basis of the existing literature. Moreover, a set of functional attributes is selected for use as dictionary entries in the comparison of clustering algorithms. In this paper, five clustering algorithms taken from the literature are reviewed, assessed and compared on the basis of the selected properties of interest. These clustering models are 1) self-organizing map (SOM); 2) fuzzy learning vector quantization (FLVQ); 3) fuzzy adaptive resonance theory (fuzzy ART); 4) growing neural gas (GNG); 5) fully self-organizing simplified adaptive resonance theory (FOSART). Although our theoretical comparison is fairly simple, it yields observations that may appear parodoxical. First, only FLVQ, fuzzy ART, and FOSART exploit concepts derived from fuzzy set theory (e.g., relative and/or absolute fuzzy membership functions). Secondly, only SOM, FLVQ, GNG, and FOSART employ soft competitive learning mechanisms, which are affected by asymptotic misbehaviors in the case of FLVQ, i.e., only SOM, GNG, and FOSART are considered effective fuzzy clustering algorithms. Index Terms—Ecological net, fuzzy clustering, modular architecture, relative and absolute membership function, soft and hard competitive learning, topologically correct mapping. I.
Annealed Competition of Experts for a Segmentation and Classification of Switching Dynamics
, 1996
"... We present a method for the unsupervised segmentation of data streams originating from different unknown sources which alternate in time. We use an architecture consisting of competing neural networks. Memory is included in order to resolve ambiguities of input-output relations. In order to obtain m ..."
Abstract
-
Cited by 75 (22 self)
- Add to MetaCart
We present a method for the unsupervised segmentation of data streams originating from different unknown sources which alternate in time. We use an architecture consisting of competing neural networks. Memory is included in order to resolve ambiguities of input-output relations. In order to obtain maximal specialization, the competition is adiabatically increased during training. Our method achieves almost perfect identification and segmentation in the case of switching chaotic dynamics where input manifolds overlap and input-output relations are ambiguous. Only a small dataset is needed for the training proceedure. Applications to time series from complex systems demonstrate the potential relevance of our approach for time series analysis and short-term prediction. 1 Introduction Neural networks provide frameworks for the representation of relations present in data. Especially in the fields of classification and time series prediction, neural networks Corresponding author, email:k...
Data clustering using a model granular magnet
- Neural Computation
, 1997
"... We present a new approach to clustering, based on the physical properties of an inhomogeneous ferromagnet. No assumption is made regarding the underlying distribution of the data. We assign a Potts spin to each data point and introduce an interaction between neighboring points, whose strength is a d ..."
Abstract
-
Cited by 72 (4 self)
- Add to MetaCart
We present a new approach to clustering, based on the physical properties of an inhomogeneous ferromagnet. No assumption is made regarding the underlying distribution of the data. We assign a Potts spin to each data point and introduce an interaction between neighboring points, whose strength is a decreasing function of the distance between the neighbors. This magnetic system exhibits three phases. At very low temperatures, it is completely ordered; all spins are aligned. At very high temperatures, the system does not exhibit any ordering, and in an intermediate regime, clusters of relatively strongly coupled spins become ordered, whereas different clusters remain uncorrelated. This intermediate phase is identified by a jump in the order parameters. The spin-spin correlation function is used to partition the spins and the corresponding data points into clusters. We demonstrate on three synthetic and three real data sets how the method works. Detailed comparison to the performance of other techniques clearly indicates the relative success of our method. 1