Results 1  10
of
163
Distributional Clustering Of English Words
 In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics
, 1993
"... We describe and evaluate experimentally a method for clustering words according to their dis tribution in particular syntactic contexts. Words are represented by the relative frequency distributions of contexts in which they appear, and relative entropy between those distributions is used as the si ..."
Abstract

Cited by 629 (27 self)
 Add to MetaCart
We describe and evaluate experimentally a method for clustering words according to their dis tribution in particular syntactic contexts. Words are represented by the relative frequency distributions of contexts in which they appear, and relative entropy between those distributions is used as the similarity measure for clustering. Clusters are represented by average context distributions derived from the given words according to their probabilities of cluster membership. In many cases, the clusters can be thought of as encoding coarse sense distinctions. Deterministic annealing is used to find lowest distortion sets of clusters: as the an nealing parameter increases, existing clusters become unstable and subdivide, yielding a hierarchi cal "soft" clustering of the data. Clusters are used as the basis for class models of word coocurrence, and the models evaluated with respect to heldout test data.
Mean shift, mode seeking, and clustering
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 1995
"... Mean shift, a simple iterative procedure that shifts each data point to the average of data points in its neighborhood, is generalized and analyzed in this paper. This generalization makes some kmeans like clustering algorithms its special cases. It is shown that mean shift is a modeseeking proce ..."
Abstract

Cited by 624 (0 self)
 Add to MetaCart
Mean shift, a simple iterative procedure that shifts each data point to the average of data points in its neighborhood, is generalized and analyzed in this paper. This generalization makes some kmeans like clustering algorithms its special cases. It is shown that mean shift is a modeseeking process on a surface constructed with a “shadow ” kernel. For Gaussian kernels, mean shift is a gradient mapping. Convergence is studied for mean shift iterations. Cluster analysis is treated as a deterministic problem of finding a fixed point of mean shift that characterizes the data. Applications in clustering and Hough transform are demontrated. Mean shift is also considered as an evolutionary strategy that performs multistart global optimization.
A Unified Mixture Framework for Motion Segmentation: Incorporating Spatial Coherence and Estimating the Number of Models
"... Describing a video sequence in terms of a small number of coherently moving segments is useful for tasks ranging from video compression to event perception. A promising approach is to view the motion segmentation problem in a mixture estimation framework. However, existing formulations generally use ..."
Abstract

Cited by 175 (5 self)
 Add to MetaCart
Describing a video sequence in terms of a small number of coherently moving segments is useful for tasks ranging from video compression to event perception. A promising approach is to view the motion segmentation problem in a mixture estimation framework. However, existing formulations generally use only the motion data and thus fail to make use of static cues when segmenting the sequence. Furthermore, the number of models is either specified in advance or estimated outside the mixturemodel framework. In this work we address both of these issues. We show how to add spatial constraints to the mixture formulations and present a variant of the EM algorithm that makes use of both the form and the motion constraints. Moreover this algorithm estimates the number of segments given knowledge about the level of model failure expected in the sequence. The algorithm's performance is illustrated on synthetic and real image sequences.
SOMBased Data Visualization Methods
 Intelligent Data Analysis
, 1999
"... The SelfOrganizing Map (SOM) is an efficient tool for visualization of multidimensional numerical data. In this paper, an overview and categorization of both old and new methods for the visualization of SOM is presented. The purpose is to give an idea of what kind of information can be acquired fro ..."
Abstract

Cited by 124 (4 self)
 Add to MetaCart
(Show Context)
The SelfOrganizing Map (SOM) is an efficient tool for visualization of multidimensional numerical data. In this paper, an overview and categorization of both old and new methods for the visualization of SOM is presented. The purpose is to give an idea of what kind of information can be acquired from different presentations and how the SOM can best be utilized in exploratory data visualization. Most of the presented methods can also be applied in the more general case of first making a vector quantization (e.g. kmeans) and then a vector projection (e.g. Sammon's mapping).
Unsupervised Learning from Dyadic Data
, 1998
"... Dyadic data refers to a domain with two finite sets of objects in which observations are made for dyads, i.e., pairs with one element from either set. This includes event cooccurrences, histogram data, and single stimulus preference data as special cases. Dyadic data arises naturally in many applic ..."
Abstract

Cited by 122 (11 self)
 Add to MetaCart
(Show Context)
Dyadic data refers to a domain with two finite sets of objects in which observations are made for dyads, i.e., pairs with one element from either set. This includes event cooccurrences, histogram data, and single stimulus preference data as special cases. Dyadic data arises naturally in many applications ranging from computational linguistics and information retrieval to preference analysis and computer vision. In this paper, we present a systematic, domainindependent framework for unsupervised learning from dyadic data by statistical mixture models. Our approach covers different models with flat and hierarchical latent class structures and unifies probabilistic modeling and structure discovery. Mixture models provide both, a parsimonious yet flexible parameterization of probability distributions with good generalization performance on sparse data, as well as structural information about datainherent grouping structure. We propose an annealed version of the standard Expectation Maximization algorithm for model fitting which is empirically evaluated on a variety of data sets from different domains.
Nonlinear Gated Experts for Time Series: Discovering Regimes and Avoiding Overfitting
, 1995
"... this paper: ftp://ftp.cs.colorado.edu/pub/TimeSeries/MyPapers/experts.ps.Z, ..."
Abstract

Cited by 110 (5 self)
 Add to MetaCart
this paper: ftp://ftp.cs.colorado.edu/pub/TimeSeries/MyPapers/experts.ps.Z,
Resampling method for unsupervised estimation of cluster validity
 Neural Computation
, 2001
"... We introduce a method for validation of results obtained by clustering analysis of data. The method is based on resampling the available data. A figure of merit that measures the stability of clustering solutions against resampling is introduced. Clusters which are stable against resampling give ris ..."
Abstract

Cited by 88 (3 self)
 Add to MetaCart
(Show Context)
We introduce a method for validation of results obtained by clustering analysis of data. The method is based on resampling the available data. A figure of merit that measures the stability of clustering solutions against resampling is introduced. Clusters which are stable against resampling give rise to local maxima of this figure of merit. This is presented first for a onedimensional data set, for which an analytic approximation for the figure of merit is derived and compared with numerical measurements. Next, the applicability of the method is demonstrated for higher dimensional data, including gene microarray expression data. 1
A Survey of Fuzzy Clustering Algorithms for Pattern Recognition  Part 11
"... the concepts of fuzzy clustering and soft competitive learning in clustering algorithms is proposed on the basis of the existing literature. Moreover, a set of functional attributes is selected for use as dictionary entries in the comparison of clustering algorithms. In this paper, five clustering a ..."
Abstract

Cited by 81 (2 self)
 Add to MetaCart
(Show Context)
the concepts of fuzzy clustering and soft competitive learning in clustering algorithms is proposed on the basis of the existing literature. Moreover, a set of functional attributes is selected for use as dictionary entries in the comparison of clustering algorithms. In this paper, five clustering algorithms taken from the literature are reviewed, assessed and compared on the basis of the selected properties of interest. These clustering models are 1) selforganizing map (SOM); 2) fuzzy learning vector quantization (FLVQ); 3) fuzzy adaptive resonance theory (fuzzy ART); 4) growing neural gas (GNG); 5) fully selforganizing simplified adaptive resonance theory (FOSART). Although our theoretical comparison is fairly simple, it yields observations that may appear parodoxical. First, only FLVQ, fuzzy ART, and FOSART exploit concepts derived from fuzzy set theory (e.g., relative and/or absolute fuzzy membership functions). Secondly, only SOM, FLVQ, GNG, and FOSART employ soft competitive learning mechanisms, which are affected by asymptotic misbehaviors in the case of FLVQ, i.e., only SOM, GNG, and FOSART are considered effective fuzzy clustering algorithms. Index Terms—Ecological net, fuzzy clustering, modular architecture, relative and absolute membership function, soft and hard competitive learning, topologically correct mapping. I.
Annealed Competition of Experts for a Segmentation and Classification of Switching Dynamics
, 1996
"... We present a method for the unsupervised segmentation of data streams originating from different unknown sources which alternate in time. We use an architecture consisting of competing neural networks. Memory is included in order to resolve ambiguities of inputoutput relations. In order to obtain m ..."
Abstract

Cited by 75 (22 self)
 Add to MetaCart
We present a method for the unsupervised segmentation of data streams originating from different unknown sources which alternate in time. We use an architecture consisting of competing neural networks. Memory is included in order to resolve ambiguities of inputoutput relations. In order to obtain maximal specialization, the competition is adiabatically increased during training. Our method achieves almost perfect identification and segmentation in the case of switching chaotic dynamics where input manifolds overlap and inputoutput relations are ambiguous. Only a small dataset is needed for the training proceedure. Applications to time series from complex systems demonstrate the potential relevance of our approach for time series analysis and shortterm prediction. 1 Introduction Neural networks provide frameworks for the representation of relations present in data. Especially in the fields of classification and time series prediction, neural networks Corresponding author, email:k...
Data clustering using a model granular magnet
 Neural Computation
, 1997
"... We present a new approach to clustering, based on the physical properties of an inhomogeneous ferromagnet. No assumption is made regarding the underlying distribution of the data. We assign a Potts spin to each data point and introduce an interaction between neighboring points, whose strength is a d ..."
Abstract

Cited by 72 (4 self)
 Add to MetaCart
We present a new approach to clustering, based on the physical properties of an inhomogeneous ferromagnet. No assumption is made regarding the underlying distribution of the data. We assign a Potts spin to each data point and introduce an interaction between neighboring points, whose strength is a decreasing function of the distance between the neighbors. This magnetic system exhibits three phases. At very low temperatures, it is completely ordered; all spins are aligned. At very high temperatures, the system does not exhibit any ordering, and in an intermediate regime, clusters of relatively strongly coupled spins become ordered, whereas different clusters remain uncorrelated. This intermediate phase is identified by a jump in the order parameters. The spinspin correlation function is used to partition the spins and the corresponding data points into clusters. We demonstrate on three synthetic and three real data sets how the method works. Detailed comparison to the performance of other techniques clearly indicates the relative success of our method. 1