Results 1 - 10
of
11
A simple method for generating additive clustering models with limited complexity
- Machine Learning
, 2002
"... Abstract. Additive clustering was originally developed within cognitive psychology to enable the development of featural models of human mental representation. The representational flexibility of additive clustering, however, suggests its more general application to modeling complicated relationship ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
Abstract. Additive clustering was originally developed within cognitive psychology to enable the development of featural models of human mental representation. The representational flexibility of additive clustering, however, suggests its more general application to modeling complicated relationships between objects in non-psychological domains of interest. This paper describes, demonstrates, and evaluates a simple method for learning additive clustering models, based on the combinatorial optimization approach known as Population-Based Incremental Learning. The performance of this new method is shown to be comparable with previously developed methods over a set of ‘benchmark ’ data sets. In addition, the method developed here has the potential, by using a Bayesian analysis of model complexity that relies on an estimate of data precision, to determine the appropriate number of clusters to include in a model.
Latent Features in Similarity Judgments: A Nonparametric Bayesian Approach
"... One of the central problems in cognitive science is determining the mental representations that underlie human inferences. Solutions to this problem often rely on the analysis of subjective similarity judgments, on the assumption that recognizing “likenesses ” between people, objects and events is c ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
One of the central problems in cognitive science is determining the mental representations that underlie human inferences. Solutions to this problem often rely on the analysis of subjective similarity judgments, on the assumption that recognizing “likenesses ” between people, objects and events is crucial to everyday inference. One such solution is provide by the additive clustering model, which is widely used to infer the features of a set of stimuli from their similarities, on the assumption that similarity is a weighted linear function of common features. Existing approaches for implementing additive clustering often lack a complete framework for statistical inference, particularly with respect to choosing the number of features. To address these problems, this paper develops a fully Bayesian formulation of the additive clustering model, using methods from nonparametric Bayesian statistics to allow the number of features to vary. We use this to explore several approaches to parameter estimation, showing that the nonparametric Bayesian approach provides a straightforward way to obtain estimates of both the number of features and their importance. 1
Common and Distinctive Features in Stimulus Similarity: A Modified Version of the Contrast Model
, 2002
"... Featural representations of similarity data assume that people represent stimuli in terms of a set of discrete properties. We consider the differences in featu al representations that arise from making fo u di#erent assu;LK' ns abo u how similarity ismeasu)q' Three of these similarity models - ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Featural representations of similarity data assume that people represent stimuli in terms of a set of discrete properties. We consider the differences in featu al representations that arise from making fo u di#erent assu;LK' ns abo u how similarity ismeasu)q' Three of these similarity models --- the common featu2L model, the distinctive featu es model, and Tversky's seminal contrast model --- have been considered previoued . The other model is new, and modifies the contrast model byassu ming that each individu al featu re only ever acts as a common or distinctive feature. Each of the four models is tested on previou sly examined similarity data, relating to kinship terms, and on a new data set, relating to faces. In fitting the models, we use the Geometric Complexity Criterion to balance the competing demands of data-fit and model complexity. The resuq2 show that both common and distinctive features are important for stimuim representation, and we argue that the modified contrast model combines these two components in a more effective and interpretable way than Tversky's original formulation.
Clustering using the contrast model
- In
, 2001
"... An algorithm is developed for generating featural representations from similarity data using Tversky’s (1977) Contrast Model. Unlike previous additive clustering approaches, the algorithm fits a representational model that allows for stimulus similarity to be measured in terms of both common and dis ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
An algorithm is developed for generating featural representations from similarity data using Tversky’s (1977) Contrast Model. Unlike previous additive clustering approaches, the algorithm fits a representational model that allows for stimulus similarity to be measured in terms of both common and distinctive features. The important issue of striking an appropriate balance between data fit and representational complexity is addressed through the use of the Geometric Complexity Criterion to guide model selection. The ability of the algorithm to recover known featural representations from noisy data is tested, and it is also applied to real data measuring the similarity of kinship terms.
Combinatoral Optimization in Clustering
"... Contents 1 Introduction 2 2 Types of Data 5 3 Cluster Structures 14 4 Clustering Criteria 15 5 Single Cluster Clustering 16 5.1 Clustering Approaches.......................... 16 5.1.1 De#nition-based Clusters .................... 16 5.1.2 Direct Algorithms ........................ 18 5.1.3 Optimal ..."
Abstract
- Add to MetaCart
Contents 1 Introduction 2 2 Types of Data 5 3 Cluster Structures 14 4 Clustering Criteria 15 5 Single Cluster Clustering 16 5.1 Clustering Approaches.......................... 16 5.1.1 De#nition-based Clusters .................... 16 5.1.2 Direct Algorithms ........................ 18 5.1.3 Optimal Clusters . ........................ 20 5.2 Single and Monotone Linkage Clusters ................. 21 5.2.1 MST and Single Linkage Clustering .............. 21 5.2.2 Monotone Linkage Clusters . . ................. 23 1 5.2.3 Modeling Skeletons in Digital Image Processing . . . . . . . . 25 5.2.4 Linkage-based Convex Criteria ................. 27 5.3 Moving Center and Approximation Clusters . . . . . ......... 29 5.3.1 Criteria for Moving Center Methods . . . . . ......... 29 5.3.2 Principal Cluster . . ....................... 29 5.3.3 Additive Cluster ......................... 32 5.3.4 Seriation with Returns . . . . . . ................ 34 6 Partitioning
Approximation Clustering: a Mine of Semidefinite Programming Problems
"... . Clustering is a discipline devoted to #nding homogeneous groups of data entities. In contrast to conventional clustering whichinvolves data processing in terms of either entities or variables, approximation clustering is aimed at processing of the data matrices as they are. Currently, approxima ..."
Abstract
- Add to MetaCart
. Clustering is a discipline devoted to #nding homogeneous groups of data entities. In contrast to conventional clustering whichinvolves data processing in terms of either entities or variables, approximation clustering is aimed at processing of the data matrices as they are. Currently, approximation clustering is a set of clustering models and methods based on approximate decomposition of the data table into scalar product matrices representing weighted subsets, partitions or hierarchies as the sought clustering structures. Some of the problems involved are of semide#nite programming, the others seem quite similar. 1 Introduction Clustering models may di#er depending on the nature of data. We distinguish here among three types of data: column-conditional, similarity and aggregable ones. The #rst two are those usually considered in clustering: a column-conditional data set is represented by an entity-to-variable matrix so that the entries within any column #variable# can be c...
Constructing and Mapping Fuzzy Thematic Clusters to Higher Ranks in a Taxonomy
"... Abstract — We present a method for mapping a structure to a related taxonomy in a thematically consistent way. The components of the structure are supplied with fuzzy profiles over the taxonomy. These are then generalized in two steps: first, by fuzzy clustering, and then by mapping the clusters to ..."
Abstract
- Add to MetaCart
Abstract — We present a method for mapping a structure to a related taxonomy in a thematically consistent way. The components of the structure are supplied with fuzzy profiles over the taxonomy. These are then generalized in two steps: first, by fuzzy clustering, and then by mapping the clusters to higher ranks of the taxonomy. To be specific, we concentrate on the Computer Sciences area represented by the ACM Computing Classification System (ACM-CCS), but the approach is aplicable also to other taxonomies. We build fuzzy clusters of the taxonomy subjects according to the similarity between individual profiles. Clusters are extracted using an original additive spectral clustering method involving a number of model-based stopping conditions. The clusters are parsimoniously lifted to higher ranks of the taxonomy using an original recursive algorithm for minimizing a penalty function that involves “head subjects ” on the higher ranks of the taxonomy along with their “gaps ” and “offshoots”. An example is given illustrating the method applied to real-world data. I.
ACM Classification Can Be Used for Representing Research Organizations
, 2007
"... 1Several visits to DIMACS 2002–2005 contributed to this work; a visit to DIMACS in August 2007 helped to finalize the report. 2Supported by the grant PTDC/EIA/69988/2006 from the Portuguese Science & Technology Foundation ..."
Abstract
- Add to MetaCart
1Several visits to DIMACS 2002–2005 contributed to this work; a visit to DIMACS in August 2007 helped to finalize the report. 2Supported by the grant PTDC/EIA/69988/2006 from the Portuguese Science & Technology Foundation
ARTICLE Communicated by Michael Lee Latent Features in Similarity Judgments: A Nonparametric Bayesian Approach
"... One of the central problems in cognitive science is determining the mental representations that underlie human inferences. Solutions to this problem often rely on the analysis of subjective similarity judgments, on the assumption that recognizing likenesses between people, objects, and events is cru ..."
Abstract
- Add to MetaCart
One of the central problems in cognitive science is determining the mental representations that underlie human inferences. Solutions to this problem often rely on the analysis of subjective similarity judgments, on the assumption that recognizing likenesses between people, objects, and events is crucial to everyday inference. One such solution is provided by the additive clustering model, which is widely used to infer the features of a set of stimuli from their similarities, on the assumption that similarity is a weighted linear function of common features. Existing approaches for implementing additive clustering often lack a complete framework for statistical inference, particularly with respect to choosing the number of features. To address these problems, this article develops a fully Bayesian formulation of the additive clustering model, using methods from nonparametric Bayesian statistics to allow the number of features to vary. We use this to explore several approaches to parameter estimation, showing that the nonparametric Bayesian approach provides a straightforward way to obtain estimates of both the number of features and their importance. 1
A Hybrid Cluster-Lift Method for the Analysis of Research Activities
"... Abstract. A hybrid of two novel methods- additive fuzzy spectral clustering and lifting method over a taxonomy- is applied to analyse the research activities of a department. To be specific, we concentrate on the Computer Sciences area represented by the ACM Computing Classification System (ACM-CCS) ..."
Abstract
- Add to MetaCart
Abstract. A hybrid of two novel methods- additive fuzzy spectral clustering and lifting method over a taxonomy- is applied to analyse the research activities of a department. To be specific, we concentrate on the Computer Sciences area represented by the ACM Computing Classification System (ACM-CCS), but the approach is applicable also to other taxonomies. Clusters of the taxonomy subjects are extracted using an original additive spectral clustering method involving a number of model-based stopping conditions. The clusters are parsimoniously lifted then to higher ranks of the taxonomy by minimizing the count of “head subjects ” along with their “gaps ” and “offshoots”. An example is given illustrating the method applied to real-world data. 1

