Results 1  10
of
25
Cluster Ensembles  A Knowledge Reuse Framework for Combining Multiple Partitions
 Journal of Machine Learning Research
, 2002
"... This paper introduces the problem of combining multiple partitionings of a set of objects into a single consolidated clustering without accessing the features or algorithms that determined these partitionings. We first identify several application scenarios for the resultant 'knowledge reuse&ap ..."
Abstract

Cited by 603 (20 self)
 Add to MetaCart
This paper introduces the problem of combining multiple partitionings of a set of objects into a single consolidated clustering without accessing the features or algorithms that determined these partitionings. We first identify several application scenarios for the resultant 'knowledge reuse' framework that we call cluster ensembles. The cluster ensemble problem is then formalized as a combinatorial optimization problem in terms of shared mutual information. In addition to a direct maximization approach, we propose three effective and efficient techniques for obtaining highquality combiners (consensus functions). The first combiner induces a similarity measure from the partitionings and then reclusters the objects. The second combiner is based on hypergraph partitioning. The third one collapses groups of clusters into metaclusters which then compete for each object to determine the combined clustering. Due to the low computational costs of our techniques, it is quite feasible to use a supraconsensus function that evaluates all three approaches against the objective function and picks the best solution for a given situation. We evaluate the effectiveness of cluster ensembles in three qualitatively different application scenarios: (i) where the original clusters were formed based on nonidentical sets of features, (ii) where the original clustering algorithms worked on nonidentical sets of objects, and (iii) where a common dataset is used and the main purpose of combining multiple clusterings is to improve the quality and robustness of the solution. Promising results are obtained in all three situations for synthetic as well as real datasets.
On a MirkinMuchnikSmith conjecture for comparing molecular phylogenies
 Journal of Computational Biology
, 1997
"... A conjecture of Mirkin, Muchnik and Smith is answered affirmatively which connects the inconsistency function, a biologically meaningful similarity/dissimilarity measure for a gene tree and a species tree, to the mutation cost function, a combinatorial measure based on the mapping of trees. A linear ..."
Abstract

Cited by 53 (5 self)
 Add to MetaCart
A conjecture of Mirkin, Muchnik and Smith is answered affirmatively which connects the inconsistency function, a biologically meaningful similarity/dissimilarity measure for a gene tree and a species tree, to the mutation cost function, a combinatorial measure based on the mapping of trees. A lineartime algorithm for computing the mutation cost function is also derived from the conjecture. 1
Metric graph theory and geometry: a survey
 CONTEMPORARY MATHEMATICS
"... The article surveys structural characterizations of several graph classes defined by distance properties, which have in part a general algebraic flavor and can be interpreted as subdirect decomposition. The graphs we feature in the first place are the median graphs and their various kinds of general ..."
Abstract

Cited by 44 (14 self)
 Add to MetaCart
The article surveys structural characterizations of several graph classes defined by distance properties, which have in part a general algebraic flavor and can be interpreted as subdirect decomposition. The graphs we feature in the first place are the median graphs and their various kinds of generalizations, e.g., weakly modular graphs, or fibercomplemented graphs, or l1graphs. Several kinds of l1graphs admit natural geometric realizations as polyhedral complexes. Particular instances of these graphs also occur in other geometric contexts, for example, as dual polar graphs, basis graphs of (even ∆)matroids, tope graphs, lopsided sets, or plane graphs with vertex degrees and face sizes bounded from below. Several other classes of graphs, e.g., Helly graphs (as injective objects), or bridged graphs (generalizing chordal graphs), or treelike graphs such as distancehereditary graphs occur in the investigation of graphs satisfying some basic properties of the distance function, such as the Helly property for balls, or the convexity of balls or of the neighborhoods of convex sets, etc. Operators between graphs or complexes relate some of the
On the question Who is a J?”: A social choice approach.” Logique et Analyse 160: 385–395
, 1997
"... The determination of “who is a J ” within a society is treated as an aggregation of the views of the members of the society regarding this question. Methods, similar to those used in Social Choice theory are applied to axiomatize three criteria for determining who is a J: 1) a J is whoever defines o ..."
Abstract

Cited by 35 (1 self)
 Add to MetaCart
The determination of “who is a J ” within a society is treated as an aggregation of the views of the members of the society regarding this question. Methods, similar to those used in Social Choice theory are applied to axiomatize three criteria for determining who is a J: 1) a J is whoever defines oneself to be a J. 2) a J is whoever a “dictator ” determines is a J. 3) a J is whoever an “oligarchy ” of individuals agrees is a J. * We wish to thanks Dubi Samet for his useful comments on an earlier version of this paper. * * Laura SchwarzKipp Chair of Professional Ethics and Philosophy of Practice and
The average consensus procedure: Combination of weighted trees containing identical or overlapping sets of taxa. Syst. Biol
, 1997
"... Abstract.—The average consensus procedure, originally proposed to combine dendrograms (i.e., ultrametric trees), is extended here to apply to any type of tree with branch lengths (ultrametric or not) containing identical, inclusive, or overlapping sets of taxa. The method proceeds in two steps. Firs ..."
Abstract

Cited by 33 (3 self)
 Add to MetaCart
Abstract.—The average consensus procedure, originally proposed to combine dendrograms (i.e., ultrametric trees), is extended here to apply to any type of tree with branch lengths (ultrametric or not) containing identical, inclusive, or overlapping sets of taxa. The method proceeds in two steps. First, the average of the pathlength matrices corresponding to the trees to be combined is computed. Then, a leastsquares algorithm is applied to this average matrix to obtain a consensus solution. The average consensus tree is a solution that minimizes the sum of squared distances between the consensus and the trees in the input profile. An application of the method to combine phylogenetic hypotheses for kangaroos is presented. [Consensus method; kangaroo; Macropodidae; phylogeny; supertree; weighted trees.] Given a set of phytogenies, how could we combine them to obtain a consensus solution that is representative of the entire set? Different methods could lead to various consensus solutions that may be considered representative (for a discussion,
The Complexity of Computing Medians of Relations
, 1998
"... Let N be a finite set and R be the set of all binary relations on N . Consider R endowed with a metric d, the symmetric difference distance. For a given mtuple = (R 1 ; : : : ; Rm ) 2 R m , a relation R 2 R that minimizes the function P m k=1 d(R k ; R) is called a median relation of . In the socia ..."
Abstract

Cited by 22 (0 self)
 Add to MetaCart
Let N be a finite set and R be the set of all binary relations on N . Consider R endowed with a metric d, the symmetric difference distance. For a given mtuple = (R 1 ; : : : ; Rm ) 2 R m , a relation R 2 R that minimizes the function P m k=1 d(R k ; R) is called a median relation of . In the social sciences, in qualitative data analysis and in multicriteria decision making, problems occur in which the mtuple represents collected data (preferences, similarities, games) and the objective is that of finding a median relation of with some special feature (representing for example, consensus of preferences, clustering of similar objects, ranking of teams, etc.). In this paper we analyse the computational complexity of all such problems in which the median is required to satisfy one or more of the properties: reexitivity, symmetry, antisymmetry, transitivity and completeness. We prove that whenever transitivity is required (except when symmetry and completeness are also si...
Multiclassifier systems: Back to the future
 Multiple Classifier Systems, pages invited paper, 1–15. LNCS
, 2002
"... Abstract. While a variety of multiple classifier systems have been studied since at least the late 1950’s, this area came alive in the 90’s with significant theoretical advances as well as numerous successful practical applications. This article argues that our current understanding of ensembletype ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
(Show Context)
Abstract. While a variety of multiple classifier systems have been studied since at least the late 1950’s, this area came alive in the 90’s with significant theoretical advances as well as numerous successful practical applications. This article argues that our current understanding of ensembletype multiclassifier systems is now quite mature and exhorts the reader to consider a broader set of models and situations for further progress. Some of these scenarios have already been considered in classical pattern recognition literature, but revisiting them often leads to new insights and progress. As an example, we consider how to integrate multiple clusterings, a problem central to several emerging distributed data mining applications. We also revisit output space decomposition to show how this can lead to extraction of valuable domain knowledge in addition to improved classification accuracy. 1 A Brief History of Multilearner Systems Multiple classifier systems are special cases of approaches that integrate several
A multifacility location problem on median spaces
 Discrete Appl. Math
, 1996
"... ..."
(Show Context)
Strategyproof Classification
, 2011
"... Experts reporting the labels used by a learning algorithm cannot always be assumed to be truthful. We describe recent advances in the design and analysis of strategyproof mechanisms for binary classification, and their relation to other mechanism design problems. ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Experts reporting the labels used by a learning algorithm cannot always be assumed to be truthful. We describe recent advances in the design and analysis of strategyproof mechanisms for binary classification, and their relation to other mechanism design problems.