Results 1  10
of
158
Prototype selection for dissimilaritybased classifiers
 Pattern Recognition
, 2006
"... A conventional way to discriminate between objects represented by dissimilarities is the nearest neighbor method. A more efficient and sometimes a more accurate solution is offered by other dissimilaritybased classifiers. They construct a decision rule based on the entire training set, but they nee ..."
Abstract

Cited by 66 (9 self)
 Add to MetaCart
(Show Context)
A conventional way to discriminate between objects represented by dissimilarities is the nearest neighbor method. A more efficient and sometimes a more accurate solution is offered by other dissimilaritybased classifiers. They construct a decision rule based on the entire training set, but they need just a small set of prototypes, the socalled representation set, as a reference for classifying new objects. Such alternative approaches may be especially advantageous for nonEuclidean or even nonmetric dissimilarities. The choice of a proper representation set for dissimilaritybased classifiers is not yet fully investigated. It appears that a random selection may work well. In this paper, a number of experiments has been conducted on various metric and nonmetric dissimilarity representations and prototype selection methods. Several procedures, like traditional feature selection methods (here effectively searching for prototypes), mode seeking and linear programming are compared to the random selection. In general, we find out that systematic approaches lead to better results than the random selection, especially for a small number of prototypes. Although there is no single winner as it depends on data characteristics, the kcentres works well, in general. For twoclass problems, an important observation is that our dissimilaritybased discrimination functions relying on significantly reduced prototype sets (3–10 % of the training objects) offer a similar or much better classification accuracy than the best kNN rule on the entire training set. This may be reached for multiclass data as well, however such problems are more difficult.
Categorization Approach to Automated Ontological Function Annotation
 Protein Science
, 2006
"... Automated Function Prediction (AFP) methods increasingly use knowledge discovery algorithms to map sequence, structure, literature, and/or pathway information about proteins whose functions are unknown into functional ontologies, typically (a portion of) the Gene Ontology (GO). While there are a gro ..."
Abstract

Cited by 29 (4 self)
 Add to MetaCart
(Show Context)
Automated Function Prediction (AFP) methods increasingly use knowledge discovery algorithms to map sequence, structure, literature, and/or pathway information about proteins whose functions are unknown into functional ontologies, typically (a portion of) the Gene Ontology (GO). While there are a growing number of methods within this paradigm, the general problem of assessing the accuracy of such prediction algorithms has not been seriously addressed. We present first an application for function prediction from protein sequences using the POSet Ontology Categorizer (POSOC) to produce new annotations by analyzing collections of GO nodes derived from annotations of protein BLAST neighborhoods. We then also present hierarchical precision and hierarchical recall as new evaluation metrics for assessing the accuracy of any predictions in hierarchical ontologies, and discuss results on a test set of protein sequences. We show that our method provides substantially improved hierarchical precision (measure of predictions made which are correct) when applied to the nearest BLAST neighbors of target proteins, as compared with simply imputing that neighborhood’s annotations to the target. Moreover, when our method is applied to a broader BLAST neighborhood, hierarchical precision is enhanced even further. In all cases, such increased hierarchical precision performance is purchased at a modest expense of hierarchical recall (measure of all annotations which get predicted at all).
Weighted clustering ensembles
 In Proceedings of The 6th SIAM International Conference on Data Mining
, 2006
"... Cluster ensembles offer a solution to challenges inherent to clustering arising from its illposed nature. Cluster ensembles can provide robust and stable solutions by leveraging the consensus across multiple clustering results, while averaging out emergent spurious structures that arise due to the ..."
Abstract

Cited by 28 (7 self)
 Add to MetaCart
Cluster ensembles offer a solution to challenges inherent to clustering arising from its illposed nature. Cluster ensembles can provide robust and stable solutions by leveraging the consensus across multiple clustering results, while averaging out emergent spurious structures that arise due to the various biases to which each participating algorithm is tuned. In this paper, we address the problem of combining multiple weighted clusters which belong to different subspaces of the input space. We leverage the diversity of the input clusterings in order to generate a consensus partition that is superior to the participating ones. Since we are dealing with weighted clusters, our consensus function makes use of the weight vectors associated with the clusters. The experimental results show that our ensemble technique is capable of producing a partition that is as good as or better than the best individual clustering. 1
Weighted Cluster Ensembles: Methods and Analysis
 TKDD
"... Cluster ensembles offer a solution to challenges inherent to clustering arising from its illposed nature. Cluster ensembles can provide robust and stable solutions by leveraging the consensus across multiple clustering results, while averaging out emergent spurious structures that arise due to the ..."
Abstract

Cited by 24 (4 self)
 Add to MetaCart
(Show Context)
Cluster ensembles offer a solution to challenges inherent to clustering arising from its illposed nature. Cluster ensembles can provide robust and stable solutions by leveraging the consensus across multiple clustering results, while averaging out emergent spurious structures that arise due to the various biases to which each participating algorithm is tuned. In this article, we address the problem of combining multiple weighted clusters that belong to different subspaces of the input space. We leverage the diversity of the input clusterings in order to generate a consensus partition that is superior to the participating ones. Since we are dealing with weighted clusters, our consensus functions make use of the weight vectors associated with the clusters. We demonstrate the effectiveness of our techniques by running experiments with several real datasets, including highdimensional text data. Furthermore, we investigate in depth the issue of diversity and accuracy for our ensemble methods. Our analysis and experimental results show that the proposed techniques are capable of producing a partition that is as good as or better than the best individual clustering.
Kernel discriminant analysis for positive definite and indefinite kernels
, 2008
"... Abstract—Kernel methods are a class of well established and successful algorithms for pattern analysis due to their mathematical elegance and good performance. Numerous nonlinear extensions of pattern recognition techniques have been proposed so far based on the socalled kernel trick. The objective ..."
Abstract

Cited by 24 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Kernel methods are a class of well established and successful algorithms for pattern analysis due to their mathematical elegance and good performance. Numerous nonlinear extensions of pattern recognition techniques have been proposed so far based on the socalled kernel trick. The objective of this paper is twofold. First, we derive an additional kernel tool that is still missing, namely kernel quadratic discriminant (KQD). We discuss different formulations of KQD based on the regularized kernel Mahalanobis distance in both complete and classrelated subspaces. Second, we propose suitable extensions of kernel linear and quadratic discriminants to indefinite kernels. We provide classifiers that are applicable to kernels defined by any symmetric similarity measure. This is important in practice because problemsuited proximity measures often violate the requirement of positive definiteness. As in the traditional case, KQD can be advantageous for data with unequal class spreads in the kernelinduced spaces, which cannot be well separated by a linear discriminant. We illustrate this on artificial and real data for both positive definite and indefinite kernels. Index Terms—Machine learning, pattern recognition, kernel methods, indefinite kernels, discriminant analysis. Ç 1
Non Euclidean or NonMetric Measures can Be Informative
 SSPR & SPR
, 2006
"... Statistical learning algorithms often rely on the Euclidean distance. In practice, nonEuclidean or nonmetric dissimilarity measures may arise when contours, spectra or shapes are compared by edit distances or as a consequence of robust object matching [1,2]. It is an open issue whether such mea ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
Statistical learning algorithms often rely on the Euclidean distance. In practice, nonEuclidean or nonmetric dissimilarity measures may arise when contours, spectra or shapes are compared by edit distances or as a consequence of robust object matching [1,2]. It is an open issue whether such measures are advantageous for statistical learning or whether they should be constrained to obey the metric axioms. The knearest neighbor (NN) rule is widely applied to general dissimilarity data as the most natural approach. Alternative methods exist that embed such data into suitable representation spaces in which statistical classifiers are constructed [3]. In this paper, we investigate the relation between nonEuclidean aspects of dissimilarity data and the classification performance of the direct NN rule and some classifiers trained in representation spaces. This is evaluated on a parameterized family of edit distances, in which parameter values control the strength of nonEuclidean behavior. Our finding is that the discriminative power of this measure increases with increasing nonEuclidean and nonmetric aspects until a certain optimum is reached. The conclusion is that statistical classifiers perform well and the optimal values of the parameters characterize a nonEuclidean and somewhat nonmetric measure.
Transforming strings to vector spaces using prototype selection
, 2006
"... Abstract. A common way of expressing string similarity in structural pattern recognition is the edit distance. It allows one to apply the kNN rule in order to classify a set of strings. However, compared to the wide range of elaborated classifiers known from statistical pattern recognition, this is ..."
Abstract

Cited by 15 (6 self)
 Add to MetaCart
(Show Context)
Abstract. A common way of expressing string similarity in structural pattern recognition is the edit distance. It allows one to apply the kNN rule in order to classify a set of strings. However, compared to the wide range of elaborated classifiers known from statistical pattern recognition, this is only a very basic method. In the present paper we propose a method for transforming strings into ndimensional real vector spaces based on prototype selection. This allows us to subsequently classify the transformed strings with more sophisticated classifiers, such as support vector machine and other kernel based methods. In a number of experiments, we show that the recognition rate can be significantly improved by means of this procedure. 1
Beyond traditional kernels: Classification in two dissimilaritybased representation spaces
 IEEE Trans. Syst., Man Cybern., Part C: Appl. Rev
, 2008
"... Abstract—Proximity captures the degree of similarity between examples and is thereby fundamental in learning. Learning from pairwise proximity data usually relies on either kernel methods for specifically designed kernels or the nearest neighbor (NN) rule. Kernel methods are powerful, but often cann ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Proximity captures the degree of similarity between examples and is thereby fundamental in learning. Learning from pairwise proximity data usually relies on either kernel methods for specifically designed kernels or the nearest neighbor (NN) rule. Kernel methods are powerful, but often cannot handle arbitrary proximities without necessary corrections. The NN rule can work well in such cases, but suffers from local decisions. The aim of this paper is to provide an indispensable explanation and insights about two simple yet powerful alternatives when neither conventional kernel methods nor the NN rule can perform best. These strategies use two proximitybased representation spaces (RSs) in which accurate classifiers are trained on all training objects and demand comparisons to a small set of prototypes. They can handle all meaningful dissimilarity measures, including nonEuclidean and nonmetric ones. Practical examples illustrate that these RSs can be highly advantageous in supervised learning. Simple classifiers built there tend to outperform the NN rule. Moreover, computational complexity may be controlled. Consequently, these approaches offer an appealing alternative to learn from proximity data for which kernel methods cannot directly be applied, are too costly or impractical, while the NN rule leads to noisy results. Index Terms—Classifier design and evaluation, indefinite kernels, similarity measures, statistical learning. I.
On learning with dissimilarity functions
 In Proceedings of the 24th international conference on Machine learning
"... We study the problem of learning a classification task in which only a dissimilarity function of the objects is accessible. That is, data are not represented by feature vectors but in terms of their pairwise dissimilarities. We investigate the sufficient conditions for dissimilarity functions to all ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
(Show Context)
We study the problem of learning a classification task in which only a dissimilarity function of the objects is accessible. That is, data are not represented by feature vectors but in terms of their pairwise dissimilarities. We investigate the sufficient conditions for dissimilarity functions to allow building accurate classifiers. Our results have the advantages that they apply to unbounded dissimilarities and are invariant to orderpreserving transformations. The theory immediately suggests a learning paradigm: construct an ensemble of decision stumps each depends on a pair of examples, then find a convex combination of them to achieve a large margin. We next develop a practical algorithm called Dissimilarity based Boosting (DBoost) for learning with dissimilarity functions under the theoretical guidance. Experimental results demonstrate that DBoost compares favorably with several existing approaches on a variety of databases and under different conditions. 1.
A comparison of binless spike train measures
, 2009
"... Several binless spike train measures which avoid the limitations of binning have been recently been proposed in the literature. This paper presents a systematic comparison of these measures in three simulated paradigms designed to address specific situations of interest in spike train analysis where ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
(Show Context)
Several binless spike train measures which avoid the limitations of binning have been recently been proposed in the literature. This paper presents a systematic comparison of these measures in three simulated paradigms designed to address specific situations of interest in spike train analysis where the relevant feature may be in the form of firing rate, firing rate modulations and/or synchrony. The measures are first disseminated and extended for ease of comparison. It is also discussed how the measures can be used to measure dissimilarity in spike trains’ firing rate despite their explicit formulation for synchrony.