Results 1  10
of
21
On combining classifiers
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 1998
"... We develop a common theoretical framework for combining classifiers which use distinct pattern representations and show that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision. An experimental ..."
Abstract

Cited by 1420 (33 self)
 Add to MetaCart
We develop a common theoretical framework for combining classifiers which use distinct pattern representations and show that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision. An experimental comparison of various classifier combination schemes demonstrates that the combination rule developed under the most restrictive assumptions—the sum rule—outperforms other classifier combinations schemes. A sensitivity analysis of the various schemes to estimation errors is carried out to show that this finding can be justified theoretically.
Local Cascade Generalization
, 1998
"... In a previous work we have presented Cascade Generalization, a new general method for merging classifiers. The basic idea of Cascade Generalization is to sequentially run the set of classifiers, at each step performing an extension of the original data by the insertion of new attributes. The new att ..."
Abstract

Cited by 55 (1 self)
 Add to MetaCart
In a previous work we have presented Cascade Generalization, a new general method for merging classifiers. The basic idea of Cascade Generalization is to sequentially run the set of classifiers, at each step performing an extension of the original data by the insertion of new attributes. The new attributes are derived from the probability class distribution given by a base classifier. This constructive step extends the representational language for the high level classifiers, relaxing their bias. In this paper we extend this work by applying Cascade locally. At each iteration of a divide and conquer algorithm, a reconstruction of the instance space occurs by the addition of new attributes. Each new attribute represents the probability that an example belongs to a class given by a base classifier. We have implemented three Local Generalization Algorithms. The first merges a linear discriminant with a decision tree, the second merges a naive Bayes with a decision tree, and the third mer...
On data and algorithms: understanding inductive performance
 Machine Learning
, 2004
"... Abstract. In this paper we address two symmetrical issues, the discovery of similarities among classification algorithms, and among datasets. Both on the basis of error measures, which we use to define the error correlation between two algorithms, and determine the relative performance of a list o ..."
Abstract

Cited by 29 (4 self)
 Add to MetaCart
(Show Context)
Abstract. In this paper we address two symmetrical issues, the discovery of similarities among classification algorithms, and among datasets. Both on the basis of error measures, which we use to define the error correlation between two algorithms, and determine the relative performance of a list of algorithms. We use the first to discover similarities between learners, and both of them to discover similarities between datasets. The latter sketch maps on the dataset space. Regions within each map exhibit specific patterns of error correlation or relative performance. To acquire an understanding of the factors determining these regions we describe them using simple characteristics of the datasets. Descriptions of each region are given in terms of the distributions of dataset characteristics within it. 1
Dynamically weighted ensemble neural networks for classification
 In Proceedings of the 1998 International Joint Conference on Neural Networks
, 1998
"... Combining the outputs of several neural networks into an aggregate output often gives improved accuracy over any individual output. The set of networks is known as an ensemble or committee. This paper presents an ensemble method for classification that has advantages over other techniqes for linear ..."
Abstract

Cited by 24 (4 self)
 Add to MetaCart
(Show Context)
Combining the outputs of several neural networks into an aggregate output often gives improved accuracy over any individual output. The set of networks is known as an ensemble or committee. This paper presents an ensemble method for classification that has advantages over other techniqes for linear combining. Normally, the output of an ensemble is a weighted sum whose are weights fixed, having been determined from the training or validation data. Our ensembles are weighted dynamically, the weights determined from the respective certainties of the network outputs. The more certain a network seems to be of its decision, the higher the weight. 1.
Combining Classifiers by Constructive Induction
, 1998
"... . Using multiple classifiers for increasing learning accuracy is an active research area. In this paper we present a new general method for merging classifiers. The basic idea of Cascade Generalization is to sequentially run the set of classifiers, at each step performing an extension of the origina ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
. Using multiple classifiers for increasing learning accuracy is an active research area. In this paper we present a new general method for merging classifiers. The basic idea of Cascade Generalization is to sequentially run the set of classifiers, at each step performing an extension of the original data set by adding new attributes. The new attributes are derived from the probability class distribution given by a base classifier. This constructive step extends the representational language for the high level classifiers, relaxing their bias. Cascade Generalization produces a single but structured model for the data that combines the model class representation of the base classifiers. We have performed an empirical evaluation of Cascade composition of three well known classifiers: Naive Bayes, Linear Discriminant, and C4.5. Composite models show an increase of performance, sometimes impressive, when compared with the corresponding single models, with significant statistical confidenc...
Limiting the Number of Trees in Random Forests
 In Proceedings of MCS 2001, LNCS 2096, 2001
"... Abstract. The aim of this paper is to propose a simple procedure that aprioridetermines a minimum number of classifiers to combine in order to obtain a prediction accuracy level similar to the one obtained with the combination of larger ensembles. The procedure is based on the McNemar nonparametric ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
Abstract. The aim of this paper is to propose a simple procedure that aprioridetermines a minimum number of classifiers to combine in order to obtain a prediction accuracy level similar to the one obtained with the combination of larger ensembles. The procedure is based on the McNemar nonparametric test of significance. Knowing a priori the minimum size of the classifier ensemble giving the best prediction accuracy, constitutes a gain for time and memory costs especially for huge data bases and realtime applications. Here we applied this procedure to four multiple classifier systems with C4.5 decision tree (Breiman’s Bagging, Ho’s Random subspaces, their combination we labeled ‘Bagfs’, and Breiman’s Random forests) and five large benchmark data bases. It is worth noticing that the proposed procedure may easily be extended to other base learning algorithms than a decision tree as well. The experimental results showed that it is possible to limit significantly the number of trees. We also showed that the minimum number of trees required for obtaining the best prediction accuracy may vary from one classifier combination method to another. 2 Patrice Latinne et al.
Osareh A: Ontology alignment using machine learning techniques. Int J Comput Sci Inf Technol 2011, 3(2):139–150. Diallo Journal of Biomedical Semantics 2014, 5:44 Page 19 of 19 http://www.jbiomedsem.com/content/5/1/4426. Ichise R: Machine Learning Approac
 Seventh IEEE/ACIS International Conference on Computer and Information Science, (ICIS
"... ABSTRACT ..."
(Show Context)
ALIGNING ONTOLOGIES INTELLIGENTLY
"... In the semantic web, ontology plays an important role to provide formal definitions of concepts and relationships. Due to the presence of several similar ontologies in the same domain, there might be several definitions for a given concept. Ontology alignment overcomes these difficulties by explori ..."
Abstract
 Add to MetaCart
(Show Context)
In the semantic web, ontology plays an important role to provide formal definitions of concepts and relationships. Due to the presence of several similar ontologies in the same domain, there might be several definitions for a given concept. Ontology alignment overcomes these difficulties by exploring a map between similar entities that refer to the same concept in two different ontologies. This paper proposes a method to combine similarity measures of different categories such as string, linguistic, structural and instance based similarity measures. To align different ontologies efficiently, K Nearest Neighbor (KNN), Support Vector Machine (SVM) and Decision Tree (DT) classifiers and Baysian network are investigated. Each classifier is optimized based on the lower cost and better classification rate. Experimental results demonstrate that the Fmeasure criterion improves up to 98% using feature selection and combination of classifiers, which is highly comparable, and outperforms the previous reported Fmeasures.
Editors
"... julkisesti tarkastettavaksi yliopiston Agorarakennuksessa (Ag Aud. 2) ..."
(Show Context)
Bayesian Integration of Rule Models
"... Although Bayesian model averaging (BMA) is in principle the optimal method for combining learned models, it has received relatively little attention in the machine learning literature. This article describes an extensive empirical study of the application of BMA to rule induction. BMA is applied to ..."
Abstract
 Add to MetaCart
Although Bayesian model averaging (BMA) is in principle the optimal method for combining learned models, it has received relatively little attention in the machine learning literature. This article describes an extensive empirical study of the application of BMA to rule induction. BMA is applied to a variety of tasks and compared with more ad hoc alternatives like bagging. In each case, BMA typically leads to higher error rates than the ad hoc alternative. This is found to be due to the exponential sensitivity of the likelihood to small variations in the sample, leading to effectively very little averaging being performed even when all models have similar error rates. Coupled with the generation of many models, this causes BMA to have a strong tendency to overfit. An attempt to combat this problem using carefullydesigned priors is described. These and further experiments suggest that methods like bagging succeed not because they approximate the optimal BMA procedure better than a sing...