Results 1  10
of
893
Induction of Decision Trees
 MACH. LEARN
, 1986
"... The technology for building knowledgebased systems by inductive inference from examples has been demonstrated successfully in several practical applications. This paper summarizes an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such syste ..."
Abstract

Cited by 4303 (4 self)
 Add to MetaCart
The technology for building knowledgebased systems by inductive inference from examples has been demonstrated successfully in several practical applications. This paper summarizes an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail. Results from recent studies show ways in which the methodology can be modified to deal with information that is noisy and/or incomplete. A reported shortcoming of the basic algorithm is discussed and two means of overcoming it are compared. The paper concludes with illustrations of current research directions.
Random forests
 Machine Learning
, 2001
"... Abstract. Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the fo ..."
Abstract

Cited by 3433 (2 self)
 Add to MetaCart
Abstract. Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, ∗∗∗, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.
Rapid object detection using a boosted cascade of simple features
 ACCEPTED CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION 2001
, 2001
"... This paper describes a machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates. This work is distinguished by three key contributions. The first is the introduction of a new image representation called the " ..."
Abstract

Cited by 3222 (9 self)
 Add to MetaCart
(Show Context)
This paper describes a machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates. This work is distinguished by three key contributions. The first is the introduction of a new image representation called the "Integral Image" which allows the features used by our detector to be computed very quickly. The second is a learning algorithm, based on AdaBoost, which selects a small number of critical visual features from a larger set and yields extremely efficient classifiers[6]. The third contribution is a method for combining increasingly more complex classifiers in a "cascade" which allows background regions of the image to be quickly discarded while spending more computation on promising objectlike regions. The cascade can be viewed as an object specific focusofattention mechanism which unlike previous approaches provides statistical guarantees that discarded regions are unlikely to contain the object of interest. In the domain of face detection the system yields detection rates comparable to the best previous systems. Used in realtime applications, the detector runs at 15 frames per second without resorting to image differencing or skin color detection.
Additive Logistic Regression: a Statistical View of Boosting
 Annals of Statistics
, 1998
"... Boosting (Freund & Schapire 1996, Schapire & Singer 1998) is one of the most important recent developments in classification methodology. The performance of many classification algorithms can often be dramatically improved by sequentially applying them to reweighted versions of the input dat ..."
Abstract

Cited by 1719 (25 self)
 Add to MetaCart
Boosting (Freund & Schapire 1996, Schapire & Singer 1998) is one of the most important recent developments in classification methodology. The performance of many classification algorithms can often be dramatically improved by sequentially applying them to reweighted versions of the input data, and taking a weighted majority vote of the sequence of classifiers thereby produced. We show that this seemingly mysterious phenomenon can be understood in terms of well known statistical principles, namely additive modeling and maximum likelihood. For the twoclass problem, boosting can be viewed as an approximation to additive modeling on the logistic scale using maximum Bernoulli likelihood as a criterion. We develop more direct approximations and show that they exhibit nearly identical results to boosting. Direct multiclass generalizations based on multinomial likelihood are derived that exhibit performance comparable to other recently proposed multiclass generalizations of boosting in most...
On combining classifiers
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 1998
"... We develop a common theoretical framework for combining classifiers which use distinct pattern representations and show that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision. An experimental ..."
Abstract

Cited by 1392 (32 self)
 Add to MetaCart
We develop a common theoretical framework for combining classifiers which use distinct pattern representations and show that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision. An experimental comparison of various classifier combination schemes demonstrates that the combination rule developed under the most restrictive assumptions—the sum rule—outperforms other classifier combinations schemes. A sensitivity analysis of the various schemes to estimation errors is carried out to show that this finding can be justified theoretically.
Robust Realtime Object Detection
 International Journal of Computer Vision
, 2001
"... This paper describes a visual object detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the “Integral Image ” which allows the features ..."
Abstract

Cited by 1169 (4 self)
 Add to MetaCart
(Show Context)
This paper describes a visual object detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the “Integral Image ” which allows the features used by our detector to be computed very quickly. The second is a learning algorithm, based on AdaBoost, which selects a small number of critical visual features and yields extremely efficient classifiers [6]. The third contribution is a method for combining classifiers in a “cascade ” which allows background regions of the image to be quickly discarded while spending more computation on promising objectlike regions. A set of experiments in the domain of face detection are presented. The system yields face detection performace comparable to the best previous systems [18, 13, 16, 12, 1]. Implemented on a conventional desktop, face detection proceeds at 15 frames per second. 1.
Statistical pattern recognition: A review
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2000
"... The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques ..."
Abstract

Cited by 998 (30 self)
 Add to MetaCart
The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques and methods imported from statistical learning theory have bean receiving increasing attention. The design of a recognition system requires careful attention to the following issues: definition of pattern classes, sensing environment, pattern representation, feature extraction and selection, cluster analysis, classifier design and learning, selection of training and test samples, and performance evaluation. In spite of almost 50 years of research and development in this field, the general problem of recognizing complex patterns with arbitrary orientation, location, and scale remains unsolved. New and emerging applications, such as data mining, web searching, retrieval of multimedia data, face recognition, and cursive handwriting recognition, require robust and efficient pattern recognition techniques. The objective of this review paper is to summarize and compare some of the wellknown methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.
Improved Boosting Algorithms Using Confidencerated Predictions
 MACHINE LEARNING
, 1999
"... We describe several improvements to Freund and Schapire’s AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give a simplified analysis of AdaBoost in this setting, and we show how this analysis can be used to find impr ..."
Abstract

Cited by 933 (26 self)
 Add to MetaCart
We describe several improvements to Freund and Schapire’s AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give a simplified analysis of AdaBoost in this setting, and we show how this analysis can be used to find improved parameter settings as well as a refined criterion for training weak hypotheses. We give a specific method for assigning confidences to the predictions of decision trees, a method closely related to one used by Quinlan. This method also suggests a technique for growing decision trees which turns out to be identical to one proposed by Kearns and Mansour. We focus next on how to apply the new boosting algorithms to multiclass classification problems, particularly to the multilabel case in which each example may belong to more than one class. We give two boosting methods for this problem, plus a third method based on output coding. One of these leads to a new method for handling the singlelabel case which is simpler but as effective as techniques suggested by Freund and Schapire. Finally, we give some experimental results comparing a few of the algorithms discussed in this paper.
An Efficient Boosting Algorithm for Combining Preferences
, 1999
"... The problem of combining preferences arises in several applications, such as combining the results of different search engines. This work describes an efficient algorithm for combining multiple preferences. We first give a formal framework for the problem. We then describe and analyze a new boosting ..."
Abstract

Cited by 707 (18 self)
 Add to MetaCart
The problem of combining preferences arises in several applications, such as combining the results of different search engines. This work describes an efficient algorithm for combining multiple preferences. We first give a formal framework for the problem. We then describe and analyze a new boosting algorithm for combining preferences called RankBoost. We also describe an efficient implementation of the algorithm for certain natural cases. We discuss two experiments we carried out to assess the performance of RankBoost. In the first experiment, we used the algorithm to combine different WWW search strategies, each of which is a query expansion for a given domain. For this task, we compare the performance of RankBoost to the individual search strategies. The second experiment is a collaborativefiltering task for making movie recommendations. Here, we present results comparing RankBoost to nearestneighbor and regression algorithms.