• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Greedy function approximation: a gradient boosting machine,” The (2001)

by J H Friedman
Venue:Annals of Statistics
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 1,001
Next 10 →

Stochastic Gradient Boosting

by Jerome H. Friedman, Of Known (y X)--values - Computational Statistics and Data Analysis , 1999
"... Gradient boosting constructs additive regression models by sequentially fitting a simple parameterized function (base learner) to current "pseudo"--residuals by least--squares at each iteration. The pseudo--residuals are the gradient of the loss functional being minimized, with respect to ..."
Abstract - Cited by 285 (1 self) - Add to MetaCart
Gradient boosting constructs additive regression models by sequentially fitting a simple parameterized function (base learner) to current "pseudo"--residuals by least--squares at each iteration. The pseudo--residuals are the gradient of the loss functional being minimized, with respect to the model values at each training data point, evaluated at the current step. It is shown that both the approximation accuracy and execution speed of gradient boosting can be substantially improved by incorporating randomization into the procedure. Specifically, at each iteration a subsample of the training data is drawn at random (without replacement) from the full training data set. This randomly selected subsample is then used in place of the full sample to fit the base learner and compute the model update for the current iteration. This randomized approach also increases robustness against overcapacity of the base learner. 1 Gradient Boosting In the function estimation problem one has a system con...

Sharing Visual Features for Multiclass And Multiview Object Detection

by Antonio Torralba, Kevin P. Murphy, William T. Freeman , 2004
"... We consider the problem of detecting a large number of different classes of objects in cluttered scenes. Traditional approaches require applying a battery of different classifiers to the image, at multiple locations and scales. This can be slow and can require a lot of training data, since each clas ..."
Abstract - Cited by 279 (6 self) - Add to MetaCart
We consider the problem of detecting a large number of different classes of objects in cluttered scenes. Traditional approaches require applying a battery of different classifiers to the image, at multiple locations and scales. This can be slow and can require a lot of training data, since each classifier requires the computation of many different image features. In particular, for independently trained detectors, the (run-time) computational complexity, and the (training-time) sample complexity, scales linearly with the number of classes to be detected. It seems unlikely that such an approach will scale up to allow recognition of hundreds or thousands of objects.
(Show Context)

Citation Context

...ackpropagation (gradient descent), the parameters are learned sequentially using weighted least squares plus exhaustive search (although boosting can be viewed as gradient descent in a function space =-=[13]-=-.) In practice, boosting is orders of magnitude faster than backprop. It is also more general in the sense that the weak learners do not have to be simple linear threshold units (decision stumps). 8 C...

Visual Tracking with Online Multiple Instance Learning

by Boris Babenko, Ming-hsuan Yang, Serge Belongie , 2009
"... In this paper, we address the problem of learning an adaptive appearance model for object tracking. In particular, a class of tracking techniques called “tracking by detection” have been shown to give promising results at realtime speeds. These methods train a discriminative classifier in an online ..."
Abstract - Cited by 261 (19 self) - Add to MetaCart
In this paper, we address the problem of learning an adaptive appearance model for object tracking. In particular, a class of tracking techniques called “tracking by detection” have been shown to give promising results at realtime speeds. These methods train a discriminative classifier in an online manner to separate the object from the background. This classifier bootstraps itself by using the current tracker state to extract positive and negative examples from the current frame. Slight inaccuracies in the tracker can therefore lead to incorrectly labeled training examples, which degrades the classifier and can cause further drift. In this paper we show that using Multiple Instance Learning (MIL) instead of traditional supervised learning avoids these problems, and can therefore lead to a more robust tracker with fewer parameter tweaks. We present a novel online MIL algorithm for object tracking that achieves superior results with real-time performance. 1.
(Show Context)

Citation Context

...olving the MIL problem [9, 2, 23]. The algorithm that is most closely related to our work is the MILBoost algorithm proposed by Viola et al. in [23]. MILBoost uses the the gradient boosting framework =-=[13]-=- to train a boosting classifier that maximizes the log likelihood of bags: log L = ∑ ( ) log p(yi|Xi) (4) i Notice that the likelihood is defined over bags and not instances, because instance labels a...

Boosting with the L_2-Loss: Regression and Classification

by Peter Bühlmann, Bin Yu , 2001
"... This paper investigates a variant of boosting, L 2 Boost, which is constructed from a functional gradient descent algorithm with the L 2 -loss function. Based on an explicit stagewise re tting expression of L 2 Boost, the case of (symmetric) linear weak learners is studied in detail in both regressi ..."
Abstract - Cited by 208 (17 self) - Add to MetaCart
This paper investigates a variant of boosting, L 2 Boost, which is constructed from a functional gradient descent algorithm with the L 2 -loss function. Based on an explicit stagewise re tting expression of L 2 Boost, the case of (symmetric) linear weak learners is studied in detail in both regression and two-class classification. In particular, with the boosting iteration m working as the smoothing or regularization parameter, a new exponential bias-variance trade off is found with the variance (complexity) term bounded as m tends to infinity. When the weak learner is a smoothing spline, an optimal rate of convergence result holds for both regression and two-class classification. And this boosted smoothing spline adapts to higher order, unknown smoothness. Moreover, a simple expansion of the 0-1 loss function is derived to reveal the importance of the decision boundary, bias reduction, and impossibility of an additive bias-variance decomposition in classification. Finally, simulation and real data set results are obtained to demonstrate the attractiveness of L 2 Boost, particularly with a novel component-wise cubic smoothing spline as an effective and practical weak learner.

Using the Forest to See the Trees: A Graphical Model Relating Features, Objects, and Scenes

by Kevin Murphy, Antonio Torralba, William T. Freeman , 2003
"... Standard approaches to object detection focus on local patches of the image, and try to classify them as background or not. We propose to use the scene context (image as a whole) as an extra source of (global) information, to help resolve local ambiguities. We present a conditional random field ..."
Abstract - Cited by 175 (13 self) - Add to MetaCart
Standard approaches to object detection focus on local patches of the image, and try to classify them as background or not. We propose to use the scene context (image as a whole) as an extra source of (global) information, to help resolve local ambiguities. We present a conditional random field for jointly solving the tasks of object detection and scene classification.
(Show Context)

Citation Context

...l support, we need to down-sample the image to detect large objects. 2.2 Classifier Following [24], our detectors are based on a classifier trained using boosting. There are many variants of boosting =-=[10, 9, 17]-=-, which differ in the loss function they are trying to optimize, and in the gradient directions which they follow. We, and others [14], have found that GentleBoost [10] gives higher performance than A...

FloatBoost Learning and Statistical Face Detection

by Stan Z. Li, ZhenQiu Zhang - IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE , 2004
"... A novel learning procedure, called FloatBoost, is proposed for learning a boosted classifier for achieving the minimum error rate. FloatBoost learning uses a backtrack mechanism after each iteration of AdaBoost learning to minimize the error rate directly, rather than minimizing an exponential fun ..."
Abstract - Cited by 168 (6 self) - Add to MetaCart
A novel learning procedure, called FloatBoost, is proposed for learning a boosted classifier for achieving the minimum error rate. FloatBoost learning uses a backtrack mechanism after each iteration of AdaBoost learning to minimize the error rate directly, rather than minimizing an exponential function of the margin as in the traditional AdaBoost algorithms. A second contribution of the paper is a novel statistical model for learning best weak classifiers using a stagewise approximation of the posterior probability. These novel techniques lead to a classifier which requires fewer weak classifiers than AdaBoost yet achieves lower error rates in both training and testing, as demonstrated by extensive experiments. Applied to face detection, the FloatBoost learning method, together with a proposed detector pyramid architecture, leads to the first real-time multiview face detection system reported.

Predicting clicks: Estimating the click-through rate for new ads

by Matthew Richardson - In Proceedings of the 16th International World Wide Web Conference (WWW-07 , 2007
"... Search engine advertising has become a significant element of the Web browsing experience. Choosing the right ads for the query and the order in which they are displayed greatly affects the probability that a user will see and click on each ad. This ranking has a strong impact on the revenue the sea ..."
Abstract - Cited by 166 (1 self) - Add to MetaCart
Search engine advertising has become a significant element of the Web browsing experience. Choosing the right ads for the query and the order in which they are displayed greatly affects the probability that a user will see and click on each ad. This ranking has a strong impact on the revenue the search engine receives from the ads. Further, showing the user an ad that they prefer to click on improves user satisfaction. For these reasons, it is important to be able to accurately estimate the click-through rate of ads in the system. For ads that have been displayed repeatedly, this is empirically measurable, but for new ads, other means must be used. We show that we can use features of ads, terms, and advertisers to learn a model that accurately predicts the click-though rate for new ads. We also show that using our model improves the convergence and performance of an advertising system. As a result, our model increases both revenue and user satisfaction.
(Show Context)

Citation Context

...er were found to be statistically significant (p < 0.01). In preliminary experiments, we also measured the performance using boosted regression trees (we used MART: multiple additive regression trees =-=[9]-=-). They were found to have no significant improvement over logistic regression. Thus, for ease of interpretation and simplicity, we continued with logistic regression for the remainder of the experime...

Empirical margin distributions and bounding the generalization error of combined classifiers

by V. Koltchinskii, D. Panchenko - Ann. Statist , 2002
"... Dedicated to A.V. Skorohod on his seventieth birthday We prove new probabilistic upper bounds on generalization error of complex classifiers that are combinations of simple classifiers. Such combinations could be implemented by neural networks or by voting methods of combining the classifiers, such ..."
Abstract - Cited by 158 (11 self) - Add to MetaCart
Dedicated to A.V. Skorohod on his seventieth birthday We prove new probabilistic upper bounds on generalization error of complex classifiers that are combinations of simple classifiers. Such combinations could be implemented by neural networks or by voting methods of combining the classifiers, such as boosting and bagging. The bounds are in terms of the empirical distribution of the margin of the combined classifier. They are based on the methods of the theory of Gaussian and empirical processes (comparison inequalities, symmetrization method, concentration inequalities) and they improve previous results of Bartlett (1998) on bounding the generalization error of neural networks in terms of ℓ1-norms of the weights of neurons and of Schapire, Freund, Bartlett and Lee (1998) on bounding the generalization error of boosting. We also obtain rates of convergence in Lévy distance of empirical margin distribution to the true margin distribution uniformly over the classes of classifiers and prove the optimality of these rates.

Boosting Algorithms as Gradient Descent

by Llew Mason, Jonathan Baxter, Peter Bartlett, Marcus Frean , 2000
"... Much recent attention, both experimental and theoretical, has been focussed on classification algorithms which produce voted combinations of classifiers. Recent theoretical work has shown that the impressive generalization performance of algorithms like AdaBoost can be attributed to the classifier h ..."
Abstract - Cited by 156 (1 self) - Add to MetaCart
Much recent attention, both experimental and theoretical, has been focussed on classification algorithms which produce voted combinations of classifiers. Recent theoretical work has shown that the impressive generalization performance of algorithms like AdaBoost can be attributed to the classifier having large margins on the training data. We present an abstract algorithm for finding linear combinations of functions that minimize arbitrary cost functionals (i.e functionals that do not necessarily depend on the margin). Many existing voting methods can be shown to be special cases of this abstract algorithm. Then, following previous theoretical results bounding the generalization performance of convex combinations of classifiers in terms of general cost functions of the margin, we present a new algorithm (DOOM II) for performing a gradient descent optimization of such cost functions. Experiments on
(Show Context)

Citation Context

...g algorithm [2] have been found to give significant performance improvements over algorithms for the corresponding base classifiers [6, 9, 16, 5], and have led to the study of many related algorithms =-=[3, 19, 12, 17, 7, 11, 8]-=-. Recent theoretical results suggest that the effectiveness of these algorithms is due to their tendency to produce large margin classifiers . The margin of an example is defined as the difference bet...

Robust Object Tracking with Online Multiple Instance Learning

by Boris Babenko, Ming-hsuan Yang, Serge Belongie , 2011
"... In this paper, we address the problem of tracking an object in a video given its location in the first frame and no other information. Recently, a class of tracking techniques called “tracking by detection ” has been shown to give promising results at real-time speeds. These methods train a discrim ..."
Abstract - Cited by 140 (7 self) - Add to MetaCart
In this paper, we address the problem of tracking an object in a video given its location in the first frame and no other information. Recently, a class of tracking techniques called “tracking by detection ” has been shown to give promising results at real-time speeds. These methods train a discriminative classifier in an online manner to separate the object from the background. This classifier bootstraps itself by using the current tracker state to extract positive and negative examples from the current frame. Slight inaccuracies in the tracker can therefore lead to incorrectly labeled training examples, which degrade the classifier and can cause drift. In this paper, we show that using Multiple Instance Learning (MIL) instead of traditional supervised learning avoids these problems and can therefore lead to a more robust tracker with fewer parameter tweaks. We propose a novel online MIL algorithm for object tracking that achieves superior results with real-time performance. We present thorough experimental results (both qualitative and quantitative) on a number of challenging video clips.
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University