Results 1 - 10
of
57,630
Greedy Function Approximation: A Gradient Boosting Machine
- Annals of Statistics
, 2000
"... Function approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest{descent minimization. A general gradient{descent \boosting" paradigm is developed for additi ..."
Abstract
-
Cited by 951 (12 self)
- Add to MetaCart
Function approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest{descent minimization. A general gradient{descent \boosting" paradigm is developed
Stagewise Lasso Stagewise Lasso
"... Many statistical machine learning algorithms (in regression or classification) minimize either an empirical loss function as in AdaBoost, or a penalized empirical loss as in SVM. A single regularization tuning parameter controls the trade-off between fidelity to the data and generalibility, or equiv ..."
Abstract
- Add to MetaCart
Stagewise Fitting (FSF) (aka e-Boosting) are of great interest because of their resulted sparse models for interpretation in addition to prediction. In this paper, we propose the BLasso algorithm that ties the FSF (e-Boosting) algo-rithm with the Lasso method that minimizes the L1 penalized L2 loss. BLasso
Projection Pursuit Regression
- Journal of the American Statistical Association
, 1981
"... A new method for nonparametric multiple regression is presented. The procedure models the regression surface as a sum of general- smooth functions of linear combinations of the predictor variables in an iterative manner. It is more general than standard stepwise and stagewise regression procedures, ..."
Abstract
-
Cited by 555 (6 self)
- Add to MetaCart
A new method for nonparametric multiple regression is presented. The procedure models the regression surface as a sum of general- smooth functions of linear combinations of the predictor variables in an iterative manner. It is more general than standard stepwise and stagewise regression procedures
Least angle regression
- Ann. Statist
"... The purpose of model selection algorithms such as All Subsets, Forward Selection and Backward Elimination is to choose a linear model on the basis of the same set of data to which the model will be applied. Typically we have available a large collection of possible covariates from which we hope to s ..."
Abstract
-
Cited by 1308 (43 self)
- Add to MetaCart
to select a parsimonious set for the efficient prediction of a response variable. Least Angle Regression (LARS), a new model selection algorithm, is a useful and less greedy version of traditional forward selection methods. Three main properties are derived: (1) A simple modification of the LARS algorithm
Cost-Aware WWW Proxy Caching Algorithms
- IN PROCEEDINGS OF THE 1997 USENIX SYMPOSIUM ON INTERNET TECHNOLOGY AND SYSTEMS
, 1997
"... Web caches can not only reduce network traffic and downloading latency, but can also affect the distribution of web traffic over the network through costaware caching. This paper introduces GreedyDualSize, which incorporates locality with cost and size concerns in a simple and non-parameterized fash ..."
Abstract
-
Cited by 544 (6 self)
- Add to MetaCart
Web caches can not only reduce network traffic and downloading latency, but can also affect the distribution of web traffic over the network through costaware caching. This paper introduces GreedyDualSize, which incorporates locality with cost and size concerns in a simple and non
For Most Large Underdetermined Systems of Linear Equations the Minimal ℓ1-norm Solution is also the Sparsest Solution
- Comm. Pure Appl. Math
, 2004
"... We consider linear equations y = Φα where y is a given vector in R n, Φ is a given n by m matrix with n < m ≤ An, and we wish to solve for α ∈ R m. We suppose that the columns of Φ are normalized to unit ℓ 2 norm 1 and we place uniform measure on such Φ. We prove the existence of ρ = ρ(A) so that ..."
Abstract
-
Cited by 560 (10 self)
- Add to MetaCart
. In contrast, heuristic attempts to sparsely solve such systems – greedy algorithms and thresholding – perform poorly in this challenging setting. The techniques include the use of random proportional embeddings and almost-spherical sections in Banach space theory, and deviation bounds for the eigenvalues
Managing Energy and Server Resources in Hosting Centers
- In Proceedings of the 18th ACM Symposium on Operating System Principles (SOSP
, 2001
"... Interact hosting centers serve multiple service sites from a common hardware base. This paper presents the design and implementation of an architecture for resource management in a hosting center op-erating system, with an emphasis on energy as a driving resource management issue for large server cl ..."
Abstract
-
Cited by 558 (37 self)
- Add to MetaCart
by estimating the value of their effects on service performance. A greedy resource allocation algorithm adjusts resource prices to balance supply and demand, allocating resources to their most efficient use. A reconfigurable server switching infrastructure directs request traffic to the servers assigned to each
A Threshold of ln n for Approximating Set Cover
- JOURNAL OF THE ACM
, 1998
"... Given a collection F of subsets of S = f1; : : : ; ng, set cover is the problem of selecting as few as possible subsets from F such that their union covers S, and max k-cover is the problem of selecting k subsets from F such that their union has maximum cardinality. Both these problems are NP-har ..."
Abstract
-
Cited by 778 (5 self)
- Add to MetaCart
-hard. We prove that (1 \Gamma o(1)) ln n is a threshold below which set cover cannot be approximated efficiently, unless NP has slightly superpolynomial time algorithms. This closes the gap (up to low order terms) between the ratio of approximation achievable by the greedy algorithm (which is (1 \Gamma
Stagewise Lasso
"... Many statistical machine learning algorithms minimize either an empirical loss function as in AdaBoost, or a penalized empirical loss as in Lasso or SVM. A single regularization tuning parameter controls the trade-off between fidelity to the data and generalizability, or equivalently between bias an ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
Stagewise Fitting (FSF) (aka e-Boosting) are of great interest because of their resulted sparse models for interpretation in addition to prediction. In this paper, we propose the BLasso algorithm that ties the FSF (e-Boosting) algorithm with the Lasso method that minimizes the L1 penalized L2 loss. BLasso
Additive Logistic Regression: a Statistical View of Boosting
- Annals of Statistics
, 1998
"... Boosting (Freund & Schapire 1996, Schapire & Singer 1998) is one of the most important recent developments in classification methodology. The performance of many classification algorithms can often be dramatically improved by sequentially applying them to reweighted versions of the input dat ..."
Abstract
-
Cited by 1719 (25 self)
- Add to MetaCart
Boosting (Freund & Schapire 1996, Schapire & Singer 1998) is one of the most important recent developments in classification methodology. The performance of many classification algorithms can often be dramatically improved by sequentially applying them to reweighted versions of the input data, and taking a weighted majority vote of the sequence of classifiers thereby produced. We show that this seemingly mysterious phenomenon can be understood in terms of well known statistical principles, namely additive modeling and maximum likelihood. For the two-class problem, boosting can be viewed as an approximation to additive modeling on the logistic scale using maximum Bernoulli likelihood as a criterion. We develop more direct approximations and show that they exhibit nearly identical results to boosting. Direct multi-class generalizations based on multinomial likelihood are derived that exhibit performance comparable to other recently proposed multi-class generalizations of boosting in most...
Results 1 - 10
of
57,630