Results 1  10
of
325,787
A dual coordinate descent method for largescale linear SVM.
 In ICML,
, 2008
"... Abstract In many applications, data appear with a huge number of instances as well as features. Linear Support Vector Machines (SVM) is one of the most popular tools to deal with such largescale sparse data. This paper presents a novel dual coordinate descent method for linear SVM with L1and L2l ..."
Abstract

Cited by 193 (18 self)
 Add to MetaCart
Abstract In many applications, data appear with a huge number of instances as well as features. Linear Support Vector Machines (SVM) is one of the most popular tools to deal with such largescale sparse data. This paper presents a novel dual coordinate descent method for linear SVM with L1and L2
Dual Coordinate Descent Methods for Logistic Regression and Maximum Entropy Models
"... Most optimization methods for logistic regression or maximum entropy solve the primal problem. They range from iterative scaling, coordinate descent, quasiNewton, and truncated Newton. Less efforts have been made to solve the dual problem. In contrast, for support vector machines (SVM), methods hav ..."
Abstract

Cited by 20 (8 self)
 Add to MetaCart
Most optimization methods for logistic regression or maximum entropy solve the primal problem. They range from iterative scaling, coordinate descent, quasiNewton, and truncated Newton. Less efforts have been made to solve the dual problem. In contrast, for support vector machines (SVM), methods
Machine Learning Journal manuscript No. (will be inserted by the editor) Dual Coordinate Descent Methods for Logistic Regression and Maximum Entropy Models
"... Abstract Most optimization methods for logistic regression or maximum entropy solve the primal problem. They range from iterative scaling, coordinate descent, quasiNewton, and truncated Newton. Less efforts have been made to solve the dual problem. In contrast, for linear support vector machines (S ..."
Abstract
 Add to MetaCart
Abstract Most optimization methods for logistic regression or maximum entropy solve the primal problem. They range from iterative scaling, coordinate descent, quasiNewton, and truncated Newton. Less efforts have been made to solve the dual problem. In contrast, for linear support vector machines
Regularization paths for generalized linear models via coordinate descent
, 2009
"... We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, twoclass logistic regression, and multinomial regression problems while the penalties include ℓ1 (the lasso), ℓ2 (ridge regression) and mixtures of the two (the elastic ..."
Abstract

Cited by 691 (14 self)
 Add to MetaCart
elastic net). The algorithms use cyclical coordinate descent, computed along a regularization path. The methods can handle large problems and can also deal efficiently with sparse features. In comparative timings we find that the new algorithms are considerably faster than competing methods.
Convergence of a block coordinate descent method for nondifferentiable minimization
 J. OPTIM THEORY APPL
, 2001
"... We study the convergence properties of a (block) coordinate descent method applied to minimize a nondifferentiable (nonconvex) function f(x1,...,xN) with certain separability and regularity properties. Assuming that f is continuous on a compact level set, the subsequence convergence of the iterate ..."
Abstract

Cited by 291 (3 self)
 Add to MetaCart
We study the convergence properties of a (block) coordinate descent method applied to minimize a nondifferentiable (nonconvex) function f(x1,...,xN) with certain separability and regularity properties. Assuming that f is continuous on a compact level set, the subsequence convergence
Pathwise coordinate optimization
, 2007
"... We consider “oneatatime ” coordinatewise descent algorithms for a class of convex optimization problems. An algorithm of this kind has been proposed for the L1penalized regression (lasso) in the lterature, but it seems to have been largely ignored. Indeed, it seems that coordinatewise algorith ..."
Abstract

Cited by 319 (17 self)
 Add to MetaCart
wise algorithms are not often used in convex optimization. We show that this algorithm is very competitive with the well known LARS (or homotopy) procedure in large lasso problems, and that it can be applied to related methods such as the garotte and elastic net. It turns out that coordinatewise descent does
Coordinate descent
"... Minimize for x ∈ RN the composite function F min x∈RN {F (x) = f(x) +ψ(x)} • f: RN → R, convex, differentiable, not strongly convex • ψ: RN → R ∪ {+∞}, convex, separable ψ(x) = n∑ i=1 ψi(x ..."
Abstract
 Add to MetaCart
Minimize for x ∈ RN the composite function F min x∈RN {F (x) = f(x) +ψ(x)} • f: RN → R, convex, differentiable, not strongly convex • ψ: RN → R ∪ {+∞}, convex, separable ψ(x) = n∑ i=1 ψi(x
Understanding FaultTolerant Distributed Systems
 COMMUNICATIONS OF THE ACM
, 1993
"... We propose a small number of basic concepts that can be used to explain the architecture of faulttolerant distributed systems and we discuss a list of architectural issues that we find useful to consider when designing or examining such systems. For each issue we present known solutions and design ..."
Abstract

Cited by 374 (23 self)
 Add to MetaCart
We propose a small number of basic concepts that can be used to explain the architecture of faulttolerant distributed systems and we discuss a list of architectural issues that we find useful to consider when designing or examining such systems. For each issue we present known solutions and design alternatives, we discuss their relative merits and we give examples of systems which adopt one approach or the other. The aim is to introduce some order in the complex discipline of designing and understanding faulttolerant distributed systems.
Stochastic Dual Coordinate Ascent Methods
, 2013
"... Stochastic Gradient Descent (SGD) has become popular for solving large scale supervised machine learning optimization problems such as SVM, due to their strong theoretical guarantees. While the closely related Dual Coordinate Ascent (DCA) method has been implemented in various software packages, it ..."
Abstract

Cited by 86 (11 self)
 Add to MetaCart
Stochastic Gradient Descent (SGD) has become popular for solving large scale supervised machine learning optimization problems such as SVM, due to their strong theoretical guarantees. While the closely related Dual Coordinate Ascent (DCA) method has been implemented in various software packages
Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2008
"... We consider the problem of estimating the parameters of a Gaussian or binary distribution in such a way that the resulting undirected graphical model is sparse. Our approach is to solve a maximum likelihood problem with an added ℓ1norm penalty term. The problem as formulated is convex but the memor ..."
Abstract

Cited by 322 (2 self)
 Add to MetaCart
but the memory requirements and complexity of existing interior point methods are prohibitive for problems with more than tens of nodes. We present two new algorithms for solving problems with at least a thousand nodes in the Gaussian case. Our first algorithm uses block coordinate descent, and can
Results 1  10
of
325,787