Results 1  10
of
14
Efficient Accelerated Coordinate Descent Methods and Faster Algorithms for Solving Linear Systems
"... In this paper we show how to accelerate randomized coordinate descent methods and achieve faster convergence rates without paying periteration costs in asymptotic running time. In particular, we show how to generalize and efficiently implement a method proposed by Nesterov, giving faster asymptotic ..."
Abstract

Cited by 23 (6 self)
 Add to MetaCart
(Show Context)
In this paper we show how to accelerate randomized coordinate descent methods and achieve faster convergence rates without paying periteration costs in asymptotic running time. In particular, we show how to generalize and efficiently implement a method proposed by Nesterov, giving faster asymptotic running times for various algorithms that use standard coordinate descent as a black box. In addition to providing a proof of convergence for this new general method, we show that it is numerically stable, efficiently implementable, and in certain regimes, asymptotically optimal. To highlight the computational power of this algorithm, we show how it can used to create faster linear system solvers in several regimes: • We show how this method achieves a faster asymptotic runtime than conjugate gradient for solving a broad class of symmetric positive definite systems of equations. • We improve the best known asymptotic convergence guarantees for Kaczmarz methods, a popular technique for image reconstruction and solving overdetermined systems of equations, by accelerating a randomized algorithm of Strohmer and Vershynin. • We achieve the best known running time for solving Symmetric Diagonally Dominant (SDD) system of equations in the unitcost RAM model, obtaining an O(m log3/2 n log logn log ( logn)) asymptotic running time by accelerating a recent solver by Kelner et al. Beyond the independent interest of these solvers, we believe they highlight the versatility of the approach of this paper and we hope that they will open the door for further algorithmic improvements in the future. 1 ar
Navigating Central Path with Electrical Flows: From Flows to Matchings, and Back
 FOCS
, 2013
"... We present an Õ(m ..."
(Show Context)
Matching the universal barrier without paying the costs : Solving linear programs with Õ( √ rank) linear system solves
 CoRR
"... In this paper we present a new algorithm for solving linear programs that requires only Õ( rank(A)L) iterations where A is the constraint matrix of a linear program with m constraints and n variables and L is the bit complexity of a linear program. Each iteration of our method consists of solving ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
In this paper we present a new algorithm for solving linear programs that requires only Õ( rank(A)L) iterations where A is the constraint matrix of a linear program with m constraints and n variables and L is the bit complexity of a linear program. Each iteration of our method consists of solving Õ(1) linear systems and additional nearly linear time computation. Our method improves upon the previous best iteration bound by factor of Ω̃((m / rank(A))1/4) for methods with polynomial time computable iterations and by Ω̃((m / rank(A))1/2) for methods which solve at most Õ(1) linear systems in each iteration. Our method is parallelizable and amenable to linear algebraic techniques for accelerating the linear system solver. As such, up to polylogarithmic factors we either match or improve upon the best previous running times for solving linear programs in both depth and work for different ratios of m and rank(A). Moreover, our method matches up to polylogarithmic factors a theoretical limit established by Nesterov and Nemirovski in 1994 regarding the use of a “universal barrier ” for interior point methods, thereby resolving a longstanding open question regarding the running time of polynomial time interior point methods for linear programming. 1
A Novel, Simple Interpretation of Nesterov’s Accelerated Method as a Combination of Gradient and Mirror Descent. ArXiv eprints, abs/1407.1537
, 2014
"... Firstorder methods play a central role in largescale convex optimization. Even though many variations exist, each suited to a particular problem form, almost all such methods fundamentally rely on two types of algorithmic steps and two corresponding types of analysis: gradientdescent steps, which ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Firstorder methods play a central role in largescale convex optimization. Even though many variations exist, each suited to a particular problem form, almost all such methods fundamentally rely on two types of algorithmic steps and two corresponding types of analysis: gradientdescent steps, which yield primal progress, and mirrordescent steps, which yield dual progress. In this paper, we observe that the performances of these two types of step are complementary, so that faster algorithms can be designed by coupling the two steps and combining their analyses. In particular, we show how to obtain a conceptually simple interpretation of Nesterov’s accelerated gradient method [Nes83, Nes04, Nes05], a cornerstone algorithm in convex optimization. Nesterov’s method is the optimal firstorder method for the class of smooth convex optimization problems. However, to the best of our knowledge, the proof of the fast convergence of Nesterov’s method has not found a clear interpretation and is still regarded by many as crucially relying on an “algebraic trick”[Jud13]. We apply our novel insights to express Nesterov’s algorithm as a natural coupling of gradient descent and mirror descent and to write its proof of convergence as a simple combination of the convergence analyses of the two underlying steps. We believe that the complementary view of gradient descent and mirror descent proposed in this paper will prove very useful in the design of firstorder methods as it allows us to design fast algorithms in a conceptually easier way. For instance, our view greatly facilitates the adaptation of nontrivial variants of Nesterov’s method to specific scenarios, such as packing and covering problems [AO14b, AO14a]. ar X iv
Following the Path of Least resistence: An Õ(m √n) Algorithm for the Minimum Cost Flow Problem
, 2013
"... In this paper we present an Õ(m√n log2 U) time algorithm for solving the maximum flow problem on directed graphs with m edges, n vertices, and capacity ratio U. This improves upon the previous fastest running time of O(mmin (n2/3,m1/2) log (n2/m) logU) achieved over 15 years ago by Goldberg and Rao ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
In this paper we present an Õ(m√n log2 U) time algorithm for solving the maximum flow problem on directed graphs with m edges, n vertices, and capacity ratio U. This improves upon the previous fastest running time of O(mmin (n2/3,m1/2) log (n2/m) logU) achieved over 15 years ago by Goldberg and Rao [8] and improves upon the previous best running times for solving dense directed unit capacity graphs of O(min{m3/2,mn2/3}) achieved by Even and Tarjan [6] over 35 years ago and a running time of O(m10/7) achieved recently by Madry [21]. We achieve these results through the development and application of a new general interior point method that we believe is of independent interest. The number of iterations required by this algorithm is better than that predicted by analyzing the best selfconcordant barrier of the feasible region. By applying this method to the linear programming formulations of maximum flow, minimum cost flow, and lossy generalized minimum cost flow and applying analysis by Daitch and Spielman[5] we achieve running time of Õ(m√n log2(U/)) for these problems as well. Furthermore, our algorithm is parallelizable and using a recent nearly linear time work polylogarithmic depth Laplacian system solver of Spielman and Peng [25] we achieve a Õ(√n log2(U/)) depth algorithm and Õ(m√n log2(U/)) work algorithm for solving these problems.
Matching triangles and basing hardness on an extremely popular conjecture
 STOC'15
, 2015
"... ..."
Constructing LinearSized Spectral Sparsification in AlmostLinear Time
"... We present the first almostlinear time algorithm for constructing linearsized spectral sparsification for graphs. This improves all previous constructions of linearsized spectral sparsification, which requires Ω(n2) time [1], [2], [3]. A key ingredient in our algorithm is a novel combination of t ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
We present the first almostlinear time algorithm for constructing linearsized spectral sparsification for graphs. This improves all previous constructions of linearsized spectral sparsification, which requires Ω(n2) time [1], [2], [3]. A key ingredient in our algorithm is a novel combination of two techniques used in literature for constructing spectral sparsification: Random sampling by effective resistance [4], and adaptive constructions based on barrier functions [1], [3]. Keywords algorithmic spectral graph theory; spectral sparsification I.
1Stochastic Spectral Descent for Discrete Graphical Models
"... Abstract—Interest in deep probabilistic graphical models has increased in recent years, due to their stateoftheart performance on many machine learning applications. Such models are typically trained with the stochastic gradient method, which can take a significant number of iterations to conver ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Interest in deep probabilistic graphical models has increased in recent years, due to their stateoftheart performance on many machine learning applications. Such models are typically trained with the stochastic gradient method, which can take a significant number of iterations to converge. Since the computational cost of gradient estimation is prohibitive even for modestlysized models, training becomes slow and practicallyusable models are kept small. In this paper we propose a new, largely tuningfree algorithm to address this problem. Our approach derives novel majorization bounds based on the Schatten ∞ norm. Intriguingly, the minimizers of these bounds can be interpreted as gradient methods in a nonEuclidean space. We thus propose using a stochastic gradient method in nonEuclidean space. We both provide simple conditions under which our algorithm is guaranteed to converge, and demonstrate empirically that our algorithm leads to dramatically faster training and improved predictive ability compared to stochastic gradient descent for both directed and undirected graphical models. I.