Results 1 
6 of
6
Convex relaxations of structured matrix factorizations
, 2013
"... We consider the factorization of a rectangular matrix X into a positive linear combination of rankone factors of the form uv ⊤ , where u and v belongs to certain sets U and V, that may encode specific structures regarding the factors, such as positivity or sparsity. In this paper, we show that comp ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
(Show Context)
We consider the factorization of a rectangular matrix X into a positive linear combination of rankone factors of the form uv ⊤ , where u and v belongs to certain sets U and V, that may encode specific structures regarding the factors, such as positivity or sparsity. In this paper, we show that computing the optimal decomposition is equivalent to computing a certain gauge function of X and we provide a detailed analysis of these gauge functions and their polars. Since these gaugefunctions are typically hard to compute, we present semidefinite relaxations and several algorithms that may recover approximate decompositions with approximation guarantees. We illustrate our results with simulations on finding decompositions with elements in {0,1}. As side contributions, we present a detailed analysis of variational quadratic representations of norms as well as a new iterative basis pursuit algorithm that can deal with inexact firstorder oracles. 1
Globally Convergent Parallel MAP LP Relaxation Solver using the FrankWolfe Algorithm
"... Estimating the most likely configuration (MAP) is one of the fundamental tasks in probabilistic models. While MAP inference is typically intractable for many realworld applications, linear programming relaxations have been proven very effective. Dual blockcoordinate descent methods are among th ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Estimating the most likely configuration (MAP) is one of the fundamental tasks in probabilistic models. While MAP inference is typically intractable for many realworld applications, linear programming relaxations have been proven very effective. Dual blockcoordinate descent methods are among the most efficient solvers, however, they are prone to get stuck in suboptimal points. Although subgradient approaches achieve global convergence, they are typically slower in practice. To improve convergence speed, algorithms which compute the steepest descent direction by solving a quadratic program have been proposed. In this paper we suggest to decouple the quadratic program based on the FrankWolfe approach. This allows us to obtain an efficient and easy to parallelize algorithm while retaining the global convergence properties. Our method proves superior when compared to existing algorithms on a set of spinglass models and protein design tasks. 1.
A Greedy Framework for FirstOrder Optimization
"... Introduction. Recent work has shown many connections between conditional gradient and other firstorder optimization methods, such as herding [3] and subgradient descent [2]. By considering a type of proximal conditional method, which we call boosted mirror descent (BMD), we are able to unify all of ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Introduction. Recent work has shown many connections between conditional gradient and other firstorder optimization methods, such as herding [3] and subgradient descent [2]. By considering a type of proximal conditional method, which we call boosted mirror descent (BMD), we are able to unify all of these algorithms into a single framework, which can be interpreted as taking successive argmins of a sequence of surrogate functions. Using a standard online learning analysis based on
• Assumptions – f: Rn → R Lipschitzcontinuous ⇒ f ∗ has compact support C
, 2013
"... Wolfe’s universal algorithm www.di.ens.fr/~fbach/wolfe_anonymous.pdf Conditional gradients everywhere • Conditional gradient and subgradient method – Fenchel duality – Generalized conditional gradient and mirror descent • Conditional gradient and greedy algorithms – Relationship with basis pursuit, ..."
Abstract
 Add to MetaCart
Wolfe’s universal algorithm www.di.ens.fr/~fbach/wolfe_anonymous.pdf Conditional gradients everywhere • Conditional gradient and subgradient method – Fenchel duality – Generalized conditional gradient and mirror descent • Conditional gradient and greedy algorithms – Relationship with basis pursuit, matching pursuit • Conditional gradient and herding – Properties of conditional gradient iterates – Relationships with sampling Composite optimization problems min x∈Rp h(x) + f(Ax)
Larry Wasserman, CoChair
, 2015
"... analysis, randomized algorithms. To my pampering parents, C.P. and Nalini. This thesis makes fundamental computational and statistical advances in testing and estimation, making critical progress in theory and application of classical statistical methods like classification, regression and hypothes ..."
Abstract
 Add to MetaCart
analysis, randomized algorithms. To my pampering parents, C.P. and Nalini. This thesis makes fundamental computational and statistical advances in testing and estimation, making critical progress in theory and application of classical statistical methods like classification, regression and hypothesis testing, and understanding the relationships between them. Our work connects multiple fields in often counterintuitive and surprising ways, leading to new theory, new algorithms, and new insights, and ultimately to a crossfertilization of varied fields like optimization, statistics and machine learning. The first of three thrusts has to do with active learning, a form of sequential learning from feedbackdriven queries that often has a provable statistical advantage over passive learning. We unify concepts from two seemingly different areas — active learning and stochastic firstorder optimization. We use this unified view to develop new lower bounds for stochastic optimization using tools from active learning and new algorithms for active learning using ideas from optimization. We also study the effect of feature noise, or errorsinvariables, on