Results 1  10
of
31
Robust 1Bit Compressive Sensing via Binary Stable Embeddings of Sparse Vectors
, 2011
"... The Compressive Sensing (CS) framework aims to ease the burden on analogtodigital converters (ADCs) by reducing the sampling rate required to acquire and stably recover sparse signals. Practical ADCs not only sample but also quantize each measurement to a finite number of bits; moreover, there is ..."
Abstract

Cited by 85 (26 self)
 Add to MetaCart
The Compressive Sensing (CS) framework aims to ease the burden on analogtodigital converters (ADCs) by reducing the sampling rate required to acquire and stably recover sparse signals. Practical ADCs not only sample but also quantize each measurement to a finite number of bits; moreover, there is an inverse relationship between the achievable sampling rate and the bit depth. In this paper, we investigate an alternative CS approach that shifts the emphasis from the sampling rate to the number of bits per measurement. In particular, we explore the extreme case of 1bit CS measurements, which capture just their sign. Our results come in two flavors. First, we consider ideal reconstruction from noiseless 1bit measurements and provide a lower bound on the best achievable reconstruction error. We also demonstrate that a large class of measurement mappings achieve this optimal bound. Second, we consider reconstruction robustness to measurement errors and noise and introduce the Binary ɛStable Embedding (BɛSE) property, which characterizes the robustness measurement process to sign changes. We show the same class of matrices that provide optimal noiseless performance also enable such a robust mapping. On the practical side, we introduce the Binary Iterative Hard Thresholding (BIHT) algorithm for signal reconstruction from 1bit measurements that offers stateoftheart performance.
Better MiniBatch Algorithms via Accelerated Gradient Methods
"... Minibatch algorithms have been proposed as a way to speedup stochastic convex optimization problems. We study how such algorithms can be improved using accelerated gradient methods. We provide a novel analysis, which shows how standard gradient methods may sometimes be insufficient to obtain a sig ..."
Abstract

Cited by 29 (6 self)
 Add to MetaCart
(Show Context)
Minibatch algorithms have been proposed as a way to speedup stochastic convex optimization problems. We study how such algorithms can be improved using accelerated gradient methods. We provide a novel analysis, which shows how standard gradient methods may sometimes be insufficient to obtain a significant speedup and propose a novel accelerated gradient algorithm, which deals with this deficiency, enjoys a uniformly superior guarantee and works well in practice. 1
Projectionfree Online Learning
"... The computational bottleneck in applying online learning to massive data sets is usually the projection step. We present efficient online learning algorithms that eschew projections in favor of much more efficient linear optimization steps using the FrankWolfe technique. We obtain a range of regret ..."
Abstract

Cited by 26 (5 self)
 Add to MetaCart
The computational bottleneck in applying online learning to massive data sets is usually the projection step. We present efficient online learning algorithms that eschew projections in favor of much more efficient linear optimization steps using the FrankWolfe technique. We obtain a range of regret bounds for online convex optimization, with better bounds for specific cases such as stochastic online smooth convex optimization. Besides the computational advantage, other desirable features of our algorithms are that they are parameterfree in the stochastic case and produce sparse decisions. We apply our algorithms to computationally intensive applications of collaborative filtering, and show the theoretical improvements to be clearly visible on standard datasets. 1.
Concentrationbased guarantees for lowrank matrix reconstruction
 24th Annual Conference on Learning Theory (COLT
, 2011
"... We consider the problem of approximately reconstructing a partiallyobserved, approximately lowrank matrix. This problem has received much attention lately, mostly using the tracenorm as a surrogate to the rank. Here we study lowrank matrix reconstruction using both the tracenorm, as well as the ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
We consider the problem of approximately reconstructing a partiallyobserved, approximately lowrank matrix. This problem has received much attention lately, mostly using the tracenorm as a surrogate to the rank. Here we study lowrank matrix reconstruction using both the tracenorm, as well as the lessstudied maxnorm, and present reconstruction guarantees based on existing analysis on the Rademacher complexity of the unit balls of these norms. We show how these are superior in several ways to recently published guarantees based on specialized analysis.
Sparse Prediction with the kSupport Norm
 IN: NEURAL INFORMATION PROCESSING SYSTEMS (CIT. ON
, 2014
"... ..."
Beating sgd: Learning svms in sublinear time
 In NIPS
, 2011
"... We present an optimization approach for linear SVMs based on a stochastic primaldual approach, where the primal step is akin to an importanceweighted SGD, and the dual step is a stochastic update on the importance weights. This yields an optimization method with a sublinear dependence on the train ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
(Show Context)
We present an optimization approach for linear SVMs based on a stochastic primaldual approach, where the primal step is akin to an importanceweighted SGD, and the dual step is a stochastic update on the importance weights. This yields an optimization method with a sublinear dependence on the training set size, and the first method for learning linear SVMs with runtime less then the size of the training set required for learning! 1
OnePass AUC Optimization
"... AUC is an important performance measure and many algorithms have been devoted to AUC optimization, mostly by minimizing a surrogate convex loss on a training data set. In this work, we focus on onepass AUC optimization that requires going through the training data only once without storing the enti ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
AUC is an important performance measure and many algorithms have been devoted to AUC optimization, mostly by minimizing a surrogate convex loss on a training data set. In this work, we focus on onepass AUC optimization that requires going through the training data only once without storing the entire training dataset, where conventional online learning algorithms cannot be applied directly because AUC is measured by a sum of losses defined over pairs of instances from different classes. We develop a regressionbased algorithm which only needs to maintain the first and secondorder statistics of training data in memory, resulting a storage requirement independent from the size of training data. To efficiently handle highdimensional data, we develop a randomized algorithm that approximates the covariance matrices by lowrank matrices. We verify, both theoretically and empirically, the effectiveness of the proposed algorithm.
Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
 ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS
, 2014
"... ..."
(Show Context)
Learning Optimally Sparse Support Vector Machines
"... We show how to train SVMs with an optimal guarantee on the number of support vectors (up to constants), and with sample complexity and training runtime bounds matching the best known for kernel SVM optimization (i.e. without any additional asymptotic cost beyond standard SVM training). Our method is ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
We show how to train SVMs with an optimal guarantee on the number of support vectors (up to constants), and with sample complexity and training runtime bounds matching the best known for kernel SVM optimization (i.e. without any additional asymptotic cost beyond standard SVM training). Our method is simple to implement and works well in practice. 1.
Collaborative Filtering with the Trace Norm: Learning, Bounding, and Transducing
"... Tracenorm regularization is a widelyused and successful approach for collaborative filtering and matrix completion. However, its theoretical understanding is surprisingly weak, and despite previous attempts, there are no distributionfree, nontrivial learning guarantees currently known. In this p ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
(Show Context)
Tracenorm regularization is a widelyused and successful approach for collaborative filtering and matrix completion. However, its theoretical understanding is surprisingly weak, and despite previous attempts, there are no distributionfree, nontrivial learning guarantees currently known. In this paper, we bridge this gap by providing such guarantees, under mild assumptions which correspond to collaborative filtering as performed in practice. In fact, we claim that previous difficulties partially stemmed from a mismatch between the standard learningtheoretic modeling of collaborative filtering, and its practical application. Our results also shed some light on the issue of collaborative filtering with bounded models, which enforce predictions to lie within a certain range. In particular, we provide experimental and theoretical evidence that such models lead to a modest yet significant improvement. 1