Results 1  10
of
43
Near Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?
, 2004
"... Suppose we are given a vector f in RN. How many linear measurements do we need to make about f to be able to recover f to within precision ɛ in the Euclidean (ℓ2) metric? Or more exactly, suppose we are interested in a class F of such objects— discrete digital signals, images, etc; how many linear m ..."
Abstract

Cited by 1513 (20 self)
 Add to MetaCart
Suppose we are given a vector f in RN. How many linear measurements do we need to make about f to be able to recover f to within precision ɛ in the Euclidean (ℓ2) metric? Or more exactly, suppose we are interested in a class F of such objects— discrete digital signals, images, etc; how many linear measurements do we need to recover objects from this class to within accuracy ɛ? This paper shows that if the objects of interest are sparse or compressible in the sense that the reordered entries of a signal f ∈ F decay like a powerlaw (or if the coefficient sequence of f in a fixed basis decays like a powerlaw), then it is possible to reconstruct f to within very high accuracy from a small number of random measurements. typical result is as follows: we rearrange the entries of f (or its coefficients in a fixed basis) in decreasing order of magnitude f  (1) ≥ f  (2) ≥... ≥ f  (N), and define the weakℓp ball as the class F of those elements whose entries obey the power decay law f  (n) ≤ C · n −1/p. We take measurements 〈f, Xk〉, k = 1,..., K, where the Xk are Ndimensional Gaussian
Minimax rates of estimation for highdimensional linear regression over balls
, 2009
"... Abstract—Consider the highdimensional linear regression model,where is an observation vector, is a design matrix with, is an unknown regression vector, and is additive Gaussian noise. This paper studies the minimax rates of convergence for estimating in eitherloss andprediction loss, assuming tha ..."
Abstract

Cited by 97 (19 self)
 Add to MetaCart
(Show Context)
Abstract—Consider the highdimensional linear regression model,where is an observation vector, is a design matrix with, is an unknown regression vector, and is additive Gaussian noise. This paper studies the minimax rates of convergence for estimating in eitherloss andprediction loss, assuming that belongs to anball for some.Itisshown that under suitable regularity conditions on the design matrix, the minimax optimal rate inloss andprediction loss scales as. The analysis in this paper reveals that conditions on the design matrix enter into the rates forerror andprediction error in complementary ways in the upper and lower bounds. Our proofs of the lower bounds are information theoretic in nature, based on Fano’s inequality and results on the metric entropy of the balls, whereas our proofs of the upper bounds are constructive, involving direct analysis of least squares overballs. For the special case, corresponding to models with an exact sparsity constraint, our results show that although computationally efficientbased methods can achieve the minimax rates up to constant factors, they require slightly stronger assumptions on the design matrix than optimal algorithms involving leastsquares over theball. Index Terms—Compressed sensing, minimax techniques, regression analysis. I.
Combining forecasting procedures: some theoretical results
 Econometric Theory
, 2004
"... We study some methods of combining procedures for forecasting a continuous random variable. Statistical risk bounds under the square error loss are obtained under mild distributional assumptions on the future given the current outside information and the past observations. The risk bounds show that ..."
Abstract

Cited by 39 (6 self)
 Add to MetaCart
We study some methods of combining procedures for forecasting a continuous random variable. Statistical risk bounds under the square error loss are obtained under mild distributional assumptions on the future given the current outside information and the past observations. The risk bounds show that the combined forecast automatically achieves the best performance among the candidate procedures up to a constant factor and an additive penalty term. In term of the rate of convergence, the combined forecast performs as well as if one knew which candidate forecasting procedure is the best in advance. Empirical studies suggest combining procedures can sometimes improve forecasting accuracy compared to the original procedures. Risk bounds are derived to theoretically quantify the potential gain and price for linearly combining forecasts for improvement. The result supports the empirical finding that it is not automatically a good idea to combine forecasts. A blind combining can degrade performance dramatically due to the undesirable large variability in estimating the best combining weights. An automated combining method is shown in theory to achieve a balance between the potential gain and the complexity penalty (the price for combining); to take advantage (if any) of sparse combining; and to maintain the best performance (in rate) among the candidate forecasting procedures if linear or sparse combining does not help.
Entropy and approximation numbers of embeddings of function spaces with Muckenhoupt weights
 I. Rev. Mat. Complut
"... We study compact embeddings for weighted spaces of Besov and TriebelLizorkin type where the weight belongs to some Muckenhoupt Ap class. For weights of purely polynomial growth, both near some singular point and at infinity, we obtain sharp asymptotic estimates for the entropy numbers and approxima ..."
Abstract

Cited by 22 (4 self)
 Add to MetaCart
(Show Context)
We study compact embeddings for weighted spaces of Besov and TriebelLizorkin type where the weight belongs to some Muckenhoupt Ap class. For weights of purely polynomial growth, both near some singular point and at infinity, we obtain sharp asymptotic estimates for the entropy numbers and approximation numbers of this embedding. The main tool is a discretization in terms of wavelet bases. Key words: wavelet bases, Muckenhoupt weighted function spaces, compact embeddings, entropy numbers, approximation numbers.
KernelDependent Support Vector Error Bounds
, 1999
"... Model selection in Support Vector machines is usually carried out by minimizing the quotient of the radius of the smallest enclosing sphere of the data and the observed margin on the training set. We provide a new criterion taking the distribution within that sphere into account by considering the ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
(Show Context)
Model selection in Support Vector machines is usually carried out by minimizing the quotient of the radius of the smallest enclosing sphere of the data and the observed margin on the training set. We provide a new criterion taking the distribution within that sphere into account by considering the eigenvalue distribution of the Gram matrix of the data. Experimental results on real world data show that this new criterion provides a good prediction of the shape of the curve relating generalization error to kernel width. 1 Introduction Support Vector (SV) machines traditionally carry out model selection by minimizing the ratio between the radius of the smallest sphere enclosing the data in feature space and the width of the margin 1=kwk since this corresponds to a classifier with minimal fat shattering dimension [4]. Whilst in general capturing the correct scaling behaviour in terms of the weight vector w, this approach has the shortcoming that it completely ignores the information abo...
The Gelfand widths of ℓpballs for 0 < p ≤ 1
 J. Complexity
"... We provide sharp lower and upper bounds for the Gelfand widths of ℓpballs in the Ndimensional ℓ N qspace for 0 < p ≤ 1 and p < q ≤ 2. Such estimates are highly relevant to the novel theory of compressive sensing, and our proofs rely on methods from this area. ..."
Abstract

Cited by 18 (9 self)
 Add to MetaCart
(Show Context)
We provide sharp lower and upper bounds for the Gelfand widths of ℓpballs in the Ndimensional ℓ N qspace for 0 < p ≤ 1 and p < q ≤ 2. Such estimates are highly relevant to the novel theory of compressive sensing, and our proofs rely on methods from this area.
Aggregating Regression Procedures for a Better Performance
 Bernoulli
, 1999
"... Methods have been proposed to linearly combine candidate regression procedures to improve estimation accuraccy. Applications of these methods in many examples are very succeesful, pointing to the great potential of combining procedures. A fundamental question regarding combining procedure is: What i ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
Methods have been proposed to linearly combine candidate regression procedures to improve estimation accuraccy. Applications of these methods in many examples are very succeesful, pointing to the great potential of combining procedures. A fundamental question regarding combining procedure is: What is the potential gain and how much one needs to pay for it? A partial answer to this question is obtained by Juditsky and Nemirovski (1996) for the case when a large number of procedures are to be combined. We attempt to give a more general solution. Under a l 1 constrain on the linear coefficients, we show that for pursuing the best linear combination over n procedures, in terms of rate of convergence under the squared L 2 loss, one can pay a price of order O \Gamma log n=n 1\Gamma \Delta when 0 ! ! 1=2 and a price of order O i (log n=n) 1=2 j when 1=2 ! 1. These rates can not be improved or essentially improved in a uniform sense. This result suggests that one should be cautious...
Widths of embeddings in function spaces
, 2007
"... We study the approximation, Gelfand and Kolmogorov numbers of embeddings in function spaces of Besov and TriebelLizorkin type. Our aim here is to provide sharp estimates in several cases left open in the literature and give a complete overview of the known results. We also add some historical remar ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
We study the approximation, Gelfand and Kolmogorov numbers of embeddings in function spaces of Besov and TriebelLizorkin type. Our aim here is to provide sharp estimates in several cases left open in the literature and give a complete overview of the known results. We also add some historical remarks.