Results 1  10
of
14
Ideal spatial adaptation by wavelet shrinkage
 Biometrika
, 1994
"... With ideal spatial adaptation, an oracle furnishes information about how best to adapt a spatially variable estimator, whether piecewise constant, piecewise polynomial, variable knot spline, or variable bandwidth kernel, to the unknown function. Estimation with the aid of an oracle o ers dramatic ad ..."
Abstract

Cited by 1251 (5 self)
 Add to MetaCart
(Show Context)
With ideal spatial adaptation, an oracle furnishes information about how best to adapt a spatially variable estimator, whether piecewise constant, piecewise polynomial, variable knot spline, or variable bandwidth kernel, to the unknown function. Estimation with the aid of an oracle o ers dramatic advantages over traditional linear estimation by nonadaptive kernels � however, it is a priori unclear whether such performance can be obtained by a procedure relying on the data alone. We describe a new principle for spatiallyadaptive estimation: selective wavelet reconstruction. Weshowthatvariableknot spline ts and piecewisepolynomial ts, when equipped with an oracle to select the knots, are not dramatically more powerful than selective wavelet reconstruction with an oracle. We develop a practical spatially adaptive method, RiskShrink, which works by shrinkage of empirical wavelet coe cients. RiskShrink mimics the performance of an oracle for selective wavelet reconstruction as well as it is possible to do so. A new inequality inmultivariate normal decision theory which wecallthe oracle inequality shows that attained performance di ers from ideal performance by at most a factor 2logn, where n is the sample size. Moreover no estimator can give a better guarantee than this. Within the class of spatially adaptive procedures, RiskShrink is essentially optimal. Relying only on the data, it comes within a factor log 2 n of the performance of piecewise polynomial and variableknot spline methods equipped with an oracle. In contrast, it is unknown how or if piecewise polynomial methods could be made to function this well when denied access to an oracle and forced to rely on data alone.
Minimax Estimation via Wavelet Shrinkage
, 1992
"... We attempt to recover an unknown function from noisy, sampled data. Using orthonormal bases of compactly supported wavelets we develop a nonlinear method which works in the wavelet domain by simple nonlinear shrinkage of the empirical wavelet coe cients. The shrinkage can be tuned to be nearly minim ..."
Abstract

Cited by 322 (32 self)
 Add to MetaCart
(Show Context)
We attempt to recover an unknown function from noisy, sampled data. Using orthonormal bases of compactly supported wavelets we develop a nonlinear method which works in the wavelet domain by simple nonlinear shrinkage of the empirical wavelet coe cients. The shrinkage can be tuned to be nearly minimax over any member of a wide range of Triebel and Besovtype smoothness constraints, and asymptotically minimax over Besov bodies with p q. Linear estimates cannot achieve even the minimax rates over Triebel and Besov classes with p <2, so our method can signi cantly outperform every linear method (kernel, smoothing spline, sieve,:::) in a minimax sense. Variants of our method based on simple threshold nonlinearities are nearly minimax. Our method possesses the interpretation of spatial adaptivity: it reconstructs using a kernel which mayvary in shape and bandwidth from point to point, depending on the data. Least favorable distributions for certain of the Triebel and Besov scales generate objects with sparse wavelet transforms. Many real objects have similarly sparse transforms, which suggests that these minimax results are relevant for practical problems. Sequels to this paper discuss practical implementation, spatial adaptation properties and applications to inverse problems.
Wavelet shrinkage: asymptopia
 Journal of the Royal Statistical Society, Ser. B
, 1995
"... Considerable e ort has been directed recently to develop asymptotically minimax methods in problems of recovering in nitedimensional objects (curves, densities, spectral densities, images) from noisy data. A rich and complex body of work has evolved, with nearly or exactly minimax estimators bein ..."
Abstract

Cited by 297 (36 self)
 Add to MetaCart
(Show Context)
Considerable e ort has been directed recently to develop asymptotically minimax methods in problems of recovering in nitedimensional objects (curves, densities, spectral densities, images) from noisy data. A rich and complex body of work has evolved, with nearly or exactly minimax estimators being obtained for a variety of interesting problems. Unfortunately, the results have often not been translated into practice, for a variety of reasons { sometimes, similarity to known methods, sometimes, computational intractability, and sometimes, lack of spatial adaptivity. We discuss a method for curve estimation based on n noisy data; one translates the empirical wavelet coe cients towards the origin by an amount p p 2 log(n) = n. The method is di erent from methods in common use today, is computationally practical, and is spatially adaptive; thus it avoids a number of previous objections to minimax estimators. At the same time, the method is nearly minimax for a wide variety of loss functions { e.g. pointwise error, global error measured in L p norms, pointwise and global error in estimation of derivatives { and for a wide range of smoothness classes, including standard Holder classes, Sobolev classes, and Bounded Variation. This is amuch broader nearoptimality than anything previously proposed in the minimax literature. Finally, the theory underlying the method is interesting, as it exploits a correspondence between statistical questions and questions of optimal recovery and informationbased complexity.
A SELECTIVE OVERVIEW OF VARIABLE SELECTION IN HIGH DIMENSIONAL FEATURE SPACE
, 2010
"... High dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded ..."
Abstract

Cited by 70 (6 self)
 Add to MetaCart
High dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded as a specific form of penalized likelihood, is computationally too expensive for many modern statistical applications. Other forms of penalized likelihood methods have been successfully developed over the last decade to cope with high dimensionality. They have been widely applied for simultaneously selecting important variables and estimating their effects in high dimensional statistical inference. In this article, we present a brief account of the recent developments of theory, methods, and implementations for high dimensional variable selection. What limits of the dimensionality such methods can handle, what the role of penalty functions is, and what the statistical properties are rapidly drive the advances of the field. The properties of nonconcave penalized likelihood and its roles in high dimensional statistical modeling are emphasized. We also review some recent advances in ultrahigh dimensional variable selection, with emphasis on independence screening and twoscale methods.
Statistical challenges with high dimensionality: feature selection in knowledge discovery
, 2006
"... ..."
(Show Context)
Minimax bayes, asymptotic minimax and sparse wavelet priors, in
 Sciences Paris (A
, 1994
"... Pinsker(1980) gave a precise asymptotic evaluation of the minimax mean squared error of estimation of a signal in Gaussian noise when the signal is known a priori to lie in a compact ellipsoid in Hilbert space. This `Minimax Bayes ' method can be applied to a variety of global nonparametric es ..."
Abstract

Cited by 46 (10 self)
 Add to MetaCart
Pinsker(1980) gave a precise asymptotic evaluation of the minimax mean squared error of estimation of a signal in Gaussian noise when the signal is known a priori to lie in a compact ellipsoid in Hilbert space. This `Minimax Bayes ' method can be applied to a variety of global nonparametric estimation settings with parameter spaces far from ellipsoidal. For example it leads to a theory of exact asymptotic minimax estimation over norm balls in Besov and Triebel spaces using simple coordinatewise estimators and wavelet bases. This paper outlines some features of the method common to several applications. In particular, we derive new results on the exact asymptotic minimax risk over weak `p balls in Rn as n!1, and also for a class of `local ' estimators on the Triebel scale. By its very nature, the method reveals the structure of asymptotically least favorable distributions. Thus wemaysimulate `least favorable ' sample paths. We illustrate this for estimation of a signal in Gaussian white noise over norm balls in certain Besov spaces. In wavelet bases, when p<2, the least favorable priors are sparse, and the resulting sample paths strikingly di erent from those observed in Pinsker's ellipsoidal setting (p =2).
Neoclassical minimax problems, thresholding and adaptive function estimation Bernoulli
, 1996
"... 2 We study the problem of estimating from data Y N ( ; ) under squarederror loss. We de ne three new scalar minimax problems in which the risk is weighted by the size of. Simple thresholding gives asymptotically minimax estimates of all three problems. We indicate the relationships of the new probl ..."
Abstract

Cited by 22 (1 self)
 Add to MetaCart
2 We study the problem of estimating from data Y N ( ; ) under squarederror loss. We de ne three new scalar minimax problems in which the risk is weighted by the size of. Simple thresholding gives asymptotically minimax estimates of all three problems. We indicate the relationships of the new problems to each other and to two other neoclassical problems: the problems of the bounded normal mean and of the riskconstrained normal mean. Via the wavelet transform, these results have implications for adaptive function estimation, to: (1) estimating functions of unknown type and degree of smoothness in a global ` 2 norm; (2) estimating a function of unknown degree of local Holder smoothness at a xed point. In setting (2), the scalar minimax results imply: (a) that it is not possible to fully adapt to unknown degree of smoothness { adaptation imposes a performance cost; and (b) that simple thresholding of the empirical wavelet transform gives an estimate of a function at a xed point which is, to within constants, optimally adaptive to unknown degree of smoothness.
WaveShrink: Shrinkage Functions and Thresholds
 Proc. SPIE
, 1995
"... Donoho and Johnstone's WaveShrink procedure has proven valuable for signal denoising and nonparametric regression. WaveShrink is based on the principle of shrinking wavelet coefficients towards zero to remove noise. WaveShrink has very broad asymptotic nearoptimality properties. In this pape ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Donoho and Johnstone's WaveShrink procedure has proven valuable for signal denoising and nonparametric regression. WaveShrink is based on the principle of shrinking wavelet coefficients towards zero to remove noise. WaveShrink has very broad asymptotic nearoptimality properties. In this paper, we introduce a new shrinkage scheme, semisoft, which generalizes hard and soft shrinkage. We study the properties of the shrinkage functions, and demonstrate that semisoft shrinkage offers advantages over both hard shrinkage (uniformly smaller risk and less sensitivity to small perturbations in the data) and soft shrinkage (smaller bias and overall L 2 risk). We also construct approximate pointwise confidence intervals for WaveShrink and address the problem of threshold selection. Key Words: WaveShrink; Bias Estimation; Confidence Interval; Minimax Thresholds; Nonparametric Regression; SemiSoft Shrinkage; Signal Denoising; Variance Estimation; Wavelet Transform; 1 INTRODUCTION Suppose we ...
Minimax Risk over. . .
, 1994
"... Consider estimating the mean vector ` from data N n (`; oe 2 I) with l q norm loss, q 1, when ` is known to lie in an ndimensional l p ball, p 2 (0; 1). For large n, the ratio of minimax linear risk to minimax risk can be arbitrarily large if p ! q. Obvious exceptions aside, the limiting rati ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Consider estimating the mean vector ` from data N n (`; oe 2 I) with l q norm loss, q 1, when ` is known to lie in an ndimensional l p ball, p 2 (0; 1). For large n, the ratio of minimax linear risk to minimax risk can be arbitrarily large if p ! q. Obvious exceptions aside, the limiting ratio equals 1 only if p = q = 2. Our arguments are mostly indirect, involving a reduction to a univariate Bayes minimax problem. When p ! q, simple nonlinear coordinatewise threshold rules are asymptotically minimax at small signaltonoise ratios, and within a bounded factor of asymptotic minimaxity in general. Our results are basic to a theory of estimation in Besov spaces using wavelet bases (to appear elsewhere). Key Words. Minimax Decision Theory. Minimax Bayes Estimation. Fisher Information. Nonlinear estimation. White noise model. Loss convexity. Estimating a bounded normal mean. Running Title: Minimax risk over l p balls. AMS 1980 Subject Classification (1985 Rev): Primary: 62C20...