Results 21  30
of
183
Innovated higher criticism for detecting sparse signals in correlated noise
 Ann. Statist
, 2010
"... Higher Criticism is a method for detecting signals that are both sparse and weak. Although first proposed in cases where the noise variables are independent, Higher Criticism also has reasonable performance in settings where those variables are correlated. In this paper we show that, by exploiting t ..."
Abstract

Cited by 41 (8 self)
 Add to MetaCart
Higher Criticism is a method for detecting signals that are both sparse and weak. Although first proposed in cases where the noise variables are independent, Higher Criticism also has reasonable performance in settings where those variables are correlated. In this paper we show that, by exploiting the nature of the correlation, performance can be improved by using a modified approach which exploits the potential advantages that correlation has to offer. Indeed, it turns out that the case of independent noise is the most difficult of all, from a statistical viewpoint, and that more accurate signal detection (for a given level of signal sparsity and strength) can be obtained when correlation is present. We characterize the advantages of correlation by showing how to incorporate them into the definition of an optimal detection boundary. The boundary has particularly attractive properties when correlation decays at a polynomial rate or the correlation matrix is Toeplitz.
Estimating the null and the proportion of nonnull effects in largescale multiple comparisons
 J. Amer. Statist. Assoc
, 2007
"... An important issue raised by Efron [7] in the context of largescale multiple comparisons is that in many applications the usual assumption that the null distribution is known is incorrect, and seemingly negligible differences in the null may result in large differences in subsequent studies. This s ..."
Abstract

Cited by 39 (6 self)
 Add to MetaCart
(Show Context)
An important issue raised by Efron [7] in the context of largescale multiple comparisons is that in many applications the usual assumption that the null distribution is known is incorrect, and seemingly negligible differences in the null may result in large differences in subsequent studies. This suggests that a careful study of estimation of the null is indispensable. In this paper, we consider the problem of estimating a null normal distribution, and a closely related problem, estimation of the proportion of nonnull effects. We develop an approach based on the empirical characteristic function and Fourier analysis. The estimators are shown to be uniformly consistent over a wide class of parameters. Numerical performance of the estimators is investigated using both simulated and real data. In particular, we apply our
OPTIMAL RATES OF CONVERGENCE FOR SPARSE COVARIANCE MATRIX ESTIMATION
 SUBMITTED TO THE ANNALS OF STATISTICS
"... This paper considers estimation ofsparse covariance matrices and establishes the optimal rate of convergence under a range of matrix operator norm and Bregman divergence losses. A major focus is on the derivation of a rate sharp minimax lower bound. The problem exhibits new features that are signifi ..."
Abstract

Cited by 34 (10 self)
 Add to MetaCart
This paper considers estimation ofsparse covariance matrices and establishes the optimal rate of convergence under a range of matrix operator norm and Bregman divergence losses. A major focus is on the derivation of a rate sharp minimax lower bound. The problem exhibits new features that are significantly different from those that occur in the conventional nonparametric function estimation problems. Standard techniques fail to yield good results and new tools are thus needed. We first develop a lower bound technique that is particularly well suited for treating “twodirectional ” problems such as estimating sparse covariance matrices. The result can be viewed as a generalization of Le Cam’s method in one direction and Assouad’s Lemma in another. This lower bound technique is of independent interest and can be used for other matrix estimation problems. We then establish a rate sharp minimax lower bound for estimating sparse covariance matrices under the spectral norm by applying the general lower bound technique. A thresholding estimator is shown to attain the optimal rate of convergence under the spectral norm. The results are then extended to the general matrix ℓw operator norms for 1�w��. In addition, we give a unified result on the minimax rate of convergence for sparse covariance matrix estimation under a class of Bregman divergence losses.
GAUSSIAN MODEL SELECTION WITH AN UNKNOWN VARIANCE
 SUBMITTED TO THE ANNALS OF STATISTICS
, 2007
"... Let Y be a Gaussian vector whose components are independent with a common unknown variance. We consider the problem of estimating the mean µ of Y by model selection. More precisely, we start with a collection S = {Sm, m ∈ M} of linear subspaces of R n and associate to each of these the leastsquares ..."
Abstract

Cited by 34 (15 self)
 Add to MetaCart
Let Y be a Gaussian vector whose components are independent with a common unknown variance. We consider the problem of estimating the mean µ of Y by model selection. More precisely, we start with a collection S = {Sm, m ∈ M} of linear subspaces of R n and associate to each of these the leastsquares estimator of µ on Sm. Then, we use a data driven penalized criterion in order to select one estimator among these. Our first objective is to analyze the performance of estimators associated to classical criteria such as FPE, AIC, BIC and AMDL. Our second objective is to propose better penalties that are versatile enough to take into account both the complexity of the collection S and the sample size. Then we apply those to solve various statistical problems such as variable selection, change point detections and signal estimation among others. Our results are based on a nonasymptotic risk bound with respect to the Euclidean loss for the selected estimator. Some analogous results are also established for the Kullback loss.
Formalized data snooping based on generalized error rates. Econometric Theory
, 2008
"... It is common in econometric applications that several hypothesis tests are carried out simultaneously+ The problem then becomes how to decide which hypotheses to reject, accounting for the multitude of tests+ The classical approach is to control the familywise error rate ~FWE!, which is the probabil ..."
Abstract

Cited by 33 (9 self)
 Add to MetaCart
It is common in econometric applications that several hypothesis tests are carried out simultaneously+ The problem then becomes how to decide which hypotheses to reject, accounting for the multitude of tests+ The classical approach is to control the familywise error rate ~FWE!, which is the probability of one or more false rejections+ But when the number of hypotheses under consideration is large, control of the FWE can become too demanding+ As a result, the number of false hypotheses rejected may be small or even zero+ This suggests replacing control of the FWE by a more liberal measure+ To this end, we review a number of recent proposals from the statistical literature+ We briefly discuss how these procedures apply to the general problem of model selection+ A simulation study and two empirical applications illustrate the methods+ 1.
GENERAL MAXIMUM LIKELIHOOD EMPIRICAL BAYES ESTIMATION OF NORMAL MEANS
, 908
"... We propose a general maximum likelihood empirical Bayes (GMLEB) method for the estimation of a mean vector based on observations with i.i.d. normal errors. We prove that under mild moment conditions on the unknown means, the average mean squared error (MSE) of the GMLEB is within an infinitesimal f ..."
Abstract

Cited by 27 (1 self)
 Add to MetaCart
(Show Context)
We propose a general maximum likelihood empirical Bayes (GMLEB) method for the estimation of a mean vector based on observations with i.i.d. normal errors. We prove that under mild moment conditions on the unknown means, the average mean squared error (MSE) of the GMLEB is within an infinitesimal fraction of the minimum average MSE among all separable estimators which use a single deterministic estimating function on individual observations, provided that the risk is of greater order than (log n) 5 /n. We also prove that the GMLEB is uniformly approximately minimax in regular and weak ℓp balls when the order of the lengthnormalized norm of the unknown means is between (log n) κ1 /n
Feature selection by higher criticism thresholding: Optimal phase diagram
, 2008
"... We consider twoclass linear classification in a highdimensional, lowsample size setting. Only a small fraction of the features are useful, the useful features are unknown to us, and each useful feature contributes weakly to the classification decision – this setting was called the rare/weak model ..."
Abstract

Cited by 26 (6 self)
 Add to MetaCart
We consider twoclass linear classification in a highdimensional, lowsample size setting. Only a small fraction of the features are useful, the useful features are unknown to us, and each useful feature contributes weakly to the classification decision – this setting was called the rare/weak model (RW Model) in [11]. We select features by thresholding feature zscores. The threshold is set by higher criticism (HC) [11]. Let πi denote the Pvalue associated to the ith zscore and π(i) denote the ith order statistic of the collection of Pvalues. The HC threshold (HCT) is the order statistic of the zscore corresponding to index i maximizing (i/n − π(i)) / p π(i)(1 − π(i)). The ideal threshold optimizes the classification error. In [11] we showed that HCT was numerically close to the ideal threshold. We formalize an asymptotic framework for studying the RW model, considering a sequence of problems with increasingly many features and relatively fewer observations. We show that along this sequence, the limiting performance of ideal HCT is essentially just as good as the limiting performance of ideal thresholding. Our results describe twodimensional
General empirical Bayes wavelet methods and exactly adaptive minimax estimation

, 2005
"... In many statistical problems, stochastic signals can be represented as a sequence of noisy wavelet coefficients. In this paper, we develop general empirical Bayes methods for the estimation of true signal. Our estimators approximate certain oracle separable rules and achieve adaptation to ideal risk ..."
Abstract

Cited by 25 (3 self)
 Add to MetaCart
In many statistical problems, stochastic signals can be represented as a sequence of noisy wavelet coefficients. In this paper, we develop general empirical Bayes methods for the estimation of true signal. Our estimators approximate certain oracle separable rules and achieve adaptation to ideal risks and exact minimax risks in broad collections of classes of signals. In particular, our estimators are uniformly adaptive to the minimum risk of separable estimators and the exact minimax risks simultaneously in Besov balls of all smoothness and shape indices, and they are uniformly superefficient in convergence rates in all compact sets in Besov spaces with a finite secondary shape parameter. Furthermore, in classes nested between Besov balls of the same smoothness index, our estimators dominate threshold and James–Stein estimators within an infinitesimal fraction of the minimax risks. More general block empirical Bayes estimators are developed. Both white noise with drift and nonparametric regression are considered.
ON THE FALSE DISCOVERY RATE AND AN ASYMPTOTICALLY OPTIMAL REJECTION CURVE 1
, 903
"... In this paper we introduce and investigate a new rejection curve for asymptotic control of the false discovery rate (FDR) in multiple hypotheses testing problems. We first give a heuristic motivation for this new curve and propose some procedures related to it. Then we introduce a set of possible as ..."
Abstract

Cited by 25 (3 self)
 Add to MetaCart
(Show Context)
In this paper we introduce and investigate a new rejection curve for asymptotic control of the false discovery rate (FDR) in multiple hypotheses testing problems. We first give a heuristic motivation for this new curve and propose some procedures related to it. Then we introduce a set of possible assumptions and give a unifying short proof of FDR control for procedures based on Simes ’ critical values, whereby certain types of dependency are allowed. This methodology of proof is then applied to other fixed rejection curves including the proposed new curve. Among others, we investigate the problem of finding least favorable parameter configurations such that the FDR becomes largest. We then derive a series of results concerning asymptotic FDR control for procedures based on the new curve and discuss several example procedures in more detail. A main result will be an asymptotic optimality statement for various procedures based on the new curve in the class of fixed rejection curves. Finally, we briefly discuss strict FDR control for a finite number of hypotheses.
Minimax Estimation of Large Covariance Matrices under ℓ1Norm
"... Driven byawide rangeofapplicationsin highdimensionaldata analysis, therehas been significant recent interest in the estimation of large covariance matrices. In this paper, we consideroptimal estimation ofacovariancematrix as well asits inverseover several commonly used parameter spaces under the ma ..."
Abstract

Cited by 24 (3 self)
 Add to MetaCart
Driven byawide rangeofapplicationsin highdimensionaldata analysis, therehas been significant recent interest in the estimation of large covariance matrices. In this paper, we consideroptimal estimation ofacovariancematrix as well asits inverseover several commonly used parameter spaces under the matrix ℓ1 norm. Both minimax lower and upper bounds are derived. The lower bounds are established by using hypothesis testing arguments, where at the core are a novel construction of collections of least favorable multivariate normal distributions and the bounding of the affinities between mixture distributions. The lower bound analysis also provides insight into where the difficulties of the covariance matrix estimation problem arise. A specific thresholding estimator and tapering estimator are constructed and shown to be minimax rate optimal. The optimal rates of convergence established in the paper can serve as a benchmark for the performance of covariance matrix estimation methods.