Results 1  10
of
22
Smooth Discrimination Analysis
 Ann. Statist
, 1998
"... Discriminant analysis for two data sets in IR d with probability densities f and g can be based on the estimation of the set G = fx : f(x) g(x)g. We consider applications where it is appropriate to assume that the region G has a smooth boundary. In particular, this assumption makes sense if di ..."
Abstract

Cited by 154 (3 self)
 Add to MetaCart
(Show Context)
Discriminant analysis for two data sets in IR d with probability densities f and g can be based on the estimation of the set G = fx : f(x) g(x)g. We consider applications where it is appropriate to assume that the region G has a smooth boundary. In particular, this assumption makes sense if discriminant analysis is used as a data analytic tool. We discuss optimal rates for estimation of G. 1991 AMS: primary 62G05 , secondary 62G20 Keywords and phrases: discrimination analysis, minimax rates, Bayes risk Short title: Smooth discrimination analysis This research was supported by the Deutsche Forschungsgemeinschaft, Sonderforschungsbereich 373 "Quantifikation und Simulation okonomischer Prozesse", HumboldtUniversitat zu Berlin 1 Introduction Assume that one observes two independent samples X = (X 1 ; : : : ; X n ) and Y = (Y 1 ; : : : ; Ym ) of IR d valued i.i.d. observations with densities f or g, respectively. The densities f and g are unknown. An additional random variabl...
Optimal rates of convergence for covariance matrix estimation
 Ann. Statist
, 2010
"... Covariance matrix plays a central role in multivariate statistical analysis. Significant advances have been made recently on developing both theory and methodology for estimating large covariance matrices. However, a minimax theory has yet been developed. In this paper we establish the optimal rates ..."
Abstract

Cited by 93 (18 self)
 Add to MetaCart
(Show Context)
Covariance matrix plays a central role in multivariate statistical analysis. Significant advances have been made recently on developing both theory and methodology for estimating large covariance matrices. However, a minimax theory has yet been developed. In this paper we establish the optimal rates of convergence for estimating the covariance matrix under both the operator norm and Frobenius norm. It is shown that optimal procedures under the two norms are different and consequently matrix estimation under the operator norm is fundamentally different from vector estimation. The minimax upper bound is obtained by constructing a special class of tapering estimators and by studying their risk properties. A key step in obtaining the optimal rate of convergence is the derivation of the minimax lower bound. The technical analysis requires new ideas that are quite different from those used in the more conventional function/sequence estimation problems. 1. Introduction. Suppose
Fast learning rates in statistical inference through aggregation
 SUBMITTED TO THE ANNALS OF STATISTICS
, 2008
"... We develop minimax optimal risk bounds for the general learning task consisting in predicting as well as the best function in a reference set G up to the smallest possible additive term, called the convergence rate. When the reference set is finite and when n denotes the size of the training data, w ..."
Abstract

Cited by 42 (8 self)
 Add to MetaCart
We develop minimax optimal risk bounds for the general learning task consisting in predicting as well as the best function in a reference set G up to the smallest possible additive term, called the convergence rate. When the reference set is finite and when n denotes the size of the training data, we provide minimax convergence rates of the form C () log G  v with tight evaluation of the positive constant C and with n exact 0 < v ≤ 1, the latter value depending on the convexity of the loss function and on the level of noise in the output distribution. The risk upper bounds are based on a sequential randomized algorithm, which at each step concentrates on functions having both low risk and low variance with respect to the previous step prediction function. Our analysis puts forward the links between the probabilistic and worstcase viewpoints, and allows to obtain risk bounds unachievable with the standard statistical learning approach. One of the key idea of this work is to use probabilistic inequalities with respect to appropriate (Gibbs) distributions on the prediction function space instead of using them with respect to the distribution generating the data. The risk lower bounds are based on refinements of the Assouad lemma taking particularly into account the properties of the loss function. Our key example to illustrate the upper and lower bounds is to consider the Lqregression setting for which an exhaustive analysis of the convergence rates is given while q ranges in [1; +∞[.
On the fundamental limits of adaptive sensing
, 2011
"... Suppose we can sequentially acquire arbitrary linear measurements of an ndimensional vector x resulting in the linear model y = Ax + z, where z represents measurement noise. If the signal is known to be sparse, one would expect the following folk theorem to be true: choosing an adaptive strategy wh ..."
Abstract

Cited by 25 (2 self)
 Add to MetaCart
(Show Context)
Suppose we can sequentially acquire arbitrary linear measurements of an ndimensional vector x resulting in the linear model y = Ax + z, where z represents measurement noise. If the signal is known to be sparse, one would expect the following folk theorem to be true: choosing an adaptive strategy which cleverly selects the next row of A based on what has been previously observed should do far better than a nonadaptive strategy which sets the rows of A ahead of time, thus not trying to learn anything about the signal in between observations. This paper shows that the folk theorem is false. We prove that the advantages offered by clever adaptive strategies and sophisticated estimation procedures—no matter how intractable—over classical compressed acquisition/recovery schemes are, in general, minimal.
Minimax Estimation of Large Covariance Matrices under ℓ1Norm
"... Driven byawide rangeofapplicationsin highdimensionaldata analysis, therehas been significant recent interest in the estimation of large covariance matrices. In this paper, we consideroptimal estimation ofacovariancematrix as well asits inverseover several commonly used parameter spaces under the ma ..."
Abstract

Cited by 24 (3 self)
 Add to MetaCart
Driven byawide rangeofapplicationsin highdimensionaldata analysis, therehas been significant recent interest in the estimation of large covariance matrices. In this paper, we consideroptimal estimation ofacovariancematrix as well asits inverseover several commonly used parameter spaces under the matrix ℓ1 norm. Both minimax lower and upper bounds are derived. The lower bounds are established by using hypothesis testing arguments, where at the core are a novel construction of collections of least favorable multivariate normal distributions and the bounding of the affinities between mixture distributions. The lower bound analysis also provides insight into where the difficulties of the covariance matrix estimation problem arise. A specific thresholding estimator and tapering estimator are constructed and shown to be minimax rate optimal. The optimal rates of convergence established in the paper can serve as a benchmark for the performance of covariance matrix estimation methods.
ESTIMATOR SELECTION WITH RESPECT TO HELLINGERTYPE RISKS
, 2009
"... We observe a random measure N and aim at estimating its intensity s. This statistical framework allows to deal simultaneously with the problems of estimating a density, the marginals of a multivariate distribution, the mean of a random vector with nonnegative components and the intensity of a Poiss ..."
Abstract

Cited by 13 (5 self)
 Add to MetaCart
We observe a random measure N and aim at estimating its intensity s. This statistical framework allows to deal simultaneously with the problems of estimating a density, the marginals of a multivariate distribution, the mean of a random vector with nonnegative components and the intensity of a Poisson process. Our estimation strategy is based on estimator selection. Given a family of estimators of s based on the observation of N, we propose a selection rule, based on N as well, in view of selecting among these. Little assumption is made on the collection of estimators. The procedure offers the possibility to perform model selection and also to select among estimators associated to different model selection strategies. Besides, it provides an alternative to the Testimators as studied recently in Birgé (2006). For illustration, we consider the problems of estimation and (complete) variable selection in various regression settings.
Model selection for Poisson processes
, 2007
"... Our purpose in this paper is to apply the general methodology for model selection based on Testimators developed in Birgé [Ann. Inst. H. Poincaré Probab. Statist. 42 (2006) 273–325] to the particular situation of the estimation of the unknown mean measure of a Poisson process. We introduce a Helli ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Our purpose in this paper is to apply the general methodology for model selection based on Testimators developed in Birgé [Ann. Inst. H. Poincaré Probab. Statist. 42 (2006) 273–325] to the particular situation of the estimation of the unknown mean measure of a Poisson process. We introduce a Hellinger type distance between finite positive measures to serve as our loss function and we build suitable tests between balls (with respect to this distance) in the set of mean measures. As a consequence of the existence of such tests, given a suitable family of approximating models, we can build Testimators for the mean measure based on this family of models and analyze their performances. We provide a number of applications to adaptive intensity estimation when the square root of the intensity belongs to various smoothness classes. We also give a method for aggregation of preliminary estimators.
Lower bounds for estimating a hazard
 Advances in survival analysis, 209–226, Handbook of Statist
, 2004
"... Abstract: The minimax global asymptotic rate of convergence for the estimation of a hazard function in the presence of random right censoring is obtained using the link which relates the Kullback distance between two probability measures with an Lptype distance between their corresponding hazards. ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Abstract: The minimax global asymptotic rate of convergence for the estimation of a hazard function in the presence of random right censoring is obtained using the link which relates the Kullback distance between two probability measures with an Lptype distance between their corresponding hazards.
The Brouwer Lecture 2005 : Statistical estimation with model selection
, 2006
"... The purpose of this paper is to explain the interest and importance of (approximate) models and model selection in Statistics. Starting from the very elementary example of histograms we present a general notion of finite dimensional model for statistical estimation and we explain what type of risk b ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
The purpose of this paper is to explain the interest and importance of (approximate) models and model selection in Statistics. Starting from the very elementary example of histograms we present a general notion of finite dimensional model for statistical estimation and we explain what type of risk bounds can be expected from the use of one such model. We then give the performance of suitable model selection procedures from a family of such models. We illustrate our point of view by two main examples: the choice of a partition for designing a histogram from an nsample and the problem of variable selection in the context of Gaussian regression.