Results 1  10
of
44
Aggregation for Gaussian regression
 SUBMITTED TO THE ANNALS OF STATISTICS
, 2007
"... This paper studies statistical aggregation procedures in the regression setting. A motivating factor is the existence of many different methods of estimation, leading to possibly competing estimators. We consider here three different types of aggregation: model selection (MS) aggregation, convex (C) ..."
Abstract

Cited by 144 (17 self)
 Add to MetaCart
(Show Context)
This paper studies statistical aggregation procedures in the regression setting. A motivating factor is the existence of many different methods of estimation, leading to possibly competing estimators. We consider here three different types of aggregation: model selection (MS) aggregation, convex (C) aggregation and linear (L) aggregation. The objective of (MS) is to select the optimal single estimator from the list; that of (C) is to select the optimal convex combination of the given estimators; and that of (L) is to select the optimal linear combination of the given estimators. We are interested in evaluating the rates of convergence of the excess risks of the estimators obtained by these procedures. Our approach is motivated by recent minimax results in [34, 40]. There exist competing aggregation procedures achieving optimal convergence rates for each of the (MS), (C) and (L) cases separately. Since these procedures are not directly comparable with each other, we suggest an alternative solution. We prove that all the three optimal rates, as well as those for the newly introduced (S)
Lectures on the central limit theorem for empirical processes
 Probability and Banach Spaces
, 1986
"... Abstract. Concentration inequalities are used to derive some new inequalities for ratiotype suprema of empirical processes. These general inequalities are used to prove several new limit theorems for ratiotype suprema and to recover anumber of the results from [1] and [2]. As a statistical applica ..."
Abstract

Cited by 135 (9 self)
 Add to MetaCart
(Show Context)
Abstract. Concentration inequalities are used to derive some new inequalities for ratiotype suprema of empirical processes. These general inequalities are used to prove several new limit theorems for ratiotype suprema and to recover anumber of the results from [1] and [2]. As a statistical application, an oracle inequality for nonparametric regression is obtained via ratio bounds. 1.
Oracle Inequalities for Inverse Problems
, 2000
"... We consider a sequence space model of statistical linear inverse problems where we need to estimate a function f from indirect noisy observations. Let a finite set of linear estimators be given. Our aim is to mimic the estimator in that has the smallest risk on the true f . Under general conditions, ..."
Abstract

Cited by 73 (9 self)
 Add to MetaCart
We consider a sequence space model of statistical linear inverse problems where we need to estimate a function f from indirect noisy observations. Let a finite set of linear estimators be given. Our aim is to mimic the estimator in that has the smallest risk on the true f . Under general conditions, we show that this can be achieved by simple minimization of unbiased risk estimator, provided the singular values of the operator of the inverse problem decrease as a power law. The main result is a nonasymptotic oracle inequality that is shown to be asymptotically exact. This inequality can be also used to obtain sharp minimax adaptive results. In particular, we apply it to show that minimax adaptation on ellipsoids in multivariate anisotropic case is realized by minimization of unbiased risk estimator without any loss of efficiency with respect to optimal nonadaptive procedures. Mathematics Subject Classifications: 62G05, 62G20 Key Words: Statistical inverse problems, Oracle inequaliti...
Smoothing Splines Estimators in Functional Linear Regression with ErrorsinVariables
, 2006
"... This work deals with a generalization of the Total Least Squares method in the context of the functional linear model. We first propose a smoothing splines estimator of the functional coefficient of the model without noise in the covariates and we obtain an asymptotic result for this estimator. Then ..."
Abstract

Cited by 67 (3 self)
 Add to MetaCart
(Show Context)
This work deals with a generalization of the Total Least Squares method in the context of the functional linear model. We first propose a smoothing splines estimator of the functional coefficient of the model without noise in the covariates and we obtain an asymptotic result for this estimator. Then, we adapt this estimator to the case where the covariates are noisy and we also derive an upper bound for the convergence speed. Our estimation procedure is evaluated by means of simulations.
Modulation Estimators and Confidence Sets
 ANN. STATIST
, 1999
"... An unknown signal plus white noise is observed at n discrete time points. Within a large convex class of linear estimators of , we choose the estimator b that minimizes estimated quadratic risk. By construction, b is nonlinear. This estimation is done after orthogonal transformation of the data to ..."
Abstract

Cited by 47 (11 self)
 Add to MetaCart
An unknown signal plus white noise is observed at n discrete time points. Within a large convex class of linear estimators of , we choose the estimator b that minimizes estimated quadratic risk. By construction, b is nonlinear. This estimation is done after orthogonal transformation of the data to a reasonable coordinate system. The procedure adaptively tapers the coefficients of the transformed data. If the class of candidate estimators satisfies a uniform entropy condition, then b is asymptotically minimax in Pinsker's sense over certain ellipsoids in the parameter space and shares one such asymptotic minimax property with the JamesStein estimator. We describe computational algorithms for b and construct confidence sets for the unknown signal. These confidence sets are centered at b , have correct asymptotic coverage probability, and have relatively small risk as setvalued estimators of .
REACT Scatterplot Smoothers: Superefficiency through Basis Economy
 J. AMER. STATIST. ASSOC
, 1999
"... ..."
SHARP ORACLE INEQUALITIES FOR AGGREGATION OF AFFINE Estimators
, 2012
"... We consider the problem of combining a (possibly uncountably infinite) set of affine estimators in nonparametric regression model with heteroscedastic Gaussian noise. Focusing on the exponentially weighted aggregate, we prove a PACBayesian type inequality that leads to sharp oracle inequalities in ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
We consider the problem of combining a (possibly uncountably infinite) set of affine estimators in nonparametric regression model with heteroscedastic Gaussian noise. Focusing on the exponentially weighted aggregate, we prove a PACBayesian type inequality that leads to sharp oracle inequalities in discrete but also in continuous settings. The framework is general enough to cover the combinations of various procedures such as least square regression, kernel ridge regression, shrinking estimators and many other estimators used in the literature on statistical inverse problems. As a consequence, we show that the proposed aggregate provides an adaptive estimator in the exact minimax sense without neither discretizing the range of tuning parameters nor splitting the set of observations. We also illustrate numerically the good performance achieved by the exponentially weighted aggregate.
Sparse estimation by exponential weighting
 Statist. Sci
, 2012
"... Abstract. Consider a regression model with fixed design and Gaussian noise where the regression function can potentially be well approximated by a function that admits a sparse representation in a given dictionary. This paper resorts to exponential weights to exploit this underlying sparsity by impl ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
Abstract. Consider a regression model with fixed design and Gaussian noise where the regression function can potentially be well approximated by a function that admits a sparse representation in a given dictionary. This paper resorts to exponential weights to exploit this underlying sparsity by implementing the principle of sparsity pattern aggregation. This model selection take on sparse estimation allows us to derive sparsity oracle inequalities in several popular frameworks, including ordinary sparsity, fused sparsity and group sparsity. One striking aspect of these theoretical results is that they hold under no condition in the dictionary. Moreover, we describe an efficient implementation of the sparsity pattern aggregation principle that compares favorably to stateoftheart procedures on some basic numerical examples. Key words and phrases: Highdimensional regression, exponential weights, sparsity, fused sparsity, group sparsity, sparsity oracle inequalities, sparsity pattern aggregation, sparsity prior, sparse regression. 1.
Model Selection
 In The Handbook Of Financial Time Series
, 2008
"... Model selection has become an ubiquitous statistical activity in the last decades, none the least due to the computational ease with which many statistical models can be fitted to data with the help of modern computing equipment. In this article we provide an introduction to the statistical aspect ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
Model selection has become an ubiquitous statistical activity in the last decades, none the least due to the computational ease with which many statistical models can be fitted to data with the help of modern computing equipment. In this article we provide an introduction to the statistical aspects and implications of model selection and we review the relevant literature. 1.1 A General Formulation When modeling data Y, a researcher often has available a menu of competing candidate models which could be used to describe the data. Let M denote the collection of these candidate models. Each model M, i.e., each element of M, can – from a mathematical point of view – be viewed as a collection of probability distributions for Y implied by the model. That is, M is given by M = {Pη: η ∈ H}, where Pη denotes a probability distribution for Y and H represents the ‘parameter ’ space (which can be different across different models M). The ‘parameter ’ space H need not be finitedimensional. Often, the ‘parameter ’ η will be partitioned into (η1, η2) where η1 is a finitedimensional parameter whereas η2 is infinitedimensional. In case the parameterization is identified, i.e., the map η → Pη is injective on H, we will often not distinguish between M and H and will use them synonymously. The model selection problem is now to select – based on the data Y – a model M ̂ = M̂(Y) in M such that M ̂ is a ‘good ’ model for the data Y. Of course, the sense, in which the selected model should be a ‘good ’ model, needs to be made precise and is a crucial point in the analysis. This is particularly important if – as is usually the case – selecting the model M ̂ is not the final
Aggregation for regression learning
, 2004
"... This paper studies statistical aggregation procedures in regression setting. A motivating factor is the existence of many different methods of estimation, leading to possibly competing estimators. We consider here three different types of aggregation: model selection (MS) aggregation, convex (C) ag ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
This paper studies statistical aggregation procedures in regression setting. A motivating factor is the existence of many different methods of estimation, leading to possibly competing estimators. We consider here three different types of aggregation: model selection (MS) aggregation, convex (C) aggregation and linear (L) aggregation. The objective of (MS) is to select the optimal single estimator from the list; that of (C) is to select the optimal convex combination of the given estimators; and that of (L) is to select the optimal linear combination of the given estimators. We are interested in evaluating the rates of convergence of the excess risks of the estimators obtained by these procedures. Our approach is motivated by recent minimax results in Nemirovski (2000) and Tsybakov (2003). There exist competing aggregation procedures achieving optimal convergence separately for each one of (MS), (C) and (L) cases. Since the bounds in these results are not directly comparable with each other, we suggest an alternative solution. We prove that all the three optimal bounds can be nearly achieved via a single "universal" aggregation procedure. We propose such a procedure which consists in mixing of the initial estimators with the weights obtained by penalized least squares. Two different penalties are considered: one of them is related to hard thresholding techniques, the second one is a data dependent L1type penalty.