Results 1  10
of
40
Information, Divergence and Risk for Binary Experiments
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2009
"... We unify fdivergences, Bregman divergences, surrogate regret bounds, proper scoring rules, cost curves, ROCcurves and statistical information. We do this by systematically studying integral and variational representations of these various objects and in so doing identify their primitives which all ..."
Abstract

Cited by 37 (8 self)
 Add to MetaCart
We unify fdivergences, Bregman divergences, surrogate regret bounds, proper scoring rules, cost curves, ROCcurves and statistical information. We do this by systematically studying integral and variational representations of these various objects and in so doing identify their primitives which all are related to costsensitive binary classification. As well as developing relationships between generative and discriminative views of learning, the new machinery leads to tight and more general surrogate regret bounds and generalised Pinsker inequalities relating fdivergences to variational divergence. The new viewpoint also illuminates existing algorithms: it provides a new derivation of Support Vector Machines in terms of divergences and relates Maximum Mean Discrepancy to Fisher Linear Discriminants.
Regression in Random Design and Warped Wavelets
 BERNOULLI,10
, 2004
"... We consider the problem of estimating an unknown function f in a regression setting with random design. Instead of expanding the function on a regular wavelet basis, we expand it on the basis jk (G), j, k} warped with the design. This allows to perform a very stable and computable thresholding alg ..."
Abstract

Cited by 28 (0 self)
 Add to MetaCart
We consider the problem of estimating an unknown function f in a regression setting with random design. Instead of expanding the function on a regular wavelet basis, we expand it on the basis jk (G), j, k} warped with the design. This allows to perform a very stable and computable thresholding algorithm. We investigate the properties of this new basis. In particular, we prove that if the design has a property of Muckenhoupt type, this new basis has a behavior quite similar to a regular wavelet basis. This enables us to prove that the associated thresholding procedure achieves rates of convergence which have been proved to be minimax in the uniform design case.
General empirical Bayes wavelet methods and exactly adaptive minimax estimation

, 2005
"... In many statistical problems, stochastic signals can be represented as a sequence of noisy wavelet coefficients. In this paper, we develop general empirical Bayes methods for the estimation of true signal. Our estimators approximate certain oracle separable rules and achieve adaptation to ideal risk ..."
Abstract

Cited by 25 (3 self)
 Add to MetaCart
In many statistical problems, stochastic signals can be represented as a sequence of noisy wavelet coefficients. In this paper, we develop general empirical Bayes methods for the estimation of true signal. Our estimators approximate certain oracle separable rules and achieve adaptation to ideal risks and exact minimax risks in broad collections of classes of signals. In particular, our estimators are uniformly adaptive to the minimum risk of separable estimators and the exact minimax risks simultaneously in Besov balls of all smoothness and shape indices, and they are uniformly superefficient in convergence rates in all compact sets in Besov spaces with a finite secondary shape parameter. Furthermore, in classes nested between Besov balls of the same smoothness index, our estimators dominate threshold and James–Stein estimators within an infinitesimal fraction of the minimax risks. More general block empirical Bayes estimators are developed. Both white noise with drift and nonparametric regression are considered.
Asymptotic equivalence of spectral density estimation and Gaussian white noise
, 2009
"... We consider the statistical experiment given by a sample y(1),...,y(n) of a stationary Gaussian process with an unknown smooth spectral density f. Asymptotic equivalence, in the sense of Le Cam’s deficiency ∆distance, to two Gaussian experiments with simpler structure is established. The first one ..."
Abstract

Cited by 25 (8 self)
 Add to MetaCart
We consider the statistical experiment given by a sample y(1),...,y(n) of a stationary Gaussian process with an unknown smooth spectral density f. Asymptotic equivalence, in the sense of Le Cam’s deficiency ∆distance, to two Gaussian experiments with simpler structure is established. The first one is given by independent zero mean Gaussians with variance approximately f(ωi) where ωi is a uniform grid of points in (−π, π) (nonparametric Gaussian scale regression). This approximation is closely related to wellknown asymptotic independence results for the periodogram and corresponding inference methods. The second asymptotic equivalence is to a Gaussian white noise model where the drift function is the logspectral density. This represents the step from a Gaussian scale model to a location model, and also has a counterpart in established inference methods, i.e. logperiodogram regression. The problem of simple explicit equivalence maps (Markov kernels), allowing to directly carry over inference, appears in this context but is not solved here.
EQUIVALENCE THEORY FOR DENSITY ESTIMATION, POISSON PROCESSES AND GAUSSIAN WHITE NOISE WITH DRIFT
"... This paper establishes the global asymptotic equivalence between a Poisson process with variable intensity and white noise with drift under sharp smoothness conditions on the unknown function. This equivalence is also extended to density estimation models by Poissonization. The asymptotic equivalenc ..."
Abstract

Cited by 25 (4 self)
 Add to MetaCart
This paper establishes the global asymptotic equivalence between a Poisson process with variable intensity and white noise with drift under sharp smoothness conditions on the unknown function. This equivalence is also extended to density estimation models by Poissonization. The asymptotic equivalences are established by constructing explicit equivalence mappings. The impact of such asymptotic equivalence results is that an investigation in one of these nonparametric models automatically yields asymptotically analogous results in the other models. 1. Introduction. The
Variance Estimation in Nonparametric Regression via the Difference Sequence Method
 Ann. Statist
, 2006
"... Consider a Gaussian nonparametric regression problem having both an unknown mean function and unknown variance function. This article presents a class of differencebased kernel estimators for the variance function. Optimal convergence rates that are uniform over broad functional classes and bandwid ..."
Abstract

Cited by 24 (4 self)
 Add to MetaCart
(Show Context)
Consider a Gaussian nonparametric regression problem having both an unknown mean function and unknown variance function. This article presents a class of differencebased kernel estimators for the variance function. Optimal convergence rates that are uniform over broad functional classes and bandwidths are fully characterized, and asymptotic normality is also established. We also show that for suitable asymptotic formulations our estimators achieve the minimax rate.
Asymptotic equivalence for nonparametric regression
 Math. Methods Statist
"... Abstract. We consider a nonparametric model En, generated by independent observations Xi, i = 1,..., n, with densities p(x, θi), i = 1,..., n, the parameters of which θi = f(i/n) ∈ Θ are driven by the values of an unknown function f: [0, 1] → Θ in a smoothness class. The main result of the paper i ..."
Abstract

Cited by 22 (0 self)
 Add to MetaCart
(Show Context)
Abstract. We consider a nonparametric model En, generated by independent observations Xi, i = 1,..., n, with densities p(x, θi), i = 1,..., n, the parameters of which θi = f(i/n) ∈ Θ are driven by the values of an unknown function f: [0, 1] → Θ in a smoothness class. The main result of the paper is that, under regularity assumptions, this model can be approximated, in the sense of the Le Cam deficiency pseudodistance, by a nonparametric Gaussian shift model Yi = Γ(f(i/n)) + εi, where ε1,..., εn are i.i.d. standard normal r.v.’s, the function Γ(θ) : Θ → R satisfies Γ′(θ) = √I(θ) and I(θ) is the Fisher information corresponding to the density p(x, θ). 1.
Asymptotic equivalence for nonparametric regression with multivariate and random design
, 2008
"... We show that nonparametric regression is asymptotically equivalent in Le Cam’s sense with a sequence of Gaussian white noise experiments as the number of observations tends to infinity. We propose a general constructive framework based on approximation spaces, which permits to achieve asymptotic equ ..."
Abstract

Cited by 20 (2 self)
 Add to MetaCart
We show that nonparametric regression is asymptotically equivalent in Le Cam’s sense with a sequence of Gaussian white noise experiments as the number of observations tends to infinity. We propose a general constructive framework based on approximation spaces, which permits to achieve asymptotic equivalence even in the cases of multivariate and random design.
Model Selection For Gaussian Regression with . . .
 UNIVERSITÉ PARIS
, 2002
"... This paper is about Gaussian regression with random design, where the observations are i.i.d., it is known from Le Cam (1973, 1975 and 1986) that the rate of convergence of optimal estimators is closely connected to the metric structure of the parameter space with respect to the Hellinger distanc ..."
Abstract

Cited by 17 (6 self)
 Add to MetaCart
This paper is about Gaussian regression with random design, where the observations are i.i.d., it is known from Le Cam (1973, 1975 and 1986) that the rate of convergence of optimal estimators is closely connected to the metric structure of the parameter space with respect to the Hellinger distance. In particular, this metric structure essentially determines the risk when the loss function is a power of the Hellinger distance. For random design regression, one typically uses as loss function the squared L 2 distance between the estimator and the parameter. If the
M.: Asymptotic statistical equivalence for scalar ergodic diffusions
 Probab. Theory Rel. Fields
, 2006
"... Abstract. For scalar diffusion models with unknown drift function asymptotic equivalence in the sense of Le Cam’s deficiency between statistical experiments is considered under longtime asymptotics. A local asymptotic equivalence result is established with an accompanying sequence of simple Gaussia ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
Abstract. For scalar diffusion models with unknown drift function asymptotic equivalence in the sense of Le Cam’s deficiency between statistical experiments is considered under longtime asymptotics. A local asymptotic equivalence result is established with an accompanying sequence of simple Gaussian shift experiments. Corresponding globally asymptotically equivalent experiments are obtained as compound experiments. The results are extended in several directions including time discretisation. An explicit transformation of decision functions from the Gaussian to the diffusion experiment is constructed. 1.