Results 1  10
of
202
Estimating the Support of a HighDimensional Distribution
, 1999
"... Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified between 0 and 1. We propo ..."
Abstract

Cited by 766 (29 self)
 Add to MetaCart
Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified between 0 and 1. We propose a method to approach this problem by trying to estimate a function f which is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a preliminary theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabelled d...
Optimal aggregation of classifiers in statistical learning
 Ann. Statist
, 2004
"... Classification can be considered as nonparametric estimation of sets, where the risk is defined by means of a specific distance between sets associated with misclassification error. It is shown that the rates of convergence of classifiers depend on two parameters: the complexity of the class of cand ..."
Abstract

Cited by 225 (7 self)
 Add to MetaCart
Classification can be considered as nonparametric estimation of sets, where the risk is defined by means of a specific distance between sets associated with misclassification error. It is shown that the rates of convergence of classifiers depend on two parameters: the complexity of the class of candidate sets and the margin parameter. The dependence is explicitly given, indicating that optimal fast rates approaching O(n−1) can be attained, where n is the sample size, and that the proposed classifiers have the property of robustness to the margin. The main result of the paper concerns optimal aggregation of classifiers: we suggest a classifier that automatically adapts both to the complexity and to the margin, and attains the optimal fast rates, up to a logarithmic factor. 1. Introduction. Let (Xi,Yi)
Smooth Discrimination Analysis
 Ann. Statist
, 1998
"... Discriminant analysis for two data sets in IR d with probability densities f and g can be based on the estimation of the set G = fx : f(x) g(x)g. We consider applications where it is appropriate to assume that the region G has a smooth boundary. In particular, this assumption makes sense if di ..."
Abstract

Cited by 154 (3 self)
 Add to MetaCart
Discriminant analysis for two data sets in IR d with probability densities f and g can be based on the estimation of the set G = fx : f(x) g(x)g. We consider applications where it is appropriate to assume that the region G has a smooth boundary. In particular, this assumption makes sense if discriminant analysis is used as a data analytic tool. We discuss optimal rates for estimation of G. 1991 AMS: primary 62G05 , secondary 62G20 Keywords and phrases: discrimination analysis, minimax rates, Bayes risk Short title: Smooth discrimination analysis This research was supported by the Deutsche Forschungsgemeinschaft, Sonderforschungsbereich 373 "Quantifikation und Simulation okonomischer Prozesse", HumboldtUniversitat zu Berlin 1 Introduction Assume that one observes two independent samples X = (X 1 ; : : : ; X n ) and Y = (Y 1 ; : : : ; Ym ) of IR d valued i.i.d. observations with densities f or g, respectively. The densities f and g are unknown. An additional random variabl...
Wedgelets: nearlyminimax estimation of edges
 Ann. Statist
, 1999
"... We study a simple “Horizon Model ” for the problem of recovering an image from noisy data; in this model the image has an edge with αHölder regularity. Adopting the viewpoint of computational harmonic analysis, we develop an overcomplete collection of atoms called wedgelets, dyadically organized in ..."
Abstract

Cited by 115 (8 self)
 Add to MetaCart
(Show Context)
We study a simple “Horizon Model ” for the problem of recovering an image from noisy data; in this model the image has an edge with αHölder regularity. Adopting the viewpoint of computational harmonic analysis, we develop an overcomplete collection of atoms called wedgelets, dyadically organized indicator functions with a variety of locations, scales, and orientations. The wedgelet representation provides nearlyoptimal representations of objects in the Horizon model, as measured by minimax description length. We show how to rapidly compute a wedgelet approximation to noisy data by finding a special edgeletdecorated recursive partition which minimizes a complexitypenalized sum of squares. This estimate, using sufficient subpixel resolution, achieves nearly the minimax meansquared error in the Horizon Model. In fact, the method is adaptive in the sense that it achieves nearly the minimax risk for any value of the unknown degree of regularity of the Horizon, 1 ≤ α ≤ 2. Wedgelet analysis and denoising may be used successfully outside the Horizon model. We study images modelled as indicators of starshaped sets with smooth boundaries and show that complexitypenalized wedgelet partitioning achieves nearly the minimax risk in that setting also.
Platelets: A Multiscale Approach for Recovering Edges and Surfaces in PhotonLimited Medical Imaging
 IEEE TRANSACTIONS ON MEDICAL IMAGING
, 2003
"... The nonparametric multiscale platelet algorithms presented in this paper, unlike traditional waveletbased methods, are both well suited to photonlimited medical imaging applications involving Poisson data and capable of better approximating edge contours. This paper introduces platelets, localized ..."
Abstract

Cited by 99 (20 self)
 Add to MetaCart
The nonparametric multiscale platelet algorithms presented in this paper, unlike traditional waveletbased methods, are both well suited to photonlimited medical imaging applications involving Poisson data and capable of better approximating edge contours. This paper introduces platelets, localized functions at various scales, locations, and orientations that produce piecewise linear image approximations, and a new multiscale image decomposition based on these functions. Platelets are well suited for approximating images consisting of smooth regions separated by smooth boundaries. For smoothness measured in certain H older classes, it is shown that the error of mterm platelet approximations can decay significantly faster than that of mterm approximations in terms of sinusoids, wavelets, or wedgelets. This suggests that platelets may outperform existing techniques for image denoising and reconstruction. Fast, plateletbased, maximum penalized likelihood methods for photonlimited image denoising, deblurring and tomographic reconstruction problems are developed. Because platelet decompositions of Poisson distributed images are tractable and computationally efficient, existing image reconstruction methods based on expectationmaximization type algorithms can be easily enhanced with platelet techniques. Experimental results suggest that plateletbased methods can outperform standard reconstruction methods currently in use in confocal microscopy, image restoration, and emission tomography.
Backcasting: adaptive sampling for sensor networks
 In Proc. Information Processing in Sensor Networks
, 2004
"... Wireless sensor networks provide an attractive approach to spatially monitoring environments. Wireless technology makes these systems relatively flexible, but also places heavy demands on energy consumption for communications. This raises a fundamental tradeoff: using higher densities of sensors pr ..."
Abstract

Cited by 93 (4 self)
 Add to MetaCart
Wireless sensor networks provide an attractive approach to spatially monitoring environments. Wireless technology makes these systems relatively flexible, but also places heavy demands on energy consumption for communications. This raises a fundamental tradeoff: using higher densities of sensors provides more measurements, higher resolution and better accuracy, but requires more communications and processing. This paper proposes a new approach, called “backcasting, ” which can significantly reduce communications and energy consumption while maintaining high accuracy. Backcasting operates by first having a small subset of the wireless sensors communicate their information to a fusion center. This provides an initial estimate of the environment being sensed, and guides the allocation of additional network resources. Specifically, the fusion center backcasts information based on the initial estimate to the network at large, selectively activating additional sensor nodes in order to achieve a target error level. The key idea is that the initial estimate can detect correlations in the environment, indicating that many sensors may not need to be activated by the fusion center. Thus, adaptive sampling can save energy compared to dense, nonadaptive sampling. This method is theoretically analyzed in the context of field estimation and it is shown that the energy savings can be quite significant compared to conventional
Minimax bounds for active learning
 In COLT
, 2007
"... Abstract. This paper aims to shed light on achievable limits in active learning. Using minimax analysis techniques, we study the achievable rates of classification error convergence for broad classes of distributions characterized by decision boundary regularity and noise conditions. The results cle ..."
Abstract

Cited by 87 (10 self)
 Add to MetaCart
(Show Context)
Abstract. This paper aims to shed light on achievable limits in active learning. Using minimax analysis techniques, we study the achievable rates of classification error convergence for broad classes of distributions characterized by decision boundary regularity and noise conditions. The results clearly indicate the conditions under which one can expect significant gains through active learning. Furthermore we show that the learning rates derived are tight for “boundary fragment ” classes in ddimensional feature spaces when the feature marginal density is bounded from above and below. 1
Sharp Adaptation for Inverse Problems With Random Noise
, 2000
"... We consider a heteroscedastic sequence space setup with polynomially increasing variances of observations that allows to treat a number of inverse problems, in particular multivariate ones. We propose an adaptive estimator that attains simultaneously exact asymptotic minimax constants on every ellip ..."
Abstract

Cited by 84 (8 self)
 Add to MetaCart
We consider a heteroscedastic sequence space setup with polynomially increasing variances of observations that allows to treat a number of inverse problems, in particular multivariate ones. We propose an adaptive estimator that attains simultaneously exact asymptotic minimax constants on every ellipsoid of functions within a wide scale (that includes ellipoids with polynomially and exponentially decreasing axes) and, at the same time, satisfies asymptotically exact oracle inequalities within any class of linear estimates having monotone nondecreasing weights. As application, we construct sharp adaptive estimators in the problems of deconvolution and tomography.
Recovering Edges in IllPosed Inverse Problems: Optimality of Curvelet Frames
, 2000
"... We consider a model problem of recovering a function f(x1,x2) from noisy Radon data. The function f to be recovered is assumed smooth apart from a discontinuity along a C2 curve – i.e. an edge. We use the continuum white noise model, with noise level ɛ. Traditional linear methods for solving such in ..."
Abstract

Cited by 78 (14 self)
 Add to MetaCart
(Show Context)
We consider a model problem of recovering a function f(x1,x2) from noisy Radon data. The function f to be recovered is assumed smooth apart from a discontinuity along a C2 curve – i.e. an edge. We use the continuum white noise model, with noise level ɛ. Traditional linear methods for solving such inverse problems behave poorly in the presence of edges. Qualitatively, the reconstructions are blurred near the edges; quantitatively, they give in our model Mean Squared Errors (MSEs) that tend to zero with noise level ɛ only as O(ɛ1/2)asɛ → 0. A recent innovation – nonlinear shrinkage in the wavelet domain – visually improves edge sharpness and improves MSE convergence to O(ɛ2/3). However, as we show here, this rate is not optimal. In fact, essentially optimal performance is obtained by deploying the recentlyintroduced tight frames of curvelets in this setting. Curvelets are smooth, highly anisotropic elements ideally suited for detecting and synthesizing curved edges. To deploy them in the Radon setting, we construct a curveletbased biorthogonal decomposition
Sparse components of images and optimal atomic decomposition
 Constr. Approx
"... Recently, Field, Lewicki, Olshausen, and Sejnowski have reported efforts to identify the “Sparse Components ” of image data. Their empirical findings indicate that such components have elongated shapes and assume a wide range of positions, orientations, and scales. To date, Sparse Components Analysi ..."
Abstract

Cited by 73 (4 self)
 Add to MetaCart
(Show Context)
Recently, Field, Lewicki, Olshausen, and Sejnowski have reported efforts to identify the “Sparse Components ” of image data. Their empirical findings indicate that such components have elongated shapes and assume a wide range of positions, orientations, and scales. To date, Sparse Components Analysis (SCA) has only been conducted on databases of small (e.g. 16by16) image patches and there seems limited prospect of dramatically increased resolving power. In this article, we apply mathematical analysis to a specific formalization of SCA using synthetic image models, hoping to gain insight into what might emerge from a higherresolution SCA based on n by n image patches for large n but constant field of view. In our formalization, we study a class of objects F in a functional space; they are to be represented by linear combinations of atoms from an overcomplete dictionary, and sparsity is measured by the ℓ p norm of the coefficients in the linear combination. We focus on the class F = Star α of blackandwhite images with the black region consisting of a starshaped set with αsmooth boundary. We aim to find an optimal dictionary, one achieving the optimal