Results 1  10
of
21
Estimating the Support of a HighDimensional Distribution
, 1999
"... Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified between 0 and 1. We propo ..."
Abstract

Cited by 766 (29 self)
 Add to MetaCart
Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified between 0 and 1. We propose a method to approach this problem by trying to estimate a function f which is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a preliminary theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabelled d...
Learning minimum volume sets
 J. Machine Learning Res
, 2006
"... Given a probability measure P and a reference measure µ, one is often interested in the minimum µmeasure set with Pmeasure at least α. Minimum volume sets of this type summarize the regions of greatest probability mass of P, and are useful for detecting anomalies and constructing confidence region ..."
Abstract

Cited by 41 (9 self)
 Add to MetaCart
Given a probability measure P and a reference measure µ, one is often interested in the minimum µmeasure set with Pmeasure at least α. Minimum volume sets of this type summarize the regions of greatest probability mass of P, and are useful for detecting anomalies and constructing confidence regions. This paper addresses the problem of estimating minimum volume sets based on independent samples distributed according to P. Other than these samples, no other information is available regarding P, but the reference measure µ is assumed to be known. We introduce rules for estimating minimum volume sets that parallel the empirical risk minimization and structural risk minimization principles in classification. As in classification, we show that the performances of our estimators are controlled by the rate of uniform convergence of empirical to true probabilities over the class from which the estimator is drawn. Thus we obtain finite sample size performance bounds in terms of VC dimension and related quantities. We also demonstrate strong universal consistency and an oracle inequality. Estimators based on histograms and dyadic partitions illustrate the proposed rules. 1
Kernel estimation of density level sets
 J. Multivariate Anal
, 2006
"... Abstract. Let f be a multivariate density and fn be a kernel estimate of f drawn from the nsample X1, · · ·,Xn of i.i.d. random variables with density f. We compute the asymptotic rate of convergence towards 0 of the volume of the symmetric difference between the tlevel set {f ≥ t} and its plug ..."
Abstract

Cited by 24 (2 self)
 Add to MetaCart
(Show Context)
Abstract. Let f be a multivariate density and fn be a kernel estimate of f drawn from the nsample X1, · · ·,Xn of i.i.d. random variables with density f. We compute the asymptotic rate of convergence towards 0 of the volume of the symmetric difference between the tlevel set {f ≥ t} and its plugin estimator {fn ≥ t}. As a corollary, we obtain the exact rate of convergence of a plugin type estimate of the density level set corresponding to a fixed probability for the law induced by f.
A Limit Theorem for Solutions of Inequalities
 Scandinavian Journal of Statistics
, 1998
"... Let H(p) be the set fx 2 X: h(x) pg, where h is a realvalued lower semicontinuous function on a locally compact second countable metric space X. A limit theorem is proved for the empirical counterpart of H(p) obtained by replacing of h with its estimator. AMS Subject Classification (1991): 52A22, ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
(Show Context)
Let H(p) be the set fx 2 X: h(x) pg, where h is a realvalued lower semicontinuous function on a locally compact second countable metric space X. A limit theorem is proved for the empirical counterpart of H(p) obtained by replacing of h with its estimator. AMS Subject Classification (1991): 52A22, 60D05, 60F05, 62G99. Keywords & Phrases: Aumann expectation, polar set, random set, Hausdorff metric, weak convergence
How to Divide a Territory? A New Simple Differential Formalism for Optimization of Set Functions
 International Journal of Intelligent Systems
"... ..."
Discrimination of locally stationary time series based on the excess mass functional
 J. AMER. STATIST. ASSOC
, 2006
"... Discrimination of time series is an important practical problem with applications in various scientific fields. We propose and study a novel approach to this problem. Our approach is applicable to cases where time series in different categories have a different “shape.” Although based on the idea of ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
(Show Context)
Discrimination of time series is an important practical problem with applications in various scientific fields. We propose and study a novel approach to this problem. Our approach is applicable to cases where time series in different categories have a different “shape.” Although based on the idea of feature extraction, our method is not distancebased, and as such does not require aligning the time series. Instead, features are measured for each time series, and discrimination is based on these individual measures. An AR process with a timevarying variance is used as an underlying model. Our method then uses shape measures or, better, measures of concentration of the variance function, as a criterion for discrimination. It is this concentration aspect or shape aspect that makes the approach intuitively appealing. We provide some mathematical justification for our proposed methodology, as well as a simulation study and an application to the problem of discriminating earthquakes and explosions.
Smallest nonparametric tolerance regions
 Ann. Statist
, 2001
"... We present a new, natural way to construct nonparametric multivariate tolerance regions. Unlike the classical nonparametric tolerance intervals, where the endpoints are determined by beforehand chosen order statistics, we take the shortest interval, that contains a certain number of observations. ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
We present a new, natural way to construct nonparametric multivariate tolerance regions. Unlike the classical nonparametric tolerance intervals, where the endpoints are determined by beforehand chosen order statistics, we take the shortest interval, that contains a certain number of observations. We extend this idea to higher dimensions by replacing the class of intervals by a general class of indexing sets, which specializes to the classes of ellipsoids, hyperrectangles or convex sets. The asymptotic behavior of our tolerance regions is derived using empirical process theory, in particular the concept of generalized quantiles. Finite sample properties of our tolerance regions are investigated through a simulation study. Real data examples are also presented. 1. Introduction. Several
Rates of convergence for a Bayesian level set estimation
 Scand. J. Statist
, 2005
"... ABSTRACT. We are interested in estimating level sets using a Bayesian nonparametric approach, from an independent and identically distributed sample drawn from an unknown distribution. Under fairly general conditions on the prior, we provide an upper bound on the rate of convergence of the Bayesia ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
ABSTRACT. We are interested in estimating level sets using a Bayesian nonparametric approach, from an independent and identically distributed sample drawn from an unknown distribution. Under fairly general conditions on the prior, we provide an upper bound on the rate of convergence of the Bayesian level set estimate, via the rate at which the posterior distribution concentrates around the true level set. We then consider, as an application, the logspline prior in the twodimensional unit cube. Assuming that the true distribution belongs to a class of Hölder, we provide an upper bound on the rate of convergence of the Bayesian level set estimates. We compare our results with existing rates of convergence in the frequentist nonparametric literature: the Bayesian level set estimator proves to be competitive and is also easy to compute, which is of no small importance. A simulation study is given as an illustration. Key words: Bayesian nonparametric estimation, convergence rates of the posterior distribution, level set 1.
A review of RKHS methods in machine learning
, 2006
"... Over the last ten years, estimation and learning methods utilizing positive definite kernels have become rather popular, particularly in machine learning. Since these methods have a stronger mathematical slant than earlier machine learning methods (e.g., neural networks), there is also ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Over the last ten years, estimation and learning methods utilizing positive definite kernels have become rather popular, particularly in machine learning. Since these methods have a stronger mathematical slant than earlier machine learning methods (e.g., neural networks), there is also
The excess mass approach and the analysis of multimodality
 Proc. 18th Annual Conference of the GfKl
, 1996
"... Summary: The excess mass approach is a general approach to statistical analysis. It can be used to formulate a probabilistic model for clustering and can be applied to the analysis of multimodality. Intuitively, a mode is present where an excess of probability mass is concentrated. This intuitive i ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Summary: The excess mass approach is a general approach to statistical analysis. It can be used to formulate a probabilistic model for clustering and can be applied to the analysis of multimodality. Intuitively, a mode is present where an excess of probability mass is concentrated. This intuitive idea can be formalized directly by means of the excess mass functional. There is no need for intervening steps like initial density estimation. The excess mass measures the local difference of a given distribution to a reference model, usually the uniform distribution. The excess mass defines a functional which can be estimated efficiently from the data and can be used to test for multimodality. 1. The problem of multimodality We want to find the number of modes of a distribution in R k, based on a sample of n independent observations. There are many approaches to this problem. Any approach has to face an inherent difficulty of the modalityproblem: the functional which associates the number of modes to a distribution is only semicontinuous. In any neighbourhood (with respect to the testing topology) of a given