Results 1  10
of
19
Invariant coordinate selection
, 2007
"... A general method for exploring multivariate data by comparing different estimates of multivariate scatter is presented. The method is based upon the eigenvalueeigenvector decomposition of one scatter matrix relative to another. In particular, it is shown that the eigenvectors can be used to gener ..."
Abstract

Cited by 16 (8 self)
 Add to MetaCart
(Show Context)
A general method for exploring multivariate data by comparing different estimates of multivariate scatter is presented. The method is based upon the eigenvalueeigenvector decomposition of one scatter matrix relative to another. In particular, it is shown that the eigenvectors can be used to generate an affine invariant coordinate system for the multivariate data. Consequently, we view this method as a method for invariant coordinate selection (ICS). By plotting the data with respect to this new invariant coordinate system, various data structures can be revealed. For example, under certain independent components models, it is shown that the invariant coordinates correspond to the independent components. Another example pertains to mixtures of elliptical distributions. In this case, it is shown that a subset of the invariant coordinates corresponds to Fisher’s linear discriminant subspace, even though the class identifications of the data points are unknown. Some illustrative examples are given.
A canonical definition of shape
, 2007
"... Very general concepts of scatter, extending the traditional notion of covariance matrices, have become classical tools in robust multivariate analysis. In many problems of practical importance (principal components, canonical correlation, testing for sphericity), only homogeneous functions of the ..."
Abstract

Cited by 11 (6 self)
 Add to MetaCart
Very general concepts of scatter, extending the traditional notion of covariance matrices, have become classical tools in robust multivariate analysis. In many problems of practical importance (principal components, canonical correlation, testing for sphericity), only homogeneous functions of the scatter matrix are of interest. In line with this fact, scatter functionals often are only defined up to a positive scalar factor, yielding a family of scatter matrices rather than a uniquely defined one. In such families, it is natural to single out one representative by imposing a normalization constraint: this normalized scatter is called a shape matrix. In the particular case of elliptical families, this constraint in turn induces a concept of scale; along with a location center and a standardized radial density, the shape and scale parameters entirely characterize an elliptical density. In this paper, we show that one and only normalization has the additional properties that (i) the resulting Fisher information matrices for shape and scale, in locally asymptotically normal (LAN) elliptical families, are blockdiagonal, and that (ii) the semiparametric elliptical families indexed by location, shape, and completely unspecified radial densities are adaptive. This particular normalization, which imposes that the determinant of the shape matrix be equal to one, therefore can be considered canonical.
MULTIVARIATE REGRESSION SESTIMATORS FOR ROBUST ESTIMATION AND INFERENCE
"... In this paper we consider Sestimators for multivariate regression. We study the robustness of the estimators in terms of their breakdown point and influence function. Our results extend results on Sestimators in the context of univariate regression and multivariate location and scatter. Furthermor ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
In this paper we consider Sestimators for multivariate regression. We study the robustness of the estimators in terms of their breakdown point and influence function. Our results extend results on Sestimators in the context of univariate regression and multivariate location and scatter. Furthermore we develop a fast and robust bootstrap method for the multivariate Sestimators to obtain inference for the regression parameters. Extensive simulation studies are performed to investigate finitesample properties. The use of the Sestimators and the fast, robust bootstrap method is illustrated on some realdata examples.
OPTIMAL RANKBASED TESTS FOR HOMOGENEITY OF SCATTER
, 806
"... We propose a class of locally and asymptotically optimal tests, based on multivariate ranks and signs for the homogeneity of scatter matrices in m elliptical populations. Contrary to the existing parametric procedures, these tests remain valid without any moment assumptions, and thus are perfectly r ..."
Abstract

Cited by 8 (8 self)
 Add to MetaCart
(Show Context)
We propose a class of locally and asymptotically optimal tests, based on multivariate ranks and signs for the homogeneity of scatter matrices in m elliptical populations. Contrary to the existing parametric procedures, these tests remain valid without any moment assumptions, and thus are perfectly robust against heavytailed distributions (validity robustness). Nevertheless, they reach semiparametric efficiency bounds at correctly specified elliptical densities and maintain high powers under all (efficiency robustness). In particular, their normalscore version outperforms traditional Gaussian likelihood ratio tests and their pseudoGaussian robustifications under a very broad range of nonGaussian densities including, for instance, all multivariate Student and powerexponential distributions. 1. Introduction. 1.1. Homogeneity of variances and covariance matrices. The assumption of variance homogeneity is central to the theory and practice of univariate
Optimal tests for homogeneity of covariance, scale, and shape
 J. Multivariate Anal
, 2008
"... The assumption of homogeneity of covariance matrices is the fundamental prerequisite of a number of classical procedures in multivariate analysis. Despite its importance and long history, however, this problem so far has not been completely settled beyond the traditional and highly unrealistic cont ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
The assumption of homogeneity of covariance matrices is the fundamental prerequisite of a number of classical procedures in multivariate analysis. Despite its importance and long history, however, this problem so far has not been completely settled beyond the traditional and highly unrealistic context of multivariate Gaussian models. And the modified likelihood ratio tests (MLRT) that are used in everyday practice are known to be highly sensitive to violations of Gaussian assumptions. In this paper, we provide a complete and systematic study of the problem, and propose test statistics which, while preserving the optimality features of the MLRT under multinormal assumptions, remain valid under unspecified elliptical densities with finite fourthorder moments. As a first step, the Le Cam LAN approach is used for deriving locally and asymptotically optimal testing procedures φ (n) f for any specified mtuple of radial densities f = (f1,..., fm). Combined with an estimation of the m densities f1,..., fm, these procedures can be used to construct adaptive tests for the problem. Adaptive tests however typically require very large samples, and pseudoGaussian tests—namely, tests that are locally and asymptotically optimal at Gaussian densities while remaining valid under a much broader class of distributions—in general are preferable. We therefore construct two pseudoGaussian modifications of the Gaussian version φ (n) N of the optimal test φ (n) f. The first one, φ
Multivariate Generalized Sestimators
, 2008
"... In this paper we introduce generalized Sestimators for the multivariate regression model. This class of estimators combines high robustness and high efficiency. They are defined by minimizing the determinant of a robust estimator of the scatter matrix of differences of residuals. In the special cas ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
In this paper we introduce generalized Sestimators for the multivariate regression model. This class of estimators combines high robustness and high efficiency. They are defined by minimizing the determinant of a robust estimator of the scatter matrix of differences of residuals. In the special case of a multivariate location model, the generalized Sestimator has the important independence property, and can be used for high breakdown estimation in independent component analysis. Robustness properties of the estimators are investigated by deriving their breakdown point and the influence function. We also study the efficiency of the estimators, both asymptotically and at finite samples. To obtain inference for the regression parameters, we discuss the fast and robust bootstrap for multivariate generalized
Robust estimation of the vector autoregressive model by a least trimmed squares procedure
 COMPSTAT 2008: Proceedings in computational statistics
, 2008
"... Robust estimation of the vector autoregressive model by a trimmed least squares procedure ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Robust estimation of the vector autoregressive model by a trimmed least squares procedure
OUTLIER DETECTION FOR SKEWED DATA
, 2007
"... Most outlier detection rules for multivariate data are based on the assumption of elliptical symmetry of the underlying distribution. We propose an outlier detection method which does not need the assumption of symmetry and does not rely on visual inspection. Our method is a generalization of the St ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Most outlier detection rules for multivariate data are based on the assumption of elliptical symmetry of the underlying distribution. We propose an outlier detection method which does not need the assumption of symmetry and does not rely on visual inspection. Our method is a generalization of the StahelDonoho outlyingness. The latter approach assigns to each observation a measure of outlyingness, which is obtained by projection pursuit techniques that only use univariate robust measures of location and scale. To allow skewness in the data, we adjust this measure of outlyingness by using a robust measure of skewness as well. The observations corresponding to an outlying value of the adjusted outlyingness are then considered as outliers. For bivariate data, our approach leads to two graphical representations. The first one is a contour plot of the adjusted outlyingness values, and can be considered as an approximation of the density contours of the underlying distribution. We also construct an extension of the boxplot for bivariate data, in the spirit of the bagplot [1] which is based on the concept of half space depth. We illustrate our outlier detection method on several simulated and real data.
Statistical procedures and robust statistics
 Estadistica
"... It is argued that a main aim of statistics is to produce statistical procedures which in this article are defined as algorithms with inputs and outputs. The structure and properties of such procedures are investigated with special reference to topological and testing considerations. Procedures whi ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
It is argued that a main aim of statistics is to produce statistical procedures which in this article are defined as algorithms with inputs and outputs. The structure and properties of such procedures are investigated with special reference to topological and testing considerations. Procedures which work well in a large variety of situations are often based on robust statistical functionals. In the final section some aspects of robust statistics are discussed again with special reference to topology and continuity. 1
Recall that a sequence Qk of laws (probability measures), here on R
, 2005
"... said to converge weakly toalaw Q if fdQk → fdQ for every bounded continuous function f. There exists a metric ρ on the set of all laws on R Q d which metrizes weak convergence, in other words Qk → Q weakly if and only if ρ(Qk,Q) → 0, e.g. Dudley (2002, Sec. 11.3). A set U of laws is called weakly o ..."
Abstract
 Add to MetaCart
(Show Context)
said to converge weakly toalaw Q if fdQk → fdQ for every bounded continuous function f. There exists a metric ρ on the set of all laws on R Q d which metrizes weak convergence, in other words Qk → Q weakly if and only if ρ(Qk,Q) → 0, e.g. Dudley (2002, Sec. 11.3). A set U of laws is called weakly open if and only if whenever Q ∈ U and Qk → Q weakly we have k ∈ U for all k large enough. Equivalently, for each Q ∈ U, there is an r> 0 such that whenever ρ(Q, P) < r we have P ∈ U. Much of robustness theory emphasizes mixture laws P =(1 − λ)F0 + λQ (1) where Q is an arbitrary “contaminating ” distribution, F0 is a special distribution with a density, say for definiteness a normal, and 0 ≤ λ< 1/2, e.g. Huber [20, pp. 86, 89]. Despite the generality of Q, the contamination model (1) doesn’t include some, perhaps the majority, of laws P treated as