Results 1 
9 of
9
OPTIMAL RANKBASED TESTING FOR PRINCIPAL COMPONENTS
"... This paper provides parametric and rankbased optimal tests for eigenvectors and eigenvalues of covariance or scatter matrices in elliptical families. The parametric tests extend the Gaussian likelihood ratio tests of Anderson (1963) and their pseudoGaussian robustifications by Tyler (1981, 1983) a ..."
Abstract

Cited by 13 (11 self)
 Add to MetaCart
This paper provides parametric and rankbased optimal tests for eigenvectors and eigenvalues of covariance or scatter matrices in elliptical families. The parametric tests extend the Gaussian likelihood ratio tests of Anderson (1963) and their pseudoGaussian robustifications by Tyler (1981, 1983) and Davis (1977), with which their Gaussian versions are shown to coincide, asymptotically, under Gaussian or finite fourthorder moment assumptions, respectively. Such assumptions however restrict the scope to covariancebased principal component analysis. The rankbased tests we are proposing remain valid without such assumptions. Hence, they address a much broader class of problems, where covariance matrices need not exist and principal components are associated with more general scatter matrices. Asymptotic relative efficiencies moreover show that those rankbased tests are quite powerful; when based on van der Waerden or normal scores, they even uniformly dominate the pseudoGaussian versions
Fast and Robust Bootstrap
 STATISTICAL METHODS AND APPLICATIONS
"... In this paper we review recent developments on a bootstrap method for robust estimators which is computationally faster and more resistant to outliers than the classical bootstrap. This fast and robust bootstrap method is, under reasonable regularity conditions, asymptotically consistent. We describ ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
In this paper we review recent developments on a bootstrap method for robust estimators which is computationally faster and more resistant to outliers than the classical bootstrap. This fast and robust bootstrap method is, under reasonable regularity conditions, asymptotically consistent. We describe the method in general and then consider its application to perform inference based on robust estimators for the linear regression and multivariate locationscatter models. In particular, we study confidence and prediction intervals and tests of hypotheses for linear regression models, inference for locationscatter parameters and principal components, and classification error estimation for discriminant analysis.
Optimal tests for homogeneity of covariance, scale, and shape
 J. Multivariate Anal
, 2008
"... The assumption of homogeneity of covariance matrices is the fundamental prerequisite of a number of classical procedures in multivariate analysis. Despite its importance and long history, however, this problem so far has not been completely settled beyond the traditional and highly unrealistic cont ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
The assumption of homogeneity of covariance matrices is the fundamental prerequisite of a number of classical procedures in multivariate analysis. Despite its importance and long history, however, this problem so far has not been completely settled beyond the traditional and highly unrealistic context of multivariate Gaussian models. And the modified likelihood ratio tests (MLRT) that are used in everyday practice are known to be highly sensitive to violations of Gaussian assumptions. In this paper, we provide a complete and systematic study of the problem, and propose test statistics which, while preserving the optimality features of the MLRT under multinormal assumptions, remain valid under unspecified elliptical densities with finite fourthorder moments. As a first step, the Le Cam LAN approach is used for deriving locally and asymptotically optimal testing procedures φ (n) f for any specified mtuple of radial densities f = (f1,..., fm). Combined with an estimation of the m densities f1,..., fm, these procedures can be used to construct adaptive tests for the problem. Adaptive tests however typically require very large samples, and pseudoGaussian tests—namely, tests that are locally and asymptotically optimal at Gaussian densities while remaining valid under a much broader class of distributions—in general are preferable. We therefore construct two pseudoGaussian modifications of the Gaussian version φ (n) N of the optimal test φ (n) f. The first one, φ
Robust Model Selection Using Fast and Robust Bootstrap
"... Robust model selection procedures control the undue influence that outliers can have on the selection criteria by using both robust point estimators and a bounded loss function when measuring either the goodnessoffit or the expected prediction error of each model. Furthermore, to avoid favoring ov ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Robust model selection procedures control the undue influence that outliers can have on the selection criteria by using both robust point estimators and a bounded loss function when measuring either the goodnessoffit or the expected prediction error of each model. Furthermore, to avoid favoring overfitting models, these two measures can be combined with a penalty term for the size of the model. The expected prediction error conditional on the observed data may be estimated using the bootstrap. However, bootstrapping robust estimators becomes extremely time consuming on moderate to high dimensional data sets. It is shown that the expected prediction error can be estimated using a very fast and robust bootstrap method, and that this approach yields a consistent model selection method that is computationally feasible even for a relatively large number of covariates. Moreover, as opposed to other bootstrap methods, this proposal avoids the numerical problems associated with the small bootstrap samples required to obtain consistent model Preprint submitted to Elsevier 17 March 2008 selection criteria. The finitesample performance of the fast and robust bootstrap model selection method is investigated through a simulation study while its feasibility and good performance on moderately large regression models are illustrated on several real data examples.
A DETERMINISTIC ALGORITHM FOR THE MCD
, 2010
"... The minimum covariance determinant (MCD) method is a robust estimator of multivariate location and scatter (Rousseeuw, 1984). The MCD is highly resistant to outliers, and it is often applied by itself and as a building block for other robust multivariate methods. Computing the exact MCD is very hard ..."
Abstract
 Add to MetaCart
The minimum covariance determinant (MCD) method is a robust estimator of multivariate location and scatter (Rousseeuw, 1984). The MCD is highly resistant to outliers, and it is often applied by itself and as a building block for other robust multivariate methods. Computing the exact MCD is very hard, so in practice one resorts to approximate algorithms. Most often the FASTMCD algorithm of Rousseeuw and Van Driessen (1999) is used. This algorithm starts by drawing many random subsets, followed by socalled concentration steps. The FASTMCD algorithm is affine equivariant but not permutation invariant. In this article we present a deterministic algorithm, denoted as DetMCD, which does not use random subsets and is even faster. It is permutation invariant and very close to affine equivariant. We illustrate DetMCD on real and simulated data sets, with applications involving principal component analysis, multivariate regression, and classification. Supplemental material (Matlab code of the DetMCD algorithm and the data sets) are available online.
ROBUST BOOTSTRAP: AN ALTERNATIVE TO BOOTSTRAPPING ROBUST ESTIMATORS
"... • There is a vast literature on robust estimators, but in some situations it is still not easy to make inferences, such as confidence regions and hypothesis testing. This is mainly due to the following facts. On one hand, in most situations, it is difficult to derive the exact distribution of the es ..."
Abstract
 Add to MetaCart
• There is a vast literature on robust estimators, but in some situations it is still not easy to make inferences, such as confidence regions and hypothesis testing. This is mainly due to the following facts. On one hand, in most situations, it is difficult to derive the exact distribution of the estimator. On the other one, even if its asymptotic behaviour is known, in many cases, the convergence to the limiting distribution may be rather slow, so bootstrap methods are preferable since they often give better small sample results. However, resampling methods have several disadvantages including the propagation of anomalous data all along the new samples. In this paper, we discuss the problems arising in the bootstrap when outlying observations are present. We argue that it is preferable to use a robust bootstrap rather than to bootstrap robust estimators and we discuss a robust bootstrap method, the Influence Function Bootstrap denoted IFB. We illustrate the performance of the IFB intervals in the univariate location case and in the logistic regression model. We derive some asymptotic properties of the IFB. Finally, we introduce a generalization of the Influence Function Bootstrap in order to improve the IFB behaviour. KeyWords: • influence function; resampling methods; robust inference. AMS Subject Classification:
Robust Learning from Bites for Data Mining
"... Some methods from statistical machine learning and from robust statistics have two drawbacks. Firstly, they are computerintensive such that they can hardly be used for massive data sets, say with millions of data points. Secondly, robust and nonparametric confidence intervals for the predictions a ..."
Abstract
 Add to MetaCart
Some methods from statistical machine learning and from robust statistics have two drawbacks. Firstly, they are computerintensive such that they can hardly be used for massive data sets, say with millions of data points. Secondly, robust and nonparametric confidence intervals for the predictions according to the fitted models are often unknown. Here, we propose a simple but general method to overcome these problems in the context of huge data sets. The method is scalable to the memory of the computer, can be distributed on several processors if available, and can help to reduce the computation time substantially. Our main focus is on robust general support vector machines (SVM) based on minimizing regularized risks. The method offers distributionfree confidence intervals for the median of the predictions. The approach can also be helpful to fit robust estimators in parametric models for huge data sets. 1.
The Annals of Statistics OPTIMAL RANKBASED TESTS FOR HOMOGENEITY OF SCATTER
"... We propose a class of locally and asymptotically optimal tests, based on multivariate ranks and signs, for the homogeneity of scatter matrices in m elliptical populations. Contrary to the existing parametric procedures, these tests remain valid without any moment assumptions, and thus are perfectl ..."
Abstract
 Add to MetaCart
(Show Context)
We propose a class of locally and asymptotically optimal tests, based on multivariate ranks and signs, for the homogeneity of scatter matrices in m elliptical populations. Contrary to the existing parametric procedures, these tests remain valid without any moment assumptions, and thus are perfectly robust against heavytailed distributions (validity robustness). Nevertheless, they reach semiparametric efficiency bounds at correctly specified elliptical densities and maintain high powers under all (efficiency robustness). In particular, their normalscore version outperforms traditional Gaussian likelihood ratio tests and their pseudoGaussian robustifications under a very broad range of nonGaussian densities including, for instance, all multivariate Student and powerexponential distributions. ∗The authors are also members of ECORE, the recently created association between CORE and ECARES.
Robust Sensor Array Processing for Nonstationary Signals
"... I would like to thank all people who have helped me during my PhD studies. First of all, I would like to express my special appreciation and sincere gratitude to my advisor Prof. Dr.Ing. Abdelhak M. Zoubir. You have been a tremendous mentor for me and I could not have imagined having a better super ..."
Abstract
 Add to MetaCart
(Show Context)
I would like to thank all people who have helped me during my PhD studies. First of all, I would like to express my special appreciation and sincere gratitude to my advisor Prof. Dr.Ing. Abdelhak M. Zoubir. You have been a tremendous mentor for me and I could not have imagined having a better supervision. I would like to thank you for your encouragement and support throughout the course of my PhD studies. Your advice on both research as well as on my career has always been invaluable. I am also grateful to Prof. Dr. Wilhelm Stannat for his supervision, guidance and precious time. I really benefited a lot from all my interactions with you. I also acknowledge my gratitude to Prof. Dr.Ing. Thomas Weiland, Prof. Dr.Ing. Ralf Steinmetz and Prof. Dr.Ing. Ulrich Konigorski who acted as the chair and examiners in the PhD committee. I would like to thank all my colleagues at the Signal Processing Group at TU Darmstadt. I treasure my memories of those joyous days. Many thanks to Michael Muma,