Results 1  10
of
75
Stepwise multiple testing as formalized data snooping
 Econometrica
, 2005
"... It is common in econometric applications that several hypothesis tests are carried out at the same time. The problem then becomes how to decide which hypotheses to reject, accounting for the multitude of tests. In this paper, we suggest a stepwise multiple testing procedure which asymptotically cont ..."
Abstract

Cited by 88 (10 self)
 Add to MetaCart
It is common in econometric applications that several hypothesis tests are carried out at the same time. The problem then becomes how to decide which hypotheses to reject, accounting for the multitude of tests. In this paper, we suggest a stepwise multiple testing procedure which asymptotically controls the familywise error rate at a desired level. Compared to related singlestep methods, our procedure is more powerful in the sense that it often will reject more false hypotheses. In addition, we advocate the use of studentization when it is feasible. Unlike some stepwise methods, our method implicitly captures the joint dependence structure of the test statistics, which results in increased ability to detect alternative hypotheses. We prove our method asymptotically controls the familywise error rate under minimal assumptions. We present our methodology in the context of comparing several strategies to a common benchmark and deciding which strategies actually beat the benchmark. However, our ideas can easily be extended and/or modified to other contexts, such as making inference for the individual regression coefficients in a multiple regression framework. Some simulation studies show the improvements of our methods over previous proposals. We also provide an application to a set of real data.
Microarrays, empirical Bayes and the twogroups model
 STATIST. SCI
, 2006
"... The classic frequentist theory of hypothesis testing developed by Neyman, Pearson, and Fisher has a claim to being the Twentieth Century’s most influential piece of applied mathematics. Something new is happening in the TwentyFirst Century: high throughput devices, such as microarrays, routinely re ..."
Abstract

Cited by 75 (10 self)
 Add to MetaCart
The classic frequentist theory of hypothesis testing developed by Neyman, Pearson, and Fisher has a claim to being the Twentieth Century’s most influential piece of applied mathematics. Something new is happening in the TwentyFirst Century: high throughput devices, such as microarrays, routinely require simultaneous hypothesis tests for thousands of individual cases, not at all what the classical theory had in mind. In these situations empirical Bayes information begins to force itself upon frequentists and Bayesians alike. The twogroups model is a simple Bayesian construction that facilitates empirical Bayes analysis. This article concerns the interplay of Bayesian and frequentist ideas in the twogroups setting, with particular attention focussed on Benjamini and Hochberg’s False Discovery Rate method. Topics include the choice and meaning of the null hypothesis in largescale testing situations, power considerations, the limitations of permutation methods, significance testing for groups of cases (such as pathways in microarray studies), correlation effects, multiple confidence intervals, and Bayesian competitors to the twogroups model.
Control of generalized error rates in multiple testing
 IEW  WORKING PAPERS IEWWP245, INSTITUTE FOR EMPIRICAL RESEARCH IN ECONOMICS  IEW (2005) AVAILABLE AT HTTP://IDEAS.REPEC.ORG/P/ZUR/IEWWPX/245.HTML
, 2005
"... Consider the problem of testing s hypotheses simultaneously. The usual approach to dealing with the multiplicity problem is to restrict attention to procedures that control the probability of even one false rejection, the familiar familywise error rate (FWER). In many applications, particularly if s ..."
Abstract

Cited by 37 (7 self)
 Add to MetaCart
(Show Context)
Consider the problem of testing s hypotheses simultaneously. The usual approach to dealing with the multiplicity problem is to restrict attention to procedures that control the probability of even one false rejection, the familiar familywise error rate (FWER). In many applications, particularly if s is large, one might be willing to tolerate more than one false rejection if the number of such cases is controlled, thereby increasing the ability of the procedure to reject false null hypotheses One possibility is to replace control of the FWER by control of the probability of k or more false rejections, which is called the kFWER. We derive both singlestep and stepdown procedures that control the kFWER in finite samples or asymptotically, depending on the situation. Lehmann and Romano (2005a) derive some exact methods for this purpose, which apply whenever pvalues are available for individual tests; no assumptions are made on the joint dependence of the pvalues. In contrast, we construct methods that implicitly take into account the dependence structure of the individual test statistics in order to further increase the ability to detect false null hypotheses. We also consider the false discovery proportion (FDP) defined as the number of false rejections divided by the total number of rejections (and defined to be 0 if there are no rejections). The false discovery rate proposed by Benjamini and Hochberg (1995) controls E(FDP). Here, the goal is to construct methods which satisfy, for a given γ and α, P {FDP> γ} ≤ α, at least asymptotically.
Formalized data snooping based on generalized error rates. Econometric Theory
, 2008
"... It is common in econometric applications that several hypothesis tests are carried out simultaneously+ The problem then becomes how to decide which hypotheses to reject, accounting for the multitude of tests+ The classical approach is to control the familywise error rate ~FWE!, which is the probabil ..."
Abstract

Cited by 33 (9 self)
 Add to MetaCart
It is common in econometric applications that several hypothesis tests are carried out simultaneously+ The problem then becomes how to decide which hypotheses to reject, accounting for the multitude of tests+ The classical approach is to control the familywise error rate ~FWE!, which is the probability of one or more false rejections+ But when the number of hypotheses under consideration is large, control of the FWE can become too demanding+ As a result, the number of false hypotheses rejected may be small or even zero+ This suggests replacing control of the FWE by a more liberal measure+ To this end, we review a number of recent proposals from the statistical literature+ We briefly discuss how these procedures apply to the general problem of model selection+ A simulation study and two empirical applications illustrate the methods+ 1.
Stepup procedures for control of generalizations of the familywise error rate
 Ann. Statist
, 2006
"... Consider the multiple testing problem of testing null hypotheses H1,...,Hs. A classical approach to dealing with the multiplicity problem is to restrict attention to procedures that control the familywise error rate (FWER), the probability of even one false rejection. But if s is large, control of t ..."
Abstract

Cited by 31 (9 self)
 Add to MetaCart
(Show Context)
Consider the multiple testing problem of testing null hypotheses H1,...,Hs. A classical approach to dealing with the multiplicity problem is to restrict attention to procedures that control the familywise error rate (FWER), the probability of even one false rejection. But if s is large, control of the FWER is so stringent that the ability of a procedure that controls the FWER to detect false null hypotheses is limited. It is therefore desirable to consider other measures of error control. This article considers two generalizations of the FWER. The first is the kFWER, in which one is willing to tolerate k or more false rejections for some fixed k ≥ 1. The second is based on the false discovery proportion (FDP), defined to be the number of false rejections divided by the total number of rejections (and defined to be 0 if there are no rejections). Benjamini and Hochberg [J. Roy. Statist. Soc. Ser. B 57 (1995) 289–300] proposed control of the false discovery rate (FDR), by which they meant that, for fixed α, E(FDP) ≤ α. Here, we consider control of the FDP in the sense that, for fixed γ and α, P {FDP> γ} ≤ α. Beginning with any nondecreasing sequence of constants and pvalues for the individual tests, we derive stepup procedures that control each of these two measures of error control without imposing any assumptions on the dependence structure of the pvalues. We use our results to point out a few interesting connections with some closely related stepdown procedures. We then compare and contrast two FDPcontrolling procedures obtained using our results with the stepup procedure for control of the FDR of Benjamini and Yekutieli [Ann. Statist. 29 (2001) 1165–1188]. 1. Introduction. In
Multiple testing and error control in Gaussian graphical model selection
 Statistical Science
"... Abstract. Graphical models provide a framework for exploration of multivariate dependence patterns. The connection between graph and statistical model is made by identifying the vertices of the graph with the observed variables and translating the pattern of edges in the graph into a pattern of cond ..."
Abstract

Cited by 28 (4 self)
 Add to MetaCart
Abstract. Graphical models provide a framework for exploration of multivariate dependence patterns. The connection between graph and statistical model is made by identifying the vertices of the graph with the observed variables and translating the pattern of edges in the graph into a pattern of conditional independences that is imposed on the variables ’ joint distribution. Focusing on Gaussian models, we review classical graphical models. For these models the defining conditional independences are equivalent to vanishing of certain (partial) correlation coefficients associated with individual edges that are absent from the graph. Hence, Gaussian graphical model selection can be performed by multiple testing of hypotheses about vanishing (partial) correlation coefficients. We show and exemplify how this approach allows one to perform model selection while controlling error rates for incorrect edge inclusion. Key words and phrases: Acyclic directed graph, Bayesian network, bidirected graph, chain graph, concentration graph, covariance graph, DAG, graphical model, multiple testing, undirected graph. 1.
Approximate nonlinear forecasting methods
 Handbook of Economic Forecasting
, 2006
"... We review key aspects of forecasting using nonlinear models. Because economic models are typically misspecified, the resulting forecasts provide only an approximation to the best possible forecast. Although it is in principle possible to obtain superior approximations to the optimal forecast using n ..."
Abstract

Cited by 26 (8 self)
 Add to MetaCart
(Show Context)
We review key aspects of forecasting using nonlinear models. Because economic models are typically misspecified, the resulting forecasts provide only an approximation to the best possible forecast. Although it is in principle possible to obtain superior approximations to the optimal forecast using nonlinear methods, there are some potentially serious practical challenges. Primary among these are computational difficulties, the dangers of overfit, and potential difficulties of interpretation. In this chapter we discuss these issues in detail. Then we propose and illustrate the use of a new family of methods (QuickNet) that achieves the benefits of using a forecasting model that is nonlinear in the predictors while avoiding or mitigating the other challenges to the use of nonlinear forecasting methods. 1.
Quantilefunction based null distribution in resampling based multiple testing
 Statistical Applications in Genetics and Molecular Biology, 5(1):Article 14, 2006. URL www.bepress.com/sagmb/vol5/iss1/art14
, 2006
"... Simultaneously testing a collection of null hypotheses about a data generating distribution based on a sample of independent and identically distributed observations is a fundamental and important statistical problem involving many applications. Methods based on marginal null distributions (i.e., ma ..."
Abstract

Cited by 24 (12 self)
 Add to MetaCart
Simultaneously testing a collection of null hypotheses about a data generating distribution based on a sample of independent and identically distributed observations is a fundamental and important statistical problem involving many applications. Methods based on marginal null distributions (i.e., marginal pvalues) are attractive since the marginal pvalues can be based on a user supplied choice of marginal null distributions and they are computationally trivial, but they, by necessity, are known to either be conservative or to rely on assumptions about the dependence structure between the teststatistics. Resampling based multiple testing (Westfall and Young, 1993) involves sampling from a joint null distribution of the teststatistics, and controlling (possibly in a, for example, stepdown fashion) the user supplied typeI error rate under this joint null distribution for the teststatistics. A generally asymptotically valid null distribution avoiding the need for the subset pivotality condition for the vector of teststatistics was proposed in Pollard, van der Laan (2003) for null hypotheses about general real valued parameters. This null distribution was generalized in Dudoit, vanderLaan, Pollard (2004) to general null hypotheses and teststatistics. In ongoing recent work van der Laan, Hubbard (2005), we propose a new generally asymptotically
Stepup procedures controlling generalized FWER and generalized
, 2007
"... In many applications of multiple hypothesis testing where more than one false rejection can be tolerated, procedures controlling error rates measuring at least k false rejections, instead of at least one, for some fixed k ≥ 1 can potentially increase the ability of a procedure to detect false null h ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
(Show Context)
In many applications of multiple hypothesis testing where more than one false rejection can be tolerated, procedures controlling error rates measuring at least k false rejections, instead of at least one, for some fixed k ≥ 1 can potentially increase the ability of a procedure to detect false null hypotheses. The kFWER, a generalized version of the usual familywise error rate (FWER), is such an error rate that has recently been introduced in the literature and procedures controlling it have been proposed. A further generalization of a result on the kFWER is provided in this article. In addition, an alternative and less conservative notion of error rate, the kFDR, is introduced in the same spirit as the kFWER by generalizing the usual false discovery rate (FDR). A kFWER procedure is constructed given any set of increasing constants by utilizing the kth order joint null distributions of the pvalues without assuming any specific form of dependence among all the pvalues. Procedures controlling the kFDR are also developed by using the kth order joint null distributions of the pvalues, first assuming that the sets of null and nonnull pvalues are mutually independent or they are jointly positively dependent in the sense of being multivariate totally positive of order two (MTP2) and then discarding that assumption about the overall dependence among the pvalues. 1. Introduction. Having
To how many simultaneous hypothesis tests the normal, Student’s t, or bootstrap calibration be applied
, 2007
"... ABSTRACT. In the analysis of microarray data, and in some other contemporary statistical problems, it is not uncommon to apply hypothesis tests in a highly simultaneous way. The number, N say, of tests used can be much larger than the sample sizes, n, to which the tests are applied, yet we wish to c ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
ABSTRACT. In the analysis of microarray data, and in some other contemporary statistical problems, it is not uncommon to apply hypothesis tests in a highly simultaneous way. The number, N say, of tests used can be much larger than the sample sizes, n, to which the tests are applied, yet we wish to calibrate the tests so that the overall level of the simultaneous test is accurate. Often the sampling distribution is quite different for each test, so there may not be an opportunity for combining data across samples. In this setting, how large can N be, as a function of n, before level accuracy becomes poor? In the present paper we answer this question in cases where the statistic under test is of Student’s t type. We show that if either Normal or Student’s t distribution is used for calibration then the level of the simultaneous test is accurate provided log N increases at a strictly slower rate than n 1/3 as n diverges. On the other hand, if bootstrap methods are used for calibration then we may choose log N almost as large as n 1/2 and still achieve asymptotic level accuracy. The implications of these results are explored both theoretically and numerically. KEYWORDS. Bonferroni’s inequality, Edgeworth expansion, genetic data, largedeviation expansion, level accuracy, microarray data, quantile estimation, skewness, Student’s t statistic.