Results 1  10
of
50
Inference by eye: Confidence intervals and how to read pictures of data
 American Psychologist
, 2005
"... Wider use in psychology of confidence intervals (CIs), especially as error bars in figures, is a desirable development. However, psychologists seldom use CIs and may not understand them well. The authors discuss the interpretation of figures with error bars and analyze the relationship between CIs a ..."
Abstract

Cited by 107 (14 self)
 Add to MetaCart
Wider use in psychology of confidence intervals (CIs), especially as error bars in figures, is a desirable development. However, psychologists seldom use CIs and may not understand them well. The authors discuss the interpretation of figures with error bars and analyze the relationship between CIs and statistical significance testing. They propose 7 rules of eye to guide the inferential use of figures with error bars. These include general principles: Seek bars that relate directly to effects of interest, be sensitive to experimental design, and interpret the intervals. They also include guidelines for inferential interpretation of the overlap of CIs on independent group means. Wider use of interval estimation in psychology has the potential to improve research communication substantially. Inference by eye is the interpretation of graphically presented data. On first seeing Figure 1, what questions should spring to mind and what inferences are justified? We discuss figures with means and confidence intervals (CIs), and propose rules of eye to guide the interpretation of such figures. We believe it is timely to consider inference by eye because psychologists are now being encouraged to make greater use of CIs. Many who seek reform of psychologists ’ statistical practices advocate a change in emphasis from null hypothesis significance testing (NHST) to CIs, among other techniques
Null Hypothesis Significance Testing: A Review of an Old and Continuing Controversy
 Psychological Methods
, 2000
"... Null hypothesis significance testing (NHST) is arguably the mosl widely used approach to hypothesis evaluation among behavioral and social scientists. It is also very controversial. A major concern expressed by critics is that such testing is misunderstood by many of those who use it. Several other ..."
Abstract

Cited by 88 (0 self)
 Add to MetaCart
(Show Context)
Null hypothesis significance testing (NHST) is arguably the mosl widely used approach to hypothesis evaluation among behavioral and social scientists. It is also very controversial. A major concern expressed by critics is that such testing is misunderstood by many of those who use it. Several other objections to its use have also been raised. In this article the author reviews and comments on the claimed misunderstandings as well as on other criticisms of the approach, and he notes arguments that have been advanced in support of NHST. Alternatives and supplements to NHST are considered, as are several related recommendations regarding the interpretation of experimental data. The concluding opinion is that NHST is easily misunderstood and misused but that when applied with good judgment it can be an effective aid to the interpretation of experimental data. Null hypothesis statistical testing (NHST1) is arguably the most widely used method of analysis of data collected in psychological experiments and has been so for about 70 years. One might think that a method that had been embraced by an entire research community would be well understood and noncontroversial after many decades of constant use. However, NHST is very controversial.2 Criticism of the method, which essentially began with the introduction of the technique (Pearce, 1992), has waxed and waned over the years; it has been intense in the recent past. Apparently, controversy regarding the idea of NHST more generally extends back more than two and a half
What future quantitative social science research could look like: Confidence intervals for effect sizes
 Educational Researcher
, 2002
"... presents a selfcanceling mixedmessage. To present an “encouragement ” in the context of strict absolute standards regarding the esoterics of author note placement, pagination, and margins is to send the message, “these myriad requirements count, this encouragement doesn’t.” ..."
Abstract

Cited by 85 (2 self)
 Add to MetaCart
(Show Context)
presents a selfcanceling mixedmessage. To present an “encouragement ” in the context of strict absolute standards regarding the esoterics of author note placement, pagination, and margins is to send the message, “these myriad requirements count, this encouragement doesn’t.”
Evaluating statistical difference, equivalence, and indeterminacy using inferential confidence intervals: An integrated alternative method of conducting null hypothesis statistical tests
 Psychological Methods
, 2001
"... Null hypothesis statistical testing (NHST) has been debated extensively but always successfully defended. The technical merits of NHST are not disputed in this article. The widespread misuse of NHST has created a human factors problem that this article intends to ameliorate. This article describes a ..."
Abstract

Cited by 37 (0 self)
 Add to MetaCart
(Show Context)
Null hypothesis statistical testing (NHST) has been debated extensively but always successfully defended. The technical merits of NHST are not disputed in this article. The widespread misuse of NHST has created a human factors problem that this article intends to ameliorate. This article describes an integrated, alternative inferential confidence interval approach to testing for statistical difference, equivalence, and indeterminacy that is algebraically equivalent to standard NHST procedures and therefore exacts the same evidential standard. The combined numeric and graphic tests of statistical difference, equivalence, and indeterminacy are designed to avoid common interpretive problems associated with NHST procedures. Multiple comparisons, power, sample size, test reliability, effect size, and causeeffect ratio are discussed. A section on the proper interpretation of confidence intervals is followed by a decision rule summary and caveats. The longstanding controversy surrounding null hypothesis statistical testing (NHST) has typically been argued on its technical merits, and they are not dis
Sample Size for Multiple Regression: Obtaining Regression Coefficients That Are Accurate, Not Simply Significant
"... An approach to sample size planning for multiple regression is presented that emphasizes accuracy in parameter estimation (AIPE). The AIPE approach yields precise estimates of population parameters by providing necessary sample sizes in order for the likely widths of confidence intervals to be suffi ..."
Abstract

Cited by 25 (8 self)
 Add to MetaCart
An approach to sample size planning for multiple regression is presented that emphasizes accuracy in parameter estimation (AIPE). The AIPE approach yields precise estimates of population parameters by providing necessary sample sizes in order for the likely widths of confidence intervals to be sufficiently narrow. One AIPE method yields a sample size such that the expected width of the confidence interval around the standardized population regression coefficient is equal to the width specified. An enhanced formulation ensures, with some stipulated probability, that the width of the confidence interval will be no larger than the width specified. Issues involving standardized regression coefficients and random predictors are discussed, as are the philosophical differences between AIPE and the power analytic approaches to sample size planning. Sample size estimation from a power analytic perspective is often performed by mindful researchers in order to have a reasonable probability of obtaining parameter estimates that are statistically significant. In general, the social sciences have slowly become more aware of the problems associated with underpowered studies and their corresponding Type II errors, which can yield misleading results in a given
Bilingualism, biliteracy, and learning to read: interactions among languages and writing systems
 Scientific Studies of Reading
, 2005
"... Four groups of children in first grade were compared on early literacy tasks. Children in three of the groups were bilingual, each group representing a different combination of language and writing system, and children in the fourth group were monolingual speakers of English. All the bilingual child ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
(Show Context)
Four groups of children in first grade were compared on early literacy tasks. Children in three of the groups were bilingual, each group representing a different combination of language and writing system, and children in the fourth group were monolingual speakers of English. All the bilingual children used both languages daily and were learning to read in both languages. The children solved decoding and phonological awareness tasks, and the bilinguals completed all tasks in both languages. Initial differences between the groups in factors that contribute to early literacy were controlled in an analysis of covariance, and the results showed a general increment in reading ability for all the bilingual children but a larger advantage for children learning two alphabetic systems. Similarly, bilinguals transferred literacy skills across languages only when both languages were written in the same system. Therefore, the extent of the bilingual facilitation for early reading depends on the relation between the two languages and writing systems. Learning to read is indisputably the premier academic achievement of early schooling. It prepares children for their educational futures and is the key to the possibilities that their futures hold for them. Thus, if knowing two languages at the time that literacy is introduced, or learning to read in a language that is not the child’s dominant one, or acquiring literacy simultaneously in two languages affects the outcome of literacy instruction, then it would be important to know that. These possibilities affect a sizable portion of the world’s children: A significant number are bilingual at the time they begin reading, many are instructed in a language they do not speak at home, and some number of those are expected to acquire this skill in two languages. Requests for reprints should be sent to Ellen Bialystok, Department of Psychology, York University,
Methods for the Behavioral, Educational, and Social Sciences (MBESS) [Computer software and manual]. Retrievable from www.cran.rproject.org
, 2007
"... package for R (R Development Core Team, 2007b), an open source statistical programming language and environment. MBESS implements methods that are not widely available elsewhere, yet are especially helpful for the idiosyncratic techniques used within the behavioral, educational, and social sciences. ..."
Abstract

Cited by 19 (8 self)
 Add to MetaCart
(Show Context)
package for R (R Development Core Team, 2007b), an open source statistical programming language and environment. MBESS implements methods that are not widely available elsewhere, yet are especially helpful for the idiosyncratic techniques used within the behavioral, educational, and social sciences. The major categories of functions are those that relate to confidence interval formation for noncentral t, F, and � 2 parameters, confidence intervals for standardized effect sizes (which require noncentral distributions), and sample size planning issues from the power analytic and accuracy in parameter estimation perspectives. In addition, MBESS contains collections of other functions that should be helpful to substantive researchers and methodologists. MBESS is a longterm project that will continue to be updated and expanded so that important methods can continue to be made available to researchers in the behavioral, educational, and social sciences. R is an open source statistical programming language and environment for (essentially) all operating systems that has gained a widespread following in quantitative disciplines (R Development Core Team, 2007b). This following is perhaps most prevalent in the statistical sciences, where many published works now provide R routines
Problems with Null Hypothesis Significance Testing (NHST): What Do the Text Book Say?”, The
 Journal of Experimental Education
, 2002
"... ABSTRACT. The first of 3 objectives in this study was to address the major problem with Null Hypothesis Significance Testing (NHST) and 2 common misconceptions related to NHST that cause confusion for students and researchers. The misconceptions are (a) a smaller p indicates a stronger relationship ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
(Show Context)
ABSTRACT. The first of 3 objectives in this study was to address the major problem with Null Hypothesis Significance Testing (NHST) and 2 common misconceptions related to NHST that cause confusion for students and researchers. The misconceptions are (a) a smaller p indicates a stronger relationship and (b) statistical significance indicates practical importance. The second objective was to determine how this problem and the misconceptions were treated in 12 recent textbooks used in education research methods and statistics classes. The third objective was to examine how the textbooks ’ presentations relate to current best practices and how much help they provide for students. The results show that almost all of the textbooks fail to acknowledge that there is controversy surrounding NHST. Most of the textbooks dealt, at least minimally, with the alleged misconceptions of interest, but they provided relatively little help for students. Key words: effect size, NHST, practical importance, research and statistics textbooks THERE HAS BEEN AN INCREASE in resistance to null hypothesis significance testing (NHST) in the social sciences during recent years. The intensity of these objections to NHST has increased, especially within the disciplines of psy
The effects of nonnormal distributions on confidence intervals for the standardized mean difference: Bootstrapping as an alternative to parametric confidence intervals
 Educational and Psychological Measurement
, 2005
"... The standardized group mean difference, Cohen’s d, is among the most commonly used and intuitively appealing effect sizes for group comparisons. However, reporting this point estimate alone does not reflect the extent to which sampling error may have led to an obtained value. A confidence interval e ..."
Abstract

Cited by 16 (5 self)
 Add to MetaCart
(Show Context)
The standardized group mean difference, Cohen’s d, is among the most commonly used and intuitively appealing effect sizes for group comparisons. However, reporting this point estimate alone does not reflect the extent to which sampling error may have led to an obtained value. A confidence interval expresses the uncertainty that exists between d and the population value, δ, it represents. A set of Monte Carlo simulations was conducted to examine the integrity of a noncentral approach analogous to that given by Steiger and Fouladi, as well as two bootstrap approaches in situations in which the normality assumption is violated. Because d is positively biased, a procedure given by Hedges and Olkin is outlined, such that an unbiased estimate of δ can be obtained. The biascorrected and accelerated bootstrap confidence interval using the unbiased estimate of δ is proposed and recommended for general use, especially in cases in which the assumption of normality may be violated. Keywords: effect size; standardized effect size; confidence intervals; bootstrap methods; nonnormal data Methodological recommendations within the behavioral sciences have increasingly emphasized the importance and utility of confidence intervals (Cumming & Finch, 2001; Smithson, 2001), effect sizes (Olejnik & Algina, I would like to thank KeHai Yuan and Joseph R. Rausch, both of the Department of Psychology at the University of Notre Dame, for helpful comments and suggestions during the preparation of this article. Correspondence concerning this article should be addressed to Ken Kelley,
Sample size planning for the standardized mean difference: Accuracy in parameter estimation via narrow confidence intervals
 Psychological Methods
, 2006
"... Methods for planning sample size (SS) for the standardized mean difference so that a narrow confidence interval (CI) can be obtained via the accuracy in parameter estimation (AIPE) approach are developed. One method plans SS so that the expected width of the CI is sufficiently narrow. A modification ..."
Abstract

Cited by 16 (8 self)
 Add to MetaCart
(Show Context)
Methods for planning sample size (SS) for the standardized mean difference so that a narrow confidence interval (CI) can be obtained via the accuracy in parameter estimation (AIPE) approach are developed. One method plans SS so that the expected width of the CI is sufficiently narrow. A modification adjusts the SS so that the obtained CI is no wider than desired with some specified degree of certainty (e.g., 99 % certain the 95 % CI will be no wider than �). The rationale of the AIPE approach to SS planning is given, as is a discussion of the analytic approach to CI formation for the population standardized mean difference. Tables with values of necessary SS are provided. The freely available Methods for the Behavioral, Educational, and Social Sciences (K. Kelley, 2006a) R (R Development Core Team, 2006) software package easily implements the methods discussed.