Results 1 
6 of
6
The insignificance of statistical significance testing.
 Journal of Wildlife Management,
, 1999
"... Abstract: Despite their wide use in scientific journals such as The Journal of Wildlife Management, statistical hypothesis tests add very little value to the products of research. Indeed, they frequently confuse the interpretation of data. This paper describes how statistical hypothesis tests are o ..."
Abstract

Cited by 92 (0 self)
 Add to MetaCart
(Show Context)
Abstract: Despite their wide use in scientific journals such as The Journal of Wildlife Management, statistical hypothesis tests add very little value to the products of research. Indeed, they frequently confuse the interpretation of data. This paper describes how statistical hypothesis tests are often viewed, and then contrasts that interpretation with the correct one. I discuss the arbitrariness of Pvalues, conclusions that the null hypothesis is true, power analysis, and distinctions between statistical and biological significance. Statistical hypothesis testing, in which the null hypothesis about the properties of a population is almost always known a priori to be false, is contrasted with scientific hypothesis testing, which examines a credible null hypothesis about phenomena in nature. More meaningful alternatives are briefly outlined, including estimation and confidence intervals for determining the importance of factors, decision theory for guiding actions in the face of uncertainty, and Bayesian approaches to hypothesis testing and other statistical practices.
Psychology will be a much better science when we change the way we analyze data
 Current Directions in Psychological Science
, 1996
"... because I believed that within it dwelt some of the most fundamental and challenging problems of the extant sciences. Who could not be intrigued, for example, by the relation between consciousness and behavior, or the rules guiding interactions in social situations, or the processes that underlie de ..."
Abstract

Cited by 78 (3 self)
 Add to MetaCart
because I believed that within it dwelt some of the most fundamental and challenging problems of the extant sciences. Who could not be intrigued, for example, by the relation between consciousness and behavior, or the rules guiding interactions in social situations, or the processes that underlie development from infancy to maturity? Today, in 1996, my fascination with these problems is undiminished. But I've developed a certain angst over the intervening thirtysomething years—a constant, nagging feeling that our field spends a lot of time spinning its wheels without really making all that much progress. This problem shows up in obvious ways—for instance, in the regularity with which findings seem not to replicate. It also shows up in subtler ways—for instance, one doesn't often hear Psychologists saying, "Well this problem is solved now; let's move on to the next one " (as, for example, Johannes Kepler must have said over three centuries ago, after he had cracked the problem of describing planetary motion). I've come to believe that at least part of this problem revolves around our tools—particularly the tools that we use in the critical domains of data analysis and data interpretation. What we do, I sometimes feel, is akin to trying to build a violin using a stone mallet and a chainsaw. The tooltotask fit is not all that good, and as a result, we wind up building a lot of poorquality violins. My purpose here is to elaborate on these issues. In what follows, I will summarize our major dataanalysis and datainterpretation tools, and describe what I believe to be amiss with them. I will then offer some suggestions for change.
Theorytesting in psychology and physics: A methodological paradox. Philosophy of Science
 Meehl, research
, 1967
"... Because physical theories typically predict numerical values, an improvement in experimental precision reduces the tolerance range and hence increases corroborability. In most psychological research, improved power of a statistical design leads to a prior probability approaching of finding a signi ..."
Abstract

Cited by 56 (6 self)
 Add to MetaCart
(Show Context)
Because physical theories typically predict numerical values, an improvement in experimental precision reduces the tolerance range and hence increases corroborability. In most psychological research, improved power of a statistical design leads to a prior probability approaching of finding a significant difference in the theoretically predicted direction. Hence the corroboration yielded by “success ” is very weak, and becomes weaker with increased precision. “Statistical significance ” plays a logical role in psychology precisely the reverse of its role in physics. This problem is worsened by certain unhealthy tendencies prevalent among psychologists, such as a premium placed on experimental “cuteness ” and a free reliance upon ad hoc explanations to avoid refuation. The purpose of the present paper is not so much to propound a doctrine or defend a thesis (especially as I should be surprised if either psychologists or statisticians were to disagree with whatever in the nature of a “thesis ” it advances), but to call the attention of logicians and philosophers of science to a puzzling state of affairs in the currently accepted methodology of the behavior sciences which I, a psychologist, have been unable to resolve to my satisfaction. The puzzle, sufficiently striking
Effect sizes and p values: What should be reported . . . ?
, 1996
"... Despite publication of many wellargued critiques of null hypothesis testing (NHT), behavioral science researchers continue to rely heavily on this set of practices. Although we agree with most critics' catalogs of NHT's flaws, this article also takes the unusual stance of identifying vi ..."
Abstract

Cited by 34 (0 self)
 Add to MetaCart
Despite publication of many wellargued critiques of null hypothesis testing (NHT), behavioral science researchers continue to rely heavily on this set of practices. Although we agree with most critics' catalogs of NHT's flaws, this article also takes the unusual stance of identifying virtues that may explain why NHT continues to be so extensively used. These virtues include providing results in the form of a dichotomous (yes/no) hypothesis evaluation and providing an index (p value) that has a justifiable mapping onto confidence in repeatability of a null hypothesis rejection. The mostcriticized flaws of NHT can be avoided when the importance of a hypothesis, rather than the p value of its test, is used to determine that a finding is worthy of report, and when p = .05 is treated as insufficient basis for confidence in the replicability of an isolated nonnull finding. Together with many recent critics of NHT, we also urge reporting of important hypothesis tests in enough descriptive detail to permit secondary uses such as metaanalysis.
Data Analysis Considerations in Producing ‘Comparable ’ Information for Water Quality Management Purposes
"... Water quality monitoring is being used in local, regional, and national scales to measure how water quality variables behave in the natural environment. A common problem, which arises from monitoring, is how to relate information contained in data to the information needed by water resource manageme ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Water quality monitoring is being used in local, regional, and national scales to measure how water quality variables behave in the natural environment. A common problem, which arises from monitoring, is how to relate information contained in data to the information needed by water resource management for decisionmaking. This is generally attempted through statistical analysis of the monitoring data. However, how the selection of methods with which to routinely analyze the data affects the quality and comparability of information produced is not as well understood as may first appear. To help understand the connectivity between the selection of methods for routine data analysis and the information produced to support management, the following three tasks were performed. An examination of the methods that are currently being used to analyze water quality monitoring data, including published criticisms of them. An exploration of how the selection of methods to analyze water quality data can impact the comparability of information used for water quality management purposes. Development of options by which data analysis methods employed in water quality
Some Aspects of Statistical Significance in Statistics Education Pranesh Kumar
"... Statistical significance in the null hypothesis testing is the primary objective method for representing scientific data as evidence and for measuring strength of that evidence. Statistical significance is measured by calculating the probability value (Pvalue) generated by the null hypothesis test ..."
Abstract
 Add to MetaCart
(Show Context)
Statistical significance in the null hypothesis testing is the primary objective method for representing scientific data as evidence and for measuring strength of that evidence. Statistical significance is measured by calculating the probability value (Pvalue) generated by the null hypothesis test of significance. Several interpretations of Pvalues are possible. For example, Pvalue is interpreted as the probability that the results were obtained due to chance. A small Pvalue would recommend that the null hypothesis is not supported by the sample data and the research hypothesis is strongly favored by data. Alternatively, effect size can be considered as a measure of the extent to which the research hypothesis is true or to the degree to which the findings have practical significance in context of the study population. Effect size measures seem to have advantages over statistical significance because they are not affected by the sample size and are scalefree. The effect size measures can be uniquely interpreted in different studies regardless of the sample size and the original scales of the variables. In this paper we will present some aspects of statistical significance, practical significance and their computations. We will consider statistical significance measures for some commonly used statistical parameters. In conclusion, we present discussions and remarks.