Results 1 - 10
of
275
False-positive psychology: Undisclosed flexibility in data collection and analysis allow presenting anything as significant
- Psychological Science
, 2011
"... Downloaded from pss.sagepub.com ..."
(Show Context)
Statistical strategies for avoiding false discoveries in metabolomics and related experiments
, 2006
"... Many metabolomics, and other high-content or high-throughput, experiments are set up such that the primary aim is the discovery of biomarker metabolites that can discriminate, with a certain level of certainty, between nominally matched ‘case ’ and ‘control ’ samples. However, it is unfortunately ve ..."
Abstract
-
Cited by 61 (11 self)
- Add to MetaCart
(Show Context)
Many metabolomics, and other high-content or high-throughput, experiments are set up such that the primary aim is the discovery of biomarker metabolites that can discriminate, with a certain level of certainty, between nominally matched ‘case ’ and ‘control ’ samples. However, it is unfortunately very easy to find markers that are apparently persuasive but that are in fact entirely spurious, and there are well-known examples in the proteomics literature. The main types of danger are not entirely independent of each other, but include bias, inadequate sample size (especially relative to the number of metabolite variables and to the required statistical power to prove that a biomarker is discriminant), excessive false discovery rate due to multiple hypothesis testing, inappropriate choice of particular numerical methods, and overfitting (generally caused by the failure to perform adequate validation and cross-validation). Many studies fail to take these into account, and thereby fail to discover anything of true significance (despite their claims). We summarise these problems, and provide pointers to a substantial existing literature that should assist in the improved design and evaluation of metabolomics experiments, thereby allowing robust scientific conclusions to be drawn from the available data. We provide a list of some of the simpler checks that might improve one’s confidence that a candidate biomarker is not simply a statistical artefact, and suggest a series of preferred tests and visualisation tools that can assist readers and authors in assessing papers. These tools can be applied to individual metabolites by using multiple univariate tests performed in parallel across all metabolite peaks. They may also be applied to the validation of multivariate models. We stress in
Why Psychologists Must Change the Way They Analyze Their Data: The Case of Psi
"... Does psi exist? In a recent article, Dr. Bem conducted nine studies with over a thousand participants in an attempt to demonstrate that future events retroactively affect people’s responses. Here we discuss several limitations of Bem’s experiments on psi; in particular, we show that the data analysi ..."
Abstract
-
Cited by 52 (9 self)
- Add to MetaCart
Does psi exist? In a recent article, Dr. Bem conducted nine studies with over a thousand participants in an attempt to demonstrate that future events retroactively affect people’s responses. Here we discuss several limitations of Bem’s experiments on psi; in particular, we show that the data analysis was partly exploratory, and that one-sided p-values may overstate the statistical evidence against the null hypothesis. We reanalyze Bem’s data using a default Bayesian t-test and show that the evidence for psi is weak to nonexistent. We argue that in order to convince a skeptical audience of a controversial claim, one needs to conduct strictly confirmatory studies and analyze the results with statistical tests that are conservative rather than liberal. We conclude that Bem’s p-values do not indicate evidence in favor of precognition; instead, they indicate that experimental psychologists need to change the way they conduct their experiments and analyze their data.
Negative results are disappearing from most disciplines and countries
- Scientometrics
, 2012
"... Abstract Concerns that the growing competition for funding and citations might distort science are frequently discussed, but have not been verified directly. Of the hypothesized problems, perhaps the most worrying is a worsening of positive-outcome bias. A system that disfavours negative results not ..."
Abstract
-
Cited by 42 (4 self)
- Add to MetaCart
(Show Context)
Abstract Concerns that the growing competition for funding and citations might distort science are frequently discussed, but have not been verified directly. Of the hypothesized problems, perhaps the most worrying is a worsening of positive-outcome bias. A system that disfavours negative results not only distorts the scientific literature directly, but might also discourage high-risk projects and pressure scientists to fabricate and falsify their data. This study analysed over 4,600 papers published in all disciplines between 1990 and 2007, measuring the frequency of papers that, having declared to have ‘‘tested’ ’ a hypothesis, reported a positive support for it. The overall frequency of positive supports has grown by over 22 % between 1990 and 2007, with significant differences between disciplines and countries. The increase was stronger in the social and some biomedical disciplines. The United States had published, over the years, significantly fewer positive results than Asian countries (and particularly Japan) but more than European countries (and in par-ticular the United Kingdom). Methodological artefacts cannot explain away these patterns, which support the hypotheses that research is becoming less pioneering and/or that the objectivity with which results are produced and published is decreasing.
The promise of Mechanical Turk: How online labor markets can help . . .
- JOURNAL OF THEORETICAL BIOLOGY
, 2012
"... ..."
Is the reliability crisis overblown? Three arguments examined
- Perspectives in Psychological Science
, 2012
"... ________________________________________ ABSTRACT -We discuss three arguments voiced by scientists who view the current outpouring of concern about replicability as overblown. The first idea is that the adoption of a low alpha level (e.g., 5%) puts reasonable bounds on the rate at which errors can ..."
Abstract
-
Cited by 33 (1 self)
- Add to MetaCart
(Show Context)
________________________________________ ABSTRACT -We discuss three arguments voiced by scientists who view the current outpouring of concern about replicability as overblown. The first idea is that the adoption of a low alpha level (e.g., 5%) puts reasonable bounds on the rate at which errors can enter the published literature, making false-positive effects rare enough to be considered a minor issue. This, we point out, rests on statistical misunderstanding: The alpha level imposes no limit on the rate at which errors may arise in the literature
Cognitive Enhancement: Methods, Ethics, Regulatory Challenges”, Ethics
, 2009
"... Abstract Cognitive enhancement takes many and diverse forms. Various methods of cognitive enhancement have implications for the near future. At the same time, these technologies raise a range of ethical issues. For example, they interact with notions of authenticity, the good life, and the role of ..."
Abstract
-
Cited by 33 (3 self)
- Add to MetaCart
(Show Context)
Abstract Cognitive enhancement takes many and diverse forms. Various methods of cognitive enhancement have implications for the near future. At the same time, these technologies raise a range of ethical issues. For example, they interact with notions of authenticity, the good life, and the role of medicine in our lives. Present and anticipated methods for cognitive enhancement also create challenges for public policy and regulation.
The spread of evidence-poor medicine via flawed social-network analyses
- Statistics, Politics and Policy
, 2011
"... ar ..."
(Show Context)
Statistical evidence in experimental psychology: An empirical comparison using 855 t tests
- Perspectives on Psychological Science
, 2011
"... Statistical inference in psychology has traditionally relied heavily on p-value significance testing. This approach to drawing conclusions from data, however, has been widely criticized, and two types of remedies have been advocated. The first proposal is to supplement p values with complementary me ..."
Abstract
-
Cited by 27 (4 self)
- Add to MetaCart
(Show Context)
Statistical inference in psychology has traditionally relied heavily on p-value significance testing. This approach to drawing conclusions from data, however, has been widely criticized, and two types of remedies have been advocated. The first proposal is to supplement p values with complementary measures of evidence, such as effect sizes. The second is to replace inference with Bayesian measures of evidence, such as the Bayes factor. The authors provide a practical comparison of p values, effect sizes, and default Bayes factors as measures of statistical evidence, using 855 recently published t tests in psychology. The comparison yields two main results. First, although p values and default Bayes factors almost always agree about what hypothesis is better supported by the data, the measures often disagree about the strength of this support; for 70 % of the data sets for which the p value falls between.01 and.05, the default Bayes factor indicates that the evidence is only anecdotal. Second, effect sizes can provide additional evidence to p values and default Bayes factors. The authors conclude that the Bayesian approach is comparatively prudent, preventing researchers from overestimating the evidence in favor of an effect. Keywords
Recommendations for increasing replicability in psychology
"... Replicability of findings is at the heart of any empirical science. The aim of this article is to move the current replicability debate in psychology toward concrete recommendations for improvement. We focus on research practices, but also offer guidelines for reviewers, editors, journal management, ..."
Abstract
-
Cited by 26 (0 self)
- Add to MetaCart
(Show Context)
Replicability of findings is at the heart of any empirical science. The aim of this article is to move the current replicability debate in psychology toward concrete recommendations for improvement. We focus on research practices, but also offer guidelines for reviewers, editors, journal management, teachers, granting institutions, and university promotion committees, highlighting some of the emerging and existing practical solutions that can facilitate implementation of these recommendations. The challenges for improving replicability in psychological science are systemic. Improvement can occur only if changes are made at many levels of practice, evaluation, and reward. Replicability 2 Preamble The purpose of this article is to recommend sensible improvements that can be implemented in