Results 1  10
of
75
ResamplingBased Multiple Testing for Microarray Data Analysis
, 2003
"... The burgeoning field of genomics has revived interest in multiple testing procedures by raising new methodological and computational challenges. For example, microarray experiments generate large multiplicity problems in which thousands of hypotheses are tested simultaneously. In their 1993 book, We ..."
Abstract

Cited by 145 (3 self)
 Add to MetaCart
The burgeoning field of genomics has revived interest in multiple testing procedures by raising new methodological and computational challenges. For example, microarray experiments generate large multiplicity problems in which thousands of hypotheses are tested simultaneously. In their 1993 book, Westfall & Young propose resamplingbased pvalue adjustment procedures which are highly relevant to microarray experiments. This article discusses different criteria for error control in resamplingbased multiple testing, including (a) the family wise error rate of Westfall & Young (1993) and (b) the false discovery rate developed by Benjamini & Hochberg (1995), both from a frequentist viewpoint; and (c) the positive false discovery rate of Storey (2002), which has a Bayesian motivation. We also introduce our recently developed fast algorithm for implementing the minP adjustment to control familywise error rate. Adjusted pvalues for different approaches are applied to gene expression data from two recently published microarray studies. The properties of these procedures for multiple testing are compared.
A Bayesian mixture model for differential gene expression
 Journal of the Royal Statistical Society C
, 2005
"... We propose modelbased inference for differential gene expression, using a nonparametric Bayesian probability model for the distribution of gene intensities under different conditions. The probability model is essentially a mixture of normals. The resulting inference is similar to the empirical Bay ..."
Abstract

Cited by 58 (5 self)
 Add to MetaCart
(Show Context)
We propose modelbased inference for differential gene expression, using a nonparametric Bayesian probability model for the distribution of gene intensities under different conditions. The probability model is essentially a mixture of normals. The resulting inference is similar to the empirical Bayes approach proposed in Efron et al. (2001). The use of fully modelbased inference mitigates some of the necessary limitations of the empirical Bayes method. However, the increased generality of our method comes at a price. Computation is not as straightforward as in the empirical Bayes scheme. But we argue that inference is no more difficult than posterior simulation in traditional nonparametric mixture of normal models. We illustrate the proposed method in two examples, including a simulation study and a microarray experiment to screen for genes with differential expression in colon cancer versus normal tissue (Alon et al., 1999).
Sample size for fdrcontrol in microarray data analysis
 Bioinformatics
, 2005
"... We consider identifying differentially expressing genes between two patient groups using microarray experiment. We propose a sample size calculation method for a specified number of true rejections while controlling the false discovery rate at a desired level. Input parameters for the sample size c ..."
Abstract

Cited by 34 (2 self)
 Add to MetaCart
We consider identifying differentially expressing genes between two patient groups using microarray experiment. We propose a sample size calculation method for a specified number of true rejections while controlling the false discovery rate at a desired level. Input parameters for the sample size calculation include the allocation proportion in each group, the number of genes in each array, the number of differentially expressing genes, and the effect sizes among the differentially expressing genes. We have a closedform sample size formula if the projected effect sizes are equal among differentially expressing genes. Otherwise, our method requires a numerical method to solve an equation. Simulation studies are conducted to show that the calculated sample sizes are accurate in practical settings. The proposed method is demonstrated with a real study. Key words: Block compound symmetry, Familywise error rate, Prognostic gene, True rejection, Twosample ttest.
Hybrid Dirichlet mixture models for functional data
"... Summary. In functional data analysis, curves or surfaces are observed, up to measurement error, at a finite set of locations, for, say, a sample of n individuals. Often, the curves are homogeneous, except perhaps for individualspecific regions that provide heterogeneous behaviour (e.g. ‘damaged ’ a ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
(Show Context)
Summary. In functional data analysis, curves or surfaces are observed, up to measurement error, at a finite set of locations, for, say, a sample of n individuals. Often, the curves are homogeneous, except perhaps for individualspecific regions that provide heterogeneous behaviour (e.g. ‘damaged ’ areas of irregular shape on an otherwise smooth surface). Motivated by applications with functional data of this nature, we propose a Bayesian mixture model, with the aim of dimension reduction, by representing the sample of n curves through a smaller set of canonical curves. We propose a novel prior on the space of probability measures for a random curve which extends the popular Dirichlet priors by allowing local clustering: nonhomogeneous portions of a curve can be allocated to different clusters and the n individual curves can be represented as recombinations (hybrids) of a few canonical curves. More precisely, the prior proposed envisions a conceptual hidden factor with klevels that acts locally on each curve. We discuss several models incorporating this prior and illustrate its performance with simulated and real data sets. We examine theoretical properties of the proposed finite hybrid Dirichlet mixtures, specifically, their behaviour as the number of the mixture components goes to 1 and their connection with Dirichlet process mixtures.
Microarray analysis of gene expression: considerations in data mining and statistical treatment
 PHYSIOL GENOMICS
, 2006
"... DNA microarray represents a powerful tool in biomedical discoveries. Harnessing the potential of this technology depends on the development and appropriate use of data mining and statistical tools. Significant current advances have made microarray data mining more versatile. Researchers are no long ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
DNA microarray represents a powerful tool in biomedical discoveries. Harnessing the potential of this technology depends on the development and appropriate use of data mining and statistical tools. Significant current advances have made microarray data mining more versatile. Researchers are no longer limited to default choices that generate suboptimal results. Conflicting results in repeated experiments can be resolved through attention to the statistical details. In the current dynamic environment, there are many choices and potential pitfalls for researchers who intend to incorporate microarrays as a research tool. This review is intended to provide a simple framework to understand the choices and identify the pitfalls. Specifically, this review article discusses the choice of microarray platform, preprocessing raw data, differential expression and validation, clustering, annotation and functional characterization of genes, and pathway construction in light of emergent concepts and tools.
Supplement to “Automated analysis of quantitative image data using isomorphic functional mixed models, with application to proteomics data
, 2010
"... Image data are increasingly encountered and are of growing importance in many areas of science. Much of these data are quantitative image data, which are characterized by intensities that represent some measurement of interest in the scanned images. The data typically consist of multiple images on t ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
Image data are increasingly encountered and are of growing importance in many areas of science. Much of these data are quantitative image data, which are characterized by intensities that represent some measurement of interest in the scanned images. The data typically consist of multiple images on the same domain and the goal of the research is to combine the quantitative information across images to make inference about populations or interventions. In this paper we present a unified analysis framework for the analysis of quantitative image data using a Bayesian functional mixed model approach. This framework is flexible enough to handle complex, irregular images with many local features, and can model the simultaneous effects of multiple factors on the image intensities and account for the correlation between images induced by the design. We introduce a general isomorphic modeling approach to fitting the functional mixed model, of which the waveletbased functional mixed model is one special case. With suitable modeling choices, this approach leads to efficient calculations and can result in flexible
PowerEnhanced Multiple Decision Functions Controlling FamilyWise Error and False Discovery Rates
, 2009
"... 2 ..."
(Show Context)
Sample size for gene expression microarray experiments
 Bioinformatics
, 2005
"... doi:10.1093/bioinformatics/bti162 ..."
(Show Context)