Results 11  20
of
210
Estimating the null and the proportion of nonnull effects in largescale multiple comparisons
 J. Amer. Statist. Assoc
, 2007
"... An important issue raised by Efron [7] in the context of largescale multiple comparisons is that in many applications the usual assumption that the null distribution is known is incorrect, and seemingly negligible differences in the null may result in large differences in subsequent studies. This s ..."
Abstract

Cited by 39 (6 self)
 Add to MetaCart
(Show Context)
An important issue raised by Efron [7] in the context of largescale multiple comparisons is that in many applications the usual assumption that the null distribution is known is incorrect, and seemingly negligible differences in the null may result in large differences in subsequent studies. This suggests that a careful study of estimation of the null is indispensable. In this paper, we consider the problem of estimating a null normal distribution, and a closely related problem, estimation of the proportion of nonnull effects. We develop an approach based on the empirical characteristic function and Fourier analysis. The estimators are shown to be uniformly consistent over a wide class of parameters. Numerical performance of the estimators is investigated using both simulated and real data. In particular, we apply our
Microarray image enhancement by denoising using stationary wavelet transform
 IEEE Transactions on Nanobiosicence
, 2003
"... ©2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other ..."
Abstract

Cited by 38 (0 self)
 Add to MetaCart
(Show Context)
©2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE
VarMixt: efficient variance modelling for the differential analysis of replicated gene expression data
 Bioinformatics
, 2005
"... replicated gene expression data. ..."
Empirical evaluation of data transformations and ranking statistics for microarray analysis
 Nucleic Acids Res
, 2004
"... There are many options in handling microarray data that can affect study conclusions, sometimes drastically. Working with a twocolor platform, this study uses ten spikein microarray experiments to evaluate the relative effectiveness of some of these options for the experimental goal of detecting d ..."
Abstract

Cited by 26 (0 self)
 Add to MetaCart
(Show Context)
There are many options in handling microarray data that can affect study conclusions, sometimes drastically. Working with a twocolor platform, this study uses ten spikein microarray experiments to evaluate the relative effectiveness of some of these options for the experimental goal of detecting differential expression. We consider two data transformations, background subtraction and intensity normalization, as well as six different statistics for detecting differentially expressed genes. Findings support the use of an intensitybased normalization procedure and also indicate that local background subtraction can be detrimental for effectively detecting differential expression. We also verify that robust statistics outperform tstatistics in identifying differentially expressed genes when there are few replicates. Finally, we find that choice of image analysis software can also substantially influence experimental conclusions.
Flexible empirical Bayes models for differential gene expression
, 2007
"... Motivation: Inference about differential expression is a typical objective when analyzing gene expression data. Recently, Bayesian hierarchical models have become increasingly popular for this type of problem. The two most common hierarchical models are the hierarchical Gamma–Gamma (GG) and Lognorma ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
Motivation: Inference about differential expression is a typical objective when analyzing gene expression data. Recently, Bayesian hierarchical models have become increasingly popular for this type of problem. The two most common hierarchical models are the hierarchical Gamma–Gamma (GG) and Lognormal–Normal (LNN) models. However, to facilitate inference, some unrealistic assumptions have been made. One such assumption is that of a common coefficient of variation across genes, which can adversely affect the resulting inference. Results: In this paper, we extend both the GG and LNN modeling frameworks to allow for genespecific variances and propose EM based algorithms for parameter estimation. The proposed methodology is evaluated on three experimental datasets: one cDNA microarray experiment and two Affymetrix spikein experiments. The two extended models significantly reduce the false positive rate while keeping a high sensitivity when compared to the originals. Finally, using a simulation study we show that the new frameworks are also more robust to model misspecification. Availability: The R code for implementing the proposed methodology can be downloaded at
GJ:Mixtures of common tfactor analyzers for clustering highdimensional microarray data. Bioinformatics 2011, 27:1269–1276. Bazot et al
 BMC Bioinformatics 2013, 14:99 Page 20 of
"... highdimensional microarray data ..."
Proportion of nonzero normal means: Universal oracle equivalences and uniformly consistent estimators
, 2007
"... OnlineOpen: This article is available free online at www.blackwellsynergy.com Summary. Since James and Stein's seminal work, the problem of estimating n normal means has received plenty of enthusiasm in the statistics community. Recently, driven by the fast expansion of the field of largesca ..."
Abstract

Cited by 17 (8 self)
 Add to MetaCart
(Show Context)
OnlineOpen: This article is available free online at www.blackwellsynergy.com Summary. Since James and Stein's seminal work, the problem of estimating n normal means has received plenty of enthusiasm in the statistics community. Recently, driven by the fast expansion of the field of largescale multiple testing, there has been a resurgence of research interest in the n normal means problem. The new interest, however, is more or less concentrated on testing n normal means: to determine simultaneously which means are 0 and which are not. In this setting, the proportion of the nonzero means plays a key role. Motivated by examples in genomics and astronomy, we are particularly interested in estimating the proportion of nonzero means, i.e. given n independent normal random variables with individual means X j N.μ j , 1/, j D 1, ..., n, to estimate the proportion " n D .1=n/ #{j : μ j = D0}. We propose a general approach to construct the universal oracle equivalence of the proportion. The construction is based on the underlying characteristic function. The oracle equivalence reduces the problem of estimating the proportion to the problem of estimating the oracle, which is relatively easier to handle. In fact, the oracle equivalence naturally yields a family of estimators for the proportion, which are consistent under mild conditions, uniformly across a wide class of parameters. The approach compares favourably with recent works by Meinshausen and Rice, and Genovese and Wasserman. In particular, the consistency is proved for an unprecedentedly broad class of situations; the class is almost the largest that can be hoped for without further constraints on the model. We also discuss various extensions of the approach, report results on simulation experiments and make connections between the approach and several recent procedures in largescale multiple testing, including the false discovery rate approach and the local false discovery rate approach.
Analysis of Microarray Gene Expression Data
 in ‘Handbook of Statistical Genetics’, 2nd edn
, 2003
"... This article reviews the methods utilized in processing and analysis of gene expression data generated using DNA microarrays. This type of experiment allows to determine relative levels of mRNA abundance in a set of tissues or cell populations for thousands of genes simultaneously. Naturally, suc ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
(Show Context)
This article reviews the methods utilized in processing and analysis of gene expression data generated using DNA microarrays. This type of experiment allows to determine relative levels of mRNA abundance in a set of tissues or cell populations for thousands of genes simultaneously. Naturally, such an experiment requires computational and statistical analysis techniques. At the outset of the processing pipeline, the computational procedures are largely determined by the technology and experimental setup that are used. Subsequently, as more reliable intensity values for genes emerge, pattern discovery methods come into play. The most striking peculiarity of this kind of data is that one usually obtains measurements for thousands of genes for only a much smaller number of conditions. This is at the root of several of the statistical questions discussed here.
Normality of oligonucleotide microarray data and the implications for parametric statistical analyses
 Bioinformatics
, 2003
"... Motivation: Experimental limitations have resulted in the popularity of parametric statistical tests as a method for identifying differentially regulated genes in microarray data sets. However, these tests assume that the data follow a normal distribution. To date, the assumption that replicate expr ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
(Show Context)
Motivation: Experimental limitations have resulted in the popularity of parametric statistical tests as a method for identifying differentially regulated genes in microarray data sets. However, these tests assume that the data follow a normal distribution. To date, the assumption that replicate expression values for any gene are normally distributed, has not been critically addressed for Affymetrix GeneChip data. Results: The normality of the expression values calculated using four different commercial and academic software packages was investigated using a data set consisting of the same target RNA applied to 59 human Affymetrix U95A GeneChips using a combination of statistical tests and visualization techniques. For the majority of probe sets obtained from each analysis suite, the expression data showed a good correlation with normality.The exception was a large number of lowexpressed genes in the data set produced using Affymetrix Microarray Suite 5.0, which showed a striking nonnormal distribution. In summary, our data provide strong support for the application of parametric tests to GeneChip data sets without the need for data transformation. Contact:
Quality control and robust estimation for cdna microarrays with replicates
 J Am Stat Assoc
"... We consider robust estimation of gene intensities from cDNA microarray data with replicates. Several statistical methods for estimating gene intensities from microarrays have been proposed, but there has been little work on robust estimation. This is particularly relevant for experiments with replic ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
We consider robust estimation of gene intensities from cDNA microarray data with replicates. Several statistical methods for estimating gene intensities from microarrays have been proposed, but there has been little work on robust estimation. This is particularly relevant for experiments with replicates, because even one outlying replicate can have a disastrous effect on the estimated intensity for the gene concerned. Because of the many steps involved in the experimental process from hybridization to image analysis, cDNA microarray data often contain outliers. For example, an outlying data value could occur because of scratches or dust on the surface, imperfections in the glass, or imperfections in the array production. We develop a Bayesian hierarchical model for robust estimation of cDNA microarray intensities. Outliers are modeled explicitly using a tdistribution, and our model also addresses classical issues such as design effects, normalization, transformation, and nonconstant variance. Parameter estimation is carried out using Markov chain Monte Carlo. By identifying potential