Results 1  10
of
62
Use of withinarray replicate spots for assessing differential expression in microarray experiments
 Bioinformatics
, 2005
"... Motivation. Spotted arrays are often printed with probes in duplicate or triplicate, but current methods for assessing differential expression are not able to make full use of the resulting information. Usual practice is to average the duplicate or triplicate results for each probe before assessing ..."
Abstract

Cited by 239 (8 self)
 Add to MetaCart
(Show Context)
Motivation. Spotted arrays are often printed with probes in duplicate or triplicate, but current methods for assessing differential expression are not able to make full use of the resulting information. Usual practice is to average the duplicate or triplicate results for each probe before assessing differential expression. This loses valuable information about genewise variability. Results. A method is proposed for extracting more information from withinarray replicate spots in microarray experiments by estimating the strength of the correlation between them. The method involves fitting separate linear models to the expression data for each gene but with a common value for the betweenreplicate correlation. The method greatly improves the precision with which the genewise variances are estimated and thereby improves inference methods designed to identify differentially expressed genes. The method may be combined with empirical Bayes methods for moderating the genewise variances between genes. The method is validated using data from a microarray experiment involving calibration and ratio control spots in conjunction with spikedin RNA. Comparing results for calibration and ratio control spots shows that the common correlation method results in substantially better discrimination of differentially expressed genes from those which are not. The spikein experiment also confirms that the results may be further improved by empirical Bayes smoothing of the variances when the sample size is small. Availability. The methodology is implemented in the limma software package for R, available from the CRAN repository
Best subset selection, persistence in highdimensional statistical learning and optimization under l1 constraint
 Ann. Statist
, 2006
"... ar ..."
(Show Context)
Optimal shrinkage estimation of variances with applications to microarray data analysis
 J. Am. Statist. Ass
, 2006
"... Microarray technology allows a scientist to study genomewide patterns of gene expression. Thousands of individual genes are measured with relatively small number of replications which poses challenges to traditional statistical methods. In particular, the genespecific estimators of variances are n ..."
Abstract

Cited by 21 (8 self)
 Add to MetaCart
Microarray technology allows a scientist to study genomewide patterns of gene expression. Thousands of individual genes are measured with relatively small number of replications which poses challenges to traditional statistical methods. In particular, the genespecific estimators of variances are not reliable and genebygene tests have low power. In this paper we propose a family of shrinkage estimators for variances raised to a fixed power. We derive optimal shrinkage parameters under both Stein and the squared loss functions. Our results show that the standard sample variance is inadmissible under either loss functions. We propose several estimators for the optimal shrinkage parameters and investigate their asymptotic properties under two scenarios: large number of replications and large number of genes. We conduct simulations to evaluate the finite sample performance of the datadriven optimal shrinkage estimators and compare them with some existing methods. We construct Flike statistics using these shrinkage variance estimators and apply them to detect differentially expressed genes in a microarray experiment. We also conduct simulations to evaluate performance of these Flike statistics and compare them with some existing methods. Key words and phrases: Flike statistic, gene expression data, inadmissibility, JamesStein shrinkage estimator, loss function. 1.
Gene Expression Profile Classification: A Review
 Current Bioinformatics
, 2006
"... Abstract: In this review, we have discussed the classprediction and discovery methods that are applied to gene expression data, along with the implications of the findings. We attempted to present a unified approach that considers both classprediction and classdiscovery. We devoted a substantial ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
(Show Context)
Abstract: In this review, we have discussed the classprediction and discovery methods that are applied to gene expression data, along with the implications of the findings. We attempted to present a unified approach that considers both classprediction and classdiscovery. We devoted a substantial part of this review to an overview of pattern classification/recognition methods and discussed important issues such as preprocessing of gene expression data, curse of dimensionality, feature extraction/selection, and measuring or estimating classifier performance. We discussed and summarized important properties such as generalizability (sensitivity to overtraining), builtin feature selection, ability to report prediction strength, and transparency (ease of understanding of the operation) of different classpredictor design approaches to provide a quick and concise reference. We have also covered the topic of biclustering, which is an emerging clustering method that processes the entries of the gene expression data matrix in both gene and sample directions simultaneously, in detail. 1.
Variances are not always nuisance parameters
 Biometrics
, 2003
"... Summary. In classical problems, e.g., comparing two populations, fitting a regression surface, etc., variability is a nuisance parameter. The term "nuisance parameter" is meant here in both the technical and the practical sense. However, there are many instances where understanding the st ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
(Show Context)
Summary. In classical problems, e.g., comparing two populations, fitting a regression surface, etc., variability is a nuisance parameter. The term "nuisance parameter" is meant here in both the technical and the practical sense. However, there are many instances where understanding the structure of variability is just as central as understanding the mean structure. The purpose of this article is to review a few of these problems. I focus in particular on two issues: (a) the determination of the validity of an assay; and (b) the issue of the power for detecting health effects from nutrient intakes when the latter are measured by food frequency questionnaires. I will also briefly mention the problems of variance structure in generalized linear mixed models, robust parameter design in quality technology, and the signal in microarrays. In these and other problems, treating variance structure as a nuisance instead of a central part of the modeling effort not only leads to inefficient estimation of means, but also to misleading conclusions.
BAMarrayTM: Java software for Bayesian analysis of variance for microarray data
 BMC Bioinformatics
"... ..."
Statistical analysis of adsorption models for oligonucleotide microarrays
 Stat. Applic. Gen. Mol. Biol
"... Abstract Recent analyses have shown that the relationship between intensity measurements from high density oligonucleotide microarrays and known concentration is non linear. Thus many measurements of socalled gene expression are neither measures of transcript nor mRNA concentration as might be exp ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
(Show Context)
Abstract Recent analyses have shown that the relationship between intensity measurements from high density oligonucleotide microarrays and known concentration is non linear. Thus many measurements of socalled gene expression are neither measures of transcript nor mRNA concentration as might be expected. Intensity as measured in such microarrays is a measurement of fluorescent dye attached to probetarget duplexes formed during hybridization of a sample to the probes on the microarray. We develop several dynamic adsorption models relating fluorescent dye intensity to target RNA concentration, the simplest of which is the equilibrium Langmuir isotherm, or hyperbolic response function. Using data from the Affymerix HGU95A Latin Square experiment, we evaluate various physical models, including equilibrium and nonequilibrium models, by applying maximum likelihood methods. We show that for these data, equilibrium Langmuir isotherms with probe dependent parameters are appropriate. We describe how probe sequence information may then be used to estimate the parameters of the Langmuir isotherm in order to provide an improved measure of absolute target concentration.
Challenges in bioinformatics for statistical data miner
 Bulletin of the Swiss Statistical Society
, 2003
"... Abstract Starting with possible definitions of statistical data mining and bioinformatics, this article will give a general sidebyside view of both fields, emphasising on the statistical data mining part and its techniques, illustrate possible synergies and discuss how statistical data miners may ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
Abstract Starting with possible definitions of statistical data mining and bioinformatics, this article will give a general sidebyside view of both fields, emphasising on the statistical data mining part and its techniques, illustrate possible synergies and discuss how statistical data miners may collaborate in bioinformatics' challenges in order to unlock the secrets of the cell.
Prediction of missing values in microarray and use of mixed models to evaluate the predictors
 STAT. APPL. GENET. MOL. BIOL
, 2005
"... ..."