Results 1 - 10
of
471
Linear models and empirical bayes methods for assessing differential expression in microarray experiments.
- Stat. Appl. Genet. Mol. Biol.
, 2004
"... Abstract The problem of identifying differentially expressed genes in designed microarray experiments is considered. Lonnstedt and Speed (2002) derived an expression for the posterior odds of differential expression in a replicated two-color experiment using a simple hierarchical parametric model. ..."
Abstract
-
Cited by 1321 (24 self)
- Add to MetaCart
(Show Context)
Abstract The problem of identifying differentially expressed genes in designed microarray experiments is considered. Lonnstedt and Speed (2002) derived an expression for the posterior odds of differential expression in a replicated two-color experiment using a simple hierarchical parametric model. The purpose of this paper is to develop the hierarchical model of Lonnstedt and Speed (2002) into a practical approach for general microarray experiments with arbitrary numbers of treatments and RNA samples. The model is reset in the context of general linear models with arbitrary coefficients and contrasts of interest. The approach applies equally well to both single channel and two color microarray experiments. Consistent, closed form estimators are derived for the hyperparameters in the model. The estimators proposed have robust behavior even for small numbers of arrays and allow for incomplete data arising from spot filtering or spot quality weights. The posterior odds statistic is reformulated in terms of a moderated t-statistic in which posterior residual standard deviations are used in place of ordinary standard deviations. The empirical Bayes approach is equivalent to shrinkage of the estimated sample variances towards a pooled estimate, resulting in far more stable inference when the number of arrays is small. The use of moderated t-statistics has the advantage over the posterior odds that the number of hyperparameters which need to estimated is reduced; in particular, knowledge of the non-null prior for the fold changes are not required. The moderated t-statistic is shown to follow a t-distribution with augmented degrees of freedom. The moderated t inferential approach extends to accommodate tests of composite null hypotheses through the use of moderated F-statistics. The performance of the methods is demonstrated in a simulation study. Results are presented for two publicly available data sets.
Evolving gene/ transcript definitions significantly alter the interpretation of GeneChip data
- ATHEY B, JONES EG, BUNNEY WE, MYERS RM, SPEED TP, AKIL H, WATSON SJ, MENG
, 2005
"... ..."
An Empirical Bayes Approach to Inferring Large-Scale Gene Association Networks
- BIOINFORMATICS
, 2004
"... Motivation: Genetic networks are often described statistically by graphical models (e.g. Bayesian networks). However, inferring the network structure offers a serious challenge in microarray analysis where the sample size is small compared to the number of considered genes. This renders many standar ..."
Abstract
-
Cited by 237 (6 self)
- Add to MetaCart
Motivation: Genetic networks are often described statistically by graphical models (e.g. Bayesian networks). However, inferring the network structure offers a serious challenge in microarray analysis where the sample size is small compared to the number of considered genes. This renders many standard algorithms for graphical models inapplicable, and inferring genetic networks an “ill-posed” inverse problem. Methods: We introduce a novel framework for small-sample inference of graphical models from gene expression data. Specifically, we focus on so-called graphical Gaussian models (GGMs) that are now frequently used to describe gene association networks and to detect conditionally dependent genes. Our new approach is based on (i) improved (regularized) small-sample point estimates of partial correlation, (ii) an exact test of edge inclusion with adaptive estimation of the degree of freedom, and (iii) a heuristic network search based on false discovery rate multiple testing. Steps (ii) and (iii) correspond to an empirical Bayes estimate of the network topology. Results: Using computer simulations we investigate the sensitivity (power) and specificity (true negative rate) of the proposed framework to estimate GGMs from microarray data. This shows that it is possible to recover the true network topology with high accuracy even for small-sample data sets. Subsequently, we analyze gene expression data from a breast cancer tumor study and illustrate our approach by inferring a corresponding large-scale gene association network for 3,883 genes. Availability: The authors have implemented the approach in the R package “GeneTS ” that is freely available from
A Benchmark for Affymetrix GeneChip Expression Measures
- Bioinformatics
, 2003
"... Motivation: The defining feature of oligonucleotide expression arrays is the use of several probes to assay each targeted transcript. This is a bonanza for the statistical geneticist, who can create probeset summaries with specific characteristics.There are now several methods available for summariz ..."
Abstract
-
Cited by 141 (10 self)
- Add to MetaCart
Motivation: The defining feature of oligonucleotide expression arrays is the use of several probes to assay each targeted transcript. This is a bonanza for the statistical geneticist, who can create probeset summaries with specific characteristics.There are now several methods available for summarizing probe level data from the popular Affymetrix GeneChips, but it is difficult to identify the best method for a given inquiry. Results: We have developed a graphical tool to evaluate summaries of Affymetrix probe level data. Plots and summary statistics offer a picture of how an expression measure performs in several important areas. This picture facilitates the comparison of competing expression measures and the selection of methods suitable for a specific investigation. The key is a benchmark data set consisting of a dilution study and a spike-in study. Because the truth is known for these data, we can identify statistical features of the data for which the expected outcome is known in advance. Those features highlighted in our suite of graphs are justified by questions of biological interest and motivated by the presence of appropriate data. Availability: In conjunction with the release of a graphics toolbox as part of the Bioconductor project
FoxA1 translates epigenetic signatures into enhancer-driven lineage-specific transcription. Cell 132: 958–970
, 2008
"... Complex organisms require tissue-specific transcriptional programs, yet little is known about how these are established. The transcription factor FoxA1 is thought to contribute to gene regulation through its ability to act as a pioneer factor binding to nucleosomal DNA. Through genome-wide positiona ..."
Abstract
-
Cited by 117 (13 self)
- Add to MetaCart
(Show Context)
Complex organisms require tissue-specific transcriptional programs, yet little is known about how these are established. The transcription factor FoxA1 is thought to contribute to gene regulation through its ability to act as a pioneer factor binding to nucleosomal DNA. Through genome-wide positional analyses, we demonstrate that FoxA1 cell type-specific functions rely primarily on differential recruitment to chromatin predominantly at distant enhancers rather than proximal promoters. This differential recruitment leads to cell type-specific changes in chromatin structure and functional collaboration with lineage-specific transcription factors. Despite the ability of FoxA1 to bind nucleosomes, its differential binding to chromatin sites is dependent on the distribution of histone H3 lysine 4 dimethylation. Together, our results suggest that methylation of histone H3 lysine 4 is part of the epigenetic signature that defines lineage-specific FoxA1 recruitment sites in chromatin. FoxA1 translates this epigenetic signature into changes in chromatin structure thereby establishing lineage-specific transcriptional enhancers and programs.
Model-based Variance-stabilizing Transformation for Illumina Microarray Data’, Nucleic Acids Res
, 2008
"... doi:10.1093/nar/gkm1075 ..."
(Show Context)
Genome-wide analysis of estrogen receptor binding sites
, 2006
"... The estrogen receptor is the master transcriptional regulator of breast cancer phenotype and the archetype of a molecular therapeutic target. We mapped all estrogen receptor and RNA polymerase II binding sites on a genome-wide scale, identifying the authentic cis binding sites and target genes, in ..."
Abstract
-
Cited by 82 (12 self)
- Add to MetaCart
The estrogen receptor is the master transcriptional regulator of breast cancer phenotype and the archetype of a molecular therapeutic target. We mapped all estrogen receptor and RNA polymerase II binding sites on a genome-wide scale, identifying the authentic cis binding sites and target genes, in breast cancer cells. Combining this unique resource with gene expression data demonstrates distinct temporal mechanisms of estrogen-mediated gene regulation, particularly in the case of estrogen-suppressed genes. Furthermore, this resource has allowed the identification of cis-regulatory sites in previously unexplored regions of the genome and the cooperating transcription factors underlying estrogen signaling in breast cancer. Recent work has focused on identifying gene expression signatures in breast cancer subtypes that predict response to specific treatment regimes and improved disease outcome Estrogen receptor-mediated transcription has been intensively studied on a small number of endogenous target promoters RESULTS The MCF-7 breast cancer cell line has been extensively used as a model of hormone-dependent breast cancer. We deprived MCF-7 cells of hormones for 3 d and then synchronously induced transcription by the addition of estrogen for a brief period of time (45 min) known to result in maximal estrogen receptor-chromatin binding
Stochastic Models Inspired by Hybridization Theory for Short Oligonucleotide Arrays (Extended Abstract)
- J. Comput. Biol
, 2004
"... Zhijin Wu Johns Hopkins Bloomberg School of Public Health 615 North Wolfe Street zwu@jhsph.edu Rafael A. Irizarry Johns Hopkins Bloomberg School of Public Health 615 North Wolfe Street rafa@jhu.edu ABSTRACT High density oligonucleotide expression arrays are a widely used tool for the measureme ..."
Abstract
-
Cited by 80 (4 self)
- Add to MetaCart
(Show Context)
Zhijin Wu Johns Hopkins Bloomberg School of Public Health 615 North Wolfe Street zwu@jhsph.edu Rafael A. Irizarry Johns Hopkins Bloomberg School of Public Health 615 North Wolfe Street rafa@jhu.edu ABSTRACT High density oligonucleotide expression arrays are a widely used tool for the measurement of gene expression on a large scale. A#ymetrix GeneChip arrays appear to dominate this market. These arrays use short oligonucleotides to probe for genes in an RNA sample. Due to optical noise, nonspecific hybridization, probe-specific e#ects, and measurement error, ad-hoc measures of expression, that summarize probe intensities, can lead to imprecise and inaccurate results. Various researchers have demonstrated that expression measures based on simple statistical models can provide great improvements over the ad-hoc procedure o#ered by A#ymetrix. Recently, physical models based on molecular hybridization theory, have been proposed as useful tools for prediction of, for example, non-specific hybridization. These physical models show great potential in terms of improving existing expression measures. In this paper we suggest that the system producing the measured intensities is too complex to be fully described with these relatively simple physical models and we propose empirically motivated stochastic models that compliment the above mentioned molecular hybridization theory to provide a comprehensive description of the data. We discuss how the proposed model can be used to obtain improved measures of expression useful for the data analysts.
Evaluation of gene expression measurements from commercial microarray platforms
- Nucleic Acids Res
, 2003
"... Multiple commercial microarrays for measuring genome-wide gene expression levels are currently available, including oligonucleotide and cDNA, single- and two-channel formats. This study reports on the results of gene expression measurements generated from identical RNA preparations that were obtaine ..."
Abstract
-
Cited by 72 (0 self)
- Add to MetaCart
(Show Context)
Multiple commercial microarrays for measuring genome-wide gene expression levels are currently available, including oligonucleotide and cDNA, single- and two-channel formats. This study reports on the results of gene expression measurements generated from identical RNA preparations that were obtained using three commercially available microarray platforms. RNA was collected from PANC-1 cells grown in serum-rich medium and at 24 h following the removal of serum. Three biological replicates were prepared for each condition, and three experimental replicates were produced for the ®rst biological replicate. RNA was labeled and hybridized to microarrays from three major suppliers according to manufacturers ' protocols, and gene expression measurements were obtained using each platform's standard software. For each platform, gene targets from a subset of 2009 common genes were compared. Correlations in gene expression levels and comparisons for signi®cant gene expression changes in this subset were calculated, and showed considerable divergence across the different platforms, suggesting the need for establishing industrial manufacturing standards, and further independent and thorough validation of the technology.
Regional and cellular gene expression changes in human Huntington's disease brain
- Hum Mol Genet
"... Huntington's disease (HD) pathology is well understood at a histological level but a comprehensive molecular analysis of the effect of the disease in the human brain has not previously been available. To elucidate the molecular phenotype of HD on a genome-wide scale, we compared mRNA profiles ..."
Abstract
-
Cited by 71 (7 self)
- Add to MetaCart
(Show Context)
Huntington's disease (HD) pathology is well understood at a histological level but a comprehensive molecular analysis of the effect of the disease in the human brain has not previously been available. To elucidate the molecular phenotype of HD on a genome-wide scale, we compared mRNA profiles from 44 human HD brains with those from 36 unaffected controls using microarray analysis. Four brain regions were analyzed: caudate nucleus, cerebellum, prefrontal association cortex [Brodmann's area 9 (BA9)] and motor cortex [Brodmann's area 4 (BA4)]. The greatest number and magnitude of differentially expressed mRNAs were detected in the caudate nucleus, followed by motor cortex, then cerebellum. Thus, the molecular phenotype of HD generally parallels established neuropathology. Surprisingly, no mRNA changes were detected in prefrontal association cortex, thereby revealing subtleties of pathology not previously disclosed by histological methods. To establish that the observed changes were not simply the result of cell loss, we examined mRNA levels in laser-capture microdissected neurons from Grade 1 HD caudate compared to control. These analyses confirmed changes in expression seen in tissue homogenates; we thus conclude that mRNA changes are not attributable to cell loss alone. These data from bona fide HD brains comprise an important reference for hypotheses related to HD and other neurodegenerative diseases.