Results 1 - 10
of
1,321
Limma: linear models for microarray data
- Bioinformatics and Computational Biology Solutions using R and Bioconductor
, 2005
"... This free open-source software implements academic research by the authors and co-workers. If you use it, please support the project by citing the appropriate journal articles listed in Section 2.1.Contents ..."
Abstract
-
Cited by 774 (13 self)
- Add to MetaCart
(Show Context)
This free open-source software implements academic research by the authors and co-workers. If you use it, please support the project by citing the appropriate journal articles listed in Section 2.1.Contents
NCBI GEO: archive for functional genomics data sets–10 years on
- Nucleic Acids Res
, 2011
"... years on ..."
(Show Context)
A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics
, 2005
"... ..."
Use of within-array replicate spots for assessing differential expression in microarray experiments
- Bioinformatics
, 2005
"... Motivation. Spotted arrays are often printed with probes in duplicate or triplicate, but current methods for assessing differential expression are not able to make full use of the resulting information. Usual practice is to average the duplicate or triplicate results for each probe before assessing ..."
Abstract
-
Cited by 239 (8 self)
- Add to MetaCart
(Show Context)
Motivation. Spotted arrays are often printed with probes in duplicate or triplicate, but current methods for assessing differential expression are not able to make full use of the resulting information. Usual practice is to average the duplicate or triplicate results for each probe before assessing differential expression. This loses valuable information about gene-wise variability. Results. A method is proposed for extracting more information from within-array replicate spots in microarray experiments by estimating the strength of the correlation between them. The method involves fitting separate linear models to the expression data for each gene but with a common value for the between-replicate correlation. The method greatly improves the precision with which the genewise variances are estimated and thereby improves inference methods designed to identify differentially expressed genes. The method may be combined with empirical Bayes methods for moderating the genewise variances between genes. The method is validated using data from a microarray experiment involving calibration and ratio control spots in conjunction with spiked-in RNA. Comparing results for calibration and ratio control spots shows that the common correlation method results in substantially better discrimination of differentially expressed genes from those which are not. The spike-in experiment also confirms that the results may be further improved by empirical Bayes smoothing of the variances when the sample size is small. Availability. The methodology is implemented in the limma software package for R, available from the CRAN repository
On testing the significance of sets of genes
- Annals of Applied Statistics
"... This paper discusses the problem of identifying differentially expressed groups of genes from a microarray experiment. The groups of genes are externally defined, for example, sets of gene pathways derived from biological databases. Our starting point is the interesting Gene Set Enrichment Analysis ..."
Abstract
-
Cited by 166 (3 self)
- Add to MetaCart
(Show Context)
This paper discusses the problem of identifying differentially expressed groups of genes from a microarray experiment. The groups of genes are externally defined, for example, sets of gene pathways derived from biological databases. Our starting point is the interesting Gene Set Enrichment Analysis (GSEA) procedure of Subramanian et al. (2005). We study the problem in some generality and propose two potential improvements to GSEA: the maxmean statistic for summarizing gene-sets, and restandardization for more accurate inferences. We discuss a variety of examples and extensions, including the use of gene-set scores for class predictions. We also describe a new R language package GSA that implements our ideas. 1
Moderated Statistical Tests for Assessing Differences in Tag Abundance
"... Motivation: Digital gene expression (DGE) technologies measure gene expression by counting sequence tags. They are sensitive technologiesfor measuring gene expression on agenomic scale, without the need for prior knowledge of the genome sequence. As the cost of sequencing DNA decreases, the number o ..."
Abstract
-
Cited by 143 (7 self)
- Add to MetaCart
(Show Context)
Motivation: Digital gene expression (DGE) technologies measure gene expression by counting sequence tags. They are sensitive technologiesfor measuring gene expression on agenomic scale, without the need for prior knowledge of the genome sequence. As the cost of sequencing DNA decreases, the number of DGE datasets is expected to grow dramatically. Various tests of differential expression have been proposed for replicated DGE data using binomial, Poisson, negative binomial or pseudo-likelihood (PL) models for the counts, but none of the these are usable when the number of replicates is very small. Results: We develop tests using the negative binomial distribution to model overdispersion relative to the Poisson, and use conditional weighted likelihood to moderate the level of overdispersion across genes. Not only is our strategy applicable even with the smallest number of libraries, but it also proves to be more powerful than previous strategies when more libraries are available. The methodology is equally applicable to other counting technologies, such as proteomic spectral counts. Availability: An R package and supplementary materials can be accessed from
A Hilbert space embedding for distributions
- In Algorithmic Learning Theory: 18th International Conference
, 2007
"... Abstract. We describe a technique for comparing distributions without the need for density estimation as an intermediate step. Our approach relies on mapping the distributions into a reproducing kernel Hilbert space. Applications of this technique can be found in two-sample tests, which are used for ..."
Abstract
-
Cited by 113 (44 self)
- Add to MetaCart
(Show Context)
Abstract. We describe a technique for comparing distributions without the need for density estimation as an intermediate step. Our approach relies on mapping the distributions into a reproducing kernel Hilbert space. Applications of this technique can be found in two-sample tests, which are used for determining whether two sets of observations arise from the same distribution, covariate shift correction, local learning, measures of independence, and density estimation. Kernel methods are widely used in supervised learning [1, 2, 3, 4], however they are much less established in the areas of testing, estimation, and analysis of probability distributions, where information theoretic approaches [5, 6] have long been dominant. Recent examples include [7] in the context of construction of graphical models, [8] in the context of feature extraction, and [9] in the context of independent component analysis. These methods have by and large a common issue: to compute quantities such as the mutual information, entropy, or Kullback-Leibler divergence, we require sophisticated space partitioning and/or
Model-based Variance-stabilizing Transformation for Illumina Microarray Data’, Nucleic Acids Res
, 2008
"... doi:10.1093/nar/gkm1075 ..."
(Show Context)
Microarrays, empirical Bayes and the two-groups model
- STATIST. SCI
, 2006
"... The classic frequentist theory of hypothesis testing developed by Neyman, Pearson, and Fisher has a claim to being the Twentieth Century’s most influential piece of applied mathematics. Something new is happening in the Twenty-First Century: high throughput devices, such as microarrays, routinely re ..."
Abstract
-
Cited by 75 (10 self)
- Add to MetaCart
(Show Context)
The classic frequentist theory of hypothesis testing developed by Neyman, Pearson, and Fisher has a claim to being the Twentieth Century’s most influential piece of applied mathematics. Something new is happening in the Twenty-First Century: high throughput devices, such as microarrays, routinely require simultaneous hypothesis tests for thousands of individual cases, not at all what the classical theory had in mind. In these situations empirical Bayes information begins to force itself upon frequentists and Bayesians alike. The two-groups model is a simple Bayesian construction that facilitates empirical Bayes analysis. This article concerns the interplay of Bayesian and frequentist ideas in the two-groups setting, with particular attention focussed on Benjamini and Hochberg’s False Discovery Rate method. Topics include the choice and meaning of the null hypothesis in large-scale testing situations, power considerations, the limitations of permutation methods, significance testing for groups of cases (such as pathways in microarray studies), correlation effects, multiple confidence intervals, and Bayesian competitors to the two-groups model.
Regional and cellular gene expression changes in human Huntington's disease brain
- Hum Mol Genet
"... Huntington's disease (HD) pathology is well understood at a histological level but a comprehensive molecular analysis of the effect of the disease in the human brain has not previously been available. To elucidate the molecular phenotype of HD on a genome-wide scale, we compared mRNA profiles ..."
Abstract
-
Cited by 71 (7 self)
- Add to MetaCart
(Show Context)
Huntington's disease (HD) pathology is well understood at a histological level but a comprehensive molecular analysis of the effect of the disease in the human brain has not previously been available. To elucidate the molecular phenotype of HD on a genome-wide scale, we compared mRNA profiles from 44 human HD brains with those from 36 unaffected controls using microarray analysis. Four brain regions were analyzed: caudate nucleus, cerebellum, prefrontal association cortex [Brodmann's area 9 (BA9)] and motor cortex [Brodmann's area 4 (BA4)]. The greatest number and magnitude of differentially expressed mRNAs were detected in the caudate nucleus, followed by motor cortex, then cerebellum. Thus, the molecular phenotype of HD generally parallels established neuropathology. Surprisingly, no mRNA changes were detected in prefrontal association cortex, thereby revealing subtleties of pathology not previously disclosed by histological methods. To establish that the observed changes were not simply the result of cell loss, we examined mRNA levels in laser-capture microdissected neurons from Grade 1 HD caudate compared to control. These analyses confirmed changes in expression seen in tissue homogenates; we thus conclude that mRNA changes are not attributable to cell loss alone. These data from bona fide HD brains comprise an important reference for hypotheses related to HD and other neurodegenerative diseases.