### Citations

8731 |
Controlling the false discovery rate: a practical and powerful approach to multiple testing
- Benjamini, Hochberg
- 1995
(Show Context)
Citation Context ... n− x1 N − n− x2 N − x1 − x2 Total n N − n N Multiple comparison procedures (MCPs) are then applied to the resulting p-values to prevent excessive false-positive rates. The false discovery rate (FDR) =-=[3]-=- is frequently used to control the expected proportion of incorrectly rejected null hypotheses in gene enrichment studies [22, 25, 29] because it has lower false-negative rates than the Bonferroni cor... |

2261 |
KEGG: Kyoto encyclopedia of genes and genomes
- Kanehisa, Goto
- 2000
(Show Context)
Citation Context ...t, is termed the feature enrichment problem. The biological information term may be, for instance, a Gene Ontology (GO) category [1] or a pathway in the Kyoto Encyclopedia of Genes and Genomes (KEGG) =-=[20]-=-. This problem has been addressed using a number of high-throughput enrichment tools, including DAVID [10], MAPPFinder [11], Onto-Express [21], and GoMiner [31]. Huang et al. [18] reviewed 68 distinct... |

879 |
Theory of probability
- Jeffreys
- 1961
(Show Context)
Citation Context ...y the NML ratio with that estimated by the MLE2 (left) and MLE3 (right). The integers are defined in Figure 1. The grey dashed lines mark commonly used thresholds for strong and overwhelming evidence =-=[19, 7]-=-. sentation (“enrichment”) is equal to the number with underrepresentation (“depletion”), we assessed the sensitivity of the MLE to that symmetry assumption by using strongly asymmetric log odds ratio... |

337 | The positive false discovery rate: a Bayesian interpretation and the q-value.
- Storey
- 2003
(Show Context)
Citation Context ...hment studies [22, 25, 29] because it has lower false-negative rates than the Bonferroni correction and other methods of controlling the family-wise error rate. Methods of FDR control assign q-values =-=[28]-=- to biological categories, but q-values are too low to reliably estimate the probability that the biological category has equivalent representation among the preselected features. Thus, we study appli... |

300 | Large-scale simultaneous hypothesis testing: the choice of a null hypothesis
- Efron
(Show Context)
Citation Context ...7] used an LFDR estimator to solve a GSEA problem and pointed out that this was less biased than the q-value for estimating the LFDR, the posterior probability that the null hypothesis is true. Efron =-=[12, 13]-=- devised reliable LFDR estimators for a range of applications in microarray gene expression analysis and other problems of large-scale inference. However, whereas microarray gene expression analysis t... |

288 |
Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists.
- Huang, Sherman, et al.
- 2009
(Show Context)
Citation Context ...s and Genomes (KEGG) [20]. This problem has been addressed using a number of high-throughput enrichment tools, including DAVID [10], MAPPFinder [11], Onto-Express [21], and GoMiner [31]. Huang et al. =-=[18]-=- reviewed 68 distinct feature enrichment analysis tools. These authors further classified feature enrichment analysis tools into 3 categories: singular enrichment analysis (SEA), gene set enrichment a... |

272 |
The Minimum Description Length Principle
- Grünwald
- 2007
(Show Context)
Citation Context ...0, 1, . . . , si}. The regret is defined as reg(f̄ , ti|si; Θ) = log f θ̂i(ti|si) (ti|si) f̄(ti|si) (19) where θ̂i(ti|si) is the Type I MLE with respect to the Θ under the observed values ti given si =-=[6, 16]-=-. For all members of Ei, the optimal predictive conditional probability mass function of GO category i, denoted as f †i , minimizes the maximal regret in the sample space {0, 1, . . . , si} in the sen... |

232 |
GoMiner: a resource for biological interpretation of genomic and proteomic data.
- Zeeberg
- 2003
(Show Context)
Citation Context ...ncyclopedia of Genes and Genomes (KEGG) [20]. This problem has been addressed using a number of high-throughput enrichment tools, including DAVID [10], MAPPFinder [11], Onto-Express [21], and GoMiner =-=[31]-=-. Huang et al. [18] reviewed 68 distinct feature enrichment analysis tools. These authors further classified feature enrichment analysis tools into 3 categories: singular enrichment analysis (SEA), ge... |

175 |
MAPPFinder: using gene ontology and GenMAPP to create a global gene-expression profile from microarray data,”
- Doniger, Salomonis, et al.
- 2003
(Show Context)
Citation Context ...egory [1] or a pathway in the Kyoto Encyclopedia of Genes and Genomes (KEGG) [20]. This problem has been addressed using a number of high-throughput enrichment tools, including DAVID [10], MAPPFinder =-=[11]-=-, Onto-Express [21], and GoMiner [31]. Huang et al. [18] reviewed 68 distinct feature enrichment analysis tools. These authors further classified feature enrichment analysis tools into 3 categories: s... |

165 |
Inference and asymptotics
- Barndorff-Nielsen, Cox
- 1994
(Show Context)
Citation Context ...of the statistic S(X1, X2) equals the nuisance parameter. Hence, from the observation of S(X1, X2) alone, the distribution of the statistic S(X1, X2) contains little information about the parameter θ =-=[2]-=-. The statistic S(X1, X2) satisfies the other 3 conditions of an ancillary statistic defined by Barndorff-Nielsen and Cox [2]: parameters θ and λ are variation independent; the statistic (T (X1, X2), ... |

152 |
Genetic mapping in human disease.
- Altshuler, Daly, et al.
- 2008
(Show Context)
Citation Context ...rrepresented or underrepresented), compared to the reference feature set, is termed the feature enrichment problem. The biological information term may be, for instance, a Gene Ontology (GO) category =-=[1]-=- or a pathway in the Kyoto Encyclopedia of Genes and Genomes (KEGG) [20]. This problem has been addressed using a number of high-throughput enrichment tools, including DAVID [10], MAPPFinder [11], Ont... |

113 |
Profiling gene expression using ontoexpress. Genomics
- Khatri, Draghici, et al.
- 2002
(Show Context)
Citation Context ...way in the Kyoto Encyclopedia of Genes and Genomes (KEGG) [20]. This problem has been addressed using a number of high-throughput enrichment tools, including DAVID [10], MAPPFinder [11], Onto-Express =-=[21]-=-, and GoMiner [31]. Huang et al. [18] reviewed 68 distinct feature enrichment analysis tools. These authors further classified feature enrichment analysis tools into 3 categories: singular enrichment ... |

106 |
Large-scale inference: empirical Bayes methods for estimation, testing, and prediction, volume 1.
- Efron
- 2010
(Show Context)
Citation Context ...7] used an LFDR estimator to solve a GSEA problem and pointed out that this was less biased than the q-value for estimating the LFDR, the posterior probability that the null hypothesis is true. Efron =-=[12, 13]-=- devised reliable LFDR estimators for a range of applications in microarray gene expression analysis and other problems of large-scale inference. However, whereas microarray gene expression analysis t... |

92 | The Estimation of Probabilities
- Good
- 1965
(Show Context)
Citation Context ...i = min ( mp2ri 2ri , 1 ) , ri ≤ m 2 1, ri > m 2 (12) It is conservative in the sense that it tends to overestimate the LFDR [5]. 7 Type II maximum likelihood estimator Bickel [5] follows Good =-=[15]-=- in calling the maximization of likelihood over a hyperparameter Type II maximum likelihood to distinguish it from the usual Type I maximum likelihood, which pertains only to models that lack random p... |

89 |
Likelihood Methods in Statistics
- Severini
- 2000
(Show Context)
Citation Context ... probability mass function of T (x1, x2) = t evaluated at S(x1, x2) = x1 + x2 = s, say Pr(T = t|S = s; θ, λ,N, n), does not depend on the nuisance parameter λ [4]. See also Example 5 8.47 of Severini =-=[27]-=-. Thus, we derive the conditional probability mass function fθ(t|s) = Pr(T = t|S = s; θ, n,N) = ( n t )( N−n v−t ) etθ∑min(s,n) j=max(0,s+n−N) ( n j )( N−n s−j ) ejθ (6) understood as a function of t.... |

55 |
database for annotation, visualization, and integrated discovery. Genome Biol.
- DAVID
- 2003
(Show Context)
Citation Context ...Ontology (GO) category [1] or a pathway in the Kyoto Encyclopedia of Genes and Genomes (KEGG) [20]. This problem has been addressed using a number of high-throughput enrichment tools, including DAVID =-=[10]-=-, MAPPFinder [11], Onto-Express [21], and GoMiner [31]. Huang et al. [18] reviewed 68 distinct feature enrichment analysis tools. These authors further classified feature enrichment analysis tools int... |

29 |
Bias in the estimation of false discovery rate in microarray studies
- Pawitan
- 2005
(Show Context)
Citation Context ...sual Type I maximum likelihood, which pertains only to models that lack random parameters. Type II maximum likelihood has been applied to parametric mixture models for the analysis of microarray data =-=[24, 23]-=-, proteomics data [9], and genetic association data [30]. In this section, we adapt the approach to the gene enrichment problem by using the conditional probability mass function defined above. Let G(... |

19 | The strength of statistical evidence for composite hypotheses: Inference to the best explanation
- Bickel, 2012c
(Show Context)
Citation Context ...y the NML ratio with that estimated by the MLE2 (left) and MLE3 (right). The integers are defined in Figure 1. The grey dashed lines mark commonly used thresholds for strong and overwhelming evidence =-=[19, 7]-=-. sentation (“enrichment”) is equal to the number with underrepresentation (“depletion”), we assessed the sensitivity of the MLE to that symmetry assumption by using strongly asymmetric log odds ratio... |

17 | An empirical Bayes mixture method for effect size and false discovery rate estimation. The Annals of Applied Statistics
- MURALIDHARAN
- 2010
(Show Context)
Citation Context ...sual Type I maximum likelihood, which pertains only to models that lack random parameters. Type II maximum likelihood has been applied to parametric mixture models for the analysis of microarray data =-=[24, 23]-=-, proteomics data [9], and genetic association data [30]. In this section, we adapt the approach to the gene enrichment problem by using the conditional probability mass function defined above. Let G(... |

11 |
Estimating the null distribution to adjust observed confidence levels for genome-scale screening
- Bickel, 2010b
(Show Context)
Citation Context ...e microarray. The LFDR of every gene is estimated using the theoretical null hypothesis method of Efron [12, 13]; empirical null hypotheses can lead to excessive bias due to deviations from normality =-=[8]-=-. When we compared gene expression data for the presence and absence of estrogen after 10 hours of exposure, we obtained 74 DE genes. Defining unrelated pairs of GO categories as those that do not sha... |

10 |
Analyzing factorial designed microarray experiments
- Scholtens, Miron, et al.
- 2004
(Show Context)
Citation Context ...0 (26) as an MLE-based estimator of BFi. 10 Results Breast cancer data analysis The data set used here is from an experiment applying an estrogen treatment to cells of a human breast cancer cell line =-=[26]-=-. The data, which is available from the Bioconductor project, contains 8 Affymetrix HG-U95Av2 CEL files from an estrogen receptor-positive breast cancer cell line. (For further information concerning ... |

9 | DR: Small-scale inference: empirical Bayes and confidence methods for as few as a single comparison.: Tech Rep, Ottawa Inst Syst Biol; 2011:arXiv:1104.0341
- Bickel
(Show Context)
Citation Context ...ent problem typically concerns a much smaller number of GO categories. While those methods are appropriate for microarray-scale inference, they are less reliable for enrichment-scale inference Bickel =-=[4, 9]-=-. Thus, we will specifically adapt three types of LFDR estimators that are appropriate for smallerscale inference to address the SEA problem. Here we will focus on genes and GO categories. Nevertheles... |

9 |
A comprehensive analysis of prognostic signatures reveals the high predictive capacity of the proliferation, immune response and RNA splicing modules in breast cancer
- Reyal
- 2008
(Show Context)
Citation Context ... to prevent excessive false-positive rates. The false discovery rate (FDR) [3] is frequently used to control the expected proportion of incorrectly rejected null hypotheses in gene enrichment studies =-=[22, 25, 29]-=- because it has lower false-negative rates than the Bonferroni correction and other methods of controlling the family-wise error rate. Methods of FDR control assign q-values [28] to biological categor... |

8 |
Minimum description length methods of medium-scale simultaneous inference
- Bickel, 2010d
(Show Context)
Citation Context ...ent problem typically concerns a much smaller number of GO categories. While those methods are appropriate for microarray-scale inference, they are less reliable for enrichment-scale inference Bickel =-=[4, 9]-=-. Thus, we will specifically adapt three types of LFDR estimators that are appropriate for smallerscale inference to address the SEA problem. Here we will focus on genes and GO categories. Nevertheles... |

8 |
Variability of gene expression profiles in human blood and lymphoblastoid cell lines
- Min, Barrett, et al.
- 2010
(Show Context)
Citation Context ... to prevent excessive false-positive rates. The false discovery rate (FDR) [3] is frequently used to control the expected proportion of incorrectly rejected null hypotheses in gene enrichment studies =-=[22, 25, 29]-=- because it has lower false-negative rates than the Bonferroni correction and other methods of controlling the family-wise error rate. Methods of FDR control assign q-values [28] to biological categor... |

7 |
Simple estimators of false discovery rates given as few as one or two p-values without strong parametric assumptions
- Bickel
- 2013
(Show Context)
Citation Context ...semiparametric estimator (SPE) of LFDR of the GO category i is ̂LFDRi = min ( mp2ri 2ri , 1 ) , ri ≤ m 2 1, ri > m 2 (12) It is conservative in the sense that it tends to overestimate the LFDR =-=[5]-=-. 7 Type II maximum likelihood estimator Bickel [5] follows Good [15] in calling the maximization of likelihood over a hyperparameter Type II maximum likelihood to distinguish it from the usual Type I... |

7 | Minimum description length and empirical Bayes methods of identifying SNPs associated with disease
- Yang, Bickel
- 2010
(Show Context)
Citation Context ...s that lack random parameters. Type II maximum likelihood has been applied to parametric mixture models for the analysis of microarray data [24, 23], proteomics data [9], and genetic association data =-=[30]-=-. In this section, we adapt the approach to the gene enrichment problem by using the conditional probability mass function defined above. Let G(s) = {gθ(•|s); θ ≥ 0} be a parametric family of probabil... |

6 |
A predictive approach to measuring the strength of statistical evidence for single and multiple comparisons
- Bickel
- 2011
(Show Context)
Citation Context ...0, 1, . . . , si}. The regret is defined as reg(f̄ , ti|si; Θ) = log f θ̂i(ti|si) (ti|si) f̄(ti|si) (19) where θ̂i(ti|si) is the Type I MLE with respect to the Θ under the observed values ti given si =-=[6, 16]-=-. For all members of Ei, the optimal predictive conditional probability mass function of GO category i, denoted as f †i , minimizes the maximal regret in the sample space {0, 1, . . . , si} in the sen... |

3 | Local false discovery rate facilitates comparison of different microarray experiments
- Hong, Tibshirani, et al.
- 2009
(Show Context)
Citation Context ...nt representation among the preselected features. Thus, we study application of better estimators of that probability, which is technically known as the local false discovery rate (LFDR). Hong et al. =-=[17]-=- used an LFDR estimator to solve a GSEA problem and pointed out that this was less biased than the q-value for estimating the LFDR, the posterior probability that the null hypothesis is true. Efron [1... |

1 |
Transcriptional regulatory dynamics of the hypothalamic-pituitary-gonadal axis and its peripheral pathways as impacted by the 3-beta HSD inhibitor trilostane in zebrafish (danio rerio). Ecotoxicology and Environmental Safety
- Wang, Bencic, et al.
(Show Context)
Citation Context ... to prevent excessive false-positive rates. The false discovery rate (FDR) [3] is frequently used to control the expected proportion of incorrectly rejected null hypotheses in gene enrichment studies =-=[22, 25, 29]-=- because it has lower false-negative rates than the Bonferroni correction and other methods of controlling the family-wise error rate. Methods of FDR control assign q-values [28] to biological categor... |