Results 1 - 10
of
76
Gene-set approach for expression pattern analysis
- Briefings in Bioinformatics 9
, 2008
"... Recently developed gene set analysis methods evaluate differential expression patterns of gene groups instead of those of individual genes. This approach especially targets gene groups whose constituents show subtle but coordi-nated expression changes, which might not be detected by the usual indivi ..."
Abstract
-
Cited by 92 (7 self)
- Add to MetaCart
(Show Context)
Recently developed gene set analysis methods evaluate differential expression patterns of gene groups instead of those of individual genes. This approach especially targets gene groups whose constituents show subtle but coordi-nated expression changes, which might not be detected by the usual individual gene analysis.The approach has been quite successful in deriving new information from expression data, and a number of methods and tools have been developed intensively in recent years.We review those methods and currently available tools, classify them accord-ing to the statistical methods employed, and discuss their pros and cons. We also discuss several interesting extensions to the methods.
GeneCodis: interpreting gene lists through enrichment analysis and integration of diverse biological information. Nucleic acids research 37: W317
, 2009
"... enrichment analysis and integration of diverse biological information ..."
Abstract
-
Cited by 49 (3 self)
- Add to MetaCart
(Show Context)
enrichment analysis and integration of diverse biological information
Assessing the Biological Significance of Gene Expression Signatures and Co-Expression Modules by Studying Their Network Properties
"... Microarray experiments have been extensively used to define signatures, which are sets of genes that can be considered markers of experimental conditions (typically diseases). Paradoxically, in spite of the apparent functional role that might be attributed to such gene sets, signatures do not seem t ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
(Show Context)
Microarray experiments have been extensively used to define signatures, which are sets of genes that can be considered markers of experimental conditions (typically diseases). Paradoxically, in spite of the apparent functional role that might be attributed to such gene sets, signatures do not seem to be reproducible across experiments. Given the close relationship between function and protein interaction, network properties can be used to study to what extent signatures are composed of genes whose resulting proteins show a considerable level of interaction (and consequently a putative common functional role). We have analysed 618 signatures and 507 modules of co-expression in cancer looking for significant values of four main protein-protein interaction (PPI) network parameters: connection degree, cluster coefficient, betweenness and number of components. A total of 3904 gene ontology (GO) modules, 146 KEGG pathways, and 263 Biocarta pathways have been used as functional modules of reference. Co-expression modules found in microarray experiments display a high level of connectivity, similar to the one shown by conventional modules based on functional definitions (GO, KEGG and Biocarta). A general observation for all the classes studied is that the networks formed by the modules improve their topological parameters when an external protein is allowed to be introduced within the paths (up to the 70 % of GO modules show network parameters beyond the random expectation). This fact suggests that functional definitions are incomplete and some genes might still be missing. Conversely, signatures are clearly not capturing the altered functions in the
ProbCD: enrichment analysis accounting for categorization uncertainty
- BMC bioinformatics
, 2007
"... As in many other areas of science, systems biology makes extensive use of statistical association and significance estimates in contingency tables, a type of categorical data analysis known in this field as enrichment (also over-representation or enhancement) analysis. In spite of efforts to create ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
(Show Context)
As in many other areas of science, systems biology makes extensive use of statistical association and significance estimates in contingency tables, a type of categorical data analysis known in this field as enrichment (also over-representation or enhancement) analysis. In spite of efforts to create probabilistic annotations, especially in the Gene Ontology context, or to deal with uncertainty in high throughput-based datasets, current enrichment methods largely ignore this probabilistic information since they are mainly based on variants of the Fisher Exact Test. We developed an open-source R package to deal with probabilistic categorical data analysis, ProbCD, that does not require a static contingency table. The contingency table for the enrichment problem is built using the expectation of a Bernoulli Scheme stochastic process given the categorization probabilities. An on-line interface was created to allow usage by non-programmers and is available at:
GeneCodis3: a non-redundant and modular enrichment analysis tool for functional genomics. Nucleic Acids Res. 2012; 40:W478– W483. [PubMed: 22573175] Wolpin et al. Page 15 Nat Genet. Author manuscript; available
- in PMC 2015
"... Since its first release in 2007, GeneCodis has become a valuable tool to functionally interpret results from experimental techniques in genomics. This web-based application integrates different sources of information to finding groups of genes with similar biological meaning. This process, known as ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
(Show Context)
Since its first release in 2007, GeneCodis has become a valuable tool to functionally interpret results from experimental techniques in genomics. This web-based application integrates different sources of information to finding groups of genes with similar biological meaning. This process, known as enrichment analysis, is essential in the interpretation of high-throughput experiments. The frequent feedbacks and the natural evolution of genomics and bioinformatics have allowed the growth of the tool and the development of this third release. In this version, a special effort has been made to remove noisy and redundant output from the enrichment results with the inclusion of a recently reported algorithm that summarizes signifi-cantly enriched terms and generates functionally coherent modules of genes and terms. A new com-parative analysis has been added to allow the differ-ential analysis of gene sets. To expand the scope of the application, new sources of biological informa-tion have been included, such as genetic diseases, drugs–genes interactions and Pubmed information among others. Finally, the graphic section has been renewed with the inclusion of new interactive graphics and filtering options. The application is freely available at
Fuzzy ensemble clustering for DNA microarray data analysis
"... Abstract. Two major problems related the unsupervised analysis of gene expression data are represented by the accuracy and reliability of the discovered clusters, and by the biological fact that classes of examples or classes of functionally related genes are sometimes not clearly defined. To face t ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
Abstract. Two major problems related the unsupervised analysis of gene expression data are represented by the accuracy and reliability of the discovered clusters, and by the biological fact that classes of examples or classes of functionally related genes are sometimes not clearly defined. To face these items, we propose a fuzzy ensemble clustering approach to both improve the accuracy of clustering results and to take into account the inherent fuzziness of biological and bio-medical gene expression data. Preliminary results with DNA microarray data of lymphoma and adenocarcinoma patients show the effectiveness of the proposed approach. 1
Discovering multi-level structures in bio-molecular data through the Bernstein inequality
- BMC BIOINFORMATICS
, 2008
"... Background: The unsupervised discovery of structures (i.e. clusterings) underlying data is a central issue in several branches of bioinformatics. Methods based on the concept of stability have been recently proposed to assess the reliability of a clustering procedure and to estimate the ”optimal ” n ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
Background: The unsupervised discovery of structures (i.e. clusterings) underlying data is a central issue in several branches of bioinformatics. Methods based on the concept of stability have been recently proposed to assess the reliability of a clustering procedure and to estimate the ”optimal ” number of clusters in bio-molecular data. A major problem with stability-based methods is the detection of multi-level structures (e.g. hierarchical functional classes of genes), and the assessment of their statistical significance. In this context, a chi-square based statistical test of hypothesis has been proposed; however, to assure the correctness of this technique some assumptions about the distribution of the data are needed. Results: To assess the statistical significance and to discover multi-level structures in bio-molecular data, a new method based on Bernstein’s inequality is proposed. This approach makes no assumptions about the distribution of the data, thus assuring a reliable application to a large range of bioinformatics problems. Results with synthetic and DNA microarray data show the effectiveness of the proposed method. Conclusions: The Bernstein test, due to its loose assumptions, is more sensitive than the chi-square test to the detection of multiple structures simultaneously present in the data. Nevertheless it is less selective, that is subject to more false positives, but adding independence assumptions, a more selective variant of the Bernstein inequality-based test is also presented. The proposed methods can be applied to discover multiple structures and to assess their significance in different types of bio-molecular data.
Gene set internal coherence in the context of functional profiling
- BMC Genomics
, 2009
"... ..."
(Show Context)
GeneBrowser: an approach for integration and functional classification of genomic data
- in Journal of Integrative Bioinformatics
"... The achievements coming from genome analysis depend greatly on the quality of computational and processing methods. Tools for functional mRNA profiling and for gene information integration have become essential to this task. We have developed GeneBrowser as a novel approach that combines the advanta ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
(Show Context)
The achievements coming from genome analysis depend greatly on the quality of computational and processing methods. Tools for functional mRNA profiling and for gene information integration have become essential to this task. We have developed GeneBrowser as a novel approach that combines the advantages of mRNA profiling tools, at genome-scale experiments, with the features provided by data integration systems. For a given set of genes, GeneBrowser integrates bibliography information with functional annotations, using Gene Ontology, Entrez Gene, KEGG Orthology and KEGG Pathways. The result is a comprehensive and easy to use web application that helps researchers to extract knowledge from large data sets and to speed up the discovery process. Availability: GeneBrowser is freely available at