Results 1  10
of
19
Network inference from cooccurrences
, 2008
"... The discovery of networks is a fundamental problem ..."
Abstract

Cited by 23 (1 self)
 Add to MetaCart
(Show Context)
The discovery of networks is a fundamental problem
Identification and Evaluation of Functional Modules in Gene Coexpression Networks
"... Abstract. Identifying gene functional modules is an important step towards elucidating gene functions at a global scale. In this paper, we introduce a simple method to construct gene coexpression networks from microarray data, and then propose an efficient spectral clustering algorithm to identify ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Identifying gene functional modules is an important step towards elucidating gene functions at a global scale. In this paper, we introduce a simple method to construct gene coexpression networks from microarray data, and then propose an efficient spectral clustering algorithm to identify natural communities, which are relatively densely connected subgraphs, in the network. To assess the effectiveness of our approach and its advantage over existing methods, we develop a novel method to measure the agreement between the gene communities and the modular structures in other reference networks, including proteinprotein interaction networks, transcriptional regulatory networks, and gene networks derived from gene annotations. We evaluate the proposed methods on two largescale gene expression data in budding yeast and Arabidopsis thaliana. The results show that the clusters identified by our method are functionally more coherent than the clusters from several standard clustering algorithms, such as kmeans, selforganizing maps, and spectral clustering, and have high agreement to the modular structures in the reference networks.
Inferring network structure from cooccurrences
, 2006
"... We consider the problem of inferring the structure of a network from cooccurrence data: observations that indicate which nodes occur in a signaling pathway but do not directly reveal node order within the pathway. This problem is motivated by network inference problems arising in computational biolo ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
We consider the problem of inferring the structure of a network from cooccurrence data: observations that indicate which nodes occur in a signaling pathway but do not directly reveal node order within the pathway. This problem is motivated by network inference problems arising in computational biology and communication systems, in which it is difficult or impossible to obtain precise time ordering information. Without order information, every permutation of the activated nodes leads to a different feasible solution, resulting in combinatorial explosion of the feasible set. However, physical principles underlying most networked systems suggest that not all feasible solutions are equally likely. Intuitively, nodes that cooccur more frequently are probably more closely connected. Building on this intuition, we model path cooccurrences as randomly shuffled samples of a random walk on the network. We derive a computationally efficient network inference algorithm and, via novel concentration inequalities for importance sampling estimators, prove that a polynomial complexity Monte Carlo version of the algorithm converges with high probability.
Bayesian Hierarchical Model for LargeScale Covariance Matrix Estimation
"... Many bioinformatics problems implicitly depend on estimating largescale covariance matrix. The traditional approaches tend to give rise to high variance and low accuracy due to “overfitting.”We cast the largescale covariance matrix estimation problem into the Bayesian hierarchical model framework ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Many bioinformatics problems implicitly depend on estimating largescale covariance matrix. The traditional approaches tend to give rise to high variance and low accuracy due to “overfitting.”We cast the largescale covariance matrix estimation problem into the Bayesian hierarchical model framework, and introduce dependency between covariance parameters. We demonstrate the advantages of our approaches over the traditional approaches using simulations and OMICS data analysis. Key words: algorithms, computational molecular biology, statistics. 1.
An effective method for network module extraction from microarray data
 BMC Bioinformatics
, 2012
"... Abstract Background: The development of highthroughput Microarray technologies has provided various opportunities to systematically characterize diverse types of computational biological networks. Coexpression network have become popular in the analysis of microarray data, such as for detecting f ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract Background: The development of highthroughput Microarray technologies has provided various opportunities to systematically characterize diverse types of computational biological networks. Coexpression network have become popular in the analysis of microarray data, such as for detecting functional gene modules. Results: This paper presents a method to build a coexpression network (CEN) and to detect network modules from the built network. We use an effective gene expression similarity measure called NMRS (Normalized mean residue similarity) to construct the CEN. We have tested our method on five publicly available benchmark microarray datasets. The network modules extracted by our algorithm have been biologically validated in terms of Q value and p value. Conclusions: Our results show that the technique is capable of detecting biologically significant network modules from the coexpression network. Biologist can use this technique to find groups of genes with similar functionality based on their expression information.
Differential coexpression analysis of obesityassociated networks in human subcutaneous adipose tissue
"... Objective—To use a unique obesitydiscordant sibpair study design to combine differential expression analysis, expression quantitative trait loci (eQTLs) mapping, and a coexpression regulatory network approach in subcutaneous human adipose tissue to identify genes relevant to the obese state. Stud ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Objective—To use a unique obesitydiscordant sibpair study design to combine differential expression analysis, expression quantitative trait loci (eQTLs) mapping, and a coexpression regulatory network approach in subcutaneous human adipose tissue to identify genes relevant to the obese state. Study design—Genomewide transcript expression in subcutaneous human adipose tissue was measured using Affymetrix U133+2.0 microarrays and genomewide genotyping data was obtained using an Applied Biosystems SNPlex linkage panel. Subjects—154 Swedish families ascertained through an obese proband (Body Mass Index>30kg/m2) with a discordant sibling (BMI>10kg/m2 less than proband). Results—Approximately onethird of the transcripts were differentially expressed between lean and obese siblings. The cellular adhesion molecules (CAMs) KEGG grouping contained the largest number of differentially expressed genes under cisacting genetic control. By using a novel approach to contrast CAMs coexpression networks between lean and obese siblings, a subset of differentially regulated genes was identified, with the previously GWAS obesityassociated NEGR1 as a central hub. Independent analysis using mouse data demonstrated that this finding for
A generalized multivariate approach to pattern discovery from replicated and incomplete genomewide measurements
 IEEE/ACM Trans Comput Biol Bioinform
, 2011
"... Abstract—Estimation of pairwise correlation from incomplete and replicated molecular profiling data is an ubiquitous problem in pattern discovery analysis, such as clustering and networking. However, existing methods solve this problem by ad hoc data imputation, followed by aveGation coefficient typ ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Estimation of pairwise correlation from incomplete and replicated molecular profiling data is an ubiquitous problem in pattern discovery analysis, such as clustering and networking. However, existing methods solve this problem by ad hoc data imputation, followed by aveGation coefficient type approaches, which might annihilate important patterns present in the molecular profiling data. Moreover, these approaches do not consider and exploit the underlying experimental design information that specifies the replication mechanisms. We develop an ExpectationMaximization (EM) type algorithm to estimate the correlation structure using incomplete and replicated molecular profiling data with a priori known replication mechanism. The approach is sufficiently generalized to be applicable to any known replication mechanism. In case of unknown replication mechanism, it is reduced to the parsimonious model introduced previously. The efficacy of our approach was first evaluated by comprehensively comparing various bivariate and multivariate imputation approaches using simulation studies. Results from realworld data analysis further confirmed the superior performance of the proposed approach to the commonly used approaches, where we assessed the robustness of the method using data sets with up to 30 percent missing values. Index Terms—Replicated data, pairwise correlation, pattern recognition, unsupervised learning, missing value. Ç 1
Inferring network structure for . . .
"... We consider the problem of inferring the structure of a network from cooccurrence data: observations that indicate which nodes occur in a signaling pathway but do not directly reveal node order within the pathway. This problem is motivated by network inference problems arising in computational biolo ..."
Abstract
 Add to MetaCart
We consider the problem of inferring the structure of a network from cooccurrence data: observations that indicate which nodes occur in a signaling pathway but do not directly reveal node order within the pathway. This problem is motivated by network inference problems arising in computational biology and communication systems, in which it is difficult or impossible to obtain precise time ordering information. Without order information, every permutation of the activated nodes leads to a different feasible solution, resulting in combinatorial explosion of the feasible set. However, physical principles underlying most networked systems suggest that not all feasible solutions are equally likely. Intuitively, nodes that cooccur more frequently are probably more closely connected. Building on this intuition, we model path cooccurrences as randomly shuffled samples of a random walk on the network. We derive a computationally efficient network inference algorithm and, via novel concentration inequalities for importance sampling estimators, prove that a polynomial complexity Monte Carlo version of the algorithm converges with high probability.
METHODOLOGY ARTICLE Open Access
"... A general coexpression networkbased approach to gene expression analysis: comparison and applications ..."
Abstract
 Add to MetaCart
(Show Context)
A general coexpression networkbased approach to gene expression analysis: comparison and applications