Results 1 - 10
of
88
SCOPE: a web server for practical de novo motif discovery
- Nucleic Acids Res
, 2007
"... discovery ..."
(Show Context)
GimmeMotifs: a de novo motif prediction pipeline for ChIP-sequencing experiments
- Bioinformatics
, 2011
"... Summary: Accurate prediction of transcription factor binding motifs that are enriched in a collection of sequences remains a computational challenge. Here we report on GimmeMotifs, a pipeline that incorporates an ensemble of computational tools to predict motifs de novo from ChIP-sequencing (ChIP-se ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
Summary: Accurate prediction of transcription factor binding motifs that are enriched in a collection of sequences remains a computational challenge. Here we report on GimmeMotifs, a pipeline that incorporates an ensemble of computational tools to predict motifs de novo from ChIP-sequencing (ChIP-seq) data. Similar redundant motifs are compared using the weighted information content (WIC) similarity score and clustered using an iterative procedure. A comprehensive output report is generated with several different evaluation metrics to compare and evaluate the results. Benchmarks show that the method performs well on human and mouse ChIP-seq datasets. GimmeMotifs consists of a suite of command-line scripts that can be easily implemented in a ChIP-seq analysis pipeline. Availability: GimmeMotifs is implemented in Python and runs on Linux. The source code is freely available for download at
Z (2009) Genome-wide de novo prediction of cisregulatory binding sites in prokaryotes. Nucleic Acids Res 37: e72
"... in prokaryotes ..."
(Show Context)
Apples to apples: improving the performance of motif finders and their significance analysis in the Twilight Zone
- Bioinformatics
, 2006
"... doi:10.1093/bioinformatics/btl245BIOINFORMATICS ..."
(Show Context)
Regulatory Motif Discovery Using a Population Clustering Evolutionary Algorithm
"... Abstract—This paper describes a novel evolutionary algorithm for regulatory motif discovery in DNA promoter sequences. The algorithm uses data clustering to logically distribute the evolving population across the search space. Mating then takes place within local regions of the population, promoting ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
(Show Context)
Abstract—This paper describes a novel evolutionary algorithm for regulatory motif discovery in DNA promoter sequences. The algorithm uses data clustering to logically distribute the evolving population across the search space. Mating then takes place within local regions of the population, promoting overall solution diversity and encouraging discovery of multiple solutions. Experiments using synthetic data sets have demonstrated the algorithm’s capacity to find position frequency matrix models of known regulatory motifs in relatively long promoter sequences. These experiments have also shown the algorithm’s ability to maintain diversity during search and discover multiple motifs within a single population. The utility of the algorithm for discovering motifs in real biological data is demonstrated by its ability to find meaningful motifs within muscle-specific regulatory sequences. Index Terms—Evolutionary computation, population-based data clustering, motif discovery, transcription factor binding sites, musclespecific gene expression. 1
RegAnalyst: a web interface for the analysis of regulatory motifs, networks and pathways
- Nucleic Acids Res
, 2009
"... and pathways ..."
(Show Context)
Identification of weak motifs in multiple biological sequences using genetic algorithm
- In GECCO ’06: Proceedings of the 8th annual conference on Genetic and evolutionary computation
, 2006
"... Recognition of motifs in multiple unaligned sequences provides an insight into protein structure and function. The task of discovering these motifs is very challenging because most of these motifs exist in different sequences in different mutated forms of the original consensus motif and thus have w ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
Recognition of motifs in multiple unaligned sequences provides an insight into protein structure and function. The task of discovering these motifs is very challenging because most of these motifs exist in different sequences in different mutated forms of the original consensus motif and thus have weakly conserved regions. Different score metrics and algorithms have been proposed for motif recognition. In this paper, we propose a new genetic algorithm based method for identification of multiple motifs instances in multiple biological sequences. The experimental results on simulated and real data show that our algorithm can identify multiple occurrences of a weak motif in single sequences as well as in multiple sequences. Moreover, it can identify weakly conserved regions more accurately than other genetic algorithm based motif discovery methods.
Recent Advances in the Computational Discovery of Transcription Factor Binding Sites
"... algorithms ..."
IEM: AN ALGORITHM FOR ITERATIVE ENHANCEMENT OF MOTIFS USING COMPARATIVE GENOMICS DATA
"... Understanding gene regulation is a key step to investigating gene functions and their relationships. Many algorithms have been developed to discover transcription factor binding sites (TFBS); they are predominantly located in upstream regions of genes and contribute to transcription regulation if th ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Understanding gene regulation is a key step to investigating gene functions and their relationships. Many algorithms have been developed to discover transcription factor binding sites (TFBS); they are predominantly located in upstream regions of genes and contribute to transcription regulation if they are bound by a specific transcription factor. However, traditional methods focusing on finding motifs have shortcomings, which can be overcome by using comparative genomics data that is now increasingly available. Traditional methods to score motifs also have their limitations. In this paper, we propose a new algorithm called IEM to refine motifs using comparative genomics data. We show the effectiveness of our techniques with several data sets. Two sets of experiments were performed with comparative genomics data on five strains of P. aeruginosa. One set of experiments were performed with similar data on four species of yeast. The weighted conservation score proposed in this paper is an improvement over existing motif scores.