• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

A comprehensive library of DNA-binding site matrices for 55 proteins applied to the complete Escherichia coli K-12 genome (1998)

by K Robinson, A M McGuire, G M Church
Venue:J Mol Biol
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 146
Next 10 →

Comining phylogenetic data with co-regulated genes to identify regulatory motif

by Ting Wang, Gary D. Stormo - BIOINFORMATICS , 2003
"... Motivation: Discovery of regulatory motifs in unaligned DNA sequences remains a fundamental problem in computational biology. Two categories of algorithms have been developed to identify common motifs from a set of DNA sequences. The first can be called a ‘multiple genes, single species’approach. It ..."
Abstract - Cited by 136 (11 self) - Add to MetaCart
Motivation: Discovery of regulatory motifs in unaligned DNA sequences remains a fundamental problem in computational biology. Two categories of algorithms have been developed to identify common motifs from a set of DNA sequences. The first can be called a ‘multiple genes, single species’approach. It proposes that a degenerate motif is embedded in some or all of the otherwise unrelated input sequences and tries to describe a consensus motif and identify its occurrences. It is often used for co-regulated genes identified through experimental approaches. The second approach can be called ‘single gene, multiple species’. It requires orthologous input sequences and tries to identify unusually well conserved regions by phylogen-etic footprinting. Both approaches perform well, but each has some limitations. It is tempting to combine the knowledge of co-regulation among different genes and conservation among orthologous genes to improve our ability to identify motifs. Results: Based on the Consensus algorithm previously established by our group, we introduce a new algorithm called PhyloCon (Phylogenetic Consensus) that takes into account both conservation among orthologous genes and co-regulation of genes within a species. This algorithm first aligns conserved regions of orthologous sequences into multiple sequence alignments, or profiles, then compares profiles representing non-orthologous sequences. Motifs emerge as common regions in these profiles. Here we present a novel statistic to compare profiles of DNA sequences and a greedy approach to search for common subprofiles. We demonstrate that PhyloCon performs well on both synthetic and biological data. Availability: Software available upon request from the authors.
(Show Context)

Citation Context

...um LT2 and Vibrio cholerae were used (K. Tan, personal communication). PhyloCon identified the top site as AGACRTCYRGACGKCTA, which is almost identical to the documented metJ site (AGACGTYYAGAYGTCY) (=-=Robison et al., 1998-=-) except for one extra base. As demonstrated by simulation, PhyloCon works well even with a limited number of genes or species. Caenorhabditis elegans genes F44E5.5, T27E4.2 and M01B12.1 are regulated...

Gibbs recursive sampler: finding transcription factor binding sites

by William Thompson, Eric C. Rouchka, Charles E. Lawrence - Nucleic Acids Res , 2003
"... The Gibbs Motif Sampler is a software package for locating common elements in collections of biopolymer sequences. In this paper we describe a new variation of the Gibbs Motif Sampler, the Gibbs Recursive Sampler, which has been developed specifically for locating multiple transcription factor bindi ..."
Abstract - Cited by 92 (7 self) - Add to MetaCart
The Gibbs Motif Sampler is a software package for locating common elements in collections of biopolymer sequences. In this paper we describe a new variation of the Gibbs Motif Sampler, the Gibbs Recursive Sampler, which has been developed specifically for locating multiple transcription factor binding sites for multiple transcription factors simultaneously in unaligned DNA sequences that may be heterogeneous in DNA composition. Here we describe the basic operation of the web-based version of this sampler. The sampler may be accessed at
(Show Context)

Citation Context

...ate common elements in collections of biopolymer sequences. It has been applied to the analysis of protein sequences (1,2). Gibbs sampling has also been used extensively in the identification of TFBS =-=(3,4)-=- and an earlier version of this software has been available at this web location for some time. In this paper we describe a new variation, the Gibbs Recursive Sampler, designed to search for multiple ...

STAMP: a web tool for exploring DNA-binding motif similarities.” Nucleic Acids Res, 35(Web Server issue

by Shaun Mahony, Panayiotis V. Benos , 2007
"... doi:10.1093/nar/gkm272 ..."
Abstract - Cited by 72 (1 self) - Add to MetaCart
doi:10.1093/nar/gkm272
(Show Context)

Citation Context

...ifs [predicted by Harbison et al. (13) and MacIsaac et al. (14)], (iv) Drosophila motifs [DNase I footprinting data from (15), motifs generated by Dan Pollard], (v) DPInteract Escherichia coli motifs =-=(16)-=- and (vi) RegTransBase prokaryotic motifs (17). Alternatively, users may upload their own dataset of motifs to query the input motifs against. Users may choose to get listings of 1, 5 or 10 of the bes...

Sigma70 promoters in Escherichia coli: specific transcription in dense regions of overlapping promoterlike signals

by Araceli M. Huerta, Julio Collado-vides - J Mol Biol , 2003
"... We present here a computational analysis showing that sigma70 house-keeping promoters are located within zones with high densities of promoter-like signals in Escherichia coli, and we introduce strategies that allow for the correct computer prediction of sigma70 promoters. Based on 599 experimentall ..."
Abstract - Cited by 68 (11 self) - Add to MetaCart
We present here a computational analysis showing that sigma70 house-keeping promoters are located within zones with high densities of promoter-like signals in Escherichia coli, and we introduce strategies that allow for the correct computer prediction of sigma70 promoters. Based on 599 experimentally verified promoters of E. coli K-12, we generated and evaluated more than 200 weight matrices optimizing different criteria to obtain the best recognition matrices. The alignments generating the best statistical models did not fully correspond with the canonical sigma70 model. However, matrices that correspond to such a canonical model per-formed better as tools for prediction. We tested the predictive capacity of these matrices on 250 bp long regions upstream of gene starts, where 90 % of the known promoters occur. The computational matrix models generated an average of 38 promoter-like signals within each 250 bp region. In more than 50 % of the cases, the true promoter does not have the best score within the region. We observed, in fact, that real promoters

ab initio prediction of transcription factor targets using structural knowledge

by Tommy Kaplan, Nir Friedman, Hanah Margalit - PLoS Comput Biol , 2005
"... Current approaches for identification and detection of transcription factor binding sites rely on an extensive set of known target genes. Here we describe a novel structure-based approach applicable to transcription factors with no prior binding data. Our approach combines sequence data and structur ..."
Abstract - Cited by 68 (1 self) - Add to MetaCart
Current approaches for identification and detection of transcription factor binding sites rely on an extensive set of known target genes. Here we describe a novel structure-based approach applicable to transcription factors with no prior binding data. Our approach combines sequence data and structural information to infer context-specific amino acid–nucleotide recognition preferences. These are used to predict binding sites for novel transcription factors from the same structural family. We demonstrate our approach on the Cys 2His 2 Zinc Finger protein family, and show that the learned DNA-recognition preferences are compatible with experimental results. We use these preferences to perform a genome-wide scan for direct targets of Drosophila melanogaster Cys 2His 2 transcription factors. By analyzing the predicted targets along with gene annotation and expression data we infer the function and activity of these proteins. Citation: Kaplan T, Friedman N, Margalit H (2005) Ab initio prediction of transcription factor targets using structural knowledge. PLoS Comp Biol 1(1): e1.
(Show Context)

Citation Context

...ences derived by the other computational approaches [5,15]. In addition, previous studies showed that there are discrepancies between SELEX-derived motifs and those derived from natural binding sites =-=[30,31]-=-. Indeed, our method yielded inferior predictions when information on artificial binding sequences was included in our training data. Figure 4C shows that our set of recognition preferences is superio...

Protein–DNA binding specificity predictions with structural models

by Alexandre V. Morozov, James J. Havranek, David Baker, Eric D. Siggia , 2005
"... ..."
Abstract - Cited by 64 (3 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...(2.8) 16 c D.melanogaster – Prd(homeo) 1fjl X-ray (2.0) 15 c D.melanogaster (37) Ubx/Exd 1b8i X-ray (2.4) 4 b D.melanogaster – Trl 1yui NMR (NA) 5 c D.melanogaster – MetJ 1mj2 X-ray (2.4) 16 c E.coli =-=(32)-=- TrpR 1tro X-ray (1.9) 15 c E.coli (32,33) PhoB 1gxp X-ray (2.5) 16 c E.coli (32) Ihf 1ihf X-ray (2.5) 27 c E.coli (32) DnaA 1j1v X-ray (2.1) 9 c E.coli (32) PurR 2puc X-ray (2.7) 23 c E.coli (32) Crp...

Virtual Footprint and PRODORIC: an integrative framework for regulon prediction in prokaryotes

by Richard Münch , Karsten Hiller , Andreas Grote , Maurice Scheer , Johannes Klein , Max Schobert , Dieter Jahn , 2005
"... A new online framework for the accurate and integrative prediction of transcription factor binding sites (TFBSs) in prokaryotes was developed. The system consists of three interconnected modules: 1. The PRODORIC database as a comprehensive data source and extensive collection of TFBSs with correspon ..."
Abstract - Cited by 63 (11 self) - Add to MetaCart
A new online framework for the accurate and integrative prediction of transcription factor binding sites (TFBSs) in prokaryotes was developed. The system consists of three interconnected modules: 1. The PRODORIC database as a comprehensive data source and extensive collection of TFBSs with corresponding position weight matrices (PWMs). 2. The pattern matching tool Virtual Footprint for the prediction of genome based regulons and for the analysis of individual promoter regions. 3. The interactive genome browser GBPro for the visualization of TFBS search results in their genomic context and links to gene and regulator-specific information in PRODORIC. The aim of this service is to provide researchers a free and easy to use collection of interconnected tools in the field of molecular microbiology, infection and systems biology.

What is bioinformatics? A proposed definition and overview of the field.

by N M Luscombe, D Greenbaum, M Gerstein - Methods of Information in Medicine, , 2001
"... ..."
Abstract - Cited by 58 (2 self) - Add to MetaCart
Abstract not found

Computational identification of transcriptional regulatory elements in DNA sequence

by Debraj GuhaThakurta , 2006
"... Identification and annotation of all the functional elements in the genome, including genes and the regulatory sequences, is a fundamental challenge in genomics and computational biology. Since regulatory elements are frequently short and variable, their identification and discovery using computatio ..."
Abstract - Cited by 55 (0 self) - Add to MetaCart
Identification and annotation of all the functional elements in the genome, including genes and the regulatory sequences, is a fundamental challenge in genomics and computational biology. Since regulatory elements are frequently short and variable, their identification and discovery using computational algorithms is difficult. However, significant advances have been made in the computational methods for modeling and detection of DNA regulatory elements. The availability of complete genome sequence from multiple organisms, as well as mRNA profiling and high-throughput experimental methods for mapping protein-binding sites in DNA, have contributed to the development of methods that utilize these auxiliary data to inform the detection of transcriptional regulatory elements. Progress is also being made in the identification of cis-regulatory modules and higher order structures of the regulatory sequences, which is essential to the understanding of transcription regulation in the metazoan genomes. This article reviews the computational approaches for modeling and identification of genomic regulatory elements, with an emphasis on the recent developments, and current challenges.

Bayesian sparse hidden components analysis for transcription regulation networks

by Chiara Sabatti, Gareth M. James , 2005
"... Motivation: In systems like E. Coli, the abundance of sequence information, gene expression array studies, and small scale experiments allows one to reconstruct the regulatory network and to quantify the effects of transcription factors on gene expression. However, this goal can only be achieved if ..."
Abstract - Cited by 40 (1 self) - Add to MetaCart
Motivation: In systems like E. Coli, the abundance of sequence information, gene expression array studies, and small scale experiments allows one to reconstruct the regulatory network and to quantify the effects of transcription factors on gene expression. However, this goal can only be achieved if all information sources are used in concert. Results: Our method integrates literature information, DNA sequences, and expression arrays. A set of relevant transcription factors is defined on the basis of literature. Sequence data is used to identify potential target genes and the results are used to define a prior distribution on the topology of the regulatory network. A Bayesian hidden component model for the expression array data allows us to identify which of the potential binding sites are actually used by the regulatory proteins in the studied cell conditions, the strength of their control, and their activation profile in a series of experiments. We apply our methodology to 35 expression studies in E. Coli with convincing results. Availability: www.genetics.ucla.edu/labs/sabatti/software.html
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University