Results 1 -
8 of
8
Eukaryotic Promoter Recognition
- Genome Res
, 1997
"... 957> http://gnomic.stanford.edu/~chris/GENSCANW. html). Because the signals that control the start and stop of transcription and translation, and the location of splicing, are still not very well understood, it is not uncommon for a gene-finding algorithm to confuse internal with initial and termina ..."
Abstract
-
Cited by 53 (0 self)
- Add to MetaCart
957> http://gnomic.stanford.edu/~chris/GENSCANW. html). Because the signals that control the start and stop of transcription and translation, and the location of splicing, are still not very well understood, it is not uncommon for a gene-finding algorithm to confuse internal with initial and terminal exons, thus wrongly partitioning the exons. The problem is compounded by our incomplete understanding of alternative splicing control elements. Another line of development in gene identification is based on homology (e.g., Gish and States 1993; Gelfand et al. 1996). If there is a close homolog in the databases to one of the genes in the sequence under analysis, sequence similarity will usually group the exons for this gene correctly. Still, in many cases there is no close homolog and no guarantee when there is some homolog that the encoded protein lacks insertions/deletions. Clearly, some means of recognizing the beginnings of genes, probably via the promoter, or the ends, probabl
Identification of human gene core promoters in silico
- Genome Research
, 1998
"... Identification of the 5’-end of human genes requires identification of functional promoter elements. In silico identification of those elements is difficult because of the hierarchical and modular nature of promoter architecture. To address this problem, I propose a new stepwise strategy based on in ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
Identification of the 5’-end of human genes requires identification of functional promoter elements. In silico identification of those elements is difficult because of the hierarchical and modular nature of promoter architecture. To address this problem, I propose a new stepwise strategy based on initial localization of a functional promoter into a 1-2 kb (extended-promoter) region from within a large genomic DNA sequence of 100 kb or larger, and further localization of a Transcriptional Start Site (TSS) into a 50-100 bp (core-promoter) region. Using positional dependent 5-tuple measures, a Quadratic Discriminant Analysis (QDA) method has been implemented in a new program- CorePromoter. Our experiments indicate that when given a 1-2 kb extended promoter, CorePromoter will correctly localize the TSS to a 100 bp interval approximately 60 % of the time.
The Eukaryotic Promoter Database EPD: the impact of in silico primer extension
- Nucleic Acids Res
, 2004
"... The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of eukaryotic POL II promoters, experimentally de®ned by a transcription start site (TSS). There may be multiple promoter entries for a single gene. The underlying experimental evidence comes from journal articles and, s ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of eukaryotic POL II promoters, experimentally de®ned by a transcription start site (TSS). There may be multiple promoter entries for a single gene. The underlying experimental evidence comes from journal articles and, starting from release 73, from 5 ¢ ESTs of fulllength cDNA clones used for so-called in silico primer extension. Access to promoter sequences is provided by pointers to TSS positions in nucleotide sequence entries. The annotation part of an EPD entry includes a description of the type and source of the initiation site mapping data, links to other biological databases and bibliographic references. EPD is structured in a way that facilitates dynamic extraction of biologically meaningful promoter subsets for comparative sequence analysis. Web-based interfaces have been developed that enable the user to view EPD entries in different formats, to select and extract promoter sequences according to a variety of criteria and to navigate to related databases exploiting different cross-references. Tools for analysing sequence motifs around TSSs de®ned in EPD are provided by the signal search analysis server. EPD can be accessed at
Novel Neural Network Prediction Systems for Human Promoters and Splice Sites
- In Gene-Finding and Gene Structure Prediction Workshop
, 1995
"... We present a detailed theoretical study of the organization and structure of landmark sequences like promoters and splice junctions in Human DNA. An improved detection of these landmark sequences in genomic DNA is important for exon detection and gene assembly. The function of eukaryotic promoters a ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
We present a detailed theoretical study of the organization and structure of landmark sequences like promoters and splice junctions in Human DNA. An improved detection of these landmark sequences in genomic DNA is important for exon detection and gene assembly. The function of eukaryotic promoters as initiators for transcription and of splice sites as signals for RNA assembly are among of the most complex processes in molecular biology. Both consist of multiple functional sites in primary DNA that are involved in the polymerase binding and splicing process, respectively. We analyzed the structure of the individual elements within promoters and splice sites using a novel technique that combines neural networks with weight pruning. For a complete promoter site prediction we combine these single predictions for each element using time-delay neural networks (TDNN). TDNNs are appropriate for recognizing promoter elements because they are able to combine multiple features, even those that ap...
A Discrimination Study Of Human Core-Promoters
- Proc. Pacific Symp. Biocomputing 1998. World Scientific, Singapore
, 1998
"... Introduction It is no secret that computational identification of eukaryotic RNAP II promoters is notoriously difficult. The field is still in its infancy and current algorithms are quite primitive (reviewed in Fickett & Hatzigeorgiou 1997). This is mainly due to our limited understanding about und ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Introduction It is no secret that computational identification of eukaryotic RNAP II promoters is notoriously difficult. The field is still in its infancy and current algorithms are quite primitive (reviewed in Fickett & Hatzigeorgiou 1997). This is mainly due to our limited understanding about underlying molecular recognition mechanism of transcription initiation (e.g. Kornberg 1996; Nikolov & Burley 1997). Recent advances in molecular genetics, biochemistry and structural biology have shown that (1) Promoter has a modular structure consisting of multiple short sequence elements, mostly transcription factor (TF) binding-sites. They can be dispersed or overlapped, largely populating in about 1 Kb region upstream and surrounding TSS. They can be either positive or negative and their functions are often context-dependent. Most of the distal elements are activational or regulatory, and their pattern of organization is often gene or pat
A Statistical Model for Locating Regulatory Regions in Genomic DNA
, 1997
"... including in non-coding sequence (introns) of genes and in areas that might be many kilobases upstream or downstream from the transcription initiation sites. In addition, because the control elements are relatively short, they occur not only in the regulatory regions but also elsewhere in the DNA se ..."
Abstract
- Add to MetaCart
including in non-coding sequence (introns) of genes and in areas that might be many kilobases upstream or downstream from the transcription initiation sites. In addition, because the control elements are relatively short, they occur not only in the regulatory regions but also elsewhere in the DNA sequence, probably by chance. See, for example Figure 1, which shows the complex distribution pattern of transcription factor binding elements in a long DNA segment (73,308 bp) that denes the human b globin locus on chromosome 11. Our hypothesis is that, despite its complexity, the distribution of control elements or "words" in genomic DNA is non-random. Here, we develop a Bayesian model for locating control regions in genomic DNA. The statistical model is known as a Hidden Markov Chain, a class of models that has been used with remarkable success in the genomic literature (e.g. see Churchill, 1989; Kruglyak et al., 1996). Criteria for defining control elements in DNA Our statistical mode
104 No. 54 Tsunoda and Takagi Algorithm Determining All Cut-off Values of TF-DNA Binding Score Calculated Using PWMs in TRANSFAC
"... transcription factor binding sites ..."

