Abstract:
Abstract Stochastic regular motifs are evolved for protein sequences using genetic programming. The motif language, SRE-DNA, is a stochastic regular expression language suitable for denoting biosequences. Three restricted versions of SRE-DNA are used as target languages for evolved motifs. The genetic programming experiments are implemented in DCTG-GP, which is a genetic programming system that uses logic–based attribute grammars to define the target language for evolved programs. Earlier preliminary work tested SRE-DNA’s viablility as a representation language for aligned protein sequences. This work establishes that SRE-DNA is also suitable for evolving motifs for unaligned sets of sequences. ∗1
Citations
|
2806
|
Introduction to Automata Theory, Languages, and Computation
– Hopcroft, Ullman
- 1979
|
|
1800
|
Genetic Programming: On the Programming of Computers by Means of Natural Selection
– Koza
- 1992
|
|
412
|
An Introduction to Computational Learning Theory
– Kearns, Vazirani
- 1994
|
|
341
|
Hidden Markov Models in Computational Biology: Applications to Protein Modelling
– Krogh, Brown, et al.
- 1994
|
|
186
|
Bioinformatics: the machine learning approach
– Baldi, Brunak
- 2001
|
|
163
|
Francone: Genetic Programming - An Introduction
– Banzhaf, Nordin, et al.
- 1998
|
|
103
|
Approaches to the automatic discovery of patterns in biosequences
– Brazma, Jonassen, et al.
- 1998
|
|
88
|
The PROSITE database, its status
– Hoffman, Bucher, et al.
- 1999
|
|
82
|
Stochastic context-free grammars for tRNA modeling
– Sakakibara, Brown, et al.
- 1994
|
|
65
|
Datalog Grammars
– Dahl, Tarau, et al.
- 1994
|
|
30
|
A Generalized Profile Syntax for Biomolecular Sequence Motifs and its Function in Automatic Sequence Interpretation
– Bucher, Bairoch
- 1994
|
|
28
|
Predicting protein structure using hidden Markov models
– Karplus, Sjolander, et al.
- 1997
|
|
16
|
Logic-based genetic programming with definite clause translation grammars
– Ross
- 2001
|
|
11
|
Protein sequence motifs
– Bork, Koonin
- 1996
|
|
9
|
Probabilistic Language Formalism for Stochastic Discrete Event Systems
– Garg, Kumar, et al.
- 1999
|
|
8
|
Biopattern Discovery by Genetic Programming
– Hu
- 1998
|
|
7
|
Probabilistic Pattern Matching and the Evolution of Stochastic Regular Expressions
– Ross
- 2000
|
|
6
|
Classifying Nucleic Acid Sub-Sequences as Introns or Exons Using Genetic Programming
– Handley
- 1995
|
|
3
|
The Evaluation of a Stochastic Regular Motif Language for Protein Sequences
– Ross
- 2001
|
|
2
|
Classifying Proteins as Extracellular Using Programmatic Motifs and Genetic Programming
– Koza, Bennett, et al.
- 1998
|
|
1
|
Automated Learning of a Detector for the Cores of α-Helices in Protein Sequences Via Genetic Programming
– Handley
- 1994
|
|
1
|
The Evolution of Stochastic Regular Motifs for Protein Sequences 33 20
– Koza, Bennett, et al.
- 1998
|