length encoding. We have developed a method for predicting the common secondary structure of large RNA multiple alignments using only the information in the alignment. It uses a series of progressively more sensitive searches of the data in an iterative manner to discover regions of base pairing; the first pass examines the entire multiple alignment. The searching uses two methods to find base pairings. Mutual information is used to measure covariation between pairs of columns in the multiple alignment and a minimum length encoding method is used to detect column pairs with high potential to base pair. Dynamic programming is used to recover the optimal tree made up of the best potential base pairs and to create a stochastic context-free grammar. The information in the tree guides the next iteration of searching. The method is similar to the traditional comparative sequence analysis technique. The method correctly identifies most of the common secondary structure in 16S and 23S rRNA.
|
372
|
Hidden Markov models in computational biology: Applications to protein modeling
– Krogh, Brown, et al.
- 1994
|
|
133
|
RNA sequence analysis using covariance models
– Eddy, Durbin
- 1994
|
|
124
|
Hidden markov models of biological primary sequence information
– Baldi, Chauvin, et al.
- 1994
|
|
109
|
Simultaneous Solution of the RNA Folding, Alignment and Protosequence Problems
– Sankoff
- 1985
|
|
103
|
Comparing multiple RNA secondary structures using tree comparisons
– Shapiro, Zhang
- 1990
|
|
89
|
Stochastic context-free grammars for tRNA modeling
– Sakakibara, Brown, et al.
- 1994
|
|
55
|
The design of the MasPar MP-1: a cost effective massively parallel computer
– Nickolls
- 1990
|
|
43
|
Estimation of secondary structure in ribonucleic acids
– Tinoco, Borer, et al.
- 1971
|
|
37
|
The Computational Linguistics of Biological Sequences
– Searls
- 1993
|
|
33
|
Identifying constraints on the higher-order structure of RNA: continued development and application of comparative sequence analysis methods
– Gutell, Power, et al.
- 1992
|
|
26
|
Inferring consensus structure from nucleic acid sequences
– Chiu, Kolodziejczak
- 1991
|
|
19
|
The RDP-II (Ribosomal Database Project
– Larsen, Olsen, et al.
- 2001
|
|
15
|
Prediction of common folding structures of homologous RNAs
– Han, Kim
- 1993
|
|
14
|
Detailed analysis of the higher-order structure of 16s-like ribosomal ribonucleic acids
– Woese, Gutell, et al.
- 1983
|
|
13
|
Optimally parsing a sequence into different classes based on multiple types of evidence
– Stormo, Haussler
- 1994
|
|
12
|
RNA modeling using Gibbs sampling and stochastic context free grammars
– Grate, Herbster, et al.
- 1994
|
|
12
|
Covariation of mutations in the V3 loop of human immunodeficiency virus type 1 envelope protein: An information theoretic analysis
– Korber, Farber, et al.
- 1993
|
|
12
|
Consensus methods for folding single-stranded nucleic acids
– Waterman
- 1989
|
|
11
|
Phylogenetic comparative analysis of RNA secondary structure. Methods in Enzymology
– James, Olsen, et al.
- 1989
|
|
10
|
Detection of correlations in trna sequences with structural implications
– Klinger, Brutlag
- 1993
|
|
6
|
Massively parallel biosequence analysis
– Hughey
- 1993
|
|
6
|
Private communication
– Lapedes
- 1992
|
|
5
|
RNA structure prediction. Annual Review of Biophysics and Biophysical Chemistry 17:167--192
– Turner, Sugimoto, et al.
- 1988
|
|
3
|
Secondary structure prediction of RNA
– Gouy
- 1987
|
|
3
|
Predicting common foldings of homologous RNAs
– Le, Zuker
- 1991
|
|
2
|
5S RNA secondary structure. Nature 256:505--507
– Fox, Woese
- 1975
|
|
2
|
XRNA: An X Windows environment RNA editing/display package. Unpublished manuscript
– Weiser, Gutell, et al.
- 1993
|
|
1
|
SAM: Sequence alignment and modeling system software
– Krogh, Hughey
- 1995
|
|
1
|
Prediction of ribosomal RNA base-pairing by neural networks
– Reklaw, Hanna, et al.
- 1995
|
|
1
|
Structure detection through automated convariance search
– Winker, Overbeek, et al.
- 1990
|