Abstract:
, a public--domain suite of programs for generating multiple alignments of a set of genomic sequences. We allow the use of either of the two popular objectives, Tree Alignment or Sum-of-Pairs. The main distinguishing feature of our method is that the alignment is obtained via a tree in which the internal nodes (ancestors) are labeled by Steiner sequences for triples of the input sequences. Given lists of candidate labels for the ancestral sequences, we use dynamic programming to choose an optimal labeling under either objective functions. Finally, the fully labeled tree of sequences is turned into into a multiple alignment. Enhancements in our implementation include the traditional space-saving ideas of Hirschberg as well as new data-packing techniques. The running-time bottleneck of computing exact Steiner sequences is handled by a highly effective but much faster heuristic alternative. Finally, other modules in the suite allow automatic generation of linear-program input files that can be used to compute novel lower bounds on the optimal values. We also report on some preliminary computational experiments with SALSA. 1
Citations
|
634
|
A general method applicable to the search for similarities in the amino acid sequence of two proteins
– Needleman, Wunch
- 1970
|
|
165
|
A linear space algorithm for computing maximal common subsequences
– Hirschberg
- 1975
|
|
114
|
On the complexity of multiple sequence alignment
– Wang, Jiang
- 1994
|
|
101
|
The multiple sequence alignment problem in biology
– Carrillo, Lipman
- 1988
|
|
99
|
Progressive sequence alignment as a prerequisite to correct phylogenetic trees
– Feng, Doolittle
- 1987
|
|
78
|
A tool for Multiple Sequence Alignment
– Lipman, Altschul, et al.
- 1989
|
|
73
|
Efficient methods for multiple sequence alignment with guaranteed error bounds
– Gusfield
- 1993
|
|
62
|
A survey of multiple sequence comparison methods
– Chan, Wong, et al.
- 1992
|
|
59
|
Minimal mutation trees of sequences
– Sankoff
- 1975
|
|
42
|
Comparitive analysis of multiple protein-sequence alignment methods
– McClure, Vasi, et al.
- 1994
|
|
41
|
Approximation algorithms for multiple sequence alignment
– Bafna, Lawler, et al.
- 1994
|
|
34
|
Improved approximation algorithms for tree alignment
– Wang, Gusfield
- 1996
|
|
31
|
Aligning sequences via an evolutionary tree: complexity and approximation
– Jiang, Lawler, et al.
- 1994
|
|
30
|
CLUSTAL V: improved software for multiple sequence alignment
– Higgins, Bleasby, et al.
- 1992
|
|
29
|
Optimal alignment between groups of sequences and its application to multiple sequence alignment
– Gotoh
- 1993
|
|
26
|
Simultaneous comparisons of three or more sequences related by a tree
– Sanko, Cedergren
- 1983
|
|
23
|
Frequency of insertion-deletion, transversion, and transition in the evolution of 5s ribosomal RNA
– Cedergren, Lapalme
- 1976
|
|
23
|
Argos P: A fast and sensitive multiple sequence alignment algorithm
– Vingron
- 1989
|
|
17
|
Approximation algorithms for multiple sequence alignment under a fixed evolutionary tree
– Ravi, Kececioglu
|
|
11
|
Making the shortest-paths approach to sum-of-pairs multiple sequence alignment more space efficient in practice
– Gupta, Kececioglu, et al.
|
|
7
|
Stars and Multiple Sequence Alignment
– Altschul, Lipman, et al.
- 1989
|
|
4
|
New Uses for Uniform Lifted Alignments
– Gusfield, Wang
- 1996
|
|
2
|
Tree Alignment And Reconstruction application software, Version 1.0
– Jiang, Liu
- 1998
|
|
2
|
Deriving an Amino Acid Distance
– Taylor, Jones
- 1993
|
|
1
|
Analytical approaches to genomic evolution, Biochimie 75
– Sankoff
- 1993
|