Inferring evolutionary trees is an interesting and important problem in biology that is very difficult from a computational point of view as most associated optimization problems are NP-hard. Although it is known that many methods are provably statistically consistent (i.e. the probability of recovering the correct tree converges on 1 as the sequence length increases), the actual rate of convergence for different methods has not been well understood. In a recent paper we introduced a new method for reconstructing evolutionary trees called the Dyadic Closure Method (DCM), and we showed that DCM has a very fast convergence rate. DCM runs in O(n 5 log n) time, where n is the number of sequences, so although it is polynomial it has computational requirements that are potentially too large to be of use in practice. In this paper we present another tree reconstruction method, the Witness-Antiwitness Method, or WAM. WAM is significantly faster than DCM, especially on random trees, and converges at the same rate as DCM. We also compare WAM to other methods used to reconstruct trees, including Neighbor Joining (possibly the most popular method among molecular biologists), and new methods introduced in the computer science literature. 1
|
1289
|
The Probabilistic Method
– Alon, Spencer
- 1992
|
|
424
|
The neighbor-joining method: A new method for reconstructing phylogenetic
– Saitou, Nei
- 1987
|
|
221
|
Fast algorithms for finding nearest common ancestors
– Harel, Tarjan
- 1984
|
|
133
|
The complexity of reconstructing trees from qualitative characters and subtrees
– Steel
- 1992
|
|
128
|
The recovery of trees from measures of dissimilarity
– Buneman
- 1971
|
|
114
|
On the complexity of multiple sequence alignment
– Wang, Jiang
- 1994
|
|
99
|
Progressive sequence alignment as a prerequisite to correct phylogenetic trees
– Feng, Doolittle
- 1987
|
|
85
|
Cases in which parsimony or compatibility methods will be positively misleading, Syst
– Felsenstein
- 1978
|
|
78
|
Reconstructing the shape of a tree from observed dissimilarity data
– Bandelt, Dress
- 1986
|
|
63
|
Inferring evolutionary trees with strong combinatorial evidence
– Berry, Gascuel
- 1997
|
|
58
|
On the approximability of numerical taxonomy: fitting distances by tree metrics
– Agarwala, Bafna, et al.
- 1996
|
|
52
|
Computational complexity of inferring phylogenies from dissimilarities matrices
– Day
- 1989
|
|
45
|
Computational complexity of inferring phylogenies by compatibility, Syst
– Day, Sanko
- 1986
|
|
45
|
E#cient algorithms for inverting evolution
– Farach, Kannan
- 1996
|
|
40
|
Taxonomy with confidence
– Cavender
- 1978
|
|
39
|
Molecular studies of evolution: a source of novel statistical problems. in Statistical Decision Theory and Related
– Neyman
- 1971
|
|
35
|
Improved approximation algorithms for tree alignment
– Wang, Gusfield
- 1996
|
|
35
|
Additive evolutionary trees
– Waterman, Smith, et al.
- 1977
|
|
33
|
Success of phylogenetic methods in the four-taxon
– Huelsenbeck, Hillis
- 1993
|
|
33
|
Recovering a tree from the leaf colourations it generates under a Markov model
– Steel
- 1994
|
|
32
|
Aligning sequences via an evolutionary tree: complexity and approximation
– Jiang, Lawler, et al.
- 1994
|
|
30
|
Haeseler, Quartet Puzzling: a quartet Maximum Likelihood method for reconstructing tree topologies
– Strimmer, von
- 1996
|
|
30
|
Approximation algorithms for tree alignment with a given phylogeny
– Wang, Jiang, et al.
- 1996
|
|
27
|
Estimation of evolutionary distances between homologous nucleotide sequences
– Kimura
- 1981
|
|
25
|
A probability model for inferring evolutionary trees
– Farris
- 1973
|
|
24
|
Extension operations on sets of leaf-labelled trees
– Bryant, Steel
- 1995
|
|
24
|
A Survey of Matrix Theory
– MARCUS, MINC
- 1964
|
|
24
|
A new method that simultaneously aligns and reconstructs ancestral sequences for any number of homologous sequences, when the phylogeny is
– Hein
|
|
21
|
Nearly tight bounds on the learnability of evolution. To appear
– Ambainis, Desper, et al.
|
|
21
|
The performance of neighbor-joining algorithms of phylogeny reconstruction
– Atteson
- 1997
|
|
21
|
Tree structure for proximity data
– Colonius, Schultze
- 1981
|
|
20
|
Reconstructing trees when sequence sites evolve at variable rates
– Steel, Szekely, et al.
- 1994
|
|
18
|
Parsimony is hard to beat
– Rice, Warnow
- 1997
|
|
17
|
Approaches for assessing phylogenetic accuracy
– Hillis
- 1995
|
|
14
|
A more efficient approximation scheme for tree alignment
– Wang, Jiang, et al.
|
|
13
|
Reconstruction methods for derivation trees
– Dekker
- 1986
|
|
13
|
Phylogenetic Inference,” Molecular Systematics
– Swofford, Olsen, et al.
- 1996
|
|
11
|
On the distribution of lengths of evolutionary trees
– Carter, Hendy, et al.
- 1990
|
|
11
|
Local quartet splits of a binary tree infer all quartet splits via one dyadic inference rule
– Erdos, Steel, et al.
- 1997
|
|
11
|
Relative efficiencies of the Fitch-Margoliash, maximum parsimony, maximum likelihood, minimum evolution, and neighbor-joining methods of phylogenetic tree construction in obtaining the correct tree
– Saitou, Imanishi
- 1989
|
|
11
|
The alignment of sets of sequences and the construction of phylogenetic trees: an integrated method
– Hogeweg, Hesper
- 1984
|
|
10
|
Performance of phylogenetic methods in simulation. Syst. Biol
– Huelsenbeck
|
|
10
|
personal communication
– Winkler
|
|
9
|
On a classical problem in probability theory
– Erdos, Renyi
- 1961
|
|
9
|
The pitfalls of molecular phylogeny based on four species, as illustrated by the cetacea/artiodactyla relationships
– Philippe, Douzery
- 1994
|
|
9
|
Combinatorial algorithms for constructing phylogenetic trees
– Warnow
- 1991
|
|
8
|
Chapter 11: Phylogenetic inference
– Swoord, Olsen, et al.
- 1996
|
|
7
|
Probability distributions on cladograms, in: Discrete Random Structures
– Aldous
- 1995
|
|
7
|
Probabilities of evolutionary trees
– Brown
- 1994
|
|
7
|
Measuring inconsistency in phylogenetic trees
– Willson
- 1998
|