Results 1  10
of
68
A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood
, 2003
"... The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximumlikelihood principle, which clearly satisfies these requirements. The ..."
Abstract

Cited by 2182 (27 self)
 Add to MetaCart
The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximumlikelihood principle, which clearly satisfies these requirements. The core of this method is a simple hillclimbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distancebased method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment of the topology and branch lengths, only a few iterations are sufficient to reach an optimum. We used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximumlikelihood programs and much higher than the performance of distancebased and parsimony approaches. The reduction of computing time is dramatic in comparison with other maximumlikelihood packages, while the likelihood maximization ability tends to be higher. For example, only 12 min were required on a standard personal computer to analyze a data set consisting of 500 rbcL sequences with 1,428 base pairs from plant plastids, thus reaching a speed of the same order as some popular distancebased and parsimony algorithms. This new method is implemented in the PHYML program, which is freely available on our web page:
Raxmliii: a fast program for maximum likelihoodbased inference of large phylogenetic trees
 Bioinformatics
, 2005
"... Motivation: The computation of large phylogenetic trees with statistical models such as maximum likelihood or bayesian inference is computationally extremely intensive. It has repeatedly been demonstrated that these models are able to recover the true tree or a tree which is topologically closer to ..."
Abstract

Cited by 259 (17 self)
 Add to MetaCart
Motivation: The computation of large phylogenetic trees with statistical models such as maximum likelihood or bayesian inference is computationally extremely intensive. It has repeatedly been demonstrated that these models are able to recover the true tree or a tree which is topologically closer to the true tree more frequently than less elaborate methods such as parsimony or neighbor joining. Due to the combinatorial and computational complexity the size of trees which can be computed on a Biologist’s PC workstation within reasonable time is limited to trees containing approximately 100 taxa. Results: In this paper we present the latest release of our program RAxMLIII for rapid maximum likelihoodbased inference of large evolutionary trees which allows for computation of 1.000taxon trees in less than 24 hours on a single PC processor. We compare RAxMLIII to the currently fastest implementations for maximum likelihood and bayesian inference: PHYML and MrBayes. Whereas RAxMLIII performs worse than PHYML and MrBayes on synthetic data it clearly outperforms both programs on all real data alignments used in terms of speed and final likelihood values. Availability & Supplementary Information: RAxMLIII including all alignments and final trees mentioned in this paper is freely available as open source code at
Phylogenetic Tree Construction Using Markov Chain Monte Carlo
, 1999
"... We describe a Bayesian method based on Markov chain simulation to study the phylogenetic relationship in a group of DNA sequences. Under simple models of mutational events, our method produces a Markov chain whose stationary distribution is the conditional distribution of the phylogeny given the obs ..."
Abstract

Cited by 86 (0 self)
 Add to MetaCart
We describe a Bayesian method based on Markov chain simulation to study the phylogenetic relationship in a group of DNA sequences. Under simple models of mutational events, our method produces a Markov chain whose stationary distribution is the conditional distribution of the phylogeny given the observed sequences. Our algorithm strikes a reasonable balance between the desire to move globally through the space of phylogenies and the need to make computationally feasible moves in areas of high probability. Since phylogenetic information is described by a tree, we have created new diagnostics to handle this type of data structure. An important byproduct of the Markov chain Monte Carlo phylogeny building technique is that it provides estimates and corresponding measures of variability for any aspect of the phylogeny under study.
PHYML Onlinea web server for fast maximum likelihoodbased phylogenetic inference
 Nucleic Acids Res
, 2005
"... PHYML Online is a web interface to PHYML, a software that implements a fast and accurate heuristic for estimating maximum likelihood phylogenies from DNA and protein sequences. This tool provides the user with a number of options, e.g. nonparametric bootstrap and estimation of various evolutionary p ..."
Abstract

Cited by 84 (0 self)
 Add to MetaCart
(Show Context)
PHYML Online is a web interface to PHYML, a software that implements a fast and accurate heuristic for estimating maximum likelihood phylogenies from DNA and protein sequences. This tool provides the user with a number of options, e.g. nonparametric bootstrap and estimation of various evolutionary parameters, in order to perform comprehensive phylogenetic analyses on large datasets in reasonable computing time. The server and its documentation are available at
A structural EM algorithm for phylogenetic inference
 Journal of Computational Biology
, 2002
"... Head of the school for Engineering and Computer Science ..."
Abstract

Cited by 58 (8 self)
 Add to MetaCart
Head of the school for Engineering and Computer Science
Reconstructing Phylogenies from GeneContent and GeneOrder Data
 MATHEMATICS OF EVOLUTION AND PHYLOGENY, OLIVIER GASCUEL (ED.)
"... ..."
PHYML Online  a web server for fast maximum likelihoodbased phylogenetic inference
"... PHYML Online is a web interface to PHYML, a software that implements a fast and accurate heuristic for estimating maximum likelihood phylogenies from DNA and protein sequences. This tool provides the user with a number of options, e.g. nonparametric bootstrap and estimation of various evolutionary p ..."
Abstract

Cited by 39 (0 self)
 Add to MetaCart
PHYML Online is a web interface to PHYML, a software that implements a fast and accurate heuristic for estimating maximum likelihood phylogenies from DNA and protein sequences. This tool provides the user with a number of options, e.g. nonparametric bootstrap and estimation of various evolutionary parameters, in order to perform comprehensive phylogenetic analyses on large data sets in reasonable computing time. The server and its documentation are available from http://atgc.lirmm.fr/phyml.
Exploring amongsite rate variation models in a maximum likelihood framework using empirical data: effects of model assumptions on estimates of topology, branch lengths, and bootstrap support
 Syst. Biol
, 2001
"... Abstract.—We have investigated the effects of different amongsite rate variation models on the estimation of substitution model parameters, branch lengths, topology, and bootstrap proportions under minimum evolution (ME) and maximum likelihood (ML). Speci�cally, we examined equal rates, invariable ..."
Abstract

Cited by 36 (4 self)
 Add to MetaCart
Abstract.—We have investigated the effects of different amongsite rate variation models on the estimation of substitution model parameters, branch lengths, topology, and bootstrap proportions under minimum evolution (ME) and maximum likelihood (ML). Speci�cally, we examined equal rates, invariable sites, gammadistributed rates, and sitespeci�c rates (SSR) models, using mitochondrial DNA sequence data from three proteincoding genes and one tRNA gene from species of the New Zealand cicada genus Maoricicada. Estimates of topology were relatively insensitive to the substitution model used; however, estimates of bootstrap support, branch lengths, and Rmatrices (underlying relative substitution rate matrix) were strongly in�uenced by the assumptions of the substitution model. We identi�ed one situation where ME and ML tree building became inaccurate when implemented with an inappropriate amongsite rate variation model. Despite the fact the SSR models often have a better �t to the data than do invariable sites and gamma rates models, SSR models have some serious weaknesses. First, SSR rate parameters are not comparable across data sets, unlike the proportion of invariable sites or the alpha shape parameter of the gamma distribution. Second, the extreme amongsite rate variation within codon positions is problematic for SSR models, which explicitly assume rate homogeneity within each rate class. Third, the SSR models appear to give severe underestimates of
Improving the efficiency of SPR moves in phylogenetic tree search methods based on maximum likelihood
 BIOINFORMATICS
, 2005
"... Motivation: Maximum likelihood methods have become very popular for constructing phylogenetic trees from sequence data. However, despite noticeable recent progress, with large and difficult data sets (e.g. multiple genes with conflicting signals) current ML programs still require huge computing time ..."
Abstract

Cited by 31 (9 self)
 Add to MetaCart
(Show Context)
Motivation: Maximum likelihood methods have become very popular for constructing phylogenetic trees from sequence data. However, despite noticeable recent progress, with large and difficult data sets (e.g. multiple genes with conflicting signals) current ML programs still require huge computing times and can become trapped in bad local optima of the likelihood function. When this occurs, the resulting trees may still show some of the defects (e.g. long branch attraction) of starting trees obtained using fast distance or parsimony programs. Methods: Subtree Pruning and Regrafting (SPR) topological rearrangements are usually sufficient to intensively search the tree space. Here, we propose two new methods to make SPR moves more efficient. The first method uses a fast distancebased approach to detect the least promising candidate SPR moves, which are then simply discarded. The second method locally estimates the change in likelihood for any remaining potential SPRs, as opposed to globally evaluating the entire tree for each possible move. These two methods are implemented in a new algorithm with a sophisticated filtering strategy, which efficiently selects potential SPRs and concentrates most of the likelihood computation on the promising moves. Results: Experiments with real data sets comprising 35 to 250 taxa show that, while indeed greatly reducing the amount of computation, our approach provides likelihood values at least as good as those of the best known ML methods so far, and is very robust to poor starting trees. Furthermore, combining our new SPR algorithm with local moves such as PHYML’s nearest neighbor interchanges, the time needed to find good solutions can sometimes be reduced even more. Availability: Executables of our SPR program and the used data sets are available for download at
An Investigation of Phylogenetic Likelihood Methods
, 2003
"... We analyze the performance of likelihoodbased approaches used to reconstruct phylogenetic trees. Unlike other techniques such as NeighborJoining (NJ) and Maximum Parsimony (MP), relatively little is known regarding the behavior of algorithms founded on the principle of likelihood. ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
(Show Context)
We analyze the performance of likelihoodbased approaches used to reconstruct phylogenetic trees. Unlike other techniques such as NeighborJoining (NJ) and Maximum Parsimony (MP), relatively little is known regarding the behavior of algorithms founded on the principle of likelihood.