Results 1  10
of
4,896
A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood
, 2003
"... The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximumlikelihood principle, which clearly satisfies these requirements. The ..."
Abstract

Cited by 2176 (27 self)
 Add to MetaCart
The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximumlikelihood principle, which clearly satisfies these requirements. The core of this method is a simple hillclimbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distancebased method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment of the topology and branch lengths, only a few iterations are sufficient to reach an optimum. We used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximumlikelihood programs and much higher than the performance of distancebased and parsimony approaches. The reduction of computing time is dramatic in comparison with other maximumlikelihood packages, while the likelihood maximization ability tends to be higher. For example, only 12 min were required on a standard personal computer to analyze a data set consisting of 500 rbcL sequences with 1,428 base pairs from plant plastids, thus reaching a speed of the same order as some popular distancebased and parsimony algorithms. This new method is implemented in the PHYML program, which is freely available on our web page:
TREEVIEW: an application to display phylogenetic trees on personal computers.
 Computer Applications in the Biosciences
, 1996
"... ..."
Model Selection and Model Averaging in Phylogenetics: Advantages of Akaike Information Criterion and Bayesian Approaches Over Likelihood Ratio Tests
, 2004
"... Model selection is a topic of special relevance in molecular phylogenetics that affects many, if not all, stages of phylogenetic inference. Here we discuss some fundamental concepts and techniques of model selection in the context of phylogenetics. We start by reviewing different aspects of the sel ..."
Abstract

Cited by 404 (8 self)
 Add to MetaCart
Model selection is a topic of special relevance in molecular phylogenetics that affects many, if not all, stages of phylogenetic inference. Here we discuss some fundamental concepts and techniques of model selection in the context of phylogenetics. We start by reviewing different aspects of the selection of substitution models in phylogenetics from a theoretical, philosophical and practical point of view, and summarize this comparison in table format. We argue that the most commonly implemented model selection approach, the hierarchical likelihood ratio test, is not the optimal strategy for model selection in phylogenetics, and that approaches like the Akaike Information Criterion (AIC) and Bayesian methods offer important advantages. In particular, the latter two methods are able to simultaneously compare multiple nested or nonnested models, assess model selection uncertainty, and allow for the estimation of phylogenies and model parameters using all available models (modelaveraged inference or multimodel inference). We also describe how the relative importance of the different parameters included in substitution models can be depicted. To illustrate some of these points, we have applied AICbased model averaging to 37 mitochondrial DNA sequences from the subgenus Ohomopterus (genus Carabus) ground beetles described by Sota and Vogler (2001).
NeighborNet: An agglomerative method for the construction of phylogenetic networks
, 2003
"... ..."
Raxmliii: a fast program for maximum likelihoodbased inference of large phylogenetic trees
 Bioinformatics
, 2005
"... Motivation: The computation of large phylogenetic trees with statistical models such as maximum likelihood or bayesian inference is computationally extremely intensive. It has repeatedly been demonstrated that these models are able to recover the true tree or a tree which is topologically closer to ..."
Abstract

Cited by 258 (17 self)
 Add to MetaCart
(Show Context)
Motivation: The computation of large phylogenetic trees with statistical models such as maximum likelihood or bayesian inference is computationally extremely intensive. It has repeatedly been demonstrated that these models are able to recover the true tree or a tree which is topologically closer to the true tree more frequently than less elaborate methods such as parsimony or neighbor joining. Due to the combinatorial and computational complexity the size of trees which can be computed on a Biologist’s PC workstation within reasonable time is limited to trees containing approximately 100 taxa. Results: In this paper we present the latest release of our program RAxMLIII for rapid maximum likelihoodbased inference of large evolutionary trees which allows for computation of 1.000taxon trees in less than 24 hours on a single PC processor. We compare RAxMLIII to the currently fastest implementations for maximum likelihood and bayesian inference: PHYML and MrBayes. Whereas RAxMLIII performs worse than PHYML and MrBayes on synthetic data it clearly outperforms both programs on all real data alignments used in terms of speed and final likelihood values. Availability & Supplementary Information: RAxMLIII including all alignments and final trees mentioned in this paper is freely available as open source code at
Bayesian phylogenetic analysis of combined data
 Syst. Biol
, 2004
"... Abstract. — The recent development of Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) techniques has facilitated the exploration of parameterrich evolutionary models. At the same time, stochastic models have become more realistic (and complex) and have been extended to new typ ..."
Abstract

Cited by 198 (11 self)
 Add to MetaCart
(Show Context)
Abstract. — The recent development of Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) techniques has facilitated the exploration of parameterrich evolutionary models. At the same time, stochastic models have become more realistic (and complex) and have been extended to new types of data, such as morphology. Based on this foundation, we developed a Bayesian MCMC approach to the analysis of combined data sets and explored its utility in inferring relationships among gall wasps based on data from morphology and four genes (nuclear and mitochondrial, ribosomal and protein coding). Examined models range in complexity from those recognizing only a morphological and a molecular partition to those having complex substitution models with independent parameters for each gene. Bayesian MCMC analysis deals efficiently with complex models: convergence occurs faster and more predictably for complex models, mixing is adequate for all parameters even under very complex models, and the parameter update cycle is virtually unaffected by model partitioning across sites. Morphology contributed only 5 % of the characters in the data set but nevertheless influenced the combineddata tree, supporting the utility of morphological data in multigene analyses. We used Bayesian criteria (Bayes factors) to show that process heterogeneity across data partitions is a significant model component, although not as important as amongsite rate variation. More complex evolutionary models are associated with more topological uncertainty and less conflict between morphology and molecules. Bayes factors sometimes favor simpler models over considerably more
A concern for evidence and a phylogenetic hypothesis of relationships among Epicrates (Boidae
, 1989
"... Abstract.—Character congruence, the principle of using all the relevant data, and character independence are important concepts in phylogenetic inference, because they relate directly to the evidence on which hypotheses are based. Taxonomic congruence, which is agreement among patterns of taxonomic ..."
Abstract

Cited by 185 (6 self)
 Add to MetaCart
Abstract.—Character congruence, the principle of using all the relevant data, and character independence are important concepts in phylogenetic inference, because they relate directly to the evidence on which hypotheses are based. Taxonomic congruence, which is agreement among patterns of taxonomic relationships, is less important, because its connection to the underlying character evidence is indirect and often imperfect. Also, taxonomic congruence is difficult to justify, because of the arbitrariness involved in choosing a consensus method and index with which to estimate agreement. High levels of character congruence were observed among 89 biochemical and morphological synapomorphies scored on 10 species of Epicrates. Such agreement is consistent with the phylogenetic interpretation attached to the resulting hypothesis, which is a consensus of two equally parsimonious cladograms: (cenchria (angulifer (striatus {(chrysogaster, exsul) (inornatus, subflavus) (gracilis (fordii, monensis)))))). Relatively little (11.4%) of the character incongruence was due to the disparity between the biochemical and morphological data sets. Each of the clades in the consensus cladogram was confirmed by two or more unique and unreversed novelties, and six of the eight clades were corroborated by biochemical and morphological evidence. Such com
Inferring phylogeny despite incomplete lineage sorting
 Syst. Biol
, 2006
"... Abstract.—It is now well known that incomplete lineage sorting can cause serious difficulties for phylogenetic inference, but little attention has been paid to methods that attempt to overcome these difficulties by explicitly considering the processes that produce them. Here we explore approaches to ..."
Abstract

Cited by 170 (4 self)
 Add to MetaCart
Abstract.—It is now well known that incomplete lineage sorting can cause serious difficulties for phylogenetic inference, but little attention has been paid to methods that attempt to overcome these difficulties by explicitly considering the processes that produce them. Here we explore approaches to phylogenetic inference designed to consider retention and sorting of ancestral polymorphism. We examine how the reconstructability of a species (or population) phylogeny is affected by (a) the number of loci used to estimate the phylogeny and (b) the number of individuals sampled per species. Even in difficult cases with considerable incomplete lineage sorting (times between divergences less than 1 Ne generations), we found the reconstructed species trees matched the "true " species trees in at least three out of five partitions, as long as a reasonable number of individuals per species were sampled. We also studied the tradeoff between sampling more loci versus more individuals. Although increasing the number of loci gives more accurate trees for a given sampling effort with deeper species trees (e.g., total depth of 10 Nc generations), sampling more individuals often gives better results than sampling more loci with shallower species trees (e.g., depth = 1 Ne). Taken together, these results demonstrate that gene sequences retain enough signal to achieve an accurate estimate of phylogeny despite widespread incomplete lineage sorting. Continued improvement in our methods to reconstruct phylogeny near the species level will require a shift to a compound model that considers not only nucleotide or character state substitutions, but also the population genetics processes of lineage sorting. [Coalescence; divergence; population; speciation.]
A likelihood approach to estimating phylogeny from discrete morphological character data
 Systematic Biology
, 2001
"... Abstract.—Evolutionary biologists have adopted simple likelihoodmodels for purposes of estimating ancestral states and evaluating character independence on specied phylogenies; however, for purposes of estimating phylogenies by using discrete morphological data, maximum parsimony remains the only o ..."
Abstract

Cited by 155 (0 self)
 Add to MetaCart
Abstract.—Evolutionary biologists have adopted simple likelihoodmodels for purposes of estimating ancestral states and evaluating character independence on specied phylogenies; however, for purposes of estimating phylogenies by using discrete morphological data, maximum parsimony remains the only option. This paper explores the possibility of using standard, wellbehaved Markov models for estimating morphological phylogenies (including branch lengths) under the likelihood criterion. An importantmodication of standardMarkovmodels involvesmaking the likelihood conditional on characters being variable, because constant characters are absent in morphological data sets. Without this modication, branch lengths are often overestimated, resulting in potentially serious biases in tree topology selection. Several new avenues of research are opened by an explicitly modelbased approach to phylogenetic analysis of discrete morphological data, including combineddata likelihood analyses (morphologyC sequence data), likelihood ratio tests, and Bayesian analyses. [Discrete morphological character; Markov model; maximum likelihood; phylogeny.] The increased availability of nucleotide and protein sequences from a diversity of both organisms and genes has stimu
Bayes or bootstrap? A simulation study comparing the performance of Bayesian Markov chain Monte Carlo sampling and bootstrapping in assessing phylogenetic conWdence.
 Mol. Biol. Evol.
, 2003
"... Bayesian Markov chain Monte Carlo sampling has become increasingly popular in phylogenetics as a method for both estimating the maximum likelihood topology and for assessing nodal confidence. Despite the growing use of posterior probabilities, the relationship between the Bayesian measure of confid ..."
Abstract

Cited by 139 (5 self)
 Add to MetaCart
(Show Context)
Bayesian Markov chain Monte Carlo sampling has become increasingly popular in phylogenetics as a method for both estimating the maximum likelihood topology and for assessing nodal confidence. Despite the growing use of posterior probabilities, the relationship between the Bayesian measure of confidence and the most commonly used confidence measure in phylogenetics, the nonparametric bootstrap proportion, is poorly understood. We used computer simulation to investigate the behavior of three phylogenetic confidence methods: Bayesian posterior probabilities calculated via Markov chain Monte Carlo sampling (BMCMCPP), maximum likelihood bootstrap proportion (MLBP), and maximum parsimony bootstrap proportion (MPBP). We simulated the evolution of DNA sequence on 17taxon topologies under 18 evolutionary scenarios and examined the performance of these methods in assigning confidence to correct monophyletic and incorrect monophyletic groups, and we examined the effects of increasing character number on support value. BMCMCPP and MLBP were often strongly correlated with one another but could provide substantially different estimates of support on short internodes. In contrast, BMCMCPP correlated poorly with MPBP across most of the simulation conditions that we examined. For a given threshold value, more correct monophyletic groups were supported by BMCMCPP than by either MLBP or MPBP. When threshold values were chosen that fixed the rate of accepting incorrect monophyletic relationship as true at 5%, all three methods recovered most of the correct relationships on the simulated topologies, although BMCMCPP and MLBP performed better than MPBP. BMCMCPP was usually a less biased predictor of phylogenetic accuracy than either bootstrapping method. BMCMCPP provided high support values for correct topological bipartitions with fewer characters than was needed for nonparametric bootstrap.