Results 1  10
of
122
A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood
, 2003
"... The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximumlikelihood principle, which clearly satisfies these requirements. The ..."
Abstract

Cited by 2182 (27 self)
 Add to MetaCart
The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximumlikelihood principle, which clearly satisfies these requirements. The core of this method is a simple hillclimbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distancebased method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment of the topology and branch lengths, only a few iterations are sufficient to reach an optimum. We used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximumlikelihood programs and much higher than the performance of distancebased and parsimony approaches. The reduction of computing time is dramatic in comparison with other maximumlikelihood packages, while the likelihood maximization ability tends to be higher. For example, only 12 min were required on a standard personal computer to analyze a data set consisting of 500 rbcL sequences with 1,428 base pairs from plant plastids, thus reaching a speed of the same order as some popular distancebased and parsimony algorithms. This new method is implemented in the PHYML program, which is freely available on our web page:
Bayesian phylogenetic analysis of combined data
 Syst. Biol
, 2004
"... Abstract. — The recent development of Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) techniques has facilitated the exploration of parameterrich evolutionary models. At the same time, stochastic models have become more realistic (and complex) and have been extended to new typ ..."
Abstract

Cited by 203 (12 self)
 Add to MetaCart
(Show Context)
Abstract. — The recent development of Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) techniques has facilitated the exploration of parameterrich evolutionary models. At the same time, stochastic models have become more realistic (and complex) and have been extended to new types of data, such as morphology. Based on this foundation, we developed a Bayesian MCMC approach to the analysis of combined data sets and explored its utility in inferring relationships among gall wasps based on data from morphology and four genes (nuclear and mitochondrial, ribosomal and protein coding). Examined models range in complexity from those recognizing only a morphological and a molecular partition to those having complex substitution models with independent parameters for each gene. Bayesian MCMC analysis deals efficiently with complex models: convergence occurs faster and more predictably for complex models, mixing is adequate for all parameters even under very complex models, and the parameter update cycle is virtually unaffected by model partitioning across sites. Morphology contributed only 5 % of the characters in the data set but nevertheless influenced the combineddata tree, supporting the utility of morphological data in multigene analyses. We used Bayesian criteria (Bayes factors) to show that process heterogeneity across data partitions is a significant model component, although not as important as amongsite rate variation. More complex evolutionary models are associated with more topological uncertainty and less conflict between morphology and molecules. Bayes factors sometimes favor simpler models over considerably more
Species trees from gene trees: Reconstructing Bayesian posterior distributions of a species phylogeny using estimated gene tree distributions
 SYSTEMATIC BIOLOGY
, 2007
"... The estimation of species trees has become popular as a considerable amount of multilocus molecular data is available for inferring the evolutionary history of species. However, the current phylogenetic paradigm, that reconstructs gene trees to represent the species tree suggests that commonly used ..."
Abstract

Cited by 109 (11 self)
 Add to MetaCart
The estimation of species trees has become popular as a considerable amount of multilocus molecular data is available for inferring the evolutionary history of species. However, the current phylogenetic paradigm, that reconstructs gene trees to represent the species tree suggests that commonly used methods such as the concatenation method, the consensus tree method, or the gene tree parsimony method may be either inconsistent or highly biased. In this paper, we propose a Bayesian hierarchical model to estimate the phylogeny of a group of species using multiple estimated gene tree distributions such as those that arise in a Bayesian analysis of DNA sequence data. Our model employs substitution models used in traditional phylogenetics, but also uses coalescent theory to explain genealogical signals from species trees to gene trees and from gene trees to sequence data, thereby forming a stochastic model to estimate gene trees, species trees, ancestral population sizes and species divergence times simultaneously. Our model is founded on the assumption that gene trees, even of unlinked loci, are correlated due to being derived from a single species tree and therefore should be estimated jointly. We apply the method to two multilocus DNA sequences datasets. The estimates of the
Frequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models
 SYST. BIOL
, 2004
"... What does die posterior probability of a phylogenetic tree mean? This simulation study shows that Bayesian posterior probabilities have the meaning that is typically ascribed to them; the pt>sterkir probability ot'a tree is the probability that the tree is corwct, assuming th>.it the mo ..."
Abstract

Cited by 101 (7 self)
 Add to MetaCart
What does die posterior probability of a phylogenetic tree mean? This simulation study shows that Bayesian posterior probabilities have the meaning that is typically ascribed to them; the pt>sterkir probability ot'a tree is the probability that the tree is corwct, assuming th>.it the model is correct. At the same time, the BayLsian method can be sensitive to model misspecification, and the sensitivity of the Bayesian method appears to be greater than the sensitivity ot " the nonparametric bootstrap method (using maximum likelihood to estimate trees). Although the estimatLs of phylogeny obtained by use of the method of maximum likelihood or the Bayesian method are Ukely to be similar, the assessment of the uncertainty of inferred trees via either bootstriipping (t"or maximum likelihood estimates) or petsterior probabilities (for Bayesian estimates) is not likely to be the same. We suggest that the Bayesian method be implemented with the most complex models of those currently avaiiable, as tliis should reduce the chance that the metliod will concentrate too much probability on tuo few trees. [Bayesian estimation; Markov ch^iin Monte Carlo; posterior probability; prior probability.] Quantify ing the uncertainty of a phylogcneticesti mil te is at least as important a goal as obtaining the phylogenetic estimate itself. Measures of phylogenetic reliability not only point out what parts of a tree can be trusted when interpreting the evolution of a group, but can guide
Molecular systematics of the eastern fence lizard (Sceloporus undulatus): A comparison of parsimony, likelihood, and Bayesian approaches
 Syst. Biol
, 2002
"... Abstract.—Phylogenetic analysis of large datasets using complex nucleotide substitution models under a maximum likelihood framework can be computationally infeasible, especially when attempting to infer con�dence values by way of nonparametric bootstrapping. Recent developments in phylogenetics sugg ..."
Abstract

Cited by 91 (8 self)
 Add to MetaCart
Abstract.—Phylogenetic analysis of large datasets using complex nucleotide substitution models under a maximum likelihood framework can be computationally infeasible, especially when attempting to infer con�dence values by way of nonparametric bootstrapping. Recent developments in phylogenetics suggest the computational burden can be reduced by using Bayesian methods of phylogenetic inference. However, few empirical phylogenetic studies exist that explore the ef�ciency of Bayesian analysis of large datasets. To this end, we conducted an extensive phylogenetic analysis of the wideranging and geographically variable Eastern Fence Lizard (Sceloporus undulatus). Maximum parsimony, maximum likelihood, and Bayesian phylogenetic analyses were performed on a combined mitochondrial DNA dataset (12S and 16S rRNA, ND1 proteincoding gene, and associated tRNA; 3,688 bp total) for 56 populations of S. undulatus (78 total terminals including other S. undulatus group species and outgroups). Maximum parsimony analysis resulted in numerous equally parsimonious trees (82,646 from equally weighted parsimony and 335 from weighted parsimony). The majority rule consensus tree derived from the Bayesian analysis was topologically identical to the single best phylogeny inferred from the maximum likelihood analysis, but required �80 % less computational time. The mtDNA data provide strong support for the monophyly of the S. undulatus group and
Multiple Sequence Alignment Accuracy and Phylogenetic Inference
"... Phylogenies are often thought to be more dependent upon the specifics of the sequence alignment rather than on the method of reconstruction. Simulation of sequences containing insertion and deletion events was performed in order to determine the role that alignment accuracy plays during phylogeneti ..."
Abstract

Cited by 54 (1 self)
 Add to MetaCart
Phylogenies are often thought to be more dependent upon the specifics of the sequence alignment rather than on the method of reconstruction. Simulation of sequences containing insertion and deletion events was performed in order to determine the role that alignment accuracy plays during phylogenetic inference. Data sets were simulated for pectinate, balanced, and random tree shapes under different conditions (ultrametric equal branch length, ultrametric random branch length, nonultrametric random branch length). Comparisons between hypothesized alignments and true alignments enabled determination of two measures of alignment accuracy, that of the total data set and that of individual branches. In general, our results indicate that as alignment error increases, topological accuracy decreases. This trend was much more pronounced for data sets derived from more pectinate topologies. In contrast, for balanced, ultrametric, equal branch length tree shapes, alignment inaccuracy had little average effect on tree reconstruction. These conclusions are based on average trends of many analyses under different conditions, and any one specific analysis, independent of the alignment accuracy, may recover very accurate or inaccurate topologies. Maximum likelihood and Bayesian, in general, outperformed neighbor joining and maximum parsimony in terms of tree reconstruction accuracy. Results also indicated that as the length of the branch and of the neighboring branches increase, alignment accuracy decreases, and the length of the neighboring branches is the major factor in topological accuracy. Thus, multiplesequence alignment can be an important factor in downstream effects on topological reconstruction. [Bayesian; maximum likelihood; maximum parsimony; multiple sequence alignment; neighbor
Stochastic mapping of morphological characters
 Syst. Biol
, 2003
"... Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at. ..."
Abstract

Cited by 36 (3 self)
 Add to MetaCart
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at.
Accurate branch length estimation in partitioned Bayesian analyses requires accommodation of amongpartition rate variation and attention to branch length priors. Syst Biol
, 2006
"... Molecular phylogenetic studies are making increasing use of partitioned Bayesian analyses via software tools like MrBayes, version 3 (Ronquist and Huelsenbeck, 2003). Data partitioning is important because, as long as the same topology/history underlies all of the partitions, it addresses some of t ..."
Abstract

Cited by 33 (0 self)
 Add to MetaCart
Molecular phylogenetic studies are making increasing use of partitioned Bayesian analyses via software tools like MrBayes, version 3 (Ronquist and Huelsenbeck, 2003). Data partitioning is important because, as long as the same topology/history underlies all of the partitions, it addresses some of the problems associated with the combination of data sets with heterogeneous rates (Bull et al., 1993) and eliminates the need to argue the validity of tests that have been used to judge data combinability (e.g., Huelsenbeck et al., 1994; Huelsenbeck
Can Incomplete Taxa Rescue Phylogenetic Analyses from LongBranch Attraction?
, 2005
"... Taxon sampling may be critically important for phylogenetic accuracy because adding taxa can help to subdivide misleading long branches. Although the idea that added taxa can break up long branches was exemplified by a study of “incomplete” fossil taxa, the issue of taxon completeness (i.e., propor ..."
Abstract

Cited by 30 (4 self)
 Add to MetaCart
Taxon sampling may be critically important for phylogenetic accuracy because adding taxa can help to subdivide misleading long branches. Although the idea that added taxa can break up long branches was exemplified by a study of “incomplete” fossil taxa, the issue of taxon completeness (i.e., proportion of missing data) has been largely ignored in most subsequent discussions of taxon sampling and longbranch attraction. In this article, I use simulations to test the ability of incomplete taxa to subdivide long branches and improve phylogenetic accuracy in situations of potential longbranch attraction. The results show that for most methods and conditions examined, adding taxa that are only 50 % complete may provide similar benefits to adding the same number of complete taxa (suggesting that the advantages of increased taxon sampling may be obtained with less data than previously considered). For parsimony, taxa that are less complete (5 % to 25% complete) may often have limited ability to rescue analyses from longbranch attraction. In contrast, highly incomplete taxa can be surprisingly beneficial when using modelbased methods. The results also suggest the importance of modelbased methods in phylogenetic analyses that combine molecular and fossil data.
Assessing calibration uncertainty in molecular dating: The assignment of fossils to alternative calibration points. Systematic Biology
, 2007
"... Abstract.—Although recent methodological advances have allowed the incorporation of rate variation in molecular dating analyses, the calibration procedure, performed mainly through fossils, remains resistant to improvements. One source of uncertainty pertains to the assignment of fossils to specific ..."
Abstract

Cited by 27 (0 self)
 Add to MetaCart
(Show Context)
Abstract.—Although recent methodological advances have allowed the incorporation of rate variation in molecular dating analyses, the calibration procedure, performed mainly through fossils, remains resistant to improvements. One source of uncertainty pertains to the assignment of fossils to specific nodes in a phylogeny, especially when alternative possibilities exist that can be equally justified on morphological grounds. Here we expand on a recently developed fossil crossvalidation method to evaluate whether alternative nodal assignments of multiple fossils produce calibration sets that differ in their internal consistency. We use an enlarged Crypteroniaceaecentered phylogeny of Myrtales, six fossils, and 72 combinations of calibration points, termed calibration sets, to identify (i) the fossil assignments that produce the most internally consistent calibration sets and (ii) the mean ages, derived from these calibration sets, for the split of the Southeast Asian Crypteroniaceae from their West Gondwanan sister clade (node X). We found that a correlation exists between s values, devised to measure the consistency among the calibration points of a calibration set (Near and Sanderson, 2004), and nodal distances among calibration points. By ranking all sets according to the percent deviation of s from the regression line with nodal distance, we identified the sets with the highest level of corrected calibrationset consistency. These sets generated lower standard deviations associated with the ages of node X than sets characterized by lower corrected consistency. The three calibration sets with the highest corrected consistencies produced mean age estimates for node X of 79.70, 79.14, and 78.15