Results 1  10
of
140
Approximate likelihood ratio test for branches: a fast, accurate and powerful alternative
 SYSTEMATIC BIOLOGY
, 2006
"... We revisit statistical tests for branches of evolutionary trees reconstructed upon molecular data. A new, fast, approximate likelihoodratio test (aLRT) for branches is presented here as a competitive alternative to nonparametric bootstrap and Bayesian estimation of branch support. The aLRT is based ..."
Abstract

Cited by 275 (9 self)
 Add to MetaCart
We revisit statistical tests for branches of evolutionary trees reconstructed upon molecular data. A new, fast, approximate likelihoodratio test (aLRT) for branches is presented here as a competitive alternative to nonparametric bootstrap and Bayesian estimation of branch support. The aLRT is based on the idea of the conventional LRT, with the null hypothesis corresponding to the assumption that the inferred branch has length 0. We show that the LRT statistic is asymptotically distributed as a maximum of three random variables drawn from the 1 2 1 2 χ 2 0 + χ
A review of longbranch attraction
 CLADISTICS
, 2005
"... The history of longbranch attraction, and in particular methods suggested to detect and avoid the artifact to date, is reviewed. Methods suggested to avoid LBAartifacts include excluding longbranch taxa, excluding faster evolving third codon positions, using inference methods less sensitive to LB ..."
Abstract

Cited by 137 (1 self)
 Add to MetaCart
The history of longbranch attraction, and in particular methods suggested to detect and avoid the artifact to date, is reviewed. Methods suggested to avoid LBAartifacts include excluding longbranch taxa, excluding faster evolving third codon positions, using inference methods less sensitive to LBA such as likelihood, the Aguinaldo et al. approach, sampling more taxa to break up long branches and sampling more characters especially of another kind, and the pros and cons of these are discussed. Methods suggested to detect LBA are numerous and include methodological disconcordance, RASA, separate partition analyses, parametric simulation, random outgroup sequences, longbranch extraction, split decomposition and spectral analysis. Less than 10 years ago it was doubted if LBA occurred in real datasets. Today, examples are numerous in the literature and it is argued that the development of methods to deal with the problem is warranted. A 16 kbp dataset of placental mammals and a morphological and molecular combined dataset of gall wasps are used to illustrate the particularly common problem of LBA of problematic ingroup taxa to outgroups. The preferred methods of separate partition analysis, methodological disconcordance, and long branch extraction are used to demonstrate detection methods. It is argued that since outgroup taxa almost always represent long branches and are as such a hazard towards misplacing long branched ingroup taxa, phylogenetic analyses should always be run with and without the outgroups included. This will detect whether only the outgroup roots the ingroup or if it simultaneously alters the ingroup topology, in which case previous studies have shown that the latter is most often the worse. Apart from that LBA to outgroups is the major
Partitioned Bayesian analyses, partition choice, and the phylogenetic relationships of scincid lizards
 Syst
, 2005
"... Abstract.Partitioned Bayesian analyses of ∼2.2 kb of nucleotide sequence data (mtDNA) were used to elucidate phylogenetic relationships among 30 scincid lizard genera. Few partitioned Bayesian analyses exist in the literature, resulting in a lack of methods to determine the appropriate number of a ..."
Abstract

Cited by 112 (7 self)
 Add to MetaCart
(Show Context)
Abstract.Partitioned Bayesian analyses of ∼2.2 kb of nucleotide sequence data (mtDNA) were used to elucidate phylogenetic relationships among 30 scincid lizard genera. Few partitioned Bayesian analyses exist in the literature, resulting in a lack of methods to determine the appropriate number of and identity of partitions. Thus, a criterion, based on the Bayes factor, for selecting among competing partitioning strategies is proposed and tested. Improvements in both mean −lnL and estimated posterior probabilities were observed when specific models and parameter estimates were assumed for partitions of the total data set. This result is expected given that the 95% credible intervals of model parameter estimates for numerous partitions do not overlap and it reveals that different data partitions may evolve quite differently. We further demonstrate that how one partitions the data (by gene, codon position, etc.) is shown to be a greater concern than simply the overall number of partitions. Using the criterion of the 2ln Bayes factor >10, the phylogenetic analysis employing the largest number of partitions was decisively better than all other strategies. Strategies that partitioned the ND1 gene by codon position performed better than other partition strategies, regardless of the overall number of partitions. Scincidae, Acontinae, Lygosominae, east Asian and North American "Eumeces" + Neoseps; North African Eumeces, Scincus, and Scincopus, and a large group primarily from subSaharan Africa, Madagascar, and neighboring islands are monophyletic. Feylinia, a limbless group of previously uncertain relationships, is nested within a "scincine" clade from subSaharan Africa. We reject the hypothesis that the nearly limbless dibamids are derived from within the Scincidae, but cannot reject the hypothesis that they represent the sister taxon to skinks. Amphiglossus, Chalcides, the acontines Acontias and Typhlosaurus, and Scincinae are paraphyletic. The globally widespread "Eumeces" is polyphyletic and we make necessary taxonomic changes.
Frequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models
 SYST. BIOL
, 2004
"... What does die posterior probability of a phylogenetic tree mean? This simulation study shows that Bayesian posterior probabilities have the meaning that is typically ascribed to them; the pt>sterkir probability ot'a tree is the probability that the tree is corwct, assuming th>.it the mo ..."
Abstract

Cited by 101 (7 self)
 Add to MetaCart
What does die posterior probability of a phylogenetic tree mean? This simulation study shows that Bayesian posterior probabilities have the meaning that is typically ascribed to them; the pt>sterkir probability ot'a tree is the probability that the tree is corwct, assuming th>.it the model is correct. At the same time, the BayLsian method can be sensitive to model misspecification, and the sensitivity of the Bayesian method appears to be greater than the sensitivity ot " the nonparametric bootstrap method (using maximum likelihood to estimate trees). Although the estimatLs of phylogeny obtained by use of the method of maximum likelihood or the Bayesian method are Ukely to be similar, the assessment of the uncertainty of inferred trees via either bootstriipping (t"or maximum likelihood estimates) or petsterior probabilities (for Bayesian estimates) is not likely to be the same. We suggest that the Bayesian method be implemented with the most complex models of those currently avaiiable, as tliis should reduce the chance that the metliod will concentrate too much probability on tuo few trees. [Bayesian estimation; Markov ch^iin Monte Carlo; posterior probability; prior probability.] Quantify ing the uncertainty of a phylogcneticesti mil te is at least as important a goal as obtaining the phylogenetic estimate itself. Measures of phylogenetic reliability not only point out what parts of a tree can be trusted when interpreting the evolution of a group, but can guide
Comparing Bootstrap and Posterior Probability Values in the FourTaxon Case
, 2003
"... Assessment of the reliability of a given phylogenetic hypothesis is an important step in phylogenetic analysis. Historically, the nonparametric bootstrap procedure has been the most frequently used method for assessing the support for specific phylogenetic relationships. The recent employment of Bay ..."
Abstract

Cited by 61 (4 self)
 Add to MetaCart
Assessment of the reliability of a given phylogenetic hypothesis is an important step in phylogenetic analysis. Historically, the nonparametric bootstrap procedure has been the most frequently used method for assessing the support for specific phylogenetic relationships. The recent employment of Bayesian methods for phylogenetic inference problems has resulted in clade support being expressed in terms of posterior probabilities. We used simulated data and the fourtaxon case to explore the relationship between nonparametric bootstrap values (as inferred by maximum likelihood) and posterior probabilities (as inferred by Bayesian analysis). The results suggest a complex association between the two measures. Three general regions of tree space can be identified: (1) the neutral zone, where differences between mean bootstrap and mean posterior probability values are not significant, (2) near the twobranch corner, and (3) deep in the twobranch corner. In the last two regions, significant differences occur between mean bootstrap and mean posterior probability values. Whether bootstrap or posterior probability values are higher depends on the data in support of alternative topologies. Examination of star topologies revealed that both bootstrap and posterior probability values differ significantly from theoretical expectations;
K (2005) Polytomies and Bayesian phylogenetic inference. Syst Biol 54
"... 1 Abstract — Bayesian phylogenetic analyses are now very popular in systematics and molecular evolution because they allow the use of much more realistic models than currently possible with maximum likelihood methods. There is, however, a growing number of examples in which large Bayesian posterior ..."
Abstract

Cited by 60 (0 self)
 Add to MetaCart
1 Abstract — Bayesian phylogenetic analyses are now very popular in systematics and molecular evolution because they allow the use of much more realistic models than currently possible with maximum likelihood methods. There is, however, a growing number of examples in which large Bayesian posterior clade probabilities are associated with very short edge lengths and low values for nonBayesian measures of support such as nonparametric bootstrapping. For the fourtaxon case when the true tree is the star phylogeny, Bayesian analyses become increasingly unpredictable in their preference for one of the three possible
The Importance of Proper Model Assumption in Bayesian Phylogenetics
, 2004
"... We studied the importance of proper model assumption in the context of Bayesian phylogenetics by examining>5,000 Bayesian analyses and six nested models of nucleotide substitution. Model misspecification can strongly bias bipartition posterior probability estimates. These biases were most pronou ..."
Abstract

Cited by 50 (4 self)
 Add to MetaCart
(Show Context)
We studied the importance of proper model assumption in the context of Bayesian phylogenetics by examining>5,000 Bayesian analyses and six nested models of nucleotide substitution. Model misspecification can strongly bias bipartition posterior probability estimates. These biases were most pronounced when rate heterogeneity was ignored. The type of bias seen at a particular bipartition appeared to be strongly influenced by the lengths of the branches surrounding that bipartition. In the Felsenstein zone, posterior probability estimates of bipartitions were biased when the assumed model was underparameterized but were unbiased when the assumed model was overparameterized. For the inverse Felsenstein zone, however, both underparameterization and overparameterization led to biased bipartition posterior probabilities, although the bias caused by overparameterization was less pronounced and disappeared with increased sequence length. Model parameter estimates were also affected by model misspecification. Underparameterization caused a bias in some parameter estimates, such as branch lengths and the gamma shape parameter, whereas overparameterization caused a decrease in the precision of some parameter estimates. We caution researchers to assure that the most appropriate model is assumed by employing both a priori model choice methods and a posteriori model adequacy tests. [Bayesian phylogenetic inference; convergence; Markov chain Monte Carlo; maximum likelihood; model choice; posterior probability.] Model choice is becoming a critical issue as the number of available models of nucleotide evolution increases rapidly. Recent studies have shown that adequate
Can Incomplete Taxa Rescue Phylogenetic Analyses from LongBranch Attraction?
, 2005
"... Taxon sampling may be critically important for phylogenetic accuracy because adding taxa can help to subdivide misleading long branches. Although the idea that added taxa can break up long branches was exemplified by a study of “incomplete” fossil taxa, the issue of taxon completeness (i.e., propor ..."
Abstract

Cited by 30 (4 self)
 Add to MetaCart
Taxon sampling may be critically important for phylogenetic accuracy because adding taxa can help to subdivide misleading long branches. Although the idea that added taxa can break up long branches was exemplified by a study of “incomplete” fossil taxa, the issue of taxon completeness (i.e., proportion of missing data) has been largely ignored in most subsequent discussions of taxon sampling and longbranch attraction. In this article, I use simulations to test the ability of incomplete taxa to subdivide long branches and improve phylogenetic accuracy in situations of potential longbranch attraction. The results show that for most methods and conditions examined, adding taxa that are only 50 % complete may provide similar benefits to adding the same number of complete taxa (suggesting that the advantages of increased taxon sampling may be obtained with less data than previously considered). For parsimony, taxa that are less complete (5 % to 25% complete) may often have limited ability to rescue analyses from longbranch attraction. In contrast, highly incomplete taxa can be surprisingly beneficial when using modelbased methods. The results also suggest the importance of modelbased methods in phylogenetic analyses that combine molecular and fossil data.
Data partitions and complex models in Bayesian analysis: the phylogeny of gymnophthalmid lizards
 Syst. Biol
, 2004
"... Abstract.—Phylogenetic studies incorporating multiple loci, and multiple genomes, are becoming increasingly common. Coincident with this trend in genetic sampling, modelbased likelihood techniques including Bayesian phylogenetic methods continue to gain popularity. Few studies, however, have examin ..."
Abstract

Cited by 29 (2 self)
 Add to MetaCart
(Show Context)
Abstract.—Phylogenetic studies incorporating multiple loci, and multiple genomes, are becoming increasingly common. Coincident with this trend in genetic sampling, modelbased likelihood techniques including Bayesian phylogenetic methods continue to gain popularity. Few studies, however, have examined model fit and sensitivity to such potentially heterogeneous data partitions within combined data analyses using empirical data. Here we investigate the relative model fit and sensitivity of Bayesian phylogenetic methods when alternative sitespecific partitions of amongsite rate variation (with and without autocorrelated rates) are considered. Our primary goal in choosing a bestfit model was to employ the simplest model that was a good fit to the data while optimizing topology and/or Bayesian posterior probabilities. Thus, we were not interested in complex models that did not practically affect our interpretation of the topology under study. We applied these alternative models to a fourgene data set including one proteincoding nuclear gene (cmos), one proteincoding mitochondrial gene (ND4), and two mitochondrial rRNA genes (12S and 16S) for the diverse yet poorly known lizard family Gymnophthalmidae. Our results suggest that the bestfit model partitioned amongsite rate variation separately among the cmos, ND4, and 12S + 16S gene regions. We found this model yielded identical topologies to those from analyses based on the GTR+I+G model, but significantly changed posterior probability estimates of clade support. This partitioned model also produced more precise (less variable) estimates of posterior probabilities across generations of long Bayesian runs, compared to runs employing a GTR+I+G model estimated for the combined data. We use this threeway gamma partitioning in Bayesian analyses to
Analysis and visualization of tree space
 Systematic Biology
, 2005
"... Abstract.—We explored the use of multidimensional scaling (MDS) of treetotree pairwise distances to visualize the relationships among sets of phylogenetic trees. We found the technique to be useful for exploring “tree islands ” (sets of topologically related trees among larger sets of nearoptima ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
(Show Context)
Abstract.—We explored the use of multidimensional scaling (MDS) of treetotree pairwise distances to visualize the relationships among sets of phylogenetic trees. We found the technique to be useful for exploring “tree islands ” (sets of topologically related trees among larger sets of nearoptimal trees), for comparing sets of trees obtained from bootstrapping and Bayesian sampling, for comparing trees obtained from the analysis of several different genes, and for comparing multiple Bayesian analyses. The technique was also useful as a teaching aid for illustrating the progress of a Bayesian analysis and as an exploratory tool for examining large sets of phylogenetic trees. We also identified some limitations to the method, including distortions of the multidimensional tree space into two dimensions through the MDS technique, and the definition of the MDSdefined space based on a limited sample of trees. Nonetheless, the technique is a useful approach for the analysis of large sets of phylogenetic trees. [Bayesian analysis; multidimensional scaling; phylogenetic analysis; tree space; visualization.] Systematists are often faced with the need to analyze a large collection of phylogenetic trees. These trees may represent a collection of equally parsimonious solutions to a phylogenetic problem, or a set of trees of similar likelihood, or a sampled set of trees from a Markov