• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

PAML: a program package for phylogenetic analysis by maximum likelihood. (1997)

by Z Yang
Venue:Comput. Appl. Biosci.
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 1,459
Next 10 →

Approximate likelihood ratio test for branches: a fast, accurate and powerful alternative

by Maria Anisimova, Olivier Gascuel - SYSTEMATIC BIOLOGY , 2006
"... We revisit statistical tests for branches of evolutionary trees reconstructed upon molecular data. A new, fast, approximate likelihood-ratio test (aLRT) for branches is presented here as a competitive alternative to nonparametric bootstrap and Bayesian estimation of branch support. The aLRT is based ..."
Abstract - Cited by 275 (9 self) - Add to MetaCart
We revisit statistical tests for branches of evolutionary trees reconstructed upon molecular data. A new, fast, approximate likelihood-ratio test (aLRT) for branches is presented here as a competitive alternative to nonparametric bootstrap and Bayesian estimation of branch support. The aLRT is based on the idea of the conventional LRT, with the null hypothesis corresponding to the assumption that the inferred branch has length 0. We show that the LRT statistic is asymptotically distributed as a maximum of three random variables drawn from the 1 2 1 2 χ 2 0 + χ

Likelihood-based tests of topologies in phylogenetics. Syst. Biol

by Nick Goldman, Jon P. Anderson, Allen G. Rodrigo , 2000
"... Abstract.—Likelihood-based statistical tests of competing evolutionary hypotheses (tree topologies) have been available for approximately a decade. By far the most commonly used is the Kishino–Hasegawa test. However, the assumptions that have to be made to ensure the validity of the Kishino–Hasegawa ..."
Abstract - Cited by 225 (3 self) - Add to MetaCart
Abstract.—Likelihood-based statistical tests of competing evolutionary hypotheses (tree topologies) have been available for approximately a decade. By far the most commonly used is the Kishino–Hasegawa test. However, the assumptions that have to be made to ensure the validity of the Kishino–Hasegawa test place important restrictions on its applicability. In particular, it is only valid when the topologies being compared are speci�ed a priori. Unfortunately, this means that the Kishino–Hasegawa test may be severely biased in many cases in which it is now commonly used: for example, in any case in which one of the competing topologies has been selected for testing because it is the maximum likelihood topology for the data set at hand. We review the theory of the Kishino–Hasegawa test and contend that for the majority of popular applications this test should not be used. Previously published results from invalid applications of the Kishino–Hasegawa test should be treated extremely cautiously, and future applications should use appropriate alternative tests instead. We review such alternative tests, both nonparametric and parametric, and give two examples which illustrate the importance of our contentions. [Kishino– Hasegawa test; maximum likelihood; phylogeny; Shimodaira–Hasegawa test; statistical tests; tree topology.] Hasegawa and Kishino (1989) and Kishino and Hasegawa(1989)developed methods for estimating the standard error and con�dence intervals for the difference in log-likelihoods between two topologically distinct phylogenetic trees representing hypotheses that might explain particular aligned sequence data sets. The method initially was introduced to compute con�dence intervals on posterior probabilities for topologies in a

The Sorcerer II Global Ocean Sampling expedition: Expanding the universe of protein families. PLoS Biol 5: e16

by Shibu Yooseph, Granger Sutton, Douglas B. Rusch, Aaron L. Halpern, Shannon J. Williamson, Karin Remington, Jonathan A. Eisen, Karla B. Heidelberg, Gerard Manning, Weizhong Li, Lukasz Jaroszewski, Piotr Cieplak, Christopher S. Miller, Huiying Li, Susan T. Mashiyama, Marcin P. Joachimiak, Christopher Van Belle, John-marc Ch, David A. Soergel, Yufeng Zhai, Kannan Natarajan, Shaun Lee, Benjamin J. Raphael, Vineet Bafna, Robert Friedman, Steven E. Brenner, Adam Godzik, David Eisenberg, Jack E. Dixon, Susan S. Taylor, Robert L. Strausberg, Marvin Frazier, J. Craig Venter , 2007
"... Metagenomics projects based on shotgun sequencing of populations of micro-organisms yield insight into protein families. We used sequence similarity clustering to explore proteins with a comprehensive dataset consisting of sequences from available databases together with 6.12 million proteins predic ..."
Abstract - Cited by 151 (6 self) - Add to MetaCart
Metagenomics projects based on shotgun sequencing of populations of micro-organisms yield insight into protein families. We used sequence similarity clustering to explore proteins with a comprehensive dataset consisting of sequences from available databases together with 6.12 million proteins predicted from an assembly of 7.7 million Global Ocean Sampling (GOS) sequences. The GOS dataset covers nearly all known prokaryotic protein families. A total of 3,995 medium- and large-sized clusters consisting of only GOS sequences are identified, out of which 1,700 have no detectable homology to known families. The GOS-only clusters contain a higher than expected proportion of sequences of viral origin, thus reflecting a poor sampling of viral diversity until now. Protein domain distributions in
(Show Context)

Citation Context

... and for proteins that are under strong positive selection, Ka/Ks 1. A Ka/Ks value close to 1 is an indication that sequences are under no selective pressure and hence are unlikely to encode proteins =-=[134,135]-=-. Weakly selected but legitimate coding sequences can have a Ka/Ks value close to 1. These were identified to some extent by using a model in which different partitions of the codons experience differ...

Combining phylogenetic and hidden Markov models in biosequence analysis

by Adam Siepel - J. Comput. Biol , 2004
"... A few models have appeared in recent years that consider not only the way substitutions occur through evolutionary history at each site of a genome, but also the way the process changes from one site to the next. These models combine phylogenetic models of molecular evolution, which apply to individ ..."
Abstract - Cited by 135 (13 self) - Add to MetaCart
A few models have appeared in recent years that consider not only the way substitutions occur through evolutionary history at each site of a genome, but also the way the process changes from one site to the next. These models combine phylogenetic models of molecular evolution, which apply to individual sites, and hidden Markov models, which allow for changes from site to site. Besides improving the realism of ordinary phylogenetic models, they are potentially very powerful tools for inference and prediction—for gene finding, for example, or prediction of secondary structure. In this paper, we review progress on combined phylogenetic and hidden Markov models and present some extensions to previous work. Our main result is a simple and efficient method for accommodating higher-order states in the HMM, which allows for context-sensitive models of substitution— that is, models that consider the effects of neighboring bases on the pattern of substitution. We present experimental results indicating that higher-order states, autocorrelated rates, and multiple functional categories all lead to significant improvements in the fit of a combined phylogenetic and hidden Markov model, with the effect of higher-order states being particularly pronounced.
(Show Context)

Citation Context

...y from it by “contaminating” a portion of the alignment. The method is not novel—indeed, it was briefly mentioned in Felsenstein’s 1973 paper [10] and has been implemented in 281 PHYLIP [12] and PAML =-=[45]-=-, among other packages—but we will describe it in some detail because it turns out to be useful in the extension of Section 2.6. Consider a single column of an alignment, Xi, someelements of which are...

Selecting the best-fit model of nucleotide substitution

by David Posada, Keith, A. Crandall - Syst , 2001
"... Abstract.—Despite the relevant role of models of nucleotide substitution in phylogenetics, choosing among different models remains a problem. Several statistical methods for selecting the model that best ts the data at hand have been proposed, but their absolute and relative performance has not yet ..."
Abstract - Cited by 135 (2 self) - Add to MetaCart
Abstract.—Despite the relevant role of models of nucleotide substitution in phylogenetics, choosing among different models remains a problem. Several statistical methods for selecting the model that best ts the data at hand have been proposed, but their absolute and relative performance has not yet been characterized. In this study, we compare under various conditions the performance of different hierarchical and dynamic likelihood ratio tests, and of Akaike and Bayesian information methods, for selecting best-t models of nucleotide substitution. We specically examine the role of the topology used to estimate the likelihood of the different models and the importance of the order in which hypotheses are tested. We do this by simulating DNA sequences under a known model of nucleotide substitution andrecording howoften this truemodel is recovered by thedifferentmethods.Our results suggest thatmodel selection is reasonablyaccurateandindicate that some likelihood ratio testmethods perform overall better than the Akaike or Bayesian information criteria. The tree used to estimate the likelihood scores does not inuence model selection unless it is a randomly chosen tree. The order in which hypotheses are tested, and the complexity of the initial model in the sequence of tests, inuence model selection in some cases. Model tting in phylogenetics has been suggested for many years, yet many authors still arbitrarily choose their models, often using the default models implemented

Algebraic Statistics for Computational Biology

by Lior Pachter, Bernd Sturmfels (eds.) , 2005
"... ..."
Abstract - Cited by 130 (22 self) - Add to MetaCart
Abstract not found

HyPhy: hypothesis testing using phylogenies

by Sergei L. Kosakovsky Pond, Spencer V. Muse - BIOINFORMATICS , 2005
"... ..."
Abstract - Cited by 110 (4 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...for addressing questions about the evolutionary process tends to take the form of stand-alone programs that answer only one or two quite specific problems. There are a few exceptions, including PAML (=-=Yang, 1997-=-), Mr Bayes (http://morphbank.ebc.uu.se/mrbayes/info.php) and library-oriented projects such as PAL (Drummond and Strimmer, 2001) and PyEvolve (Butterfield et al., 2004). HyPhywas developed as a unifi...

Ancestral polyploidy in seed plants and angiosperms.

by Yuannian Jiao , Norman J Wickett , Saravanaraj Ayyampalayam , André S Chanderbali , Lena Landherr , Paula E Ralph , Lynn P Tomsho , Yi Hu , Haiying Liang , Pamela S Soltis , Douglas E Soltis , Sandra W Clifton , Scott E Schlarbaum , Stephan C Schuster , Hong Ma , Jim Leebens-Mack , Claude W Depamphilis - Nature , 2011
"... ..."
Abstract - Cited by 95 (1 self) - Add to MetaCart
Abstract not found

Choosing BLAST options for better detection of orthologs as reciprocal best hits

by Gabriel Moreno-hagelsieb, Kristen Latimer - PAGE 12 OF 12 at Pennsylvania State U niversity on M arch 1, 2014 http://nar.oxfordjournals.org/ D ow nloaded from , 2008
"... Motivation: The analyses of the increasing number of genome sequences requires shortcuts for the detection of orthologs, such as Reciprocal Best Hits (RBH), where orthologs are assumed if two genes each in a different genome find each other as the best hit in the other genome. Two BLAST options seem ..."
Abstract - Cited by 83 (6 self) - Add to MetaCart
Motivation: The analyses of the increasing number of genome sequences requires shortcuts for the detection of orthologs, such as Reciprocal Best Hits (RBH), where orthologs are assumed if two genes each in a different genome find each other as the best hit in the other genome. Two BLAST options seem to affect alignment scores the most, and thus the choice of a best hit: the filtering of low information sequence segments and the algorithm used to produce the final alignment. Thus, we decided to test whether such options would help better detect orthologs. Results: Using Escherichia coli K12 as an example, we compared the number and quality of orthologs detected as RBH. We tested four different conditions derived from two options: filtering of low-information segments, hard (default) versus soft; and alignment algorithm, default (based on matching words) versus Smith-Waterman. All options resulted in significant differences in the number of orthologs detected, with the highest numbers obtained with the combination of soft filtering with Smith-Waterman alignments. We compared these results with those of Reciprocal Shortest Distances (RSD), supposed to be superior to RBH because it uses an evolutionary measure of distance, rather than BLAST statistics, to rank homologs and thus detect orthologs. RSD barely increased the number of orthologs detected over those found with RBH. Error estimates, based on analyses of conservation of gene order, found small differences in the quality of orthologs detected using RBH. However, RSD showed the highest error rates. Thus, RSD have no advantages over RBH.

2005b).Datamonkey: rapid detection of selective pressure on individual sites of codon alignments

by Sergei L. Kosakovsky Pond, Simon D. W. Frost - Bioinformatics
"... sites of codon alignments ..."
Abstract - Cited by 71 (7 self) - Add to MetaCart
sites of codon alignments
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University