Results 1  10
of
124
Model Selection and Model Averaging in Phylogenetics: Advantages of Akaike Information Criterion and Bayesian Approaches Over Likelihood Ratio Tests
, 2004
"... Model selection is a topic of special relevance in molecular phylogenetics that affects many, if not all, stages of phylogenetic inference. Here we discuss some fundamental concepts and techniques of model selection in the context of phylogenetics. We start by reviewing different aspects of the sel ..."
Abstract

Cited by 407 (8 self)
 Add to MetaCart
Model selection is a topic of special relevance in molecular phylogenetics that affects many, if not all, stages of phylogenetic inference. Here we discuss some fundamental concepts and techniques of model selection in the context of phylogenetics. We start by reviewing different aspects of the selection of substitution models in phylogenetics from a theoretical, philosophical and practical point of view, and summarize this comparison in table format. We argue that the most commonly implemented model selection approach, the hierarchical likelihood ratio test, is not the optimal strategy for model selection in phylogenetics, and that approaches like the Akaike Information Criterion (AIC) and Bayesian methods offer important advantages. In particular, the latter two methods are able to simultaneously compare multiple nested or nonnested models, assess model selection uncertainty, and allow for the estimation of phylogenies and model parameters using all available models (modelaveraged inference or multimodel inference). We also describe how the relative importance of the different parameters included in substitution models can be depicted. To illustrate some of these points, we have applied AICbased model averaging to 37 mitochondrial DNA sequences from the subgenus Ohomopterus (genus Carabus) ground beetles described by Sota and Vogler (2001).
Key Concepts in Model Selection: Performance and Generalizability
 Journal of Mathematical Psychology
, 2000
"... methods of model selection, and how do they work? Which methods perform better than others, and in what circumstances? These questions rest on a number of key concepts in a relatively underdeveloped field. The aim of this essay is to explain some background concepts, highlight some of the results in ..."
Abstract

Cited by 72 (13 self)
 Add to MetaCart
(Show Context)
methods of model selection, and how do they work? Which methods perform better than others, and in what circumstances? These questions rest on a number of key concepts in a relatively underdeveloped field. The aim of this essay is to explain some background concepts, highlight some of the results in this special issue, and to add my own. The standard methods of model selection include classical hypothesis testing, maximum likelihood, Bayes method, minimum description length, crossvalidation and Akaike’s information criterion. They all provide an implementation of Occam’s razor, in which parsimony or simplicity is balanced against goodnessoffit. These methods primarily take account of the sampling errors in parameter estimation, although their relative success at this task depends on the circumstances. However, the aim of model selection should also include the ability of a model to generalize to predictions in a different domain. Errors of extrapolation, or generalization, are different from errors of parameter estimation. So, it seems that simplicity and parsimony may be an additional factor in managing these errors, in which case the standard methods of model selection are incomplete implementations of Occam’s razor. 1. WHAT IS MODEL SELECTION? William of Ockham (1285 1347/49) will always be remembered for his famous postulations of Ockham’s razor (also spelled ‘Occam’), which states that entities are not to be multiplied beyond necessity. In a similar vein, Sir Isaac Newton’s first rule of hypothesizing instructs us that we are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances. While they This paper is derived from a presentation at the Methods of Model Selection symposium at Indiana University
Studies in Bayesian Confirmation Theory
, 2001
"... According to Bayesian confirmation theory, evidence E (incrementally) confirms (or supports) a hypothesis H (roughly) just in case E and H are positively probabilistically correlated (under an appropriate probability function Pr). There are many logically equivalent ways of saying that E and H are ..."
Abstract

Cited by 34 (8 self)
 Add to MetaCart
According to Bayesian confirmation theory, evidence E (incrementally) confirms (or supports) a hypothesis H (roughly) just in case E and H are positively probabilistically correlated (under an appropriate probability function Pr). There are many logically equivalent ways of saying that E and H are correlated under Pr. Surprisingly, this leads to a plethora of nonequivalent quantitative measures of the degree to which E confirms H (under Pr). In fact, many nonequivalent Bayesian measures of the degree to which E confirms (or supports) H have been proposed and defended in the literature on inductive logic. I provide a thorough historical survey of the various proposals, and a detailed discussion of the philosophical ramifications of the differences between them. I argue that the set of candidate
Bayes not Bust! Why Simplicity is no Problem for Bayesians
, 2007
"... The advent of formal definitions of the simplicity of a theory has important implications for model selection. But what is the best way to define simplicity? Forster and Sober ([1994]) advocate the use of Akaike’s Information Criterion (AIC), a nonBayesian formalisation of the notion of simplicity. ..."
Abstract

Cited by 21 (10 self)
 Add to MetaCart
The advent of formal definitions of the simplicity of a theory has important implications for model selection. But what is the best way to define simplicity? Forster and Sober ([1994]) advocate the use of Akaike’s Information Criterion (AIC), a nonBayesian formalisation of the notion of simplicity. This forms an important part of their wider attack on Bayesianism in the philosophy of science. We defend a Bayesian alternative: the simplicity of a theory is to be characterised in terms of Wallace’s Minimum Message Length (MML). We show that AIC is inadequate for many statistical problems where MML performs well. Whereas MML is always defined, AIC can be undefined. Whereas MML is not known ever to be statistically inconsistent, AIC can be. Even when defined and consistent, AIC performs worse than MML on small sample sizes. MML is statistically invariant under 1to1 reparametrisation, thus avoiding a common criticism of Bayesian approaches. We also show that MML provides answers to many of Forster’s objections to Bayesianism. Hence an important part of the attack on
The Contest Between Parsimony and Likelihood
"... In a “classic ” phylogenetic inference problem, the observed taxa are assumed to be the leaves of a bifurcating tree and the goal is to infer just the “topology ” of the tree (i.e., the formal tree structure linking the extant taxa at the tips), not amount of time between branching events, or amount ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
In a “classic ” phylogenetic inference problem, the observed taxa are assumed to be the leaves of a bifurcating tree and the goal is to infer just the “topology ” of the tree (i.e., the formal tree structure linking the extant taxa at the tips), not amount of time between branching events, or amount of evolution that has taken place on branches, or character states of interior vertices. Two of the main methods that biologists now use to solve such problems are maximum likelihood (ML) and maximum parsimony (MP); distance methods constitute a third approach, which will not be discussed here. ML seeks to find the tree topology that confers the highest probability on the observed characteristics of tip species. MP seeks to find the tree topology that requires the fewest changes in character state to produce the characteristics of those tip species. Besides saying what the “best ” tree is for a given data set, both methods also provide an ordering of trees, from best to worst. The two methods sometimes disagree about this ordering—most vividly, when they disagree about which tree is best supported by the evidence. For this reason, biologists have had to address this methodological dispute head on, rather than setting it aside as a merely “philosophical ” dispute of dubious relevance to scientists “in the trenches.” The main objection that has been made against ML is that it requires the adoption of a model of the evolutionary process that one has scant reason to think is true. ML requires a process model because hypotheses that specify a tree topology (and nothing more) do not, by themselves, confer probabilities on the observations. The situation here is familiar to philosophers as an instance of “Duhem’s Thesis. ” Pierre Duhem was a French philosopher of science who contended that physical theories do not entail 1 claims about observations unless they are supplemented with auxiliary assumptions (Duhem,
2001): “Why Likelihood
 The Nature of Scientific Evidence
, 1980
"... ABSTRACT: The Likelihood Principle has been defended on Bayesian grounds, on the grounds that it coincides with and systematizes intuitive judgments about example problems, and by appeal to the fact that it generalizes what is true when hypotheses have deductive consequences about observations. Here ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
(Show Context)
ABSTRACT: The Likelihood Principle has been defended on Bayesian grounds, on the grounds that it coincides with and systematizes intuitive judgments about example problems, and by appeal to the fact that it generalizes what is true when hypotheses have deductive consequences about observations. Here we divide the Principle into two parts one qualitative, the other quantitative and evaluate each in the light of the Akaike information criterion. Both turn out to be correct in a special case (when the competing hypotheses have the same number of adjustable parameters), but not otherwise.
Model selection in seismic hazard analysis: An information‐theoretic perspective
 Bull Seism. Soc. Am
, 2009
"... Abstract Although the methodological framework of probabilistic seismic hazard analysis is well established, the selection of models to predict the ground motion at the sites of interest remains a major challenge. Information theory provides a powerful theoretical framework that can guide this selec ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
(Show Context)
Abstract Although the methodological framework of probabilistic seismic hazard analysis is well established, the selection of models to predict the ground motion at the sites of interest remains a major challenge. Information theory provides a powerful theoretical framework that can guide this selection process in a consistent way. From an informationtheoretic perspective, the appropriateness of models can be expressed in terms of their relative information loss (Kullback–Leibler distance) and hence in physically meaningful units (bits). In contrast to hypothesis testing, informationtheoretic model selection does not require ad hoc decisions regarding significance levels nor does it require the models to be mutually exclusive and collectively exhaustive. The key ingredient, the Kullback–Leibler distance, can be estimated from the statistical expectation of loglikelihoods of observations for the models under consideration. In the present study, datadriven groundmotion model selection based on Kullback–Leiblerdistance differences is illustrated for a set of simulated observations of response spectra and macroseismic intensities. Information theory allows for a unified treatment of both quantities. The application of Kullback–Leiblerdistance based model selection to real data using the model generating data set for the Abrahamson and Silva (1997) groundmotion model demonstrates the superior performance of the informationtheoretic perspective in comparison to earlier attempts at datadriven model selection (e.g., Scherbaum et al., 2004).