Results 1  10
of
27
Bayesian Statistics
 in WWW', Computing Science and Statistics
, 1989
"... ∗ Signatures are on file in the Graduate School. This dissertation presents two topics from opposite disciplines: one is from a parametric realm and the other is based on nonparametric methods. The first topic is a jackknife maximum likelihood approach to statistical model selection and the second o ..."
Abstract

Cited by 33 (1 self)
 Add to MetaCart
(Show Context)
∗ Signatures are on file in the Graduate School. This dissertation presents two topics from opposite disciplines: one is from a parametric realm and the other is based on nonparametric methods. The first topic is a jackknife maximum likelihood approach to statistical model selection and the second one is a convex hull peeling depth approach to nonparametric massive multivariate data analysis. The second topic includes simulations and applications on massive astronomical data. First, we present a model selection criterion, minimizing the KullbackLeibler distance by using the jackknife method. Various model selection methods have been developed to choose a model of minimum KullbackLiebler distance to the true model, such as Akaike information criterion (AIC), Bayesian information criterion (BIC), Minimum description length (MDL), and Bootstrap information criterion. Likewise, the jackknife method chooses a model of minimum KullbackLeibler distance through bias reduction. This bias, which is inevitable in model
A LargeSample Model Selection Criterion Based on Kullback's Symmetric Divergence
 Statistical and Probability Letters
, 1999
"... The Akaike information criterion, AIC, is a widely known and extensively used tool for statistical model selection. AIC serves as an asymptotically unbiased estimator of a variant of Kullback's directed divergence between the true model and a fitted approximating model. The directed divergence ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
(Show Context)
The Akaike information criterion, AIC, is a widely known and extensively used tool for statistical model selection. AIC serves as an asymptotically unbiased estimator of a variant of Kullback's directed divergence between the true model and a fitted approximating model. The directed divergence is an asymmetric measure of separation between two statistical models, meaning that an alternate directed divergence may be obtained by reversing the roles of the two models in the definition of the measure. The sum of the two directed divergences is Kullback's symmetric divergence. Since the symmetric divergence combines the information in two related though distinct measures, it functions as a gauge of model disparity which is arguably more sensitive than either of its individual components. With this motivation, we propose a model selection criterion which serves as an asymptotically unbiased estimator of a variant of the symmetric divergence between the true model and a fitted approximating model. We examine the performance of the criterion relative to other wellknown criteria in a simulation study. Keywords: AIC, Akaike information criterion, Idivergence, Jdivergence, KullbackLeibler information, relative entropy. Correspondence: Joseph E. Cavanaugh, Department of Statistics, 222 Math Sciences Bldg., University of Missouri, Columbia, MO 65211. y This research was supported by NSF grant DMS9704436. 1.
A Bootstrap Variant of AIC for StateSpace Model Selection
 STATISTICA SINICA
, 1997
"... Following in the recent work of Hurvich and Tsai (1989, 1991, 1993) and Hurvich, Shumway, and Tsai (1990), we propose a corrected variant of AIC developed for the purpose of smallsample statespace model selection. Our variant of AIC utilizes bootstrapping in the statespace framework (Stoffer and ..."
Abstract

Cited by 17 (6 self)
 Add to MetaCart
Following in the recent work of Hurvich and Tsai (1989, 1991, 1993) and Hurvich, Shumway, and Tsai (1990), we propose a corrected variant of AIC developed for the purpose of smallsample statespace model selection. Our variant of AIC utilizes bootstrapping in the statespace framework (Stoffer and Wall (1991)) to provide an estimate of the expected KullbackLeibler discrepancy between the model generating the data and a fitted approximating model. We present simulation results which demonstrate that in smallsample settings, our criterion estimates the expected discrepancy with less bias than traditional AIC and certain other competitors. As a result, our AIC variant serves as an effective tool for selecting a model of appropriate dimension. We present an asymptotic justification for our criterion in the Appendix.
Model Selection for Variable Length Markov Chains and Tuning the Context Algorithm
, 2000
"... We consider the model selection problem in the class of stationary variable length Markov chains (VLMC) on a nite space. The processes in this class are still Markovian of higher order, but with memory of variable length. Various aims in selecting a VLMC can be formalized with dierent nonequivalent ..."
Abstract

Cited by 15 (3 self)
 Add to MetaCart
We consider the model selection problem in the class of stationary variable length Markov chains (VLMC) on a nite space. The processes in this class are still Markovian of higher order, but with memory of variable length. Various aims in selecting a VLMC can be formalized with dierent nonequivalent risks, such as nal prediction error or expected KullbackLeibler information. We consider the asymptotic behavior of dierent risk functions and show how they can be generally estimated with the same resampling strategy. Such estimated risks then yield new model selection criteria. In particular, we obtain a datadriven tuning of Rissanen's tree structured context algorithm which is a computationally feasible procedure for selection and estimation of a VLMC. Key words and phrases. Bootstrap, zeroone loss, nal prediction error, nitememory source, FSMX model, KullbackLeibler information, L 2 loss, optimal tree pruning, resampling, tree model. Short title: Selecting variable length Mar...
Model Selection
 In The Handbook Of Financial Time Series
, 2008
"... Model selection has become an ubiquitous statistical activity in the last decades, none the least due to the computational ease with which many statistical models can be fitted to data with the help of modern computing equipment. In this article we provide an introduction to the statistical aspect ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
Model selection has become an ubiquitous statistical activity in the last decades, none the least due to the computational ease with which many statistical models can be fitted to data with the help of modern computing equipment. In this article we provide an introduction to the statistical aspects and implications of model selection and we review the relevant literature. 1.1 A General Formulation When modeling data Y, a researcher often has available a menu of competing candidate models which could be used to describe the data. Let M denote the collection of these candidate models. Each model M, i.e., each element of M, can – from a mathematical point of view – be viewed as a collection of probability distributions for Y implied by the model. That is, M is given by M = {Pη: η ∈ H}, where Pη denotes a probability distribution for Y and H represents the ‘parameter ’ space (which can be different across different models M). The ‘parameter ’ space H need not be finitedimensional. Often, the ‘parameter ’ η will be partitioned into (η1, η2) where η1 is a finitedimensional parameter whereas η2 is infinitedimensional. In case the parameterization is identified, i.e., the map η → Pη is injective on H, we will often not distinguish between M and H and will use them synonymously. The model selection problem is now to select – based on the data Y – a model M ̂ = M̂(Y) in M such that M ̂ is a ‘good ’ model for the data Y. Of course, the sense, in which the selected model should be a ‘good ’ model, needs to be made precise and is a crucial point in the analysis. This is particularly important if – as is usually the case – selecting the model M ̂ is not the final
Estimating a Difference of KullbackLeibler Risks Using a Normalized Difference of AIC, The Annals of Applied Statistics
"... se rm ..."
(Show Context)
Bootstrap variants of the akaike information criterion for mixed model selection
 Comput. Stat. Data Anal
, 2008
"... Abstract: Two bootstrapcorrected variants of the Akaike information criterion are proposed for the purpose of smallsample mixed model selection. These two variants are asymptotically equivalent, and provide asymptotically unbiased estimators of the expected KullbackLeibler discrepancy between the ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
Abstract: Two bootstrapcorrected variants of the Akaike information criterion are proposed for the purpose of smallsample mixed model selection. These two variants are asymptotically equivalent, and provide asymptotically unbiased estimators of the expected KullbackLeibler discrepancy between the true model and a fitted candidate model. The performance of the criteria is investigated in a simulation study where the random effects and the errors for the true model are generated from a Gaussian distribution. The parametric bootstrap is employed. The simulation results suggest that both criteria provide effective tools for choosing a mixed model with an appropriate mean and covariance structure. A theoretical asymptotic justification for the variants is presented in the Appendix.
TreeStructured GARCH Models
, 2000
"... We propose a new GARCH model with treestructured multiple thresholds for volatility estimation in nancial time series. The approach relies on the idea of a binary tree where every terminal node parameterizes a (local) GARCH model for a partition cell of the predictor space. Fitting of such trees is ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
We propose a new GARCH model with treestructured multiple thresholds for volatility estimation in nancial time series. The approach relies on the idea of a binary tree where every terminal node parameterizes a (local) GARCH model for a partition cell of the predictor space. Fitting of such trees is constructed within the likelihood framework for nonGaussian observations: it is very dierent from the wellknown CART procedure for regression which is based on residual sum of squares. Our strategy includes the classical GARCH model as a special case and allows to increase modelcomplexity in a systematic and exible way. We derive a consistency result and conclude with simulations and real data analysis that the new method has better predictive potential in comparison with other approaches. Keywords: Conditional variance; Financial time series; GARCH model; Maximum likelihood; Threshold model; Tree model; Volatility. 1 Introduction We propose a new method for estimating volatility in...
InSample OutofSample Fit: Their Joint Distribution and Its Implications for Model Selection.” Unpublished manuscript
, 2008
"... Prepared for the 5th ECB Workshop on Forecasting Techniques We consider the case where a parameter, ; is estimated by maximizing a criterion function, Q(X;). The estimate is then used to evaluate the criterion function with the same data, X, as well as with an independent data set, Y. The insample ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
Prepared for the 5th ECB Workshop on Forecasting Techniques We consider the case where a parameter, ; is estimated by maximizing a criterion function, Q(X;). The estimate is then used to evaluate the criterion function with the same data, X, as well as with an independent data set, Y. The insample …t and outofsample …t relative to that of 0; the “true ” parameter, are given by Tx;x =
InSample Fit and OutofSample Fit: Their Joint Distribution and its Implications for Model Selection. mimeo
, 2009
"... We consider the case where a parameter, ; is estimated by maximizing a criterion function, Q(X; ). The estimate, ̂ = ̂(X); is then used to evaluate the criterion function with the same data, X, as well as with an independent data set, Y. The insample t and outofsample
t relative to that of the ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
We consider the case where a parameter, ; is estimated by maximizing a criterion function, Q(X; ). The estimate, ̂ = ̂(X); is then used to evaluate the criterion function with the same data, X, as well as with an independent data set, Y. The insample t and outofsample
t relative to that of the true, or quasitrue, parameter, ; are de
ned by = Q(X; ̂)Q(X; ) and ~ = Q(Y; ̂)Q(Y; ), respectively. We derive the joint limit distribution of (; ~) for a broad class of criterion functions and the joint distribution reveals that and ~ are strongly negatively related. The implication is that good insample
t translates into poor outofsample
t, onetoone. The result exposes a winners curse problem when multiple models are compared in terms of their insample
t. The winners curse has important implications for model selection by standard information criteria such as AIC and BIC.