Results 1  10
of
351
Econometric methods for fractional response variables with an application to 401 (K) plan participation rates
, 1996
"... We develop attractive functional forms and simple quasilikelihood estimation methods for regression models with a fractional dependent variable. Compared with logodds type procedures, there is no difficulty in recovering the regression function for the fractional variable, and there is no need to ..."
Abstract

Cited by 442 (8 self)
 Add to MetaCart
We develop attractive functional forms and simple quasilikelihood estimation methods for regression models with a fractional dependent variable. Compared with logodds type procedures, there is no difficulty in recovering the regression function for the fractional variable, and there is no need to use ad hoc transformations to handle data at the extreme values of zero and one. We also offer some new, robust specification tests by nesting the logit or probit function in a more general functional form. We apply these methods to a data set of employee participation rates in 401 (k) pension plans.
Bayesian measures of model complexity and fit
 Journal of the Royal Statistical Society, Series B
, 2002
"... [Read before The Royal Statistical Society at a meeting organized by the Research ..."
Abstract

Cited by 436 (4 self)
 Add to MetaCart
[Read before The Royal Statistical Society at a meeting organized by the Research
Flexible smoothing with Bsplines and penalties
 STATISTICAL SCIENCE
, 1996
"... Bsplines are attractive for nonparametric modelling, but choosing the optimal number and positions of knots is a complex task. Equidistant knots can be used, but their small and discrete number allows only limited control over smoothness and fit. We propose to use a relatively large number of knots ..."
Abstract

Cited by 395 (6 self)
 Add to MetaCart
Bsplines are attractive for nonparametric modelling, but choosing the optimal number and positions of knots is a complex task. Equidistant knots can be used, but their small and discrete number allows only limited control over smoothness and fit. We propose to use a relatively large number of knots and a difference penalty on coefficients of adjacent Bsplines. We show connections to the familiar spline penalty on the integral of the squared second derivative. A short overview of Bsplines, their construction, and penalized likelihood is presented. We discuss properties of penalized Bsplines and propose various criteria for the choice of an optimal penalty parameter. Nonparametric logistic regression, density estimation and scatterplot smoothing are used as examples. Some details of the computations are presented.
Finding Approximate POMDP Solutions Through Belief Compression
, 2003
"... Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are generally considered to be intractable for large models. The intractability of these algorithms is to a large extent a consequence of computing an exact, optimal policy over the ent ..."
Abstract

Cited by 82 (3 self)
 Add to MetaCart
(Show Context)
Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are generally considered to be intractable for large models. The intractability of these algorithms is to a large extent a consequence of computing an exact, optimal policy over the entire belief space. However, in realworld POMDP problems, computing the optimal policy for the full belief space is often unnecessary for good control even for problems with complicated policy classes. The beliefs experienced by the controller often lie near a structured, lowdimensional manifold embedded in the highdimensional belief space. Finding a good approximation to the optimal value function for only this manifold can be much easier than computing the full value function. We introduce a new method for solving largescale POMDPs by reducing the dimensionality of the belief space. We use Exponential family Principal Components Analysis (Collins, Dasgupta, & Schapire, 2002) to represent sparse, highdimensional belief spaces using lowdimensional sets of learned features of the belief state. We then plan only in terms of the lowdimensional belief features. By planning in this lowdimensional space, we can find policies for POMDP models that are orders of magnitude larger than models that can be handled by conventional techniques. We demonstrate the use of this algorithm on a synthetic problem and on mobile robot navigation tasks. 1.
Mapping spatial pattern in biodiversity for regional conservation planning: Where from here? Systematic Biology 51:331–363
"... Abstract.—Vast gaps in available information on the spatial distribution of biodiversity pose a major challenge for regional conservation planning in many parts of the world. This problem is often addressed by basing such planning on various biodiversity surrogates. In some situations, distribution ..."
Abstract

Cited by 61 (6 self)
 Add to MetaCart
Abstract.—Vast gaps in available information on the spatial distribution of biodiversity pose a major challenge for regional conservation planning in many parts of the world. This problem is often addressed by basing such planning on various biodiversity surrogates. In some situations, distributional data for selected taxamay be used as surrogates for biodiversity as a whole.However, this approach is less effective in datapoor regions, where there may be little choice but to base conservation planning on some form of remote environmental mapping, derived, for example, from interpretation of satellite imagery or from numerical classication of abiotic environmental layers. Although this alternative approach confers obvious benets in terms of costeffectiveness and rapidity of application, problems may arise if congruence is poor between mapped landclasses and actual biological distributions. I propose three strategies for making more effective use of available biological data and knowledge to alleviate such problemsby (1)more closely integratingbiological andenvironmental data throughpredictive modeling, with increased emphasis on modeling collective properties of biodiversity rather than individual entities; (2) making more rigorous use of remotely mapped surrogates in conservation planning by incorporating knowledge of heterogeneity within landclasses, and of varying levels of distinctiveness between classes, into measures of conservation priority and achievement;
On the Misuses of Artificial Neural Networks for Prognostic and Diagnostic Classification in Oncology
 Statistics in Medicine
, 2000
"... The application of artificial neural networks (ANNs) for prognostic and diagnostic classification in clinical medicine has become very popular. Some indications might be derived from a recent "miniseries" in the Lancet 7,23,30,94 with three more or less enthusiastic review articles and ..."
Abstract

Cited by 53 (0 self)
 Add to MetaCart
The application of artificial neural networks (ANNs) for prognostic and diagnostic classification in clinical medicine has become very popular. Some indications might be derived from a recent "miniseries" in the Lancet 7,23,30,94 with three more or less enthusiastic review articles and an additional commentary expressing at least some scepticism. In this paper, the essentials of feedforward neural networks and their statistical counterparts (e.g. logistic regression models) are reviewed. We point to serious problems of ANNs as the fitting of implausible functions to describe the probability of class membership and the underestimation of misclassification probabilities. In applications of ANNs to survival data many suggested procedures result in predicted survival probabilities which are not necessarily monotone functions of time and lack a proper incorporation of censored observations. Finally, the results of a search in the medical literature from 1991 to 1995 on applications of A...
Bayesian Modelling of Inseparable SpaceTime Variation in Disease Risk
, 1998
"... This paper proposes a unified framework for the analysis of incidence or mortality data in space and time. The problem with such analysis is that the number of cases and the corresponding population at risk in any single unit of space \Theta time are too small to produce a reliable estimate of the u ..."
Abstract

Cited by 53 (2 self)
 Add to MetaCart
This paper proposes a unified framework for the analysis of incidence or mortality data in space and time. The problem with such analysis is that the number of cases and the corresponding population at risk in any single unit of space \Theta time are too small to produce a reliable estimate of the underlying disease risk without "borrowing strength" from neighbouring cells. The goal here could be described as one of smoothing, in which both spatial and nonspatial considerations may arise, and spatiotemporal interactions may become an important feature. Based on an extended version of the main effects model proposed in KnorrHeld and Besag (1998), four generic types of space \Theta time interactions are introduced. Each type implies a certain degree of prior (in)dependence for interaction parameters, and corresponds to the product of one of the two spatial main effects with one of the two temporal main effects. Data analysis is implemented via Markov chain Monte Carlo methods. The methodology is illustrated by an analysis of Ohio lung cancer data 196888. We compare the fit and the complexity of each model by the DIC criterion, recently proposed in Spiegelhalter et al. (1998).
Bayesian Deviance, the Effective Number of Parameters, and the Comparison of Arbitrarily Complex Models
, 1998
"... We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. We follow Dempster in examining the posterior distribution of the loglikelihood under each model, from which we derive measures of fit and complexity (the effective number of p ..."
Abstract

Cited by 51 (8 self)
 Add to MetaCart
(Show Context)
We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. We follow Dempster in examining the posterior distribution of the loglikelihood under each model, from which we derive measures of fit and complexity (the effective number of parameters). These may be combined into a Deviance Information Criterion (DIC), which is shown to have an approximate decisiontheoretic justification. Analytic and asymptotic identities reveal the measure of complexity to be a generalisation of a wide range of previous suggestions, with particular reference to the neural network literature. The contributions of individual observations to fit and complexity can give rise to a diagnostic plot of deviance residuals against leverages. The procedure is illustrated in a number of examples, and throughout it is emphasised that the required quantities are trivial to compute in a Markov chain Monte Carlo analysis, and require no analytic work for new...
Piecewisepolynomial regression trees
 Statistica Sinica
, 1994
"... A nonparametric function 1 estimation method called SUPPORT (“Smoothed and Unsmoothed PiecewisePolynomial Regression Trees”) is described. The estimate is typically made up of several pieces, each piece being obtained by fitting a polynomial regression to the observations in a subregion of the data ..."
Abstract

Cited by 49 (8 self)
 Add to MetaCart
A nonparametric function 1 estimation method called SUPPORT (“Smoothed and Unsmoothed PiecewisePolynomial Regression Trees”) is described. The estimate is typically made up of several pieces, each piece being obtained by fitting a polynomial regression to the observations in a subregion of the data space. Partitioning is carried out recursively as in a treestructured method. If the estimate is required to be smooth, the polynomial pieces may be glued together by means of weighted averaging. The smoothed estimate is thus obtained in three steps. In the first step, the regressor space is recursively partitioned until the data in each piece are adequately fitted by a polynomial of a fixed order. Partitioning is guided by analysis of the distributions of residuals and crossvalidation estimates of prediction mean square error. In the second step, the data within a neighborhood of each partition are fitted by a polynomial. The final estimate of the regression function is obtained by averaging the polynomial pieces, using smooth weight functions each of which diminishes rapidly to zero outside its associated partition. Estimates of derivatives of the regression function may be
Loss Functions for Binary Class Probability Estimation and Classification: Structure and Applications,” manuscript, available at wwwstat.wharton.upenn.edu/~buja
, 2005
"... What are the natural loss functions or fitting criteria for binary class probability estimation? This question has a simple answer: socalled “proper scoring rules”, that is, functions that score probability estimates in view of data in a Fisherconsistent manner. Proper scoring rules comprise most ..."
Abstract

Cited by 48 (1 self)
 Add to MetaCart
What are the natural loss functions or fitting criteria for binary class probability estimation? This question has a simple answer: socalled “proper scoring rules”, that is, functions that score probability estimates in view of data in a Fisherconsistent manner. Proper scoring rules comprise most loss functions currently in use: logloss, squared error loss, boosting loss, and as limiting cases costweighted misclassification losses. Proper scoring rules have a rich structure: • Every proper scoring rules is a mixture (limit of sums) of costweighted misclassification losses. The mixture is specified by a weight function (or measure) that describes which misclassification cost weights are most emphasized by the proper scoring rule. • Proper scoring rules permit Fisher scoring and Iteratively Reweighted LS algorithms for model fitting. The weights are derived from a link function and the above weight function. • Proper scoring rules are in a 11 correspondence with information measures for treebased classification.