Results 1  10
of
79
Strictly Proper Scoring Rules, Prediction, and Estimation
, 2007
"... Scoring rules assess the quality of probabilistic forecasts, by assigning a numerical score based on the predictive distribution and on the event or value that materializes. A scoring rule is proper if the forecaster maximizes the expected score for an observation drawn from the distribution F if he ..."
Abstract

Cited by 373 (28 self)
 Add to MetaCart
(Show Context)
Scoring rules assess the quality of probabilistic forecasts, by assigning a numerical score based on the predictive distribution and on the event or value that materializes. A scoring rule is proper if the forecaster maximizes the expected score for an observation drawn from the distribution F if he or she issues the probabilistic forecast F, rather than G ̸ = F. It is strictly proper if the maximum is unique. In prediction problems, proper scoring rules encourage the forecaster to make careful assessments and to be honest. In estimation problems, strictly proper scoring rules provide attractive loss and utility functions that can be tailored to the problem at hand. This article reviews and develops the theory of proper scoring rules on general probability spaces, and proposes and discusses examples thereof. Proper scoring rules derive from convex functions and relate to information measures, entropy functions, and Bregman divergences. In the case of categorical variables, we prove a rigorous version of the Savage representation. Examples of scoring rules for probabilistic forecasts in the form of predictive densities include the logarithmic, spherical, pseudospherical, and quadratic scores. The continuous ranked probability score applies to probabilistic forecasts that take the form of predictive cumulative distribution functions. It generalizes the absolute error and forms a special case of a new and very general type of score, the energy score. Like many other scoring rules, the energy score admits a kernel representation in terms of negative definite functions, with links to inequalities of Hoeffding type, in both univariate and multivariate settings. Proper scoring rules for quantile and interval forecasts are also discussed. We relate proper scoring rules to Bayes factors and to crossvalidation, and propose a novel form of crossvalidation known as randomfold crossvalidation. A case study on probabilistic weather forecasts in the North American Pacific Northwest illustrates the importance of propriety. We note optimum score approaches to point and quantile
Probabilistic forecasts, calibration and sharpness
 Journal of the Royal Statistical Society Series B
, 2007
"... Summary. Probabilistic forecasts of continuous variables take the form of predictive densities or predictive cumulative distribution functions. We propose a diagnostic approach to the evaluation of predictive performance that is based on the paradigm of maximizing the sharpness of the predictive dis ..."
Abstract

Cited by 116 (23 self)
 Add to MetaCart
Summary. Probabilistic forecasts of continuous variables take the form of predictive densities or predictive cumulative distribution functions. We propose a diagnostic approach to the evaluation of predictive performance that is based on the paradigm of maximizing the sharpness of the predictive distributions subject to calibration. Calibration refers to the statistical consistency between the distributional forecasts and the observations and is a joint property of the predictions and the events that materialize. Sharpness refers to the concentration of the predictive distributions and is a property of the forecasts only. A simple theoretical framework allows us to distinguish between probabilistic calibration, exceedance calibration and marginal calibration. We propose and study tools for checking calibration and sharpness, among them the probability integral transform histogram, marginal calibration plots, the sharpness diagram and proper scoring rules. The diagnostic approach is illustrated by an assessment and ranking of probabilistic forecasts of wind speed at the Stateline wind energy centre in the US Pacific Northwest. In combination with crossvalidation or in the time series context, our proposal provides very general, nonparametric alternatives to the use of information criteria for model diagnostics and model selection.
Probabilistic Quantitative Precipitation Forecasting Using Bayesian Model Averaging
, 2007
"... Bayesian model averaging (BMA) is a statistical way of postprocessing forecast ensembles to create predictive probability density functions (PDFs) for weather quantities. It represents the predictive PDF as a weighted average of PDFs centered on the individual biascorrected forecasts, where the wei ..."
Abstract

Cited by 57 (25 self)
 Add to MetaCart
Bayesian model averaging (BMA) is a statistical way of postprocessing forecast ensembles to create predictive probability density functions (PDFs) for weather quantities. It represents the predictive PDF as a weighted average of PDFs centered on the individual biascorrected forecasts, where the weights are posterior probabilities of the models generating the forecasts and reflect the forecasts ’ relative contributions to predictive skill over a training period. It was developed initially for quantities whose PDFs can be approximated by normal distributions, such as temperature and sea level pressure. BMA does not apply in its original form to precipitation, because the predictive PDF of precipitation is nonnormal in two major ways: it has a positive probability of being equal to zero, and it is skewed. In this study BMA is extended to probabilistic quantitative precipitation forecasting. The predictive PDF corresponding to one ensemble member is a mixture of a discrete component at zero and a gamma distribution. Unlike methods that predict the probability of exceeding a threshold, BMA gives a full probability distribution for future precipitation. The method was applied to daily 48h forecasts of 24h accumulated precipitation in the North American Pacific Northwest in 2003–04 using the University of Washington mesoscale ensemble. It yielded predictive distributions that were calibrated and sharp. It also gave probability of precipitation forecasts that were much better calibrated than those based on consensus voting of the ensemble members. It gave better estimates of the probability of highprecipitation events than logistic regression on the cube root of the ensemble mean.
Bayesian Modeling of Uncertainty in Ensembles of Climate Models
, 2008
"... Projections of future climate change caused by increasing greenhouse gases depend critically on numerical climate models coupling the ocean and atmosphere (GCMs). However, different models differ substantially in their projections, which raises the question of how the different models can best be co ..."
Abstract

Cited by 43 (7 self)
 Add to MetaCart
(Show Context)
Projections of future climate change caused by increasing greenhouse gases depend critically on numerical climate models coupling the ocean and atmosphere (GCMs). However, different models differ substantially in their projections, which raises the question of how the different models can best be combined into a probability distribution of future climate change. For this analysis, we have collected both current and future projected mean temperatures produced by nine climate models for 22 regions of the earth. We also have estimates of current mean temperatures from actual observations, together with standard errors, that can be used to calibrate the climate models. We propose a Bayesian analysis that allows us to combine the different climate models into a posterior distribution of future temperature increase, for each of the 22 regions, while allowing for the different climate models to have different variances. Two versions of the analysis are proposed, a univariate analysis in which each region is analyzed separately, and a multivariate analysis in which the 22 regions are combined into an overall statistical model. A crossvalidation approach is proposed to confirm the reasonableness of our Bayesian predictive distributions. The results of this analysis allow for a quantification of the uncertainty of climate model projections as a Bayesian posterior distribution, substantially extending previous approaches to uncertainty in climate models.
Calibrated probabilistic forecasting at the Stateline wind energy center: The regimeswitching spacetime (RST) method
 Journal of the American Statistical Association
, 2004
"... With the global proliferation of wind power, accurate shortterm forecasts of wind resources at wind energy sites are becoming paramount. Regimeswitching spacetime (RST) models merge meteorological and statistical expertise to obtain accurate and calibrated, fully probabilistic forecasts of wind s ..."
Abstract

Cited by 35 (14 self)
 Add to MetaCart
With the global proliferation of wind power, accurate shortterm forecasts of wind resources at wind energy sites are becoming paramount. Regimeswitching spacetime (RST) models merge meteorological and statistical expertise to obtain accurate and calibrated, fully probabilistic forecasts of wind speed and wind power. The model formulation is parsimonious, yet takes account of all the salient features of wind speed: alternating atmospheric regimes, temporal and spatial correlation, diurnal and seasonal nonstationarity, conditional heteroscedasticity, and nonGaussianity. The RST method identifies forecast regimes at the wind energy site and fits a conditional predictive model for each regime. Geographically dispersed meteorological observations in the vicinity of the wind farm are used as offsite predictors. The RST technique was applied to 2hour ahead forecasts of hourly average wind speed at the Stateline wind farm in the US Pacific Northwest. In July 2003, for instance, the RST forecasts had rootmeansquare error (RMSE) 28.6 % less than the persistence forecasts. For each month in the test period, the RST forecasts had lower RMSE than forecasts using stateoftheart vector time series techniques. The RST method provides probabilistic forecasts in the form of
Probabilistic forecasts of wind speed: ensemble model output statistics by using heteroscedastic censored regression,”
, 2009
"... ..."
(Show Context)
Combining spatial statistical and ensemble information in probabilistic weather forecasts
 Monthly Weather Review
, 2007
"... Forecast ensembles typically show a spreadskill relationship, but they are also often underdispersive, and therefore uncalibrated. Bayesian model averaging (BMA) is a statistical postprocessing method for forecast ensembles that generates calibrated probabilistic forecast products for weather quant ..."
Abstract

Cited by 12 (8 self)
 Add to MetaCart
Forecast ensembles typically show a spreadskill relationship, but they are also often underdispersive, and therefore uncalibrated. Bayesian model averaging (BMA) is a statistical postprocessing method for forecast ensembles that generates calibrated probabilistic forecast products for weather quantities at individual sites. This paper introduces the Spatial BMA technique, which combines BMA and the geostatistical output perturbation (GOP) method, and extends BMA to generate calibrated probabilistic forecasts of whole weather fields simultaneously, rather than just weather events at individual locations. At any site individually, Spatial BMA reduces to the original BMA technique. The Spatial BMA method provides statistical ensembles of weather field forecasts that take the spatial structure of observed fields into account and honor the flowdependent information contained in the dynamical ensemble. The members of the Spatial BMA ensemble are obtained by dressing the weather field forecasts from the dynamical ensemble with simulated spatially correlated error fields, in proportions that correspond to the BMA weights for the member models in the dynamical ensemble. Statistical ensembles of any size can be
Ensemble Model Output Statistics for Wind Vectors
 University of Heidelberg
, 2011
"... ar ..."
(Show Context)
TM (2007). “Comparison of EnsembleMOS Methods Using
 GFS Reforecasts.” Monthly Weather Review
"... Three recently proposed and promising methods for postprocessing ensemble forecasts based on their historical error characteristics (i.e., ensemblemodel output statistics methods) are compared using a multidecadal reforecast dataset. Logistic regressions and nonhomogeneous Gaussian regressions are ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
Three recently proposed and promising methods for postprocessing ensemble forecasts based on their historical error characteristics (i.e., ensemblemodel output statistics methods) are compared using a multidecadal reforecast dataset. Logistic regressions and nonhomogeneous Gaussian regressions are generally preferred for daily temperature, and for mediumrange (6–10 and 8–14 day) temperature and precipitation forecasts. However, the better sharpness of mediumrange ensembledressing forecasts sometimes yields the best Brier scores even though their calibration is somewhat worse. Using the long (15 or 25 yr) training samples that are available with these reforecasts improves the accuracy and skill of these probabilistic forecasts to levels that are approximately equivalent to gains of 1 day of lead time, relative to using short