| Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection. J. Econometrics, 75, 317--344. |
....density; otherwise it still has a positive probability of being accepted. The acceptance probability is governed by the reversible jump algorithm. The algorithm of [6] for logspline density estimation is similar to algorithms for univariate regression using polynomial splines proposed by [2] and [11]. To make (pointwise) 95 credible intervals about the logspline density estimate obtained from this Bayesian procedure, we use as endpoints the 2.5th and 97.5th percentiles of all Markov chain Monte Carlo simulations. Credible intervals have a di erent interpretation from (frequentist) con ....
Smith, M. and Kohn R. (1996), \Nonparametric regression using Bayesian variable selection," J. Environ., 75, 317-344.
....are most appropriate for the problem at hand. Keywords: Bayesian model averaging, generalized ridge regression, prediction, regression splines, shrinkage. 1 Introduction Regression modelling using Bayesian methods has been the subject of many recent papers (e.g. George and McCulloch, 1993; Smith and Kohn, 1996; Denison, Mallick and Smith, 1998; Hoeting et al. 1999) The main aim of many of these papers has been to provide good predictions to the response variable at points x within some domain of interest X , given the data D. The data is typically of the form D = fy i ; x i g n 1 where y i is known ....
....; i = 1) p(Dj i ; i = 0) p(Dj i ; i = 1) analytically where i refers to all the components of except the ith one. This approach has been applied successfully to simple linear regression (Hoeting et al. 1999) as well as nonlinear regression approaches such as spline tting (Smith and Kohn, 1996) and wavelets (Clyde et al. 1998) Both Hoeting et al. 1999) and Smith and Kohn (1996) adopt the g prior speci cation suggested by Zellner (1986) taking p( 2 ) N( j0; c 2 (X 0 X ) 1 )Gamma( 2 j0; 0) where the subscripts denote the design matrix made up of only the ....
[Article contains additional citation context not shown here]
Smith, M. and Kohn, R. (1996) Nonparametric regression using Bayesian variable selection. J. Econometrics, 75, 317-344.
....A Matlab version of the algorithm is available from the rst author on request. A key feature of our approach is that we accommodate uncertainty in the number of spline basis functions that make up the MARS model. Related work in the literature for regression problems includes the papers of Smith and Kohn (1996), Holmes and Mallick (1998) and Andrieu, de Freitas and Doucet (2000) For classi cation, the use of smoothing splines with latent variables and probit link function is reported in Wood and Kohn (1998) The rest of the paper is as follows. In Section 2 we present the Bayesian formulation of MARS ....
Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection, J. Econometrics 75: 317-344.
....simple variance components model, standard mixed model software such as PROC MIXED in SAS or lme( in S PLUS can be called upon to obtain a fully automatic fit. Even in the additive model context, fully automatic smoothing parameter choice is quite rare. The Markov Chain Monte Carlo approaches of Smith and Kohn (1996), and Shively, Kohn and Wood (1999) produce automatic additive model fits, and the S PLUS function step.gam( allows for some automation in smoothing splinebased additive models. However, the more common approach is to use simple rules such as three degrees of freedom per additive component ....
Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection. J. Econometrics, 75, 317--344.
....to these algorithms. Regression splines can be fit by ordinary least squares once the knots have been selected, but knot selection requires sophisticated algorithms that can be computationally intensive; see, for example, Friedman and Silverman s (1989) Turbo, Friedman s (1991) MARS algorithm, and Smith and Kohn s (1996) Bayesian knot selector based on Gibbs sampling. In this paper, we combine features of smoothing splines and regression splines. Our models often have far fewer parameters than a smoothing spline, but unlike MARS and other approaches to regression splines, the location of the knots is not as ....
....than 40. The reason is that 80 knots allows the spline to track the rapid oscillations on the left, but only if a local penalty is used. Also, comparing the results in Figure 8 to the results in Wand (1997) for j = 6, the local penalty approach is somewhat better than the Bayesian method of Smith and 19 Kohn (1996) and the stepwise selection method of Stone, Hansen, Kooperberg, and Truong (1997) We have also looked at moderate spatial variability (j = 4 or 5) There the local penalty estimator is better than the global penalty estimator, and again the local penalty estimator is as good as the Bayesian and ....
Smith, M., and Kohn, R. (1996), "Nonparametric regression using Bayesian variable selection," Journal of Econometrics, 75, 317--344.
....as measured by chlorophyll a. We use regression splines and product spline bases to allow for exible nonlinear functions and interactions in these variables. Because of the large number of possible models, we use Stochastic Search Variable Selection (SSVS) George and McCulloch 1993, 1997, Smith and Kohn 1996) to identify important models, and use Bayesian Model Averaging (BMA) Raftery, Madigan, and Hoeting 1997, Clyde 1999) to incorporate uncertainty about which predictor variables should be incorporated into the model. Rather than selecting a single model to make predictions, as is common practice, ....
....of these predictor variables, and is a n 1 vector of independent normal errors with mean 0 and variance 2 : In general, the curse of dimensionality prevents us from tting non parametric models with high order interactions. The model developed in this paper is an extension of the work of Smith and Kohn (1996) inspired by the MARS (multivariate adaptive regression splines) models of Friedman (1991) and allows exible two way interactions among the variables. Smith and Kohn presented a framework for semi parametric multiple regression using additive splines in a Bayesian hierarchical model for variable ....
[Article contains additional citation context not shown here]
Smith, M. and Kohn, R. (1996), \Nonparametric Regression Using Bayesian Variable Selection," Journal of Econometrics 75, 317-367.
....misspeci cation of the classi cation boundary introduced by decision stumps by integrating out the parameters that de ne them. This is known as Bayesian model averaging (Raftery, Madigan and Volinsky 1996) and it has been shown to improve predictive performance in many modelling situations (e.g. Smith and Kohn 1996, Denison, Mallick and Smith 1998a, Holmes, Denison and Mallick 1999) This is because averaging over all possible models minimises the posterior predictive squared error (Bernardo and Smith 1994) so is a natural approach when prediction is of prime importance. In the next section we review ....
Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection, J. Econometrics 75: 317-344.
....(1980) use this prior when considering model selection based on Bayes factors where a = 0 and c = c(n) is a deterministic function of sample size. These authors were motivated by a calibration between Bayes factors and penalized selection criteria in the form of BIC and AIC (see also Smith, 1996; and Smith and Kohn, 1996) Finally, Peterson (1986) builds on the work of Smith and Spiegelhalter (1980) by first choosing Sigma = X t X) Gamma1 and then suggesting that c be estimated via (marginal) maximum likelihood based on the same mixture (31) This is essentially Rissanen s (1989) ....
....use this prior when considering model selection based on Bayes factors where a = 0 and c = c(n) is a deterministic function of sample size. These authors were motivated by a calibration between Bayes factors and penalized selection criteria in the form of BIC and AIC (see also Smith, 1996; and Smith and Kohn, 1996). Finally, Peterson (1986) builds on the work of Smith and Spiegelhalter (1980) by first choosing Sigma = X t X) Gamma1 and then suggesting that c be estimated via (marginal) maximum likelihood based on the same mixture (31) This is essentially Rissanen s (1989) prescription. Throughout ....
[Article contains additional citation context not shown here]
Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection. J. Econometrics, 75, 317--344.
....#1980# use this prior when considering model selection based on Bayes factors where a = 0 and c = c#n# is a deterministic function of sample size. These authors were motivated by a #calibration between Bayes factors and penalized selection criteria in the form of BIC and AIC #see also Smith, 1996; and Smith and Kohn, 1996#.Finally,Peterson #1986# builds on the work of Smith and Spiegelhalter #1980# by #rst choosing # = #X t X# ,1 and then suggesting that c be estimated via #marginal# maximum likelihood based on the same mixture #31#. This is essentially Rissanen s #1989# prescription. ....
....use this prior when considering model selection based on Bayes factors where a = 0 and c = c#n# is a deterministic function of sample size. These authors were motivated by a #calibration between Bayes factors and penalized selection criteria in the form of BIC and AIC #see also Smith, 1996; and Smith and Kohn, 1996#.Finally,Peterson #1986# builds on the work of Smith and Spiegelhalter #1980# by #rst choosing # = #X t X# ,1 and then suggesting that c be estimated via #marginal# maximum likelihood based on the same mixture #31#. This is essentially Rissanen s #1989# prescription. Throughout our development ....
[Article contains additional citation context not shown here]
Smith, M. and Kohn, R. #1996#.Nonparametric regression using Bayesian variable selection. J. Econometrics, 75, 317#344.
....respectively interpreted as the expected signal to noise ratios and the parameters may not be uniquely estimated from the data (see Equation (2) 9 and the expected number of radial basis. The prior for the coecients has been previously advocated by various authors (George and Foster 1997, Smith and Kohn 1996). It corresponds to the popular g prior distribution (Zellner 1986) and can be derived using a maximum entropy approach (Andrieu 1998) An important property of this prior is that it penalises for basis functions being too close as, in this situation, the determinant of 1 i tends to zero. We ....
Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection, Journal of Econometrics 75(2): 317-343.
....available or even nonexistent for more complex models. Regression splines can be fit by ordinary least squares once the knots have been selected, but knot selection requires sophisticated algorithms that can be computationally intensive; see, for example, Friedman s (1991) MARS algorithm and Smith and Kohn s (1995) Bayesian knot selector based on Gibbs sampling. In this paper, we combine features of smoothing splines and regression splines. Our models have far fewer parameters than a smoothing spline, but unlike MARS and other approaches to regression splines, the location of the knots is not crucial since ....
Smith, M., and Kohn, R. (1995), "Nonparametric regression using Bayesian Variable Selection," Preprint.
....is taken into consideration and the estimation of the regression splines in each iteration of the algorithm is based on di erent knot settings. The nal estimator then is built as the mean of the estimators in each iteration resulting in a great exibility of the estimated spline function. Smith and Kohn (1996) proposed a Bayesian approach for univariate curve tting and additive models with normal response using Gibbs sampling. In each iteration of their algorithm signi cant knots are chosen from a set of candidate knots by Bayesian variable selection. A Bayesian approach for univariate curve tting ....
Smith, M. and Kohn, R. (1996). Nonparametric Regression using Bayesian Variable Selection, Journal of Econometrics 75: 317-343.
....data which makes the method highly resilient to irrelevant predictors. By examining the number of times each covariate enters into the model we gain a global measure of their relevance. Other Bayesian approaches to nonlinear regression using random splines have been considered by amongst others Smith and Kohn (1996), Denison, Mallick and Smith (1998a) and Holmes and Mallick (1998, 2000) Both Smith and Kohn (1996) and Denison et al. 1998a) restrict their attention to univariate splines. Denison, Mallick and Smith (1998b) provides a multivariate extension. Smith and Kohn (1996) adopt binary indicator ....
....times each covariate enters into the model we gain a global measure of their relevance. Other Bayesian approaches to nonlinear regression using random splines have been considered by amongst others Smith and Kohn (1996) Denison, Mallick and Smith (1998a) and Holmes and Mallick (1998, 2000) Both Smith and Kohn (1996) and Denison et al. 1998a) restrict their attention to univariate splines. Denison, Mallick and Smith (1998b) provides a multivariate extension. Smith and Kohn (1996) adopt binary indicator variables on the set of possible spline knot locations which requires the speci cation of the full model ....
[Article contains additional citation context not shown here]
Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection, J. Econometrics 75: 317-344.
....thereby allowing the model greater flexibility in those parts. In a Bayesian framework this spatial adaption is introduced by assuming a joint prior distributions on both the number and location of knots as well as on the regression coefficients. Examples of this approach include the papers of Smith Kohn (1996), Denison et al. 1998) Holmes Denison (1999) and ) who all consider Bayesian nonparametric basis models for Gaussian data. All of these papers consider Gaussian response data where conjugate priors exist for fi which leaves the posterior distribution in standard form and hence an analytic ....
Smith, M. & Kohn, R. (1996). Nonparametric regression using Bayesian variable selection. J. Econometrics, 75, 317--344.
....Also, in the original study it was hypothesised that when considering white blood cell count patients might be e ectively dichotomised into two groups determined by a cut point of CD45 at 20 . Recent work on Bayesian unconstrained curve tting with random structure can be found in the papers of Smith and Kohn (1996), Chipman et al. (1998) and Denison et al. (1998) Work on constrained curve tting by nonBayesian methods has been reported in Ramsey et al. (1984) and Friedman and Tibshirani (1984) as well as Schell and Singh (1997) In Section 2 we introduce the Bayesian generalised monotonic regression method ....
Smith, M. and Kohn, R. (1996) Nonparametric regression using Bayesian variable selection. J. Econometrics, 75, 2, 317-343.
.... is to approximate p(XjY ) with samples drawn from p(XjY ) This can be achieved using Markov chain Monte Carlo (MCMC) methods whereby a Markov chain is simulated whose steady state distribution matches the density of interest, i.e. p(XjY ) A number of papers have utilized this approach (Smith and Kohn, 1996; Denison et al. 1998; Holmes and Mallick, 1998; Clyde et al., 1998) However, a problem with all of these methods is that the convergence of the Markov chain cannot be assured. This paper aims to address this issue for the important and special case of orthogonal models. As a further motivation ....
Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection. Journal of Econometrics. 75, pp. 317-344.
....choice of Sigma, Smith and Spiegelhalter (1980) consider model selection based on Bayes factors where c = c(n) is a deterministic function of sample size. These authors were motivated by a calibration between Bayes factors and penalized selection criteria in the form of BIC and AIC (see also Smith, 1996; and Smith and Kohn, 1996) Zellner (1986) also discusses this prior specification, christening it the g prior, and derives expressions for various posterior quantities (although Zellner s discussion does not specifically touch on model selection) Finally, Peterson (1986) builds on the 23 work ....
....Smith and Spiegelhalter (1980) consider model selection based on Bayes factors where c = c(n) is a deterministic function of sample size. These authors were motivated by a calibration between Bayes factors and penalized selection criteria in the form of BIC and AIC (see also Smith, 1996; and Smith and Kohn, 1996). Zellner (1986) also discusses this prior specification, christening it the g prior, and derives expressions for various posterior quantities (although Zellner s discussion does not specifically touch on model selection) Finally, Peterson (1986) builds on the 23 work of Smith and Spiegelhalter ....
[Article contains additional citation context not shown here]
Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection. J. Econometrics, 75, 317--344.
....models without any inclusion of unstructured or spatial random effects, or Hastie and Tibshirani (1998) who derive the Gibbs sampler as a Bayesian version of backfitting. Bayesian basis function approaches with regression splines or a more general class of piecewise polynomials are proposed by Smith and Kohn (1996) and Denison et al. 1998) again without random effects. For fundamentally non Gaussian models, more general MCMC techniques than Gibbs sampling are needed, and there is still a lack of methods and of practical experience with existing suggestions. Hastie and Tibshirani (1998) sketch a ....
Smith, M., Kohn, R. (1996) Nonparametric regression using Bayesian variable selection. Journal of Econometrics, 75, 317-343.
....defined as its minimizer and (ii) a knot deletion algorithm which attempts to locate this minimizer. Various non MDL based regression spline smoothing procedures have been proposed in the literature. They are chiefly based on cross validation or Bayesian approaches: Friedman Silverman (1989) Smith Kohn (1996), Luo Wahba (1997) and Denison, Mallick Smith (1998) Notice that most of these procedures fix the order of the spline a priori. 2 Nonparametric Regression as Model Selection Suppose that n pairs of measurements fx i ; y i g n i=1 , y i = f(x i ) ffl i , ffl i iid N(0; oe 2 ) are ....
.... ) Such a knot deletion strategy is often called the greedy strategy (e.g. see Hastie 1989) The knot deletion algorithm continues until all initial knots are removed. One typical strategy for placing initial knots is to place a knot at every s sorted values of the x i s. As mentioned in Smith Kohn (1996), this initial knot placement strategy permits the initial 5 knots to follow the density of the x i s. In our implementation s is taken as between three to five. However, one referee pointed out that this strategy often fails in capturing high frequency signals, such as the left tail of ....
[Article contains additional citation context not shown here]
Smith, M. & Kohn, R. (1996), `Nonparametric regression using Bayesian variable selection', J.
....unify the two procedures into one. With the recent advances in computing technology, such methods are computationally feasible. A recent paper by Hoeting, Raftery, and Madigan (1995) proposes a simultaneous approach to variable and transformation selection based on change point transformations. Smith and Kohn (1996) propose a nonparametric approach to Bayesian variable selection automatically selecting independent variables and a power transformation for the response. In this article, we address variable selection and transformation selection from a predictive Bayesian viewpoint. We propose two methods for ....
Smith, M. and Kohn, R. (1996), "Nonparametrics Regression using Bayesian Variable Selection," Journal of Econometrics, 75, 317-367.
.... 2 2 k k I k ( k ) IG 2 k ; 0 ; 0 (16) p ( kj ) c k k IK (k) 17) p 2 , IG 2 ; 2 ; 2 (18) p ( Ga ( 19) p 2 , IG 2 ; 2 ; 2 : 20) Similar to what was done for linear regression models before in e.g. [9, 22], the matrix k is here set to k = ak = X T k X k ) 1 if k = a k , or k = k = 2 k if k = k . For this choice of ak the prior distribution p a k ; 2 k k; 2 corresponds to the popular g prior distribution (see [25] for a motivation) It can also be ....
M. Smith and R. Kohn. Nonparametric regression using Bayesian variable selection. Journal of Econometrics, 75(2):317-343, 1996.
....by penalized least squares estimation. Recently, there have appeared Bayesian methods that yield weighted averages of (essentially) least squares estimates. The averages are over the sets of possible knots, with a set s weight given by the posterior probability that the set is the true set (Smith Kohn, 1996; Denison, Mallick, Smith, 1998) In this paper we use a penalty approach similar to smoothing splines but with fewer knots. We allow K in (2) to be large but typically far less than n. Unlike knot selection techniques, we retain all candidate knots. As with smoothing splines, a roughness ....
....better than 40. The reason is that 80 knots allows the spline to track the rapid oscillations on the left, but only if a local penalty is used. Also, comparing the results in Figure 1 to the results in Wand (1997) for j = 6, the localpenalty approach is somewhat better than the Bayesian method of Smith Kohn (1996) and the stepwise selection method of Stone, Hansen, Kooperberg, Truong (1997) However, Wand s simulations used code provided by Smith that had 35 knots hard wired into it (Wand, personal communication) With more knots, the Smith and Kohn method could very well be competitive with the ....
Smith, M., & Kohn, R. (1996), "Nonparametric regression using Bayesian variable selection," Journal of Econometrics, 75, 317--344.
....regression models could also be considered to be a special case of the regression splines developed by Kooperberg et al. 1995) They use a linear spline approach to estimate the conditional hazard function of censored response data with one or more covariates. Our approach is also similar to Smith and Kohn s (1996) work on nonparametric regression. These authors suggest a Bayesian approach to select regression spline knots, variables and Box Cox transformations of the response variable. None of these authors account for model uncertainty in their work, but our approach could be generalized to account for ....
Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection. Journal of Econometrics, 75:317--343.
....involving Bayesian extensions to classical models. The general approach has been to embed classical nonparametric models within a Bayesian probabilistic framework. Examples of this type of approach include Chipman et al. 1998) Denison et al. 1998a,b, 1999) Holmes and Mallick (1998a) and Smith and Kohn (1996). These methods have been shown to have significant predictive power when the sample of generated models are averaged . This is partly due to the unavoidable model misspecification being countered by the model averaging (Hoeting et al. 1998) Bayesian model averaging also has the effect of ....
....priors to p( jT ) and p(T ) and use the equality p( T ) p( jT )p(T ) For the regression case with linear models within each region we choose independent normal prior distributions over each i conditional on the noise variance which is allocated an inverse gamma prior (p. 442, Bernardo and Smith, 1996) so p( i jT ; oe 2 ) MVN(fi i j0; oe 2 Gamma1 i ) i = 1; M (6) p(oe Gamma2 ) Ga(oe Gamma2 jfl 1 =2; fl 2 =2) where i is the prior precision matrix of fi i and fl 1 ; fl 2 are hyperparameters. We then find from (5) that the marginal likelihood of the data given T is ....
[Article contains additional citation context not shown here]
Smith, M. and Kohn, R. (1996) Nonparametric regression using Bayesian variable selection.
....of the classification boundary introduced by the weak classifier by integrating out the parameters that define it. This is known as Bayesian model averaging (Raftery, Madigan and Volinsky 1996) and it has been shown to improve predictive performance in many modelling situations (e.g. Smith and Kohn 1996, Denison, Mallick and Smith 1998a, Holmes, Denison and Mallick 1999) In the next section we review boosting in more detail and in Section 3 we demonstrate how Bayesian methods relate to the boosting approaches given in Section 2. Section 4 contains comparisons of the proposed algorithm with ....
Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection, J. Econometrics 75: 317--344.
....of outliers w is assumed to be small and will be estimated, while the variance inflation factor c is fixed a priori; see below for more comments on this. Model (4) is often known as inflated variance model for outliers; see Titterington, Smith Makov (1985, Ch. 4) and references given therein. Smith Kohn (1996) also adopted a similar approach to make their Bayesian regression procedure robust. Suppose that B, b, m and f k j g B j=1 are specified and that a legitimate guess of which y i s are contaminated by outliers is made. If there are n out of these suspected outliers , then w can be ....
....3. DMS: the Bayesian curve fitting procedure of Denison, Mallick Smith (1998) with = 5, 4. sure: the SureShrink procedure of Donoho Johnstone (1995) 5. Bayes: the BayesShrink procedure of Abramovich, Sapatinas Silverman (1998) 6. SK: the Bayesian regression spline smoothing procedure of Smith Kohn (1996), 7. RSW: local linear regression with the direct bandwidth plug in choice of Ruppert, Sheather Wand (1995) and 8. HST: nearest neighbour local polynomial estimator LOESS (Cleveland Devlin 1988) with the AIC c choice of smoothing parameter proposed by Hurvich, Simonoff Tsai (1998) ....
Smith, M. & Kohn, R. (1996), `Nonparametric regression using Bayesian variable selection', J. Econometrics 75, 317--344.
....by penalized least squares estimation. Recently, there have appeared Bayesian methods that yield weighted averages of (essentially) least squares estimates. The averages are over the sets of possible knots, with a set s weight given by the posterior probability that the set is the true set (Smith and Kohn, 1996; Denison, Mallick, and Smith, 1998) In this paper we use a penalty approach similar to smoothing splines but with less knots. We allow K in (2) to be large but typically far less than n. Unlike knot selection techniques, we retain all candidate knots. As with smoothing splines, a roughness ....
....better than 40. The reason is that 80 knots allows the spline to track the rapid oscillations on the left, but only if a local penalty is used. Also, comparing the results in Figure 1 to the results in Wand (1997) for j = 6, the localpenalty approach is somewhat better than the Bayesian method of Smith and Kohn (1996) and the stepwise selection method of Stone, Hansen, Kooperberg, and Truong (1997) However, Wand s simulations used code provide by Smith that had 35 knots hard wired into it (Wand, personal communication) With more knots, the Smith and Kohn method could very well be competitive with the ....
Smith, M., and Kohn, R. (1996), "Nonparametric regression using Bayesian variable selection, " Journal of Econometrics, 75, 317--344.
....effective, but a clearer understanding of their empirical and theoretical properties in this context would be valuable. Our approach to selecting a spline basis has been fairly crude we have only allowed equally spaced breakpoints. The work of Stone, Hansen, Kooperberg and Truong (1997) and Smith and Kohn (1996) suggest possibilities for variable knot selection in our multiple curve problem, but since their methods only deal with a single curve, extending their work appropriately would require substantial further development. In our analyses we have explicitly decomposed the random trajectories into a ....
Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection, Journal of Econometrics 75: 317--343.
....to determine which basis functions should be included in (8) If the basis has too many functions, estimates for fi i will have high variability or may even produce interpolation in the extreme case. Conversly, if too few functions are in the basis, the resulting estimator will be severely biased. Smith and Kohn (1996) propose to approximate f(x) by cubic regression splines using the truncated power series basis f(x) ff 0 ff 1 x ff 2 x ff 3 x 3 m X k=1 fi k (x Gamma k ) 3 ; where 1 : m is a large number of potential knots placed along the domain of the independent variable x, ....
....variable Y is, at least approximately, Gaussian. A well known class of such transformations are Box Cox type transformations. Given the transformation, regression techniques for Gaussian responses are applied. A Bayesian approach for a data driven choice of transformations is suggested in Smith and Kohn (1996). This section deals with the second approach, where the distribution of Y is directly modelled by a non Gaussian distribution. We distinguish between so called conditionally Gaussian models, where the density of Y is Gaussian conditional upon a further variable, often a mixture variable, and ....
SMITH, M., KOHN, R. (1996): Nonparametric Regression using Bayesian variable selection. Journal of Econometrics 75, 317-343.
....with variance oe 2 i Sigma i . The terms ffi 2 2 (R ) c and 2 R can be respectively interpreted as the expected signal to noise ratios and the expected number of radial basis. The prior for the coefficients has been previously advocated by various authors (George and Foster 1997, Smith and Kohn 1996). It corresponds to the popular g prior distribution (Zellner 1986) and can be derived using a maximum entropy approach (Andrieu 1998) An important property of this prior is that it penalises for basis functions being too close as, in this situation, the determinant of Sigma Gamma1 i tends to ....
Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection, Journal of Econometrics 75: 317--344.
....regression model that is estimated using a Gibbs sampling scheme. The Gibbs sampler is an estimation procedure similar in scope to maximum likelihood, with introductions given in Gelfand and Smith (1990) and Casella and George (1992) The hierarchical model in this paper was proposed in Smith and Kohn (1996) and improves and extends the work on such models by Mitchell and Beauchamp (1988) and George and McCulloch (1993) It makes it feasible to identify significant regressors from a large number of such variables, whereas this was not practical with previous approaches. Furthermore, previous ....
....Bayesian approach to semiparametric regression. It sets out the Bayesian hierarchical model, discusses the prior assumptions and briefly discusses estimation using the Gibbs sampler. However, for details on the implementation and further discussion of the sampling scheme the reader is referred to Smith and Kohn (1996; 1997) Section 4 reports the results of the application to the Australian print advertising data, while section 5 summarizes the implications of the methodology for marketing researchers. The appendix lists the variables used in the study. 2 Modeling the Starch print advertising data 2.1 ....
[Article contains additional citation context not shown here]
Smith, Michael and Kohn, Robert (1996), "Nonparametric regression using Bayesian variable selection," Journal of Econometrics, 75, 317-344.
....regression model that is estimated using a Gibbs sampling scheme. The Gibbs sampler is an estimation procedure similar in scope to maximum likelihood, with introductions given in Gelfand and Smith (1990) and Casella and George (1992) The hierarchical model in this paper was proposed in Smith and Kohn (1996) and improves and extends the work on such models by Mitchell and Beauchamp (1988) and George and McCulloch (1993) It makes it feasible to identify significant regressors from a large number of such variables, whereas this was not practical with previous approaches. Furthermore, previous ....
....Bayesian approach to semiparametric regression. It sets out the Bayesian hierarchical model, discusses the prior assumptions and briefly discusses estimation using the Gibbs sampler. However, for details on the implementation and further discussion of the sampling scheme the reader is referred to Smith and Kohn (1996; 1997) Section 4 reports the results of the application to the Australian print advertising data, while section 5 summarizes the implications of the methodology for marketing researchers. The appendix lists all the variables used in the study. 2 Modeling the Starch print advertising data 2.1 ....
[Article contains additional citation context not shown here]
Smith, Michael and Kohn, Robert (1996), "Nonparametric regression using Bayesian variable selection," Journal of Econometrics, 75, 317-344.
....knowledge there are currently no other approaches that simultaneously select variables, fit certain regressors nonparametrically, and select a suitable transformation for the data. e) The regression methodology can be made robust to outliers by modeling the errors as a mixture of normals as in Smith and Kohn (1996). The robustness properties of the resulting estimator are studied in detail by Smith, Sheather and Kohn (1995) and are shown to compare favorably to other robust estimators for linear regression. f) The methodology allows the researcher (and the manager) to introduce into the analysis prior ....
....by Mitchell and Beauchamp (1988) and George and McCulloch (1993) It makes it feasible to carry out variable selection with a large number of variables, whereas this was not practical with previous approaches. Furthermore, previous approaches did not consider data transformations nor robustness. Smith and Kohn (1996) explain the technical details of the approach and compare it to the work of George and McCulloch (1993) A good survey of current Bayesian approaches to variable selection is given by George and McCulloch (1995) We illustrate the methodology using Starch print advertising data from an Australian ....
[Article contains additional citation context not shown here]
Smith, Michael and Kohn, Robert (1996), "Nonparametric regression using Bayesian variable selection," Journal of Econometrics, in press.
....involving Bayesian extensions to classical models. The general approach has been to embed classical nonparametric models within a Bayesian probabilistic framework. Examples of this type of approach include Chipman et al. 1998) Denison et al. 1998a,b, 1999) Holmes and Mallick (1998a) and Smith and Kohn (1996). These methods have been shown to have significant predictive power when the sample of generated models are averaged . This is partly due to the unavoidable model misspecification being countered by the model averaging (Hoeting et al. 1998) Bayesian model averaging also has the effect of ....
....to p( jT ) and p(T ) and use the equality p( T ) p( jT )p(T ) For the regression case with linear models within each region we choose independent normal prior distributions over each i conditional on the noise variance which is allocated an inverse gamma prior (p. 442, Bernardo and Smith, 1996) so p( i jT ; oe 2 ) MVN(fi i j0; oe 2 Gamma1 i ) i = 1; M (6) p(oe Gamma2 ) Ga(oe Gamma2 j 1 2 fl 1 ; 1 2 fl 2 ) where i is the prior precision matrix of fi i and fl 1 ; fl 2 are hyperparameters. We then find from (5) that the marginal likelihood of the data ....
[Article contains additional citation context not shown here]
Smith, M. and Kohn, R. (1996) Nonparametric regression using Bayesian variable selection.
....unknown probability vector P and, in particular, eg A M depends on h A M the quantity that we are aiming to estimate in the first place. Therefore, to make this approach workable in practice some initial estimate for P is required. We decided to use the Bayesian regression spline smoother of Smith and Kohn (1996) because of its very good simulation performance using default values of the tuning parameters. A full description of the algorithm is : 1. Find P init initial estimate for P , by applying a Bayesian regression spline smoother to (x 1 ; P 1 ) x k ; P k ) 2. Set up a partition A = ....
Smith, M. and Kohn, R. (1996) Nonparametric regression using Bayesian variable selection, J. Econometrics, 75, 317--344.
....4.2 as inputs to a clustering algorithm also based on the MDL principle. The material in this section draws from a number of sources on MDL (see Rissanen, 1989, Speed and Yu, 1993, and Barron et al., 1998) as well as the current literature on Bayesian variable selection (see George and Foster 1998, Smith and Kohn 1996, and Wong, Hansen, Kohn and Smith 1997) By considering normal linear regression, we hope to highlight both the derivation and application of the MDL principle to model selection. 4.1.1 Several Forms of MDL for Model Selection Most of our readers will be familiar with two stage MDL or the use of ....
.... S = X 0 X) Gamma1 , so that S c = y 0 y cRSS 1 c ; and jI n Thetan cXSX 0 j = 1 c) k : Hence, the mixture form of MDL with Zellner s g prior for b is given by Gamma n Gamma k 2 log(1 c) n 2 log(y 0 y cRSS) 25) Rather than optimizing for a as we have done, Smith and Kohn (1996) began with an improper prior for t and obtain (25) as the (negative) log posterior of a model index g in the context of Bayesian variable selection for the normal linear model (see also George and Foster, 1998) While we arrive at the same expression, improper priors do not make sense from a ....
[Article contains additional citation context not shown here]
Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection. Journal of Econometrics, 75, 317--344.
....to Markov chain Monte Carlo simulation methods to summarise features of interest, such as predictive distributions and marginal model orders. The use of variable dimension samplers (Green, 1995; George McCulloch, 1993) are well recorded and have become increasingly popular in recent years (Smith Kohn, 1996; Denison et al. 1998; Holmes Mallick, 1998a; Holmes Mallick, 1998b) We utilise a hybrid method that embeds a varying dimension sampler into a scheme that iterates through the following three proposals steps: ffl Sample p(XjY ; Sigma) using a Metropolis Hastings type kernel. ffl Draw fi ....
....important task when fitting spline models is in determining the number and location of the knots. In a Bayesian framework the knots are treated as random and assigned prior probability densities. Hence X and fi are varying both in value and dimension. For examples of Bayesian spline models, see Smith Kohn (1996), Denison et al. 1998) and Holmes Mallick (1998b) In this Section we are interested in fitting multiple curves where correlation is suspected in the response. Multiple families of curves exist in many situations, see, for example, Ramsay Silverman (1997) As an example, we generated 10 ....
Smith, M. & Kohn, R. (1996). Nonparametric regression using Bayesian variable selection.
....: t m . Fitting too many knots can lead to approximations with a high degree of variability whereas too few can lead to oversmoothing. This task has the form of a variable selection problem in the set of possible knot points, which are commonly restricted to lie at the data points. Previously, Smith and Kohn (1996) used the Gibbs sampler and an indicator variable approach whereas Holmes and Mallick (1998) used a reversible jump sampler to select knot points. However, due to the high dimensionality of the problem and the high correlation between predictors these samplers can show extremely poor convergence ....
Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection. Journal of Econometrics. 75, pp. 317-344.
....for the errors and the transformation of the dependent variable. Furthermore, the approach in this paper can be made robust to outliers and can accommodate missing values of the dependent variable as in Barnett, Kohn and Sheather (1996) This paper links two lines of research. The first is by Smith and Kohn (1996) who combine regression splines with Bayesian variable selection to nonparametrically estimate an additive regression model with independent errors. They show that in the univariate case their approach acts as a variable bandwidth smoother and compares favorably with modern kernel weighted local ....
....the two most important problems in approximating f(x) by (2) is how many knots to use and where to place them. If too few knots are used, or they are badly placed, then important features of the curve may be missed. If too many knots are used then the estimate of f will have high local variance. Smith and Kohn (1996) solve this problem in the independent error case by introducing many knots from which significant knots are selected. We show how to extend their approach to the autocorrelated error case by rewriting (1) as a linear model. Let r = m 3, fi = b 0 ; b r ) 0 ; x = x 1 ; xn ) 0 ....
Smith, M., and Kohn, R., (1996), `Nonparametric Regression using Bayesian Variable Selection,' J. Econometrics, in press.
....The solution advocated in this paper is to use a basis with many terms and adopt a hierarchical Bayesian model in which terms are allowed to be in or out of the regression. It is this procedure that makes the estimator nonparametric, rather than a fit of a linear combination of parametric terms. Smith and Kohn (1996) propose a Bayesian approach to univariate nonparametric regression using a univariate regression spline basis containing a large number of elements. They consider two estimators of the regression function. The first estimator is based on the subset of basis functions with the highest posterior ....
....two estimators of the regression function. The first estimator is based on the subset of basis functions with the highest posterior probability. The second is an estimator of the posterior mean of the regression curve, averaging over the distribution of all possible subsets of the basis functions. Smith and Kohn (1996) show that this approach works well on both real and simulated data, comparing favorably with a kernel based local linear approach using a direct plug in estimator for the bandwidth. They also extend their analysis to additive and robust nonparametric regression. This paper extends the approach of ....
[Article contains additional citation context not shown here]
Smith, M., and Kohn R., (1996), "Nonparametric regression using Bayesian variable selection," J. Econometrics, vol. 75, no. 2, 317-344 Smith, M., Sheather, S.J., and Kohn, R. (1996), "Finite sample performance of robust Bayesian regression," J.
....function is unknown then it is often more appropriate to estimate it nonparametrically. In nonparametric regression a robust estimator attempts to protect the user locally from outlying observations. The Bayesian approach to robust nonparametric regression considered in this paper is that of Smith and Kohn (1996). They model the regression function using cubic regression splines having many knots, and choose the significant knots by Bayesian variable selection. As in the linear case, the errors are modeled as a mixture of normals. This is a model based approach to robust nonparametric regression which is ....
....line in the middle of the data is not the same as on the boundaries. We note that there are no high leverage outliers in this example. This example is important because unknown nonlinear regression functions are often estimated by regression splines, e.g. Hastie and Tibshirani (1990, Ch. 9) and Smith and Kohn (1996). In particular, regression splines are used for robust nonparametric function estimation in section 4.1. ffl Case 3: High leverage linear regression. The independent variable x Uniform(0; 0:4) and clean observations are generated from the model y = 1 2x e, where e N(0; oe 2 ) High ....
[Article contains additional citation context not shown here]
Smith, M. and Kohn, R. (1996), `Nonparametric regression using Bayesian variable selection', Journal of Econometrics, in press.
....for the errors and the transformation of the dependent variable. Furthermore, the approach in this paper can be made robust to outliers and can accommodate missing values of the dependent variable as in Barnett, Kohn and Sheather (1994) This paper links two lines of research. The first is by Smith and Kohn (1994) who combine regression splines with Bayesian variable selection to nonparametrically estimate an additive regression model with independent errors. They show that in the univariate case their approach acts as a variable bandwidth smoother and compares favorably with modern kernel weighted local ....
....local linear smoothing. The second line of research is by Barnett et al. 1995) who propose a Bayesian approach for robustly estimating an autoregressive model, simultaneously choosing the order of the model and estimating its parameters and any missing observations. We note that the work of Smith and Kohn (1994) is motivated by the Bayesian variable selection paper of George and McCulloch (1993) while Barnett et al. 1995) refine and extend the work of McCulloch and Tsay (1994) The paper is organized as follows. Section 2 describes the model and the prior assumptions and Section 3 discusses estimation ....
[Article contains additional citation context not shown here]
Smith, M., and Kohn, R. (1994) Nonparametric Regression using Bayesian Variable Selection.
No context found.
Smith, M. and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection. J. Econometrics, 75, 317--344.
No context found.
Smith M, Kohn R, Nonparametric regression using Bayesian variable selection, J. Econometrics 75:317--344, 1997.
No context found.
Smith, M. & Kohn, R. (1994). Nonparametric Regression using Bayesian Variable Selection. Journal of Econometrics, to appear.
No context found.
Smith M. and Kohn R. (1996). Nonparametric Regression Using Bayesian Variable Selection. Journal of Econometrics, 75, 317--343.
No context found.
Smith M. and Kohn R. (1996), "Nonparametric Regression Using Bayesian Variable Selection", Journal of Econometrics, 75, 317--343.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC