Results 1  10
of
104
Bayesian Multivariate Time Series Methods for Empirical Macroeconomics
, 2009
"... Macroeconomic practitioners frequently work with multivariate time series models such as VARs, factor augmented VARs as well as timevarying parameter versions of these models (including variants with multivariate stochastic volatility). These models have a large number of parameters and, thus, over ..."
Abstract

Cited by 56 (12 self)
 Add to MetaCart
(Show Context)
Macroeconomic practitioners frequently work with multivariate time series models such as VARs, factor augmented VARs as well as timevarying parameter versions of these models (including variants with multivariate stochastic volatility). These models have a large number of parameters and, thus, overparameterization problems may arise. Bayesian methods have become increasingly popular as a way of overcoming these problems. In this monograph, we discuss VARs, factor augmented VARs and timevarying parameter extensions and show how Bayesian inference proceeds. Apart from the simplest of VARs, Bayesian inference requires the use of Markov chain Monte Carlo methods developed for state space models and we describe these algorithms. The focus is on the empirical macroeconomist and we offer advice on how to use these models and methods in practice and include empirical illustrations. A website provides Matlab code for carrying out Bayesian inference in these models.
Analysis of population structure: a unifying framework and novel methods based on sparse factor analysis, PLoS genetics 2010;6:e1001117
"... Abstract We consider the statistical analysis of population structure using genetic data. We show how the two most widely used approaches to modeling population structure, admixturebased models and principal components analysis (PCA), can be viewed within a single unifying framework of matrix fact ..."
Abstract

Cited by 34 (4 self)
 Add to MetaCart
(Show Context)
Abstract We consider the statistical analysis of population structure using genetic data. We show how the two most widely used approaches to modeling population structure, admixturebased models and principal components analysis (PCA), can be viewed within a single unifying framework of matrix factorization. Specifically, they can both be interpreted as approximating an observed genotype matrix by a product of two lowerrank matrices, but with different constraints or prior distributions on these lowerrank matrices. This opens the door to a large range of possible approaches to analyzing population structure, by considering other constraints or priors. In this paper, we introduce one such novel approach, based on sparse factor analysis (SFA). We investigate the effects of the different types of constraint in several real and simulated data sets. We find that SFA produces similar results to admixturebased models when the samples are descended from a few welldifferentiated ancestral populations and can recapitulate the results of PCA when the population structure is more ''continuous,'' as in isolationbydistance models.
Sparse Statistical Modelling in Gene Expression Genomics
, 2006
"... The concept of sparsity is more and more central to practical data analysis and inference with increasingly highdimensional data. Gene expression genomics is a key example context. As part of a series of projects that has developed Bayesian methodology for largescale regression, ANOVA and latent f ..."
Abstract

Cited by 31 (10 self)
 Add to MetaCart
The concept of sparsity is more and more central to practical data analysis and inference with increasingly highdimensional data. Gene expression genomics is a key example context. As part of a series of projects that has developed Bayesian methodology for largescale regression, ANOVA and latent factor models, we have extended traditional Bayesian “variable selection” priors and modelling ideas to new hierarchical sparsity priors that are providing substantial practical gains in addressing false discovery and isolating significant genespecific parameters/effects in highly multivariate studies involving thousands of genes. We discuss and review these developments, in the contexts of multivariate regression, ANOVA and latent factor models for multivariate gene expression data arising in either observational or designed experimental studies. The development includes the use of sparse regression components to provide genesample specific normalisation/correction based on control and housekeeping factors, an important general issue and one that can be critical and critically misleading if ignored in many gene expression studies. Two rich data sets are used to provide context and illustration. The first data set arises from a gene expression experiment designed to investigate the transcriptional response in terms of responsive gene subsets and their expression signatures to interventions that upregulate a series of key oncogenes. The second data set is observational, breast cancer tumourderived data evaluated utilising a sparse latent factor model to define and isolate factors underlying the hugely complex patterns of association in gene expression patterns. We also mention software that implements these and other models and methods in one comprehensive framework.
Default prior distributions and efficient posterior computation in Bayesian factor analysis
 Journal of Computational and Graphical Statistics
, 2009
"... Factor analytic models are widely used in social sciences. These models have also proven useful for sparse modeling of the covariance structure in multidimensional data. Normal prior distributions for factor loadings and inverse gamma prior distributions for residual variances are a popular choice b ..."
Abstract

Cited by 28 (6 self)
 Add to MetaCart
Factor analytic models are widely used in social sciences. These models have also proven useful for sparse modeling of the covariance structure in multidimensional data. Normal prior distributions for factor loadings and inverse gamma prior distributions for residual variances are a popular choice because of their conditionally conjugate form. However, such prior distributions require elicitation of many hyperparameters and tend to result in poorly behaved Gibbs samplers. In addition, one must choose an informative specification, as high variance prior distributions face problems due to impropriety of the posterior distribution. This article proposes a default, heavy tailed prior distribution specification, which is induced through parameter expansion while facilitating efficient posterior computation. We also develop an approach to allow uncertainty in the number of factors. The methods are illustrated through simulated examples and epidemiology and toxicology applications.
Bayesian latent variable models for mixed discrete outcomes
 Biostatistics
, 2005
"... In studies of complex health conditions, mixtures of discrete outcomes (event time, count, binary, ordered categorical) are commonly collected. For example, studies of skin tumorigenesis record latency time prior to the first tumor, increases in the number of tumors at each week, and the occurrence ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
(Show Context)
In studies of complex health conditions, mixtures of discrete outcomes (event time, count, binary, ordered categorical) are commonly collected. For example, studies of skin tumorigenesis record latency time prior to the first tumor, increases in the number of tumors at each week, and the occurrence of internal tumors at the time of death. Motivated by this application, we propose a general underlying Poisson variable framework for mixed discrete outcomes, accommodating dependency through an additive gamma frailty model for the Poisson means. The model has loglinear, complementary loglog, and proportional hazards forms for count, binary and discrete event time outcomes, respectively. Simple closed form expressions can be derived for the marginal expectations, variances, and correlations. Following a Bayesian approach to inference, conditionallyconjugate prior distributions are chosen that facilitate posterior computation via an MCMC algorithm. The methods are illustrated using data from a Tg.AC mouse bioassay study.
Model averaging and dimension selection for the singular value decomposition
 Journal of the American Statistical Association
, 2007
"... Many multivariate data analysis techniques for an m × n matrix Y are related to the model Y = M+E, where Y is an m×n matrix of full rank and M is an unobserved mean matrix of rank K < (m ∧ n). Typically the rank of M is estimated in a heuristic way and then the leastsquares estimate of M is obta ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
Many multivariate data analysis techniques for an m × n matrix Y are related to the model Y = M+E, where Y is an m×n matrix of full rank and M is an unobserved mean matrix of rank K < (m ∧ n). Typically the rank of M is estimated in a heuristic way and then the leastsquares estimate of M is obtained via the singular value decomposition of Y, yielding an estimate that can have a very high variance. In this paper we suggest a modelbased alternative to the above approach by providing prior distributions and posterior estimation for the rank of M and the components of its singular value decomposition. In addition to providing more accurate inference, such an approach has the advantage of being extendable to more general dataanalysis situations, such as inference in the presence of missing data and estimation in a generalized linear modeling framework.
A factor analysis of bond risk premia. In
 Handbook of Empirical Economics and Finance. Chapman and
, 2011
"... This paper uses the factor augmented regression framework to analyze the relation between bond excess returns and the macro economy. Using a panel of 131 monthly macroeconomic time series for the sample 1964:12007:12, we estimate 8 static factors by the method of asymptotic principal components. We ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
This paper uses the factor augmented regression framework to analyze the relation between bond excess returns and the macro economy. Using a panel of 131 monthly macroeconomic time series for the sample 1964:12007:12, we estimate 8 static factors by the method of asymptotic principal components. We also use Gibb sampling to estimate dynamic factors from the 131 series reorganized into 8 blocks. Regardless of how the factors are estimated, macroeconomic factors are found to have statistically significant predictive power for excess bond returns. We show how a bias correction to the parameter estimates of factor augmented regressions can be obtained. This bias is numerically trivial in our application. The predictive power of real activity for excess bond returns is robust even after accounting for finite sample inference problems. Forecasts of excess bond returns (or bond risk premia) are countercyclical. This implies that investors are compensated for risks associated with recessions.
Serial and parallel implementations of modelbased clustering via parsimonious Gaussian mixture
, 2009
"... ..."
Dynamic Factor Process Convolution Models for Multivariate SpaceTime Data with Application to Air Quality Assessment
 Environmental and Ecological Statistics
"... We propose a Bayesian dynamic factor process convolution model for multivariate spatial temporal processes and illustrate the utility of this approach in modeling large air quality monitoring data. Key advantages of this modeling framework are a descriptive parametrization of the crosscovariance ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
(Show Context)
We propose a Bayesian dynamic factor process convolution model for multivariate spatial temporal processes and illustrate the utility of this approach in modeling large air quality monitoring data. Key advantages of this modeling framework are a descriptive parametrization of the crosscovariance structure of the spacetime processes and dimension reduction features that allow full Bayesian inference procedures to remain computationally tractable for large data sets. These features result from modeling spacetime data as realizations of linear combinations of underlying spacetime elds. The underlying latent components are constructed by convolving temporallyevolving processes dened on a grid covering the spatial domain and include both trend and cyclical components. We argue that mixtures of such components can realistically describe a variety of spacetime environmental processes and are especially applicable to air pollution processes that have complex spacetime dependencies. In addition to computational benets that arise from the dimension reduction features of the model, the process convolution structure permits misaligned and missing data without the need for imputation when tting the model. This advantage is especially useful when constructing models for data collected at monitoring stations that have misaligned sampling schedules and that are frequently out of service for long stretches of time. We illustrate the modeling approach using a multivariate pollution dataset taken from the EPA's CASTNet database.
Bayesian gaussian copula factor models for mixed data. Arxiv Preprint arXiv:1111.0317
, 2011
"... Gaussian factor models have proven widely useful for parsimoniously characterizing dependence in multivariate data. There is a rich literature on their extension to mixed categorical and continuous variables, using latent Gaussian variables or through generalized latent trait models acommodating mea ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
(Show Context)
Gaussian factor models have proven widely useful for parsimoniously characterizing dependence in multivariate data. There is a rich literature on their extension to mixed categorical and continuous variables, using latent Gaussian variables or through generalized latent trait models acommodating measurements in the exponential family. However, when generalizing to nonGaussian measured variables the latent variables typically influence both the dependence structure and the form of the marginal distributions, complicating interpretation and introducing artifacts. To address this problem we propose a novel class of Bayesian Gaussian copula factor models which decouple the latent factors from the marginal distributions. A semiparametric specification for the marginals based on the extended rank likelihood yields straightforward implementation and substantial computational gains. We provide new theoretical and empirical justifications for using this likelihood in Bayesian inference. We propose new default priors for the factor loadings and develop efficient parameterexpanded Gibbs sampling for posterior computation. The methods are evaluated through simulations and applied to a dataset in political science. The models in this paper are implemented in the R package bfa.