Results 1  10
of
15
Constrained Bayesian Inference for Low Rank Multitask Learning
"... We present a novel approach for constrained Bayesian inference. Unlike current methods, our approach does not require convexity of the constraint set. We reduce the constrained variational inference to a parametric optimization over the feasible set of densities and propose a general recipe for such ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
We present a novel approach for constrained Bayesian inference. Unlike current methods, our approach does not require convexity of the constraint set. We reduce the constrained variational inference to a parametric optimization over the feasible set of densities and propose a general recipe for such problems. We apply the proposed constrained Bayesian inference approach to multitask learning subject to rank constraints on the weight matrix. Further, constrained parameter estimation is applied to recover the sparse conditional independence structure encoded by prior precision matrices. Our approach is motivated by reverse inference for high dimensional functional neuroimaging, a domain where the high dimensionality and small number of examples requires the use of constraints to ensure meaningful and effective models. For this application, we propose a model that jointly learns a weight matrix and the prior inverse covariance structure between different tasks. We present experimental validation showing that the proposed approach outperforms strong baseline models in terms of predictive performance and structure recovery. 1
The Bigraphical Lasso
"... The i.i.d. assumption in machine learning is endemic, but often flawed. Complex data sets exhibit partial correlations between both instances and features. A model specifying both types of correlation can have a number of parameters that scales quadratically with the number of features and data poin ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
The i.i.d. assumption in machine learning is endemic, but often flawed. Complex data sets exhibit partial correlations between both instances and features. A model specifying both types of correlation can have a number of parameters that scales quadratically with the number of features and data points. We introduce the bigraphical lasso, an estimator for precision matrices of matrixnormals based on the Cartesian product of graphs. A prominent product in spectral graph theory, this structure has appealing properties for regression, enhanced sparsity and interpretability. To deal with the parameter explosion we introduce ℓ1 penalties and fit the model through a flipflop algorithm that results in a linear number of lasso regressions. We demonstrate the performance of our approach with simulations and an example from the COIL image data set. 1.
Fast Kronecker Inference in Gaussian Processes with nonGaussian Likelihoods
"... Gaussian processes (GPs) are a flexible class of methods with state of the art performance on spatial statistics applications. However, GPs require O(n3) computations and O(n2) storage, and popular GP kernels are typically limited to smoothing and interpolation. To address these difficulties, Krone ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Gaussian processes (GPs) are a flexible class of methods with state of the art performance on spatial statistics applications. However, GPs require O(n3) computations and O(n2) storage, and popular GP kernels are typically limited to smoothing and interpolation. To address these difficulties, Kronecker methods have been used to exploit structure in the GP covariance matrix for scalability, while allowing for expressive kernel learning (Wilson et al., 2014). However, fast Kronecker methods have been confined to Gaussian likelihoods. We propose new scalable Kronecker methods for Gaussian processes with nonGaussian likelihoods, using a Laplace approximation which involves linear conjugate gradients for inference, and a lower bound on the GP marginal likelihood for kernel learning. Our approach has near linear scaling, requiring O(DnD+1D) operations and O(Dn 2D) storage, for n training datapoints on a dense D> 1 dimensional grid. Moreover, we introduce a log Gaussian Cox process, with highly expressive kernels, for modelling spatiotemporal count processes, and apply it to a point pattern (n = 233,088) of a decade of crime events in Chicago. Using our model, we discover spatially varying multiscale seasonal trends and produce highly accurate longrange local area forecasts. 1.
Fast NearGRID Gaussian Process Regression
"... Gaussian process regression (GPR) is a powerful nonlinear technique for Bayesian inference and prediction. One drawback is its O(N 3) computational complexity for both prediction and hyperparameter estimation for N input points which has led to much work in sparse GPR methods. In case that the cova ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Gaussian process regression (GPR) is a powerful nonlinear technique for Bayesian inference and prediction. One drawback is its O(N 3) computational complexity for both prediction and hyperparameter estimation for N input points which has led to much work in sparse GPR methods. In case that the covariance function is expressible as a tensor product kernel (TPK) and the inputs form a multidimensional grid, it was shown that the costs for exact GPR can be reduced to a subquadratic function of N. We extend these exact fast algorithms to sparse GPR and remark on a connection to Gaussian process latent variable models (GPLVMs). In practice, the inputs may also violate the multidimensional grid constraints so we pose and efficiently solve missing and extra data problems for both exact and sparse grid GPR. We demonstrate our method on synthetic, text scan, and magnetic resonance imaging (MRI) data reconstructions. 1
A Multivariate Timeseries Modeling Approach to Severity of Illness Assessment and Forecasting in ICU with Sparse, Heterogeneous Clinical Data
"... The ability to determine patient acuity (or severity of illness) has immediate practical use for clinicians. We evaluate the use of multivariate timeseries modeling with the multitask Gaussian process (GP) models using noisy, incomplete, sparse, heterogeneous and unevenlysampled clinical data, inc ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
The ability to determine patient acuity (or severity of illness) has immediate practical use for clinicians. We evaluate the use of multivariate timeseries modeling with the multitask Gaussian process (GP) models using noisy, incomplete, sparse, heterogeneous and unevenlysampled clinical data, including both physiological signals and clinical notes. The learned multitask GP (MTGP) hyperparameters are then used to assess and forecast patient acuity. Experiments were conducted with two real clinical data sets acquired from ICU patients: firstly, estimating cerebrovascular pressure reactivity, an important indicator of secondary damage for traumatic brain injury patients, by learning the interactions between intracranial pressure and mean arterial blood pressure signals, and secondly, mortality prediction using clinical progress notes. In both cases, MTGPs provided improved results: an MTGP model provided better results than singletask GP models for signal interpolation and forecasting (0.91 vs 0.69 RMSE), and the use of MTGP hyperparameters obtained improved results when used as additional classification features (0.812 vs 0.788 AUC). 1
Fast laplace approximation for gaussian processes with a tensor product kernel
 In Proceedings of 22th Benelux Conference on Artificial Intelligence (BNAIC
, 2014
"... The following full text is a preprint version which may differ from the publisher's version. ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
The following full text is a preprint version which may differ from the publisher's version.
Constrained relative entropy minimization with applications to multitask learning
, 2013
"... Copyright by ..."
(Show Context)
Learning the dependency structure of latent factors
 In Advances in Neural Information Processing Systems 25
, 2012
"... Abstract In this paper, we study latent factor models with dependency structure in the latent space. We propose a general learning framework which induces sparsity on the undirected graphical model imposed on the vector of latent factors. A novel latent factor model SLFA is then proposed as a matri ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract In this paper, we study latent factor models with dependency structure in the latent space. We propose a general learning framework which induces sparsity on the undirected graphical model imposed on the vector of latent factors. A novel latent factor model SLFA is then proposed as a matrix factorization problem with a special regularization term that encourages collaborative reconstruction. The main benefit (novelty) of the model is that we can simultaneously learn the lowerdimensional representation for data and model the pairwise relationships between latent factors explicitly. An online learning algorithm is devised to make the model feasible for largescale learning problems. Experimental results on two synthetic data and two realworld data sets demonstrate that pairwise relationships and latent factors learned by our model provide a more structured way of exploring highdimensional data, and the learned representations achieve the stateoftheart classification performance.
Multiple Output Regression with Latent Noise
, 2016
"... Abstract In highdimensional data, structured noise caused by observed and unobserved factors affecting multiple target variables simultaneously, imposes a serious challenge for modeling, by masking the often weak signal. Therefore, (1) explaining away the structured noise in multipleoutput regres ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract In highdimensional data, structured noise caused by observed and unobserved factors affecting multiple target variables simultaneously, imposes a serious challenge for modeling, by masking the often weak signal. Therefore, (1) explaining away the structured noise in multipleoutput regression is of paramount importance. Additionally, (2) assumptions about the correlation structure of the regression weights are needed. We note that both can be formulated in a natural way in a latent variable model, in which both the interesting signal and the noise are mediated through the same latent factors. Under this assumption, the signal model then borrows strength from the noise model by encouraging similar effects on correlated targets. We introduce a hyperparameter for the latent signaltonoise ratio which turns out to be important for modelling weak signals, and an ordered infinitedimensional shrinkage prior that resolves the rotational unidentifiability in reducedrank regression models. Simulations and prediction experiments with metabolite, gene expression, FMRI measurement, and macroeconomic time series data show that our model equals or exceeds the stateoftheart performance and, in particular, outperforms the standard approach of assuming independent noise and signal models.
Multivariate Temporal Symptomatic Characterization of Cardiac Arrest
"... Abstract — We model the temporal symptomatic characteristics of 171 cardiac arrest patients in Intensive Care Units. The temporal and feature dependencies in the data are illustrated using a mixture of matrix normal distributions. We found that the cardiac arrest temporal signature is best summarize ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract — We model the temporal symptomatic characteristics of 171 cardiac arrest patients in Intensive Care Units. The temporal and feature dependencies in the data are illustrated using a mixture of matrix normal distributions. We found that the cardiac arrest temporal signature is best summarized with six hours data prior to cardiac arrest events, and its statistical descriptions are significantly different from the measurements taken in the past two days. This matrix normal model can classify these patterns better than logistic regressions with lagged features. I.