Results 1  10
of
13
Fast variational Bayesian linear statespace model
 ECML PKDD 2013, Part I. LNCS
, 2013
"... Abstract. This paper presents a fast variational Bayesian method for linear statespace models. The standard variational Bayesian expectationmaximization (VBEM) algorithm is improved by a parameter expansion which optimizes the rotation of the latent space. With this approach, the inference is ord ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Abstract. This paper presents a fast variational Bayesian method for linear statespace models. The standard variational Bayesian expectationmaximization (VBEM) algorithm is improved by a parameter expansion which optimizes the rotation of the latent space. With this approach, the inference is orders of magnitude faster than the standard method. The speed of the proposed method is demonstrated on an artificial dataset and a large realworld dataset, which shows that the standard VBEM algorithm is not suitable for large datasets because it converges extremely slowly. In addition, the paper estimates the temporal state variables using a smoothing algorithm based on the block LDL decomposition. This smoothing algorithm reduces the number of required matrix inversions and avoids a model augmentation compared to previous approaches.
Manifold Alignment Determination
"... We present Manifold Alignment Determination (MAD), an algorithm for learning alignments between data points from multiple views or modalities. The approach is capable of learning correspondences between views as well as correspondences between individual datapoints. The proposed method requires onl ..."
Abstract
 Add to MetaCart
(Show Context)
We present Manifold Alignment Determination (MAD), an algorithm for learning alignments between data points from multiple views or modalities. The approach is capable of learning correspondences between views as well as correspondences between individual datapoints. The proposed method requires only a few aligned examples from which it is capable to recover a global alignment through a probabilistic model. The strong, yet flexible regularization provided by the generative model is sufficient to align the views. We provide experiments on both synthetic and real data to highlight the benefit of the proposed approach. 1
Models for Data Fusion in the Behavioral Sciences
"... This paper considers multiblock and factorbased methods for data fusion on coupled data matrices that all share the same data mode and demonstrates how to capture both similarities and differences in the data matrices by varying parameters in the objective function. By Iven Van Mechelen and Eva Ceu ..."
Abstract
 Add to MetaCart
(Show Context)
This paper considers multiblock and factorbased methods for data fusion on coupled data matrices that all share the same data mode and demonstrates how to capture both similarities and differences in the data matrices by varying parameters in the objective function. By Iven Van Mechelen and Eva Ceulemans ABSTRACT  We start from a few examples of coupled behavioral sciences data, along with associated research questions and dataanalytic methods. Linking up with these, we introduce a few concepts and distinctions, by means of which we specify the focus of this paper: 1) data that take the form of a collection of coupled matrices that are linked in either the experimental unit or the variable data mode; 2) associated with questions about the mechanisms underlying these data matrices; 3) which are to be addressed by dataanalytic methods that rely on a submodel per data matrix, with a common parameterization of the shared data mode. Next, we outline the principles of two closely related families within this focus: the families of multiblock component and factorbased models for data fusion (while considering both deterministic and stochastic model variants). Then, we review developments within these families to capture both similarities and differences between the different data matrices under study. We follow with a discussion on recent attempts to address quite a few challenges in data fusion based on multiblock component and factor models, including whether and how to differentially weigh the different data matrices under study, and problems such as dealing with large numbers of variables, outliers, and missing values. While the focus of this paper is on data and modeling contributions from the behavioral sciences, we point in a concluding section at their relevance for other domains and at the importance of related methods developed in those domains.
Expectation Propagation for Likelihoods Depending on an Inner Product of Two Multivariate Random Variables
"... We describe how a deterministic Gaussian posterior approximation can be constructed using expectation propagation (EP) for models, where the likelihood function depends on an inner product of two multivariate random variables. The family of applicable models includes a wide variety of important lin ..."
Abstract
 Add to MetaCart
(Show Context)
We describe how a deterministic Gaussian posterior approximation can be constructed using expectation propagation (EP) for models, where the likelihood function depends on an inner product of two multivariate random variables. The family of applicable models includes a wide variety of important linear latent variable models used in statistical machine learning, such as principal component and factor analysis, their linear extensions, and errorsinvariables regression. The EP computations are facilitated by an integral transformation of the Dirac delta function, which allows transforming the multidimensional integrals over the two multivariate random variables into an analytically tractable form up to onedimensional analytically intractable integrals that can be efficiently computed numerically. We study the resulting posterior approximations in sparse principal component analysis with Gaussian and probit likelihoods. Comparisons to Gibbs sampling and variational inference are presented. 1
A Bayesian Framework for MultiModality Analysis of Mental Health
"... We develop statistical methods for multimodality assessment of mental health, based on four forms of data: (i) selfreported answers to a set of classical questionnaires, (ii) singlenucleotide polymorphism (SNP) data, (iii) fMRI data measured in response to visual stimuli, and (iv) scores for psy ..."
Abstract
 Add to MetaCart
(Show Context)
We develop statistical methods for multimodality assessment of mental health, based on four forms of data: (i) selfreported answers to a set of classical questionnaires, (ii) singlenucleotide polymorphism (SNP) data, (iii) fMRI data measured in response to visual stimuli, and (iv) scores for psychiatric disorders. The data were acquired from hundreds of college students. We utilize the data and model to ask a timely and novel clinical question: can one predict brain activity associated with risk for mental illness and treatment response based on knowledge of how the subject answers questionnaires, and using genetic (SNP) data? Also, in another direction: can one predict an individual’s fundamental propensity for psychopathology based on observed selfreport, SNP and fMRI data (separately or in combination)? The data are analyzed with a multimodality factor model, with sparsity imposed on the factor loadings, linked to the particular type of data modality. The analysis framework encompasses a wide range of problems, such as matrix completion and clustering, leveraging information in all the data sources. We use an efficient variational inference algorithm to fit the model, which is especially flexible in dealing with ordinalvalued views (selfreport answers and SNP data). The variational inference is validated with slower but rigorous sampling methods. We demonstrate the effectiveness of the model to perform accurate predictions for clinically relevant brain activity relative to baseline models, and to identify meaningful associations between data views.
DOI 10.1007/s1099401353574 Bayesian object matching
"... Abstract Matching of object refers to the problem of inferring unknown cooccurrence or alignment between observations or samples in two data sets. Given two sets of equally many samples, the task is to find for each sample a representative sample in the other set, without prior knowledge on a dista ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract Matching of object refers to the problem of inferring unknown cooccurrence or alignment between observations or samples in two data sets. Given two sets of equally many samples, the task is to find for each sample a representative sample in the other set, without prior knowledge on a distance measure between the sets. Given a distance measure, the problem would correspond to a linear assignment problem, the problem of finding a permutation that reorders samples in one set to minimize the total distance. When no such measure is available, we need to consider more complex solutions. Typical approaches maximize statistical dependency between the two sets, whereas in this work we present a Bayesian solution that builds a joint model for the two sources. We learn a Bayesian canonical correlation analysis model that includes a permutation parameter for reordering the samples in one of the sets. We provide both variational and samplingbased inference for approximative Bayesian analysis, and demonstrate on three data sets that the resulting methods outperform the earlier solutions.
Probabilistic Partial Canonical Correlation Analysis
"... Partial canonical correlation analysis (partial CCA) is a statistical method that estimates a pair of linear projections onto a low dimensional space, where the correlation between two multidimensional variables is maximized after eliminating the influence of a third variable. Partial CCA is known ..."
Abstract
 Add to MetaCart
Partial canonical correlation analysis (partial CCA) is a statistical method that estimates a pair of linear projections onto a low dimensional space, where the correlation between two multidimensional variables is maximized after eliminating the influence of a third variable. Partial CCA is known to be closely related to a causality measure between two time series. However, partial CCA requires the inverses of covariance matrices, so the calculation is not stable. This is particularly the case for highdimensional data or small sample sizes. Additionally, we cannot estimate the optimal dimension of the subspace in the model. In this paper, we have addressed these problems by proposing a probabilistic interpretation of partial CCA and deriving a Bayesian estimation method based on the probabilistic model. Our numerical experiments demonstrated that our methods can stably estimate the model parameters, even in high dimensions or when there are a small number of samples. 1.
Supplementary Material for: “A Bayesian Framework for MultiModality Analysis of Mental Health”
"... This document presents supplementary materials to the submitted paper. It contains: (i) Details about inference procedures; (ii) Details on computational time; (iii) Comparison between VB and MCMC results; and (iv) Additional results on neuroscience data. 1 Details about posterior inference 1.1 Vari ..."
Abstract
 Add to MetaCart
This document presents supplementary materials to the submitted paper. It contains: (i) Details about inference procedures; (ii) Details on computational time; (iii) Comparison between VB and MCMC results; and (iv) Additional results on neuroscience data. 1 Details about posterior inference 1.1 Variational Bayes derivation Without lost of generality, throughout this derivation we only consider inference for the model without the regression component. Thus, we infer the variational distribution for the latent variables, collectively referred to as Θ, such that Θ = {W ̃ (m),αm, γm, τm}Mm=1,V} along with {{µj}Jj=1, z} for clustering. For the ordinal views, we denote the cutpoints as G = {gm}M1m=1 and the rotation matrix as Q. Sometimes, for brevity, we will use {W ̃,α,γ, τ} for {W ̃ (m),αm, γm, τm}Mm=1 and {µ, z} for {{µj}Jj=1, z}, respectively. The data from all views are collectively referred to as Y. We approximate the true posterior p(ΘY,G,Q)) by its meanfield approximation: q(Θ) = N∏ i=1 q(vi) M∏ m=1 K∏ k=1 q(w̃ (m) k) K∏ k=1 q(αmk)q(γm)q(τm). (1) The goal here is to minimize the KLdivergence KL(q(Θ)p(ΘY,G,Q)), which is equivalent to maximizing the evidence lower bound (ELBO) given by L(q(Θ),G,Q) = Eq(Θ)[log p(Y,ΘG,Q) − log(q(Θ))] (2) = 〈log p(Y,Θ) − log q(Θ)〉q(Θ) = 〈log p(Y W,V,γ, τ)〉+ 〈log p(W α)p(α)p(V)p(γ)p(τ) q(W)q(α)q(V)q(γ)q(τ) Approximation for ordinal views Directly maximizing L(q(Θ),G,Q) is intractable, thus further approximation is needed for the first term of (2). Only the ordinal views are considered in this subsection. For realvalued views no such approximation is needed. The approximation for the ordinal views proceeds 1 as follows: 〈log p(YW,v,γ, τ)〉q(Θ) = i,j,m
Editor: –
"... Multiview learning leverages correlations between different sources of data to make predictions in one view based on observations in another view. A popular approach is to assume that, both, the correlations between the views and the viewspecific covariances have a lowrank structure, leading to ..."
Abstract
 Add to MetaCart
Multiview learning leverages correlations between different sources of data to make predictions in one view based on observations in another view. A popular approach is to assume that, both, the correlations between the views and the viewspecific covariances have a lowrank structure, leading to interbattery factor analysis, a model closely related to canonical correlation analysis. We propose a convex relaxation of this model using structured norm regularization. Further, we extend the convex formulation to a robust version by adding an `1penalized matrix to our estimator, similarly to convex robust PCA. We develop and compare scalable algorithms for several convex multiview models. We show experimentally that the viewspecific correlations are improving data imputation performances, as well as labeling accuracy in realworld multilabel prediction tasks.