Results 1  10
of
22
Simultaneously Structured Models with Application to Sparse and Lowrank Matrices
, 2014
"... The topic of recovery of a structured model given a small number of linear observations has been wellstudied in recent years. Examples include recovering sparse or groupsparse vectors, lowrank matrices, and the sum of sparse and lowrank matrices, among others. In various applications in signal p ..."
Abstract

Cited by 41 (5 self)
 Add to MetaCart
The topic of recovery of a structured model given a small number of linear observations has been wellstudied in recent years. Examples include recovering sparse or groupsparse vectors, lowrank matrices, and the sum of sparse and lowrank matrices, among others. In various applications in signal processing and machine learning, the model of interest is known to be structured in several ways at the same time, for example, a matrix that is simultaneously sparse and lowrank. Often norms that promote each individual structure are known, and allow for recovery using an orderwise optimal number of measurements (e.g., `1 norm for sparsity, nuclear norm for matrix rank). Hence, it is reasonable to minimize a combination of such norms. We show that, surprisingly, if we use multiobjective optimization with these norms, then we can do no better, orderwise, than an algorithm that exploits only one of the present structures. This result suggests that to fully exploit the multiple structures, we need an entirely new convex relaxation, i.e. not one that is a function of the convex relaxations used for each structure. We then specialize our results to the case of sparse and lowrank matrices. We show that a nonconvex formulation of the problem can recover the model from very few measurements, which is on the order of the degrees of freedom of the matrix, whereas the convex problem obtained from a combination of the `1 and nuclear norms requires many more measurements. This proves an orderwise gap between the performance of the convex and nonconvex recovery problems in this case. Our framework applies to arbitrary structureinducing norms as well as to a wide range of measurement ensembles. This allows us to give performance bounds for problems such as sparse phase retrieval and lowrank tensor completion.
Convex tensor decomposition via structured Schatten norm regularization
 IN ADVANCES IN NIPS 26
, 2013
"... We study a new class of structured Schatten norms for tensors that includes two recently proposed norms (“overlapped” and “latent”) for convexoptimizationbased tensor decomposition. We analyze the performance of “latent” approach for tensor decomposition, which was empirically found to perform bet ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
We study a new class of structured Schatten norms for tensors that includes two recently proposed norms (“overlapped” and “latent”) for convexoptimizationbased tensor decomposition. We analyze the performance of “latent” approach for tensor decomposition, which was empirically found to perform better than the “overlapped” approach in some settings. We show theoretically that this is indeed the case. In particular, when the unknown true tensor is lowrank in a specific unknown mode, this approach performs as well as knowing the mode with the smallest rank. Along the way, we show a novel duality result for structured Schatten norms, which is also interesting in the general context of structured sparsity. We confirm through numerical simulations that our theory can precisely predict the scaling behaviour of the mean squared error.
Simple bounds for noisy linear inverse problems with exact side information. Available at arXiv.org/abs/1312.0641
, 2013
"... ar ..."
(Show Context)
Equivariant and scalefree Tucker decomposition models
, 2013
"... Analyses of arrayvalued datasets often involve reducedrank array approximations, typically obtained via leastsquares or truncations of array decompositions. However, leastsquares approximations tend to be noisy in highdimensional settings, and may not be appropriate for arrays that include di ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
Analyses of arrayvalued datasets often involve reducedrank array approximations, typically obtained via leastsquares or truncations of array decompositions. However, leastsquares approximations tend to be noisy in highdimensional settings, and may not be appropriate for arrays that include discrete or ordinal measurements. This article develops methodology to obtain lowrank modelbased representations of continuous, discrete and ordinal data arrays. The model is based on a parameterization of the mean array as a multilinear product of a reducedrank core array and a set of indexspecific orthogonal eigenvector matrices. It is shown how orthogonally equivariant parameter estimates can be obtained from Bayesian procedures under invariant prior distributions. Additionally, priors on the core array are developed that act as regularizers, leading to improved inference over the standard leastsquares estimator, and providing robustness to misspecification of the array rank. This modelbased approach is extended to accommodate discrete or ordinal data arrays using a semiparametric transformation model. The resulting lowrank representation is scalefree, in the sense that it is invariant to monotonic transformations of the data array. In an example analysis of a multivariate discrete network dataset, this scalefree approach provides a more complete description of data patterns.
Suykens. Learning tensors in reproducing kernel hilbert spaces with multilinear spectral penalties
, 2013
"... We present a general framework to learn functions in tensor product reproducing kernel Hilbert spaces (TPRKHSs). The methodology is based on a novel representer theorem suitable for existing as well as new spectral penalties for tensors. When the functions in the TPRKHS are defined on the Cartesia ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
We present a general framework to learn functions in tensor product reproducing kernel Hilbert spaces (TPRKHSs). The methodology is based on a novel representer theorem suitable for existing as well as new spectral penalties for tensors. When the functions in the TPRKHS are defined on the Cartesian product of finite discrete sets, in particular, our main problem formulation admits as a special case existing tensor completion problems. Other special cases include transfer learning with multimodal side information and multilinear multitask learning. For the latter case, our kernelbased view is instrumental to derive nonlinear extensions of existing model classes. We give a novel algorithm and show in experiments the usefulness of the proposed extensions. 1
A Statistical Model for Tensor PCA
 Neural Information Processing Systems (NIPS
, 2014
"... We consider the Principal Component Analysis problem for large tensors of arbitrary order k under a singlespike (or rankone plus noise) model. On the one hand, we use information theory, and recent results in probability theory, to establish necessary and sufficient conditions under which the prin ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
We consider the Principal Component Analysis problem for large tensors of arbitrary order k under a singlespike (or rankone plus noise) model. On the one hand, we use information theory, and recent results in probability theory, to establish necessary and sufficient conditions under which the principal component can be estimated using unbounded computational resources. It turns out that this is possible as soon as the signaltonoise ratio β becomes larger than C k log k (and in particular β can remain bounded as the problem dimensions increase). On the other hand, we analyze several polynomialtime estimation algorithms, based on tensor unfolding, power iteration and message passing ideas from graphical models. We show that, unless the signaltonoise ratio diverges in the system dimensions, none of these approaches succeeds. This is possibly related to a fundamental limitation of computationally tractable estimators for this problem. We discuss various initializations for tensor power iteration, and show that a tractable initialization based on the spectrum of the matricized tensor outperforms significantly baseline methods, statistically and computationally. Finally, we consider the case in which additional side information is available about the unknown signal. We characterize the amount of side information that allows the iterative algorithms to converge to a good estimate. 1
Convergence rate of Bayesian tensor estimator: Optimal rate without restricted strong convexity
, 2014
"... ..."
Robust Tensor Decomposition with Gross Corruption
"... In this paper, we study the statistical performance of robust tensor decomposition with gross corruption. The observations are noisy realization of the superposition of a lowrank tensorW ∗ and an entrywise sparse corruption tensor V∗. Unlike conventional noise with bounded variance in previous conv ..."
Abstract
 Add to MetaCart
(Show Context)
In this paper, we study the statistical performance of robust tensor decomposition with gross corruption. The observations are noisy realization of the superposition of a lowrank tensorW ∗ and an entrywise sparse corruption tensor V∗. Unlike conventional noise with bounded variance in previous convex tensor decomposition analysis, the magnitude of the gross corruption can be arbitrary large. We show that under certain conditions, the true lowrank tensor as well as the sparse corruption tensor can be recovered simultaneously. Our theory yields nonasymptotic Frobeniusnorm estimation error bounds for each tensor separately. We show through numerical experiments that our theory can precisely predict the scaling behavior in practice. 1