Results 1  10
of
706
From frequency to meaning : Vector space models of semantics
 Journal of Artificial Intelligence Research
, 2010
"... Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are begi ..."
Abstract

Cited by 347 (3 self)
 Add to MetaCart
(Show Context)
Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term–document, word–context, and pair–pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field. 1.
The Convex Geometry of Linear Inverse Problems
, 2010
"... In applications throughout science and engineering one is often faced with the challenge of solving an illposed inverse problem, where the number of available measurements is smaller than the dimension of the model to be estimated. However in many practical situations of interest, models are constr ..."
Abstract

Cited by 189 (20 self)
 Add to MetaCart
In applications throughout science and engineering one is often faced with the challenge of solving an illposed inverse problem, where the number of available measurements is smaller than the dimension of the model to be estimated. However in many practical situations of interest, models are constrained structurally so that they only have a few degrees of freedom relative to their ambient dimension. This paper provides a general framework to convert notions of simplicity into convex penalty functions, resulting in convex optimization solutions to linear, underdetermined inverse problems. The class of simple models considered are those formed as the sum of a few atoms from some (possibly infinite) elementary atomic set; examples include wellstudied cases such as sparse vectors (e.g., signal processing, statistics) and lowrank matrices (e.g., control, statistics), as well as several others including sums of a few permutations matrices (e.g., ranked elections, multiobject tracking), lowrank tensors (e.g., computer vision, neuroscience), orthogonal matrices (e.g., machine learning), and atomic measures (e.g., system identification). The convex programming formulation is based on minimizing the norm induced by the convex hull of the atomic set; this norm is referred to as the atomic norm. The facial
Hierarchical singular value decomposition of tensors
 SIAM Journal on Matrix Analysis and Applications
"... Abstract. We define the hierarchical singular value decomposition (SVD) for tensors of order d ≥ 2. This hierarchical SVD has properties like the matrix SVD (and collapses to the SVD in d = 2), and we prove these. In particular, one can find low rank (almost) best approximations in a hierarchical fo ..."
Abstract

Cited by 178 (11 self)
 Add to MetaCart
(Show Context)
Abstract. We define the hierarchical singular value decomposition (SVD) for tensors of order d ≥ 2. This hierarchical SVD has properties like the matrix SVD (and collapses to the SVD in d = 2), and we prove these. In particular, one can find low rank (almost) best approximations in a hierarchical format (HTucker) which requires only O((d − 1)k3 + dnk) parameters, where d is the order of the tensor, n the size of the modes and k the (hierarchical) rank. The HTucker format is a specialization of the Tucker format and it contains as a special case all (canonical) rank k tensors. Based on this new concept of a hierarchical SVD we present algorithms for hierarchical tensor calculations allowing for a rigorous error analysis. The complexity of the truncation (finding lower rank approximations to hierarchical rank k tensors) is in O((d−1)k4+dnk2) and the attainable accuracy is just 2–3 digits less than machine precision.
An Error Analysis of The Multiconfiguration Timedependent Hartree Method of Quantum Dynamics
 MATHEMATICAL MODELLING AND NUMERICAL ANALYSIS
, 2010
"... This paper gives an error analysis of the multiconfiguration timedependent Hartree (MCTDH) method for the approximation of multiparticle timedependent Schrödinger equations. The MCTDH method approximates the multivariate wave function by a linear combination of products of univariate functions a ..."
Abstract

Cited by 111 (0 self)
 Add to MetaCart
(Show Context)
This paper gives an error analysis of the multiconfiguration timedependent Hartree (MCTDH) method for the approximation of multiparticle timedependent Schrödinger equations. The MCTDH method approximates the multivariate wave function by a linear combination of products of univariate functions and replaces the highdimensional linear Schrödinger equation by a coupled system of ordinary differential equations and lowdimensional nonlinear partial differential equations. The main result of this paper yields an L 2 error bound of the MCTDH approximation in terms of a bestapproximation error bound in a stronger norm and of lower bounds of singular values of matrix unfoldings of the coefficient tensor. This result permits us to establish convergence of the MCTDH method to the exact wave function under appropriate conditions on the approximability of the wave function, and it points to reasons for possible failure in other cases.
Tensor decompositions for learning latent variable models
, 2014
"... This work considers a computationally and statistically efficient parameter estimation method for a wide class of latent variable models—including Gaussian mixture models, hidden Markov models, and latent Dirichlet allocation—which exploits a certain tensor structure in their loworder observable mo ..."
Abstract

Cited by 83 (7 self)
 Add to MetaCart
This work considers a computationally and statistically efficient parameter estimation method for a wide class of latent variable models—including Gaussian mixture models, hidden Markov models, and latent Dirichlet allocation—which exploits a certain tensor structure in their loworder observable moments (typically, of second and thirdorder). Specifically, parameter estimation is reduced to the problem of extracting a certain (orthogonal) decomposition of a symmetric tensor derived from the moments; this decomposition can be viewed as a natural generalization of the singular value decomposition for matrices. Although tensor decompositions are generally intractable to compute, the decomposition of these specially structured tensors can be efficiently obtained by a variety of approaches, including power iterations and maximization approaches (similar to the case of matrices). A detailed analysis of a robust tensor power method is provided, establishing an analogue of Wedin’s perturbation theorem for the singular vectors of matrices. This implies a robust and computationally tractable estimation approach for several popular latent variable models.
The Alternating Linear Scheme for Tensor Optimisation in the TT format
 IN THE TT FORMAT. PREPRINT 71, DFGSPP 1324
, 2010
"... ..."
Parallel stochastic gradient algorithms for largescale matrix completion
 MATHEMATICAL PROGRAMMING COMPUTATION
, 2013
"... This paper develops Jellyfish, an algorithm for solving dataprocessing problems with matrixvalued decision variables regularized to have low rank. Particular examples of problems solvable by Jellyfish include matrix completion problems and leastsquares problems regularized by the nuclear norm or ..."
Abstract

Cited by 74 (8 self)
 Add to MetaCart
This paper develops Jellyfish, an algorithm for solving dataprocessing problems with matrixvalued decision variables regularized to have low rank. Particular examples of problems solvable by Jellyfish include matrix completion problems and leastsquares problems regularized by the nuclear norm or γ2norm. Jellyfish implements a projected incremental gradient method with a biased, random ordering of the increments. This biased ordering allows for a parallel implementation that admits a speedup nearly proportional to the number of processors. On largescale matrix completion tasks, Jellyfish is orders of magnitude more efficient than existing codes. For example, on the Netflix Prize data set, prior art computes rating predictions in approximately 4 hours, while Jellyfish solves the same problem in under 3 minutes on a 12 core workstation.
A ThreeWay Model for Collective Learning on MultiRelational Data
"... Relational learning is becoming increasingly important in many areas of application. Here, we present a novel approach to relational learning based on the factorization of a threeway tensor. We show that unlike other tensor approaches, our method is able to perform collective learning via the laten ..."
Abstract

Cited by 67 (15 self)
 Add to MetaCart
Relational learning is becoming increasingly important in many areas of application. Here, we present a novel approach to relational learning based on the factorization of a threeway tensor. We show that unlike other tensor approaches, our method is able to perform collective learning via the latent components of the model and provide an efficient algorithm to compute the factorization. We substantiate our theoretical considerations regarding the collective learning capabilities of our model by the means of experiments on both a new dataset and a dataset commonly used in entity resolution. Furthermore, we show on common benchmark datasets that our approach achieves better or onpar results, if compared to current stateoftheart relational learning solutions, while it is significantly faster to compute. 1.
Scalable tensor decompositions for multiaspect data mining
 In ICDM 2008: Proceedings of the 8th IEEE International Conference on Data Mining
, 2008
"... Modern applications such as Internet traffic, telecommunication records, and largescale social networks generate massive amounts of data with multiple aspects and high dimensionalities. Tensors (i.e., multiway arrays) provide a natural representation for such data. Consequently, tensor decompositi ..."
Abstract

Cited by 64 (2 self)
 Add to MetaCart
(Show Context)
Modern applications such as Internet traffic, telecommunication records, and largescale social networks generate massive amounts of data with multiple aspects and high dimensionalities. Tensors (i.e., multiway arrays) provide a natural representation for such data. Consequently, tensor decompositions such as Tucker become important tools for summarization and analysis. One major challenge is how to deal with highdimensional, sparse data. In other words, how do we compute decompositions of tensors where most of the entries of the tensor are zero. Specialized techniques are needed for computing the Tucker decompositions for sparse tensors because standard algorithms do not account for the sparsity of the data. As a result, a surprising phenomenon is observed by practitioners: Despite the fact that there is enough memory to store both the input tensors and the factorized output tensors, memory overflows occur during the tensor factorization process. To address this intermediate blowup problem, we propose MemoryEfficient Tucker (MET). Based on the available memory, MET adaptively selects the right execution strategy during the decomposition. We provide quantitative and qualitative evaluation of MET on real tensors. It achieves over 1000X space reduction without sacrificing speed; it also allows us to work with much larger tensors that were too big to handle before. Finally, we demonstrate a data mining casestudy using MET. 1