Results 1 - 10
of
134
Indexing by latent semantic analysis
- JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE
, 1990
"... A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The p ..."
Abstract
-
Cited by 2168 (30 self)
- Add to MetaCart
A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 or-thogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are re-turned. initial tests find this completely automatic method for retrieval to be promising.
Tensor Decompositions and Applications
- SIAM REVIEW
, 2009
"... This survey provides an overview of higher-order tensor decompositions, their applications, and available software. A tensor is a multidimensional or N -way array. Decompositions of higher-order tensors (i.e., N -way arrays with N ⥠3) have applications in psychometrics, chemometrics, signal proce ..."
Abstract
-
Cited by 95 (13 self)
- Add to MetaCart
This survey provides an overview of higher-order tensor decompositions, their applications, and available software. A tensor is a multidimensional or N -way array. Decompositions of higher-order tensors (i.e., N -way arrays with N ⥠3) have applications in psychometrics, chemometrics, signal processing, numerical linear algebra, computer vision, numerical analysis, data mining, neuroscience, graph analysis, etc. Two particular tensor decompositions can be considered to be higher-order extensions of the matrix singular value decompo-
sition: CANDECOMP/PARAFAC (CP) decomposes a tensor as a sum of rank-one tensors, and the Tucker decomposition is a higher-order form of principal components analysis. There are many other tensor decompositions, including INDSCAL, PARAFAC2, CANDELINC, DEDICOM, and PARATUCK2 as well as nonnegative variants of all of the above. The N-way Toolbox and Tensor Toolbox, both for MATLAB, and the Multilinear Engine are examples of software packages for working with tensors.
Non-negative tensor factorization with applications to statistics and computer vision
- In Proceedings of the International Conference on Machine Learning (ICML
, 2005
"... We derive algorithms for finding a nonnegative n-dimensional tensor factorization (n-NTF) which includes the non-negative matrix factorization (NMF) as a particular case when n = 2. We motivate the use of n-NTF in three areas of data analysis: (i) connection to latent class models in statistics, (ii ..."
Abstract
-
Cited by 52 (4 self)
- Add to MetaCart
We derive algorithms for finding a nonnegative n-dimensional tensor factorization (n-NTF) which includes the non-negative matrix factorization (NMF) as a particular case when n = 2. We motivate the use of n-NTF in three areas of data analysis: (i) connection to latent class models in statistics, (ii) sparse image coding in computer vision, and (iii) model selection problems. We derive a ”direct ” positive-preserving gradient descent algorithm and an alternating scheme based on repeated multiple rank-1 problems. 1.
Beyond streams and graphs: Dynamic tensor analysis
- In KDD
, 2006
"... How do we find patterns in author-keyword associations, evolving over time? Or in DataCubes, with product-branchcustomer sales information? Matrix decompositions, like principal component analysis (PCA) and variants, are invaluable tools for mining, dimensionality reduction, feature selection, rule ..."
Abstract
-
Cited by 47 (11 self)
- Add to MetaCart
How do we find patterns in author-keyword associations, evolving over time? Or in DataCubes, with product-branchcustomer sales information? Matrix decompositions, like principal component analysis (PCA) and variants, are invaluable tools for mining, dimensionality reduction, feature selection, rule identification in numerous settings like streaming data, text, graphs, social networks and many more. However, they have only two orders, like author and keyword, in the above example. We propose to envision such higher order data as tensors, and tap the vast literature on the topic. However, these methods do not necessarily scale up, let alone operate on semi-infinite streams. Thus, we introduce the dynamic tensor analysis (DTA) method, and its variants. DTA provides a compact summary for high-order and high-dimensional data, and it also reveals the hidden correlations. Algorithmically, we designed DTA very carefully so that it is (a) scalable, (b) space efficient (it does not need to store the past) and (c) fully automatic with no need for user defined parameters. Moreover, we propose STA, a streaming tensor analysis method, which provides a fast, streaming approximation to DTA. We implemented all our methods, and applied them in two real settings, namely, anomaly detection and multi-way latent semantic indexing. We used two real, large datasets, one on network flow data (100GB over 1 month) and one from DBLP (200MB over 25 years). Our experiments show that our methods are fast, accurate and that they find interesting patterns and outliers on the real datasets. 1.
Parallel Factor Analysis in Sensor Array Processing
- IEEE TRANS. SIGNAL PROCESSING
, 2000
"... This paper links multiple invariance sensor array processing (MI-SAP) to parallel factor (PARAFAC) analysis, which is a tool rooted in psychometrics and chemometrics. PARAFAC is a common name for low-rank decomposition of three- and higher way arrays. This link facilitates the derivation of power ..."
Abstract
-
Cited by 41 (13 self)
- Add to MetaCart
This paper links multiple invariance sensor array processing (MI-SAP) to parallel factor (PARAFAC) analysis, which is a tool rooted in psychometrics and chemometrics. PARAFAC is a common name for low-rank decomposition of three- and higher way arrays. This link facilitates the derivation of powerful identifiability results for MI-SAP, shows that the uniqueness of single- and multiple-invariance ESPRIT stems from uniqueness of low-rank decomposition of three-way arrays, and allows tapping on the available expertise for fitting the PARAFAC model. The results are applicable to both data-domain and subspace MI-SAP formulations. The paper also includes a constructive uniqueness proof for a special PARAFAC model.
Higher-Order Web Link Analysis Using Multilinear Algebra
- IEEE INTERNATIONAL CONFERENCE ON DATA MINING
, 2005
"... Linear algebra is a powerful and proven tool in web search. Techniques, such as the PageRank algorithm of Brin and Page and the HITS algorithm of Kleinberg, score web pages based on the principal eigenvector (or singular vector) of a particular non-negative matrix that captures the hyperlink structu ..."
Abstract
-
Cited by 37 (16 self)
- Add to MetaCart
Linear algebra is a powerful and proven tool in web search. Techniques, such as the PageRank algorithm of Brin and Page and the HITS algorithm of Kleinberg, score web pages based on the principal eigenvector (or singular vector) of a particular non-negative matrix that captures the hyperlink structure of the web graph. We propose and test a new methodology that uses multilinear algebra to elicit more information from a higher-order representation of the hyperlink graph. We start by labeling the edges in our graph with the anchor text of the hyperlinks so that the associated linear algebra representation is a sparse, three-way tensor. The first two dimensions of the tensor represent the web pages while the third dimension adds the anchor text. We then use the rank-1 factors of a multilinear PARAFAC tensor decomposition, which are akin to singular vectors of the SVD, to automatically identify topics in the collection along with the associated authoritative web pages.
Blind PARAFAC receivers for DS-CDMA systems
- IEEE TRANS. SIGNAL PROCESSING
, 2000
"... This paper links the direct-sequence code-division multiple access (DS-CDMA) multiuser separation-equalization-detection problem to the parallel factor (PARAFAC) model, which is an analysis tool rooted in psychometrics and chemometrics. Exploiting this link, it derives a deterministic blind PARAFAC ..."
Abstract
-
Cited by 35 (12 self)
- Add to MetaCart
This paper links the direct-sequence code-division multiple access (DS-CDMA) multiuser separation-equalization-detection problem to the parallel factor (PARAFAC) model, which is an analysis tool rooted in psychometrics and chemometrics. Exploiting this link, it derives a deterministic blind PARAFAC DS-CDMA receiver with performance close to nonblind minimum mean-squared error (MMSE). The proposed PARAFAC receiver capitalizes on code, spatial, and temporal diversity-combining, thereby supporting small sample sizes, more users than sensors, and/or less spreading than users. Interestingly, PARAFAC does not require knowledge of spreading codes, the specifics of multipath (interchip interference), DOA-calibration information, finite alphabet/constant modulus, or statistical independence/whiteness to recover the information-bearing signals. Instead, PARAFAC relies on a fundamental result regarding the uniqueness of low-rank three-way array decomposition due to Kruskal (and generalized herein to the complex-valued case) that guarantees identifiability of all relevant signals and propagation parameters. These and other issues are also demonstrated in pertinent simulation experiments.
Sparse image coding using a 3D non-negative tensor factorization
- In: International Conference of Computer Vision (ICCV
, 2005
"... We introduce an algorithm for a non-negative 3D tensor factorization for the purpose of establishing a local parts feature decomposition from an object class of images. In the past such a decomposition was obtained using nonnegative matrix factorization (NMF) where images were vectorized before bein ..."
Abstract
-
Cited by 35 (2 self)
- Add to MetaCart
We introduce an algorithm for a non-negative 3D tensor factorization for the purpose of establishing a local parts feature decomposition from an object class of images. In the past such a decomposition was obtained using nonnegative matrix factorization (NMF) where images were vectorized before being factored by NMF. A tensor factorization (NTF) on the other hand preserves the 2D representations of images and provides a unique factorization (unlike NMF which is not unique). The resulting ”factors” from the NTF factorization are both sparse (like with NMF) but also separable allowing efficient convolution with the test image. Results show a superior decomposition to what an NMF can provide on all fronts — degree of sparsity, lack of ghost residue due to invariant parts and efficiency of coding of around an order of magnitude better. Experiments on using the local parts decomposition for face detection using SVM and Adaboost classifiers demonstrate that the recovered features are discriminatory and highly effective for classification. 1.
From frequency to meaning : Vector space models of semantics
- Journal of Artificial Intelligence Research
, 2010
"... Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are begi ..."
Abstract
-
Cited by 34 (0 self)
- Add to MetaCart
Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term–document, word–context, and pair–pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field. 1.
Symmetric tensors and symmetric tensor rank
- Scientific Computing and Computational Mathematics (SCCM
, 2006
"... Abstract. A symmetric tensor is a higher order generalization of a symmetric matrix. In this paper, we study various properties of symmetric tensors in relation to a decomposition into a symmetric sum of outer product of vectors. A rank-1 order-k tensor is the outer product of k non-zero vectors. An ..."
Abstract
-
Cited by 33 (18 self)
- Add to MetaCart
Abstract. A symmetric tensor is a higher order generalization of a symmetric matrix. In this paper, we study various properties of symmetric tensors in relation to a decomposition into a symmetric sum of outer product of vectors. A rank-1 order-k tensor is the outer product of k non-zero vectors. Any symmetric tensor can be decomposed into a linear combination of rank-1 tensors, each of them being symmetric or not. The rank of a symmetric tensor is the minimal number of rank-1 tensors that is necessary to reconstruct it. The symmetric rank is obtained when the constituting rank-1 tensors are imposed to be themselves symmetric. It is shown that rank and symmetric rank are equal in a number of cases, and that they always exist in an algebraically closed field. We will discuss the notion of the generic symmetric rank, which, due to the work of Alexander and Hirschowitz, is now known for any values of dimension and order. We will also show that the set of symmetric tensors of symmetric rank at most r is not closed, unless r = 1. Key words. Tensors, multiway arrays, outer product decomposition, symmetric outer product decomposition, candecomp, parafac, tensor rank, symmetric rank, symmetric tensor rank, generic symmetric rank, maximal symmetric rank, quantics AMS subject classifications. 15A03, 15A21, 15A72, 15A69, 15A18 1. Introduction. We

