• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Probabilistic models for incomplete multi-dimensional arrays. (2009)

by W Chu, Z Ghahramani
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 30
Next 10 →

Personalized Recommendation on Dynamic Content Using Predictive Bilinear Models

by Wei Chu, Seung-Taek Park - WWW 2009 MADRID! TRACK: SOCIAL NETWORKS AND WEB 2.0 / SESSION: RECOMMENDER SYSTEMS , 2009
"... In Web-based services of dynamic content (such as news articles), recommender systems face the difficulty of timely identifying new items of high-quality and providing recommendations for new users. We propose a feature-based machine learning approach to personalized recommendation that is capable o ..."
Abstract - Cited by 54 (3 self) - Add to MetaCart
In Web-based services of dynamic content (such as news articles), recommender systems face the difficulty of timely identifying new items of high-quality and providing recommendations for new users. We propose a feature-based machine learning approach to personalized recommendation that is capable of handling the cold-start issue effectively. We maintain profiles of content of interest, in which temporal characteristics of the content, e.g. popularity and freshness, are updated in real-time manner. We also maintain profiles of users including demographic information and a summary of user activities within Yahoo! properties. Based on all features in user and content profiles, we develop predictive bilinear regression models to provide accurate personalized recommendations of new items for both existing and new users. This approach results in an offline model with light computational overhead compared with other recommender systems that require online re-training. The proposed framework is general and flexible for other personalized tasks. The superior performance of our approach is verified on a large-scale data set collected from the Today-Module on Yahoo! Front Page, with comparison against six competitive approaches.
(Show Context)

Citation Context

...een widely applied in machine learning applications. For example, Tenenbaum and Freeman [39] developed a bilinear model for separating “style” and “content” in images, and recently Chu and Ghahramani =-=[11]-=- derived a probabilistic framework of the Tucker family for modeling structural dependency from partially observed high-dimensional array data. We define an indicator CX as DX a bilinear function of x...

Modelling Relational Data using Bayesian Clustered Tensor Factorization

by Ilya Sutskever, Ruslan Salakhutdinov, Joshua B. Tenenbaum
"... We consider the problem of learning probabilistic models for complex relational structures between various types of objects. A model can help us “understand ” a dataset of relational facts in at least two ways, by finding interpretable structure in the data, and by supporting predictions, or inferen ..."
Abstract - Cited by 42 (2 self) - Add to MetaCart
We consider the problem of learning probabilistic models for complex relational structures between various types of objects. A model can help us “understand ” a dataset of relational facts in at least two ways, by finding interpretable structure in the data, and by supporting predictions, or inferences about whether particular unobserved relations are likely to be true. Often there is a tradeoff between these two aims: cluster-based models yield more easily interpretable representations, while factorization-based approaches have given better predictive performance on large data sets. We introduce the Bayesian Clustered Tensor Factorization (BCTF) model, which embeds a factorized representation of relations in a nonparametric Bayesian clustering framework. Inference is fully Bayesian but scales well to large data sets. The model simultaneously discovers interpretable clusters and yields predictive performance that matches or beats previous probabilistic models for relational data. 1
(Show Context)

Citation Context

...ith noise models or multiple clusterings), which are currently less practical due to the computational difficulty of their inference problems [7, 6, 9]. Models based on matrix or tensor factorization =-=[18, 19, 3]-=- have the potential of making better predictions than interpretable models of similar complexity, as we demonstrate in our experimental results section. Factorization models learn a distributed repres...

A latent factor model for highly multi-relational data

by Rodolphe Jenatton, Antoine Bordes, Nicolas Le Roux, Guillaume Obozinski
"... Many data such as social networks, movie preferences or knowledge bases are multi-relational, in that they describe multiple relations between entities. While there is a large body of work focused on modeling these data, modeling these multiple types of relations jointly remains challenging. Further ..."
Abstract - Cited by 31 (4 self) - Add to MetaCart
Many data such as social networks, movie preferences or knowledge bases are multi-relational, in that they describe multiple relations between entities. While there is a large body of work focused on modeling these data, modeling these multiple types of relations jointly remains challenging. Further, existing approaches tend to breakdown when the number of these types grows. In this paper, we propose a method for modeling large multi-relational datasets, with possibly thousands of relations. Our model is based on a bilinear structure, which captures various orders of interaction of the data, and also shares sparse latent factors across different relations. We illustrate the performance of our approach on standard tensor-factorization datasets where we attain, or outperform, state-of-the-art results. Finally, a NLP application demonstrates our scalability and the ability of our model to learn efficient and semantically meaningful verb representations. 1
(Show Context)

Citation Context

... approach, which induces inherently some sharing of parameters between both different terms and different relations, has been applied successfully [8] and has inspired some probabilistic formulations =-=[4]-=-. Another natural extension to learning several relations simultaneously can be to share the common embedding or the entities across relations via collective matrix factorization as proposed in RESCAL...

Infinite Tucker Decomposition: Nonparametric Bayesian Models for Multiway Data Analysis

by Zenglin Xu, Feng Yan, Yuan Qi - In Proceedings of the International Conference on Machine Learning (ICML , 2012
"... Tensor decomposition is a powerful computational tool for multiway data analysis. Many popular tensor decomposition approaches—such as the Tucker decomposition and CANDE-COMP/PARAFAC (CP)—amount to multi-linear factorization. They are insufficient to model (i) complex interactions between data entit ..."
Abstract - Cited by 14 (2 self) - Add to MetaCart
Tensor decomposition is a powerful computational tool for multiway data analysis. Many popular tensor decomposition approaches—such as the Tucker decomposition and CANDE-COMP/PARAFAC (CP)—amount to multi-linear factorization. They are insufficient to model (i) complex interactions between data entities, (ii) various data types (e.g.missing data and binary data), and (iii) noisy observations and outliers. To address these issues, we propose tensorvariate latent nonparametric Bayesian models, coupled with efficient inference methods, for multiway data analysis. We name these models InfTucker. Using these InfTucker models, we conduct Tucker decomposition in an infinite feature space. Unlike classical tensor decomposition models, our new approaches handle both continuous and binary data in a probabilistic framework. Unlike previous Bayesian models on matrices and tensors, our models are based on latent Gaussian or t processes with nonlinear covariance functions. To efficiently learn the InfTucker models from data, we develop a variational inference technique on tensors. Compared with classical implementation, the new technique reduces both time and space complexities by several orders of magnitude. Our experimental results on chemometrics and social network datasets demonstrate that our new models achieved significantly higher prediction accuracy than the most state-of-art tensor decomposition approaches.

Theoretical analysis of Bayesian matrix factorization

by Shinichi Nakajima, Masashi Sugiyama, Inderjit Dhillon - Journal of Machine Learning Research
"... Recently, variational Bayesian (VB) techniques have been applied to probabilistic matrix factorization and shown to perform very well in experiments. In this paper, we theoretically elucidate properties of the VB matrix factorization (VBMF) method. Through finite-sample analysis of the VBMF estimato ..."
Abstract - Cited by 9 (6 self) - Add to MetaCart
Recently, variational Bayesian (VB) techniques have been applied to probabilistic matrix factorization and shown to perform very well in experiments. In this paper, we theoretically elucidate properties of the VB matrix factorization (VBMF) method. Through finite-sample analysis of the VBMF estimator, we show that two types of shrinkage factors exist in the VBMF estimator: the positive-part James-Stein (PJS) shrinkage and the trace-norm shrinkage, both acting on each singular component separately for producing low-rank solutions. The trace-norm shrinkage is simply induced by non-flat prior information, similarly to the maximum a posteriori (MAP) approach. Thus, no trace-norm shrinkage remains when priors are non-informative. On the other hand, we show a counter-intuitive fact that the PJS shrinkage factor is kept activated even with flat priors. This is shown to be induced by the non-identifiability of the matrix factorization model, that is, the mapping between the target matrix and factorized matrices is not one-to-one. We call this model-induced regularization. We further extend our analysis to empirical Bayes scenarios where hyperparameters are also learned based on the VB free energy. Throughout the paper, we assume no missing entry in the observed matrix, and therefore collaborative filtering is out of scope.
(Show Context)

Citation Context

...l data analysis tool (Cichocki et al., 2009). Among various methods, Bayesian methods of tensor factorization have been shown to be promising (Tao et al., 2008; Yu et al., 2008; Hayashi et al., 2009; =-=Chu and Ghahramani, 2009-=-). In our future work, we will elucidate the behavior of tensor factorization methods based on a similar line of discussion to the current work. Acknowledgments We would like to thank anonymous review...

A case study of behavior-driven conjoint analysis on yahoo! front page today module

by Wei Chu, Seung-taek Park, Todd Beaupre, Nitin Motgi, Amit Phadke, Seinjuti Chakraborty, Joe Zachariah - In Proc. of KDD , 2009
"... Conjoint analysis is one of the most popular market research methodologies for assessing how customers with heterogeneous preferences appraise various objective characteristics in products or services, which provides critical inputs for many marketing decisions, e.g. optimal design of new products a ..."
Abstract - Cited by 9 (1 self) - Add to MetaCart
Conjoint analysis is one of the most popular market research methodologies for assessing how customers with heterogeneous preferences appraise various objective characteristics in products or services, which provides critical inputs for many marketing decisions, e.g. optimal design of new products and target market selection. Nowadays it becomes practical in e-commercial applications to collect millions of samples quickly. However, the large-scale data sets make traditional conjoint analysis coupled with sophisticated Monte Carlo simulation for parameter estimation computationally prohibitive. In this paper, we report a successful large-scale case study of conjoint analysis on click through stream in a real-world application at Yahoo!. We consider identifying users ’ heterogenous preferences from millions of click/view events and building predictive models to classify new users into segments of distinct behavior pattern. A scalable conjoint analysis technique, known as tensor segmentation, is developed by utilizing logistic tensor regression in standard partworth framework for solutions. In offline analysis on the samples collected from a random bucket of Yahoo! Front Page Today Module, we compare tensor segmentation against other segmentation schemes using demographic information, and study user preferences on article content within tensor segments. Our knowledge acquired in the segmentation results also provides assistance to editors in content management and user targeting. The usefulness of our approach is further verified by the observations in a bucket test launched
(Show Context)

Citation Context

... extensively studied in literature and applications. For example, Tenenbaum and Freeman [18] developed a bilinear model for separating “style” and “content” in images, and recently Chu and Ghahramani =-=[3]-=- derived a probabilistic framework of the Tucker family for modeling structural dependency from partially observed highdimensional array data. The tensor indicator is closely related to the traditiona...

Tensor Factorization Using Auxiliary Information

by Atsuhiro Narita, Kohei Hayashi, Ryota Tomioka, Hisashi Kashima
"... Abstract. Most of the existing analysis methods for tensors (or multi-way arrays) only assume that tensors to be completed are of low rank. However, for example, when they are applied to tensor completion prob-lems, their prediction accuracy tends to be significantly worse when only limited entries ..."
Abstract - Cited by 6 (0 self) - Add to MetaCart
Abstract. Most of the existing analysis methods for tensors (or multi-way arrays) only assume that tensors to be completed are of low rank. However, for example, when they are applied to tensor completion prob-lems, their prediction accuracy tends to be significantly worse when only limited entries are observed. In this paper, we propose to use relation-ships among data as auxiliary information in addition to the low-rank assumption to improve the quality of tensor decomposition. We introduce two regularization approaches using graph Laplacians induced from the relationships, and design iterative algorithms for approximate solutions. Numerical experiments on tensor completion using synthetic and bench-mark datasets show that the use of auxiliary information improves com-pletion accuracy over the existing methods based only on the low-rank assumption, especially when observations are sparse. 1
(Show Context)

Citation Context

...ere also exist probabilistic extensions of tensor factorization methods. Shashua and Hazan [17] studied the PARAFAC model under the non-negativity constraint with latent variables. Chu and Ghahramani =-=[4]-=- proposed a probabilistic extension of the Tucker method, known as pTucker. Although we focus on the squared loss function (3) in this paper, changing the loss function corresponds to non-Gaussian pro...

Scalable nonparametric multiway data analysis

by Shandian Zhe , Zenglin Xu , Xinqi Chu , Yuan Qi , Youngja Park - In International Conference on Artificial Intelligence and Statistics , 2015
"... Abstract Multiway data analysis deals with multiway arrays, i.e., tensors, and the goal is twofold: predicting missing entries by modeling the interactions between array elements and discovering hidden patterns, such as clusters or communities in each mode. Despite the success of existing tensor fa ..."
Abstract - Cited by 5 (1 self) - Add to MetaCart
Abstract Multiway data analysis deals with multiway arrays, i.e., tensors, and the goal is twofold: predicting missing entries by modeling the interactions between array elements and discovering hidden patterns, such as clusters or communities in each mode. Despite the success of existing tensor factorization approaches, they are either unable to capture nonlinear interactions, or computationally expensive to handle massive data. In addition, most of the existing methods lack a principled way to discover latent clusters, which is important for better understanding of the data. To address these issues, we propose a scalable nonparametric tensor decomposition model. It employs Dirichlet process mixture (DPM) prior to model the latent clusters; it uses local Gaussian processes (GPs) to capture nonlinear relationships and to improve scalability. An efficient online variational Bayes Expectation-Maximization algorithm is proposed to learn the model. Experiments on both synthetic and real-world data show that the proposed model is able to discover latent clusters with higher prediction accuracy than competitive methods. Furthermore, the proposed model obtains significantly better predictive performance than the state-of-the-art large scale tensor decomposition algorithm, GigaTensor, on two large datasets with billions of entries.
(Show Context)

Citation Context

...pture complex interactions between the array elements and predict the missing entries (e.g., unknown drug response); furthermore, we want to discover the hidden patterns embedded in the data, such as clusters or communities of the nodes or objects in each mode (e.g., groups of abnormal users who may threaten the system security, and sets of people with identical characteristics for personalized medicines development). A number of approaches have been proposed for multiway data analysis, such as CANDECOM/PARAFAC (CP) (Harshman, 1970), Tucker decomposition (Tucker, 1966) and its generalization (Chu and Ghahramani, 2009), and infinite Tucker decomposition (InfTucker) (Xu et al., 2012). Although very useful, these approaches have their own limitations. For example, the popular multilinear factorization methods, such as PARAFAC and Tucker decomposition, cannot capture the nonlinear relationships between array elements; although nonparametric models, such as InfTucker, can model the nonlinear relationships by latent Gaussian processes (GPs), they suffer a prohibitive high training cost and cannot handle massive data in real applications; besides, most of them lack a principled way to discover the latent clusters...

Tensor Analyzers

by Yichuan Tang, Ruslan Salakhutdinov, Geoffrey Hinton , 2013
"... Factor Analysis is a statistical method that seeks to explain linear variations in data by using unobserved latent variables. Due to its additive nature, it is not suitable for modeling data that is generated by multiple groups of latent factors which interact multiplicatively. In this paper, we int ..."
Abstract - Cited by 5 (0 self) - Add to MetaCart
Factor Analysis is a statistical method that seeks to explain linear variations in data by using unobserved latent variables. Due to its additive nature, it is not suitable for modeling data that is generated by multiple groups of latent factors which interact multiplicatively. In this paper, we introduce Tensor Analyzers which are a multilinear generalization of Factor Analyzers. We describe an efficient way of sampling from the posterior distribution over factor values and we demonstrate that these samples can be used in the EM algorithm for learning interesting mixture models of natural image patches. Ten-sor Analyzers can also accurately recognize a face under significant pose and illumination variations when given only one previous image of that face. We also show that Tensor Analyzers can be trained in an unsupervised, semi-supervised, or fully supervised settings.
(Show Context)

Citation Context

...osition methods, including CP: CANDECOMP/PARAFAC (Carroll & Chang, 1970); NTF: Nonnegative Tensor Factorization (Shashua & Hazan, 2005); Tucker (Tucker, 1963); and Probabilistic Tucker decomposition (=-=Chu & Ghahramani, 2009-=-). For CP, Tucker, and NTF, we used the N-way toolbox (Bro, 1998). For the S&C bilinear method, we implemented the exact algorithm as stated in Sec. 3.2 of (Tenenbaum & Freeman, 2000). The style and c...

Leveraging Features and Networks for Probabilistic Tensor Decomposition

by Piyush Rai, Yingjian Wang, Lawrence Carin
"... We present a probabilistic model for tensor decomposi-tion where one or more tensor modes may have side-information about the mode entities in form of their features and/or their adjacency network. We consider a Bayesian approach based on the Canonical PARAFAC (CP) decomposition and enrich this sing ..."
Abstract - Cited by 4 (1 self) - Add to MetaCart
We present a probabilistic model for tensor decomposi-tion where one or more tensor modes may have side-information about the mode entities in form of their features and/or their adjacency network. We consider a Bayesian approach based on the Canonical PARAFAC (CP) decomposition and enrich this single-layer decom-position approach with a two-layer decomposition. The second layer fits a factor model for each layer-one fac-tor matrix and models the factor matrix via the mode entities ’ features and/or the network between the mode entities. The second-layer decomposition of each factor matrix also learns a binary latent representation for the entities of that mode, which can be useful in its own right. Our model can handle both continuous as well as binary tensor observations. Another appealing aspect of our model is the simplicity of the model inference, with easy-to-sample Gibbs updates. We demonstrate the re-sults of our model on several benchmarks datasets, con-sisting of both real and binary tensors.
(Show Context)

Citation Context

...er 2009) provide an effective way to extract latent factors from such data, and are now routinely used for tensor completion of sparse, incomplete tensors. Probabilistic tensor decomposition methods (=-=Chu and Ghahramani 2009-=-; Xiong et al. 2010; Xu, Yan, and Qi 2013; Rai et al. 2014) are especially appealing because they can deal with with diverse data types and missing data in a principled way via a proper generative mod...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University