Results 11  20
of
58
MultiRelational Matrix Factorization using Bayesian Personalized Ranking for Social Network Data
"... A key element of the social networks on the internet such as Facebook and Flickr is that they encourage users to create connections between themselves, other users and objects. One important task that has been approached in the literature that deals with such data is to use social graphs to predict ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
A key element of the social networks on the internet such as Facebook and Flickr is that they encourage users to create connections between themselves, other users and objects. One important task that has been approached in the literature that deals with such data is to use social graphs to predict user behavior (e.g. joining a group of interest). More specifically, we study the coldstart problem, where users only participate in some relations, which we will call social relations, but not in the relation on which the predictions are made, which we will refer to as target relations. We propose a formalization of the problem and a principled approach to it based on multirelational factorization techniques. Furthermore, we derive a principled feature extraction scheme from the social data to extract predictors for a classifier on the target relation. Experiments conducted on real world datasets show that our approach outperforms current methods.
Guaranteed matrix completion via nonconvex factorization
, 2014
"... Matrix factorization is a popular approach for largescale matrix completion and constitutes a basic component of many solutions for Netflix Prize competition. In this approach, the unknown lowrank matrix is expressed as the product of two much smaller matrices so that the lowrank property is auto ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
Matrix factorization is a popular approach for largescale matrix completion and constitutes a basic component of many solutions for Netflix Prize competition. In this approach, the unknown lowrank matrix is expressed as the product of two much smaller matrices so that the lowrank property is automatically fulfilled. The resulting optimization problem, even with huge size, can be solved (to stationary points) very efficiently through standard optimization algorithms such as alternating minimization and stochastic gradient descent (SGD). However, due to the nonconvexity caused by the factorization model, there is a limited theoretical understanding of whether these algorithms will generate a good solution. In this paper, we establish a theoretical guarantee for the factorization based formulation to correctly recover the underlying lowrank matrix. In particular, we show that under similar conditions to those in previous works, many standard optimization algorithms converge to the global optima of the factorization based formulation, thus recovering the true lowrank matrix. To the best of our knowledge, our result is the first one that provides recovery guarantee for many standard algorithms such as gradient descent, SGD and block coordinate gradient descent. Our result also applies to alternating minimization, and a notable difference from previous studies on alternating minimization is that we do not need the resampling scheme (i.e. using independent samples in each iteration).
Dynamic Cognitive Tracing: Towards Unified Discovery of Student and Cognitive Models
 in Proceedings of the Fifth International Conference on Educational Data Mining. 2012, in press. Chania
"... This work describes a unified approach to two problems previously addressed separately in Intelligent Tutoring Systems: (i) Cognitive Modeling, which factorizes problem solving steps into the latent set of skills required to perform them [7]; and (ii) Student Modeling, which infers students ’ learni ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
(Show Context)
This work describes a unified approach to two problems previously addressed separately in Intelligent Tutoring Systems: (i) Cognitive Modeling, which factorizes problem solving steps into the latent set of skills required to perform them [7]; and (ii) Student Modeling, which infers students ’ learning by observing student performance [9]. The practical importance of improving understanding of how students learn is to build better intelligent tutors [8]. The expected advantages of our integrated approach include (i) more accurate prediction of a student’s future performance, and (ii) clustering items into skills automatically, without expensive manual expert knowledge annotation. We introduce a unified model, Dynamic Cognitive Tracing, to explain student learning in terms of skill mastery over time, by learning the Cognitive Model and the Student Model jointly. We formulate our approach as a graphical model, and we validate it using sixty different synthetic datasets. Dynamic Cognitive Tracing significantly outperforms singleskill Knowledge Tracing on predicting future student performance. 1.
Nonnegative matrix factorizations as probabilistic inference in composite models
 In In Proc. 17th European Signal Processing Conference (EUSIPCO09
, 2009
"... We develop an interpretation of nonnegative matrix factorization (NMF) methods based on Euclidean distance, KullbackLeibler and ItakuraSaito divergences in a probabilistic framework. We describe how these factorizations are implicit in a welldefined statistical model of superimposed components, e ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
We develop an interpretation of nonnegative matrix factorization (NMF) methods based on Euclidean distance, KullbackLeibler and ItakuraSaito divergences in a probabilistic framework. We describe how these factorizations are implicit in a welldefined statistical model of superimposed components, either Gaussian or Poisson distributed, and are equivalent to maximum likelihood estimation of either mean, variance or intensity parameters. By treating the components as hiddenvariables, NMF algorithms can be derived in a typical data augmentation setting. This setting can in particular accommodate regularization constraints on the matrix factors through Bayesian priors. We describe multiplicative, ExpectationMaximization, Markov chain Monte Carlo and Variational Bayes algorithms for the NMF problem. This paper describes in a unified framework both new and known algorithms and aims at providing statistical insights to NMF. 1.
Stability of matrix factorization for collaborative filtering
 In ICML
, 2012
"... We study the stability vis a vis adversarial noise of matrix factorization algorithm for matrix completion. In particular, our results include: (I) we bound the gap between the solution matrix of the factorization method and the ground truth in terms of root mean square error; (II) we treat the matr ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
We study the stability vis a vis adversarial noise of matrix factorization algorithm for matrix completion. In particular, our results include: (I) we bound the gap between the solution matrix of the factorization method and the ground truth in terms of root mean square error; (II) we treat the matrix factorization as a subspace fitting problem and analyze the difference between the solution subspace and the ground truth; (III) we analyze the prediction error of individual users based on the subspace stability. We apply these results to the problem of collaborative filtering under manipulator attack, which leads to useful insights and guidelines for collaborative filtering system design. 1.
Sparkler: Supporting largescale matrix factorization
 In EDBT
, 2013
"... Lowrank matrix factorization has recently been applied with great success on matrix completion problems for applications like recommendation systems, link predictions for social networks, and click prediction for web search. However, as this approach is applied to increasingly larger datasets, such ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
Lowrank matrix factorization has recently been applied with great success on matrix completion problems for applications like recommendation systems, link predictions for social networks, and click prediction for web search. However, as this approach is applied to increasingly larger datasets, such as those encountered in webscale recommender systems like Netflix and Pandora, the data management aspects quickly become challenging and form a roadblock. In this paper, we introduce a system called Sparkler to solve such large instances of low rank matrix factorizations. Sparkler extends Spark, an existing platform for running parallel iterative algorithms on datasets that fit in the aggregate main memory of a cluster. Sparkler supports distributed stochastic gradient descent as an approach to solving the factorization problem – an iterative technique that has been shown to perform very well in practice. We identify the shortfalls of Spark in solving large matrix factorization problems, especially when running on the cloud, and solve this by introducing a novel abstraction called “Carousel Maps ” (CMs). CMs are well suited to storing large matrices in the aggregate memory of a cluster and can efficiently support the operations performed on them during distributed stochastic gradient descent. We describe the design, implementation, and the use of CMs in Sparkler programs. Through a variety of experiments, we demonstrate that Sparkler is faster than Spark by 4x to 21x, with bigger advantages for larger problems. Equally importantly, we show that this can be done without imposing any changes to the ease of programming. We argue that Sparkler provides a convenient and efficient extension to Spark for solving matrix factorization problems on very large datasets.
Fast Bregman Divergence NMF using Taylor Expansion and Coordinate Descent
"... Nonnegative matrix factorization (NMF) provides a lower rank approximation of a matrix. Due to nonnegativity imposed on the factors, it gives a latent structure that is often more physically meaningful than other lower rank approximations such as singular value decomposition (SVD). Most of the algo ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Nonnegative matrix factorization (NMF) provides a lower rank approximation of a matrix. Due to nonnegativity imposed on the factors, it gives a latent structure that is often more physically meaningful than other lower rank approximations such as singular value decomposition (SVD). Most of the algorithms proposed in literature for NMF have been based on minimizing the Frobenius norm. This is partly due to the fact that the minimization problem based on the Frobenius norm provides much more flexibility in algebraic manipulation than other divergences. In this paper we propose a fast NMF algorithm that is applicable to general Bregman divergences. Through Taylor series expansion of the Bregman divergences, we reveal a relationship between Bregman divergences and Euclidean distance. This key relationship provides a new direction for NMF algorithms with general Bregman divergences when combined with the scalar block coordinate descent method. The proposed algorithm generalizes several recently proposed methods for computation of NMF with Bregman divergences and is computationally faster than existing alternatives. We demonstrate the effectiveness of our approach with experiments conducted on artificial as well as real world data.
L.: Classification of sparse time series via supervised matrix factorization
, 2012
"... Data sparsity is an emerging realworld problem observed in a various domains ranging from sensor networks to medical diagnosis. Consecutively, numerous machine learning methods were modeled to treat missing values. Nevertheless, sparsity, defined as missing segments, has not been thoroughly inve ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
(Show Context)
Data sparsity is an emerging realworld problem observed in a various domains ranging from sensor networks to medical diagnosis. Consecutively, numerous machine learning methods were modeled to treat missing values. Nevertheless, sparsity, defined as missing segments, has not been thoroughly investigated in the context of timeseries classification. We propose a novel principle for classifying time series, which in contrast to existing approaches, avoids reconstructing the missing segments in time series and operates solely on the observed ones. Based on the proposed principle, we develop a method that prevents adding noise that incurs during the reconstruction of the original time series. Our method adapts supervised matrix factorization by projecting time series in a latent space through stochastic learning. Furthermore the projected data is built in a supervised fashion via a logistic regression. Abundant experiments on a large collection of 37 data sets demonstrate the superiority of our method, which in the majority of cases outperforms a set of baselines that do not follow our proposed principle. 1
CoBaFi: Collaborative Bayesian Filtering
 In World Wide Web Conference
, 2014
"... Given a large dataset of users ’ ratings of movies, what is the best model to accurately predict which movies a person will like? And how can we prevent spammers from tricking our algorithms into suggesting a bad movie? Is it possible to infer structure between movies simultaneously? In this paper w ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
(Show Context)
Given a large dataset of users ’ ratings of movies, what is the best model to accurately predict which movies a person will like? And how can we prevent spammers from tricking our algorithms into suggesting a bad movie? Is it possible to infer structure between movies simultaneously? In this paper we describe a unified Bayesian approach to Collaborative Filtering that accomplishes all of these goals. It models the discrete structure of ratings and is flexible to the often nonGaussian shape of the distribution. Additionally, our method finds a coclustering of the users and items, which improves the model’s accuracy and makes the model robust to fraud. We offer three main contributions: (1) We provide a novel model and Gibbs sampling algorithm that accurately models the quirks of real world ratings, such as convex ratings distributions. (2) We provide proof of our model’s robustness to spam and anomalous behavior. (3) We use several real world datasets to demonstrate the model’s effectiveness in accurately predicting user’s ratings, avoiding prediction skew in the face of injected spam, and finding interesting patterns in real world ratings data.
Estimation Error Guarantees for Poisson Denoising with Sparse and Structured Dictionary Models
"... Abstract—Poisson processes are commonly used models for describing discrete arrival phenomena arising, for example, in photonlimited scenarios in lowlight and infrared imaging, astronomy, and nuclear medicine applications. In this context, several recent efforts have evaluated Poisson denoising me ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Abstract—Poisson processes are commonly used models for describing discrete arrival phenomena arising, for example, in photonlimited scenarios in lowlight and infrared imaging, astronomy, and nuclear medicine applications. In this context, several recent efforts have evaluated Poisson denoising methods that utilize contemporary sparse modeling and dictionary learning techniques designed to exploit and leverage (local) shared structure in the images being estimated. This paper establishes a theoretical foundation for such procedures. Specifically, we formulate sparse and structured dictionarybased Poisson denoising methods as constrained maximum likelihood estimation strategies, and establish performance bounds for their meansquare estimation error using the framework of complexity penalized maximum likelihood analyses. I.