Results 1 
4 of
4
Thurstonian Boltzmann Machines: Learning from Multiple Inequalities
"... We introduce Thurstonian Boltzmann Machines (TBM), a unified architecture that can naturally incorporate a wide range of data inputs at the same time. Our motivation rests in the Thurstonian view that many discrete data types can be considered as being generated from a subset of underlying latent co ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
We introduce Thurstonian Boltzmann Machines (TBM), a unified architecture that can naturally incorporate a wide range of data inputs at the same time. Our motivation rests in the Thurstonian view that many discrete data types can be considered as being generated from a subset of underlying latent continuous variables, and in the observation that each realisation of a discrete type imposes certain inequalities on those variables. Thus learning and inference in TBM reduce to making sense of a set of inequalities. Our proposed TBM naturally supports the following types: Gaussian, intervals, censored, binary, categorical, muticategorical, ordinal, (in)complete rank with and without ties. We demonstrate the versatility and capacity of the proposed model on three applications of very different natures; namely handwritten digit recognition, collaborative filtering and complex social survey analysis. 1.
Learning From Ordered Sets and Applications in Collaborative Ranking
"... Ranking over sets arise when users choose between groups of items. For example, a group may be of those movies deemed 5 stars to them, or a customized tour package. It turns out, to model this data type properly, we need to investigate the general combinatorics problem of partitioning a set and orde ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Ranking over sets arise when users choose between groups of items. For example, a group may be of those movies deemed 5 stars to them, or a customized tour package. It turns out, to model this data type properly, we need to investigate the general combinatorics problem of partitioning a set and ordering the subsets. Here we construct a probabilistic loglinear model over a set of ordered subsets. Inference in this combinatorial space is highly challenging: The space size approaches (N!/2)6.93145N+1 as N approaches infinity. We propose a splitandmerge MetropolisHastings procedure that can explore the statespace efficiently. For discovering hidden aspects in the data, we enrich the model with latent binary variables so that the posteriors can be efficiently evaluated. Finally, we evaluate the proposed model on largescale collaborative filtering tasks and demonstrate that it is competitive against stateoftheart methods.
Under consideration for publication in Knowledge and Information Systems Modelling Human Preferences for Ranking and Collaborative Filtering: A Probabilistic Ordered Partition Approach
"... Abstract. Learning preference models from human generated data is an important task in modern information processing systems. Its popular setting consists of simple input ratings, assigned with numerical values to indicate their relevancy with respect to a specific query. Since ratings are often sp ..."
Abstract
 Add to MetaCart
Abstract. Learning preference models from human generated data is an important task in modern information processing systems. Its popular setting consists of simple input ratings, assigned with numerical values to indicate their relevancy with respect to a specific query. Since ratings are often specified within a small range, several objects may have the same ratings, thus creating ties among objects for a given query. Dealing with this phenomena presents a general problem of modelling preferences in the presence of ties and being queryspecific. To this end, we present in this paper a novel approach by constructing probabilistic models directly on the collection of objects exploiting the combinatorial structure induced by the ties among them. The proposed probabilistic setting allows exploration of a superexponential combinatorial state space with unknown numbers of partitions and unknown order among them. Learning and inference in such a large state space are challenging, and yet we present in this paper efficient algorithms to perform these tasks. Our approach exploits discrete choice theory, imposing generative process such that the finite set of objects is partitioned into subsets in a stagewise procedure, and thus reducing the statespace at each stage significantly. Efficient Markov Chain Monte Carlo (MCMC) algorithms are then presented for the proposed models. We demonstrate that the model can potentially be trained in a largescale setting of hundreds of thousands objects using an ordinary computer. In fact, in some special cases with appropriate model specification, our models can be learned in linear time. We evaluate the models on two application areas: (i) document ranking with the data from the Yahoo! challenge, and (ii) collaborative filtering with movie data. We demonstrate that the models are competitive against stateofthearts. Received xxx Revised xxx
Learning Rank Functionals: An Empirical Study
, 2011
"... Ranking is a key aspect of many applications, such as information retrieval, question answering, ad placement and recommender systems. Learning to rank has the goal of estimating a ranking model automatically from training data. In practical settings, the task often reduces to estimating a rank fun ..."
Abstract
 Add to MetaCart
(Show Context)
Ranking is a key aspect of many applications, such as information retrieval, question answering, ad placement and recommender systems. Learning to rank has the goal of estimating a ranking model automatically from training data. In practical settings, the task often reduces to estimating a rank functional of an object with respect to a query. In this paper, we investigate key issues in designing an effective learning to rank algorithm. These include data representation, the choice of rank functionals, the design of the loss function so that it is correlated with the rank metrics used in evaluation. For the loss function, we study three techniques: approximating the rank metric by a smooth function, decomposition of the loss into a weighted sum of elementwise losses and into a weighted sum of pairwise losses. We then present derivations of piecewise losses using the theory of highorder Markov chains and Markov random fields. In experiments, we evaluate these design aspects on two tasks: answer ranking in a Social Question Answering site, and Web Information Retrieval. 1