Results 1  10
of
24
From frequency to meaning : Vector space models of semantics
 Journal of Artificial Intelligence Research
, 2010
"... Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are begi ..."
Abstract

Cited by 347 (3 self)
 Add to MetaCart
(Show Context)
Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term–document, word–context, and pair–pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field. 1.
LargeScale Manifold Learning
"... This paper examines the problem of extracting lowdimensional manifold structure given millions of highdimensional face images. Specifically, we address the computational challenges of nonlinear dimensionality reduction via Isomap and Laplacian Eigenmaps, using a graph containing about 18 million nod ..."
Abstract

Cited by 51 (7 self)
 Add to MetaCart
(Show Context)
This paper examines the problem of extracting lowdimensional manifold structure given millions of highdimensional face images. Specifically, we address the computational challenges of nonlinear dimensionality reduction via Isomap and Laplacian Eigenmaps, using a graph containing about 18 million nodes and 65 million edges. Since most manifold learning techniques rely on spectral decomposition, we first analyze two approximate spectral decomposition techniques for large dense matrices (Nyström and Columnsampling), providing the first direct theoretical and empirical comparison between these techniques. We next show extensive experiments on learning lowdimensional embeddings for two large face datasets: CMUPIE (35 thousand faces) and a web dataset (18 million faces). Our comparisons show that the Nyström approximation is superior to the Columnsampling method. Furthermore, approximate Isomap tends to perform better than Laplacian Eigenmaps on both clustering and classification with the labeled CMUPIE dataset. 1.
Sampling Methods for the Nyström Method
 JOURNAL OF MACHINE LEARNING RESEARCH
"... The Nyström method is an efficient technique to generate lowrank matrix approximations and is used in several largescale learning applications. A key aspect of this method is the procedure according to which columns are sampled from the original matrix. In this work, we explore the efficacy of a v ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
The Nyström method is an efficient technique to generate lowrank matrix approximations and is used in several largescale learning applications. A key aspect of this method is the procedure according to which columns are sampled from the original matrix. In this work, we explore the efficacy of a variety of fixed and adaptive sampling schemes. We also propose a family of ensemblebased sampling algorithms for the Nyström method. We report results of extensive experiments that provide a detailed comparison of various fixed and adaptive sampling techniques, and demonstrate the performance improvement associated with the ensemble Nyström method when used in conjunction with either fixed or adaptive sampling schemes. Corroborating these empirical findings, we present a theoretical analysis of the Nyström method, providing novel error bounds guaranteeing a better convergence rate of the ensemble Nyström method in comparison to the standard Nyström method.
Attack resistant collaborative filtering
 SIGIR ’08 Proc. ThirtyFirst Annual Internat. ACM SIGIR Conf. Res. Development Inform. Retrieval, ACM
, 2008
"... The widespread deployment of recommender systems has lead to user feedback of varying quality. While some users faithfully express their true opinion, many provide noisy ratings which can be detrimental to the quality of the generated recommendations. The presence of noise can violate modeling assum ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
(Show Context)
The widespread deployment of recommender systems has lead to user feedback of varying quality. While some users faithfully express their true opinion, many provide noisy ratings which can be detrimental to the quality of the generated recommendations. The presence of noise can violate modeling assumptions and may thus lead to instabilities in estimation and prediction. Even worse, malicious users can deliberately insert attack profiles in an attempt to bias the recommender system to their benefit. While previous research has attempted to study the robustness of various existing Collaborative Filtering (CF) approaches, this remains an unsolved problem. Approaches such as Neighbor Selection algorithms, Association Rules and Robust Matrix Factorization have produced unsatisfactory results. This work describes a new collaborative algorithm based on SVD which is accurate as well as highly stable to shilling. This algorithm exploits previously established SVD based shilling detection algorithms, and combines it with SVD basedCF. Experimental results show a much diminished effect of all kinds of shilling attacks. This work also offers significant improvement over previous Robust Collaborative Filtering frameworks.
Clustering More than Two Million Biomedical Publications: Comparing the Accuracies of Nine Text Based Similarity Approaches
"... Background: We investigate the accuracy of different similarity approaches for clustering over two million biomedical documents. Clustering large sets of text documents is important for a variety of information needs and applications such as collection management and navigation, summary and analysis ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
(Show Context)
Background: We investigate the accuracy of different similarity approaches for clustering over two million biomedical documents. Clustering large sets of text documents is important for a variety of information needs and applications such as collection management and navigation, summary and analysis. The few comparisons of clustering results from different similarity approaches have focused on small literature sets and have given conflicting results. Our study was designed to seek a robust answer to the question of which similarity approach would generate the most coherent clusters of a biomedical literature set of over two million documents. Methodology: We used a corpus of 2.15 million recent (20042008) records from MEDLINE, and generated nine different documentdocument similarity matrices from information extracted from their bibliographic records, including titles, abstracts and subject headings. The nine approaches were comprised of five different analytical techniques with two data sources. The five analytical techniques are cosine similarity using term frequencyinverse document frequency vectors (tfidf cosine), latent semantic analysis (LSA), topic modeling, and two Poissonbased language models – BM25 and PMRA (PubMed Related Articles). The two data sources were a) MeSH subject headings, and b) words from titles and abstracts. Each similarity matrix was filtered to keep the topn highest similarities per document and then clustered using a combination of graph layout and averagelink clustering. Cluster results from the nine similarity approaches were compared
A Recommender System for an IPTV Service Provider: a Real LargeScale Production Environment
 In Recommender Systems Handbook
, 2011
"... Abstract In this chapter we describe the integration of a recommender system into the production environment of Fastweb, one of the largest European IP Television (IPTV) providers. The recommender system implements both collaborative and contentbased techniques, suitable tailored to the specific r ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
(Show Context)
Abstract In this chapter we describe the integration of a recommender system into the production environment of Fastweb, one of the largest European IP Television (IPTV) providers. The recommender system implements both collaborative and contentbased techniques, suitable tailored to the specific requirements of an IPTV architecture, such as the limited screen definition, the reduced navigation capabilities, and the strict time constraints. The algorithms are extensively analyzed by means of offline and online tests, showing the effectiveness of the recommender systems: up to 30 % of the recommendations are followed by a purchase, with an estimated lift factor (increase in sales) of 15%. 1
Statistical significance of the Netflix challenge
 URL http: //arxiv.org/abs/1207.5649
"... ar ..."
Collaborative filtering with interlaced generalized linear models
, 2008
"... Collaborative filtering (CF) is a data analysis task appearing in many challenging applications, in particular data mining in Internet and ecommerce. CF can often be formulated as identifying patterns in a large and mostly empty rating matrix. In this paper, we focus on predicting unobserved rating ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
Collaborative filtering (CF) is a data analysis task appearing in many challenging applications, in particular data mining in Internet and ecommerce. CF can often be formulated as identifying patterns in a large and mostly empty rating matrix. In this paper, we focus on predicting unobserved ratings. This task is often a part of a recommendation procedure. We propose a new CF approach called interlaced generalized linear models (GLM); it is based on a factorization of the rating matrix and uses probabilistic modeling to represent uncertainty in the ratings. The advantage of this approach is that different configurations, encoding different intuitions about the rating process can easily be tested while keeping the same learning procedure. The GLM formulation is the keystone to derive an efficient learning procedure, applicable to large datasets. We illustrate the technique on three public domain datasets. r 2008 Elsevier B.V. All rights reserved.
Laplacian Embedded Regression for Scalable Manifold Regularization
"... Semisupervised Learning (SSL), as a powerful tool to learn from a limited number of labeled data and a large number of unlabeled data, has been attracting increasing attention in the machine learning community. In particular, the manifold regularization framework has laid solid theoretical foundat ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Semisupervised Learning (SSL), as a powerful tool to learn from a limited number of labeled data and a large number of unlabeled data, has been attracting increasing attention in the machine learning community. In particular, the manifold regularization framework has laid solid theoretical foundations for a large family of SSL algorithms such as Laplacian SVM (LapSVM) and Laplacian Regularized Least Squares (LapRLS). However, most of these algorithms are limited to small scale problems due to the high computational cost of the matrix inversion operation involved in the optimization problem. In this paper, we propose a novel framework called Laplacian Embedded Regression by introducing an intermediate decision variable into the manifold regularization framework. By using ϵinsensitive loss, we obtain the Laplacian Embedded SVR (LapESVR) algorithm, which inherits the sparse solution from SVR. Also, we derive Laplacian Embedded RLS (LapERLS) corresponding to RLS under the proposed framework. Both LapESVR and LapERLS possess a simpler form of a transformed kernel, which is the summation of the original kernel and a graph kernel that captures the manifold structure. The benefits of the transformed kernel are twofold: 1) we can deal with the original kernel matrix and the graph Laplacian matrix in the graph kernel separately; 2) if the graph Laplacian matrix is sparse, we only need to perform the inverse operation for a sparse matrix, which is much more efficient when compared with that for a dense one. Inspired by kernel PCA, we further propose to project the introduced decision variable into a subspace spanned by a few eigenvectors of the graph Laplacian matrix in order to better reflect the data manifold, as well as accelerate the calculation of the graph kernel, allowing our methods to efficiently and effectively cope with large scale semisupervised learning problems. Extensive experiments on both toy and real world data sets show the effectiveness and scalability of the proposed framework.