Results 1 - 10
of
13
Web path recommendations based on page ranking and markov models
- In WIDM ’05: Proceedings of the 7th annual ACM international workshop on Web information and data management, 2–9
, 2005
"... Markov models have been widely used for modelling users' navigational behaviour in the Web graph, using the transitional probabilities between web pages, as recorded in the web logs. The recorded users ' navigation is used to extract popular web paths and predict current users ’ next steps. Such pur ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Markov models have been widely used for modelling users' navigational behaviour in the Web graph, using the transitional probabilities between web pages, as recorded in the web logs. The recorded users ' navigation is used to extract popular web paths and predict current users ’ next steps. Such purely usage-based probabilistic models, however, present certain shortcomings. Since the prediction of users ' navigational behaviour is based solely on the usage data, structural properties of the Web graph are ignored. Thus important- in terms of pagerank authority score- paths may be underrated. In this paper we present a hybrid probabilistic predictive model extending the properties of Markov models by incorporating link analysis methods. More specifically, we propose the use of a PageRank-style algorithm for assigning prior probabilities to the web pages based on their importance in the web site's graph. We prove, through experimentation, that this approach results in more objective and representative predictions than the ones produced from the pure usage-based approaches.
Personalized Social Recommendations- Accurate or Private?
"... With the recent surge of social networks like Facebook, new forms of recommendations have become possible – personalized recommendations of ads, content, and even new friend and product connections based on one’s social interactions. Since recommendations may use sensitive social information, it is ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
With the recent surge of social networks like Facebook, new forms of recommendations have become possible – personalized recommendations of ads, content, and even new friend and product connections based on one’s social interactions. Since recommendations may use sensitive social information, it is speculated that these recommendations are associated with privacy risks. The main contribution of this work is in formalizing these expected trade-offs between the accuracy and privacy of personalized social recommendations. In this paper, we study whether“social recommendations”, or recommendations that are solely based on a user’s social network, can be made without disclosing sensitive links in the social graph. More precisely, we quantify the loss in utility when existing recommendation algorithms are modified to satisfy a strong notion of privacy, called differential privacy. We prove lower bounds on the minimum loss in utility for any recommendation algorithm that is differentially private. We adapt two privacy preserving algorithms from the differential privacy literature to the problem of social recommendations, and analyze their performance in comparison to the lower bounds, both analytically and experimentally. We show that good private social recommendations are feasible only for a small subset of the users in the social network or for a lenient setting of privacy parameters. 1.
Link Prediction on Evolving Data using Matrix and Tensor Factorizations
- IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS
, 2009
"... The data in many disciplines such as social networks, web analysis, etc. is link-based, and the link structure can be exploited for many different data mining tasks. In this paper, we consider the problem of temporal link prediction: Given link data for time periods 1 through T, can we predict the l ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
The data in many disciplines such as social networks, web analysis, etc. is link-based, and the link structure can be exploited for many different data mining tasks. In this paper, we consider the problem of temporal link prediction: Given link data for time periods 1 through T, can we predict the links in time period T +1? Specifically, we look at bipartite graphs changing over time and consider matrix- and tensorbased methods for predicting links. We present a weight-based method for collapsing multi-year data into a single matrix. We show how the well-known Katz method for link prediction can be extended to bipartite graphs and, moreover, approximated in a scalable way using a truncated singular value decomposition. Using a CANDECOMP/PARAFAC tensor decomposition of the data, we illustrate the usefulness of exploiting the natural threedimensional structure of temporal link data. Through several numerical experiments, we demonstrate that both matrixand tensor-based techniques are effective for temporal link prediction despite the inherent difficulty of the problem.
Temporality in Link Prediction: Understanding Social Complexity
"... This article summarises experimental results that bring together two views in contemporary science: Bayesian analysis and link prediction, to enhance the current understanding of social network analysis (SNA), particularly in value creation through social connectedness – an important, and growing, d ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This article summarises experimental results that bring together two views in contemporary science: Bayesian analysis and link prediction, to enhance the current understanding of social network analysis (SNA), particularly in value creation through social connectedness – an important, and growing, discipline within management science. Traditional link prediction methods use the values of metrics in a graph to determine where new links are likely to arise, and little work has been done on analysing long-term graph trends. We have found that existing graph generation models are unrealistic in their prediction, and can be complemented through the use of temporal metrics, in the study of some networks. To date, no temporal information has been used in link prediction research, thereby excluding valuable temporal trends that emerge in sociogram sequences and also lowering the accuracy of the link prediction. We extracted information from the Pussokram online dating network dataset, and 9,939 cases of each class were formed. Logistic regression in the Weka data mining system was used to perform link prediction. Our results show that temporal metrics are an extremely valuable new contribution to link prediction, and should be used in future applications. In addition to using metrics to measure the local behaviours of participants in social networks, we used Bayesian networks to model the interrelationships between the metrics as local behaviours and links forming between individuals as emergent behaviours (social complexity). We also explored how the metrics evolve over time using Dynamic Bayesian
Automatic Metadata Generation using Associative Networks
"... In spite of its tremendous value, metadata is generally sparse and incomplete, thereby hampering the effectiveness of digital information services. Many of the existing mechanisms for the automated creation of metadata rely primarily on content analysis which can be costly and inefficient. The autom ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In spite of its tremendous value, metadata is generally sparse and incomplete, thereby hampering the effectiveness of digital information services. Many of the existing mechanisms for the automated creation of metadata rely primarily on content analysis which can be costly and inefficient. The automatic metadata generation system proposed in this paper leverages resource relationships generated from existing metadata as a medium for propagation from metadata-rich to metadata-poor resources. Because of its independence from content analysis, it can be applied to a wide variety of resource media types and has been shown to be computationally inexpensive. The proposed method operates through two distinct phases. Occurrence and co-occurrence algorithms first generate an associative network of repository resources leveraging existing repository metadata. Second, using the associative network as a substrate, metadata associated with metadata-rich resources is propagated to metadata-poor resources by means of a discrete-form spreading activation algorithm. This paper discusses the general framework for building associative networks, an algorithm for disseminating metadata through such networks, and a validation of the proposed system using a bibliographic dataset.
General
"... The data in many disciplines such as social networks, Web analysis, etc. is link-based, and the link structure can be exploited for many different data mining tasks. In this article, we consider the problem of temporal link prediction: Given link data for times 1 through T, can we predict the links ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The data in many disciplines such as social networks, Web analysis, etc. is link-based, and the link structure can be exploited for many different data mining tasks. In this article, we consider the problem of temporal link prediction: Given link data for times 1 through T, can we predict the links at time T + 1? If our data has underlying periodic structure, can we predict out even further in time, i.e., links at time T + 2, T + 3, etc.? In this article, we consider bipartite graphs that evolve over time and consider matrixand tensor-based methods for predicting future links. We present a weight-based method for collapsing multiyear data into a single matrix. We show how the well-known Katz method for link prediction can be extended to bipartite graphs and, moreover, approximated in a scalable way using a truncated singular value decomposition. Using a CANDECOMP/PARAFAC tensor decomposition of the data, we illustrate the usefulness of exploiting the natural three-dimensional structure of temporal link data. Through several numerical experiments, we demonstrate that both matrix- and tensor-based techniques are effective for temporal link prediction despite the inherent difficulty of the problem. Additionally, we show that tensorbased
Link Prediction via Matrix Factorization
"... Abstract. We propose to solve the link prediction problem in graphs using a supervised matrix factorization approach. The model learns latent features from the topological structure of a (possibly directed) graph, and is shown to make better predictions than popular unsupervised scores. We show how ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. We propose to solve the link prediction problem in graphs using a supervised matrix factorization approach. The model learns latent features from the topological structure of a (possibly directed) graph, and is shown to make better predictions than popular unsupervised scores. We show how these latent features may be combined with optional explicit features for nodes or edges, which yields better performance than using either type of feature exclusively. Finally, we propose a novel approach to address the class imbalance problem which is common in link prediction by directly optimizing for a ranking loss. Our model is optimized with stochastic gradient descent and scales to large graphs. Results on several datasets show the efficacy of our approach.
A Two-Phase Spectral Bigraph Co-clustering Approach for the “Who Rated What ” Task in KDD Cup 2007
"... This paper describes our approach for the “Who Rated What ” task in KDD Cup 2007 competition. Given the Netflix data set that consists of more than 100 million ratings between 1998 and 2005, this task is to predict the probability that each user-movie pair was rated in 2006. Totally 100,000 user-mov ..."
Abstract
- Add to MetaCart
This paper describes our approach for the “Who Rated What ” task in KDD Cup 2007 competition. Given the Netflix data set that consists of more than 100 million ratings between 1998 and 2005, this task is to predict the probability that each user-movie pair was rated in 2006. Totally 100,000 user-movie pairs are drawn from the Netflix data set as the test set. In our approach, the Netflix data set is modeled as a bipartite graph (or bigraph) with users and movies on either side. In the bigraph, there are only directed edges from user nodes to movie nodes and each directed edge corresponds to a rating event that the user rated the movie at some time. Then the given task can be further formulated as a link existence prediction problem, i.e., whether a directed link exists between a user node and a movie node. Considering the huge size and the sparsity of ratings in the data set, it is important to reveal the hidden class-based correlation between users and movies from the bigraph while keeping relatively low computational complexity. Towards this end, a two-phase spectral bigraph co-clustering approach is used in our approach. The key idea is to simultaneously obtain user and movie neighborhoods via co-clustering and then generate predictions based on the results of co-clustering. Roughly speaking, our approach includes three steps. First, users and movies are coarsely clustered using K-means algorithm respectively. Then the user and movie clusters are further coclustered using multipartite spectral graph partition algorithm. Based on the results of co-clustering, a probabilistic model is derived to predict the probability of a link existing between a user node and a movie node. Experimental results show that our approach works well in the task.
Sandia National Laboratories and
, 1005
"... The data in many disciplines such as social networks, web analysis, etc. is link-based, and the link structure can be exploited for many different data mining tasks. In this paper, we consider the problem of temporal link prediction: Given link data for times 1 through T, can we predict the links at ..."
Abstract
- Add to MetaCart
The data in many disciplines such as social networks, web analysis, etc. is link-based, and the link structure can be exploited for many different data mining tasks. In this paper, we consider the problem of temporal link prediction: Given link data for times 1 through T, can we predict the links at time T + 1? If our data has underlying periodic structure, can we predict out even further in time, i.e., links at time T + 2, T + 3, etc.? In this paper, we consider bipartite graphs that evolve over time and consider matrix- and tensor-based methods for predicting future links. We present a weight-based method for collapsing multi-year data into a single matrix. We show how the well-known Katz method for link prediction can be extended to bipartite graphs and, moreover, approximated in a scalable way using a truncated singular value decomposition. Using a CANDECOMP/PARAFAC tensor decomposition of the data, we illustrate the usefulness of exploiting the natural three-dimensional structure of temporal link data. Through several numerical experiments, we demonstrate that both matrix- and tensor-based techniques are effective for temporal link prediction despite the inherent difficulty of the problem. Additionally, we show that tensor-based techniques are particularly effective for temporal data with varying periodic patterns.
Link Prediction in Social Networks using Computationally Efficient Topological Features
"... Abstract—Online social networking sites have become increasingly popular over the last few years. As a result, new interdisciplinary research directions have emerged in which social network analysis methods are applied to networks containing hundreds millions of users. Unfortunately, links between i ..."
Abstract
- Add to MetaCart
Abstract—Online social networking sites have become increasingly popular over the last few years. As a result, new interdisciplinary research directions have emerged in which social network analysis methods are applied to networks containing hundreds millions of users. Unfortunately, links between individuals may be missing either due to imperfect acquirement processes or because they are not yet reflected in the online network (i.e., friends in real-world did not form a virtual connection.) Existing link prediction techniques lack the scalability required for full application on a continuously growing social network. The primary bottleneck in link prediction techniques is extracting structural features required for classifying links. In this paper we propose a set of simple, easy-to-compute structural features, that can be analyzed to identify missing links. We show that by using simple structural features, a machine learning classifier can successfully identify missing links, even when applied to a hard problem of classifying links between individuals with at least one common friend. A new friends measure that we developed is shown to be a good predictor for missing links. An evaluation experiment was performed on five large Social Networks datasets: Facebook, Flickr, YouTube, Academia and TheMarker. Our methods can provide social network site operators with the capability of helping users to find known, offline contacts and to discover new friends online. They may also be used for exposing hidden links in an online social network.

