Results 1 - 10
of
38
Learning to Discover Social Circles in Ego Networks
"... Our personal social networks are big and cluttered, and currently there is no good way to organize them. Social networking sites allow users to manually categorize their friends into social circles (e.g. ‘circles ’ on Google+, and ‘lists ’ on Facebook and Twitter), however they are laborious to cons ..."
Abstract
-
Cited by 84 (5 self)
- Add to MetaCart
(Show Context)
Our personal social networks are big and cluttered, and currently there is no good way to organize them. Social networking sites allow users to manually categorize their friends into social circles (e.g. ‘circles ’ on Google+, and ‘lists ’ on Facebook and Twitter), however they are laborious to construct and must be updated whenever a user’s network grows. We define a novel machine learning task of identifying users ’ social circles. We pose the problem as a node clustering problem on a user’s ego-network, a network of connections between her friends. We develop a model for detecting circles that combines network structure as well as user profile information. For each circle we learn its members and the circle-specific user profile similarity metric. Modeling node membership to multiple circles allows us to detect overlapping as well as hierarchically nested circles. Experiments show that our model accurately identifies circles on a diverse set of data from Facebook, Google+, and Twitter for all of which we obtain hand-labeled ground-truth. 1
Exploiting Homophily Effect for Trust Prediction
"... Trustplaysacrucialroleforonlineuserswhoseekreliableinformation. However, in reality, user-specified trust relations are very sparse, i.e., a tiny number of pairs of users with trust relations are buried in a disproportionately large number of pairs without trust relations, making trust prediction a ..."
Abstract
-
Cited by 18 (11 self)
- Add to MetaCart
(Show Context)
Trustplaysacrucialroleforonlineuserswhoseekreliableinformation. However, in reality, user-specified trust relations are very sparse, i.e., a tiny number of pairs of users with trust relations are buried in a disproportionately large number of pairs without trust relations, making trust prediction a daunting task. As an important social concept, however, trusthasreceivedgrowingattentionandinterest. Socialtheories are developed for understanding trust. Homophily is oneofthemostimportanttheoriesthatexplainwhytrustrelations are established. Exploiting the homophily effect for trust prediction provides challenges and opportunities. In this paper, we embark on the challenges to investigate the trust prediction problem with the homophily effect. First, we delineate how it differs from existing approaches to trust prediction in an unsupervised setting. Next, we formulate the new trust prediction problem into an optimization problem integrated with homophily, empirically evaluate our approach on two datasets from real-world product review sites, and compare with representative algorithms to gain a deep understanding of the role of homophily in trust prediction.
Jointly Predicting Links and Inferring Attributes using a Social-Attribute Network (SAN)
, 1112
"... The effects of social influence and homophily suggest that both network structure and node attribute information should inform the tasks of link prediction and node attribute inference. Recently, Yin et al. [28, 29] proposed Social-Attribute Network (SAN), an attribute-augmented social network, to i ..."
Abstract
-
Cited by 15 (7 self)
- Add to MetaCart
(Show Context)
The effects of social influence and homophily suggest that both network structure and node attribute information should inform the tasks of link prediction and node attribute inference. Recently, Yin et al. [28, 29] proposed Social-Attribute Network (SAN), an attribute-augmented social network, to integrate network structure and node attributes to perform both link prediction and attribute inference. They focused on generalizing the random walk with restart algorithm to the SAN framework and showed improved performance. In this paper, we extend the SAN framework with several leading supervised and unsupervised link prediction algorithms and demonstrate performance improvement for each algorithm on both link prediction and attribute inference. Moreover, we make the novel observation that attribute inference can help inform link prediction, i.e., link prediction accuracy is further improved by first inferring missing attributes. We comprehensively evaluate these algorithms and compare them with other existing algorithms using a novel, largescale Google+ dataset, which we make publicly available 1.
Transforming Graph Data for Statistical Relational Learning
, 2012
"... Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of Statistical Relational Learning (SRL) algorithms to these domains. In th ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of Statistical Relational Learning (SRL) algorithms to these domains. In this article, we examine and categorize techniques for transforming graph-based relational data to improve SRL algorithms. In particular, appropriate transformations of the nodes, links, and/or features of the data can dramatically affect the capabilities and results of SRL algorithms. We introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. More specifically, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed.
Opinion Fraud Detection in Online Reviews by Network Effects
"... User-generated online reviews can play a significant role in the success of retail products, hotels, restaurants, etc. How-ever, review systems are often targeted by opinion spammers who seek to distort the perceived quality of a product by cre-ating fraudulent reviews. We propose a fast and effecti ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
(Show Context)
User-generated online reviews can play a significant role in the success of retail products, hotels, restaurants, etc. How-ever, review systems are often targeted by opinion spammers who seek to distort the perceived quality of a product by cre-ating fraudulent reviews. We propose a fast and effective framework, FRAUDEAGLE, for spotting fraudsters and fake reviews in online review datasets. Our method has several advantages: (1) it exploits the network effect among review-ers and products, unlike the vast majority of existing meth-ods that focus on review text or behavioral analysis, (2) it consists of two complementary steps; scoring users and re-views for fraud detection, and grouping for visualization and sensemaking, (3) it operates in a completely unsupervised fashion requiring no labeled data, while still incorporating side information if available, and (4) it is scalable to large datasets as its run time grows linearly with network size. We demonstrate the effectiveness of our framework on synthetic and real datasets; where FRAUDEAGLE successfully reveals fraud-bots in a large online app review database.
Feature-enhanced probabilistic models for diffusion network inference
- In European conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD’12
, 2012
"... Abstract. Cascading processes, such as disease contagion, viral marketing, and information diffusion, are a pervasive phenomenon in many types of networks. The problem of devising intervention strategies to facilitate or inhibit such processes has recently received considerable attention. However, a ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Cascading processes, such as disease contagion, viral marketing, and information diffusion, are a pervasive phenomenon in many types of networks. The problem of devising intervention strategies to facilitate or inhibit such processes has recently received considerable attention. However, a major challenge is that the underlying network is often unknown. In this paper, we revisit the problem of inferring latent network structure given observations from a diffusion process, such as the spread of trending topics in social media. We define a family of novel probabilistic models that can explain recurrent cascading behavior, and take into account not only the time differences between events but also a richer set of additional features. We show that MAP inference is tractable and can therefore scale to very large real-world networks. Further, we demonstrate the effectiveness of our approach by inferring the underlying network structure of a subset of the popular Twitter following network by analyzing the topics of a large number of messages posted by users over a 10-month period. Experimental results show that our models accurately recover the links of the Twitter network, and significantly improve the performance over previous models based entirely on time. 1
A deep architecture for matching short texts
- In: Advances in Neural Information Processing Systems
, 2013
"... Many machine learning problems can be interpreted as learning for matching two types of objects (e.g., images and captions, users and products, queries and doc-uments, etc.). The matching level of two objects is usually measured as the inner product in a certain feature space, while the modeling eff ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
Many machine learning problems can be interpreted as learning for matching two types of objects (e.g., images and captions, users and products, queries and doc-uments, etc.). The matching level of two objects is usually measured as the inner product in a certain feature space, while the modeling effort focuses on mapping of objects from the original space to the feature space. This schema, although proven successful on a range of matching tasks, is insufficient for capturing the rich struc-ture in the matching process of more complicated objects. In this paper, we pro-pose a new deep architecture to more effectively model the complicated matching relations between two objects from heterogeneous domains. More specifically, we apply this model to matching tasks in natural language, e.g., finding sensible re-sponses for a tweet, or relevant answers to a given question. This new architecture naturally combines the localness and hierarchy intrinsic to the natural language problems, and therefore greatly improves upon the state-of-the-art models. 1
0 Joint Link Prediction and Attribute Inference using a Social-Attribute Network
"... The effects of social influence and homophily suggest that both network structure and node attribute information should inform the tasks of link prediction and node attribute inference. Recently, Yin et al. [Yin et al. 2010a; 2010b] proposed an attribute-augmented social network model, which we call ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
The effects of social influence and homophily suggest that both network structure and node attribute information should inform the tasks of link prediction and node attribute inference. Recently, Yin et al. [Yin et al. 2010a; 2010b] proposed an attribute-augmented social network model, which we call as Social-Attribute Network (SAN), to integrate network structure and node attributes to perform both link prediction and attribute inference. They focused on generalizing the random walk with restart algorithm to the SAN framework and showed improved performance. In this paper, we extend the SAN framework with several leading supervised and unsupervised link prediction algorithms and demonstrate performance improvement for each algorithm on both link prediction and attribute inference. Moreover, we make the novel observation that attribute inference can help inform link prediction, i.e., link prediction accuracy is further improved by first inferring missing attributes. We comprehensively evaluate these algorithms and compare them with
Learning Structured Models with the AUC Loss and Its Generalizations
"... Many problems involve the prediction of mul-tiple, possibly dependent labels. The struc-tured output prediction framework builds predictors that take these dependencies into account and use them to improve accuracy. In many such tasks, performance is evalu-ated by the Area Under the ROC Curve (AUC). ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Many problems involve the prediction of mul-tiple, possibly dependent labels. The struc-tured output prediction framework builds predictors that take these dependencies into account and use them to improve accuracy. In many such tasks, performance is evalu-ated by the Area Under the ROC Curve (AUC). While a framework for optimizing the AUC loss for unstructured models exists, it does not naturally extend to structured mod-els. In this work, we propose a representa-tion and learning formulation for optimizing structured models over the AUC loss, show how our approach generalizes the unstruc-tured case, and provide algorithms for solv-ing the resulting inference and learning prob-lems. We also explore several new variants of the AUC measure which naturally arise from our formulation. Finally, we empirically show the utility of our approach in several do-mains. 1
Organizational Overlap on Social Networks and its Applications
"... Online social networks have become important for networking, communication, sharing, and discovery. A considerable challenge these networks face is the fact that an online social network is partially observed because two individuals might know each other, but may not have established a connection on ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Online social networks have become important for networking, communication, sharing, and discovery. A considerable challenge these networks face is the fact that an online social network is partially observed because two individuals might know each other, but may not have established a connection on the site. Therefore, link prediction and recommendations are important tasks for any online social network. In this paper, we address the problem of computing edge affinity between two users on a social network, based on the users belonging to organizations such as companies, schools, and online groups. We present experimental insights from social network data on organizational overlap, a novel mathematical model to compute the probability of connection between two people based on organizational overlap, and experimental validation of this model based on real social network data. We also present novel ways in which the organization overlap model can be applied to link prediction and community detection, which in itself could be useful for recommending entities to follow and generating personalized news feed.