Results 1 - 10
of
233
Measuring the mixing time of social graphs
, 2010
"... Social networks provide interesting algorithmic properties that can be used to bootstrap the security of distributed systems. For example, it is widely believed that social networks are fast mixing, and many recently proposed designs of such systems make crucial use of this property. However, whethe ..."
Abstract
-
Cited by 59 (11 self)
- Add to MetaCart
(Show Context)
Social networks provide interesting algorithmic properties that can be used to bootstrap the security of distributed systems. For example, it is widely believed that social networks are fast mixing, and many recently proposed designs of such systems make crucial use of this property. However, whether real-world social networks are really fast mixing is not verified before, and this could potentially affect the performance of such systems based on the fast mixing property. To address this problem, we measure the mixing time of several social graphs, the time that it takes a random walk on the graph to approach the stationary distribution of that graph, using two techniques. First, we use the second largest eigenvalue modulus which bounds the mixing time. Second, we sample initial distributions and compute the random walk length required to achieve probability distributions close to the stationary distribution. Our findings show that the mixing time of social graphs is much larger than anticipated, and being used in literature, and this implies that either the current security systems based on fast mixing have weaker utility guarantees or have to be less efficient, with less security guarantees, in order to compensate for the slower mixing.
Inferring social ties across heterogeneous networks
- In WSDM’12
, 2012
"... It is well known that different types of social ties have essentially different influence between people. However, users in online social networks rarely categorize their contacts into “family”, “colleagues”, or “classmates”. While a bulk of research has focused on inferring particular types of rela ..."
Abstract
-
Cited by 46 (19 self)
- Add to MetaCart
(Show Context)
It is well known that different types of social ties have essentially different influence between people. However, users in online social networks rarely categorize their contacts into “family”, “colleagues”, or “classmates”. While a bulk of research has focused on inferring particular types of relationships in a specific social network, few publications systematically study the generalization of the problem of inferring social ties over multiple heterogeneous networks. In this work, we develop a framework for classifying the type of social relationships by learning across heterogeneous networks. The framework incorporates social theories into a machine learning model, which effectively improves the accuracy of inferring the type of social relationships in a target network, by borrowing knowledge from a different source network. Our empirical study on five different genres of networks validates the effectiveness of the proposed framework. For example, by leveraging information from a coauthor network with labeled advisor-advisee relationships, the proposed framework is able to obtain an F1-score of 90 % (8-28 % improvements over alternative methods) for inferring manager-subordinate relationships in an enterprise email network.
Who will follow you back? reciprocal relationship prediction
- In CIKM’11
, 2011
"... We study the extent to which the formation of a two-way relation-ship can be predicted in a dynamic social network. A two-way (called reciprocal) relationship, usually developed from a one-way (parasocial) relationship, represents a more trustful relationship be-tween people. Understanding the forma ..."
Abstract
-
Cited by 46 (14 self)
- Add to MetaCart
(Show Context)
We study the extent to which the formation of a two-way relation-ship can be predicted in a dynamic social network. A two-way (called reciprocal) relationship, usually developed from a one-way (parasocial) relationship, represents a more trustful relationship be-tween people. Understanding the formation of two-way relation-ships can provide us insights into the micro-level dynamics of the social network, such as what is the underlying community structure and how users influence each other. Employing Twitter as a source for our experimental data, we propose a learning framework to formulate the problem of recipro-cal relationship prediction into a graphical model. The framework incorporates social theories into a machine learning model. We demonstrate that it is possible to accurately infer 90 % of reciprocal relationships in a dynamic network. Our study provides strong ev-idence of the existence of the structural balance among reciprocal relationships. In addition, we have some interesting findings, e.g., the likelihood of two “elite ” users creating a reciprocal relation-ships is nearly 8 times higher than the likelihood of two ordinary users. More importantly, our findings have potential implications such as how social structures can be inferred from individuals ’ be-haviors.
Discovering Value from Community Activity on Focused Question Answering Sites: A Case Study of Stack Overflow
- Proc. KDD, 2012
"... Question answering (Q&A) websites are now large repositories of valuable knowledge. While most Q&A sites were initially aimed at providing useful answers to the question asker, there has been a marked shift towards question answering as a community-driven knowledge creation process whose end ..."
Abstract
-
Cited by 39 (0 self)
- Add to MetaCart
(Show Context)
Question answering (Q&A) websites are now large repositories of valuable knowledge. While most Q&A sites were initially aimed at providing useful answers to the question asker, there has been a marked shift towards question answering as a community-driven knowledge creation process whose end product can be of enduring value to a broad audience. As part of this shift, specific expertise and deep knowledge of the subject at hand have become increasingly important, and many Q&A sites employ voting and reputation mechanisms as centerpieces of their design to help users identify the trustworthiness and accuracy of the content. To better understand this shift in focus from one-off answers to a group knowledge-creation process, we consider a question together with its entire set of corresponding answers as our fundamental unit of analysis, in contrast with the focus on individual questionanswer pairs that characterized previous work. Our investigation considers the dynamics of the community activity that shapes the set of answers, both how answers and voters arrive over time and how this influences the eventual outcome. For example, we observe significant assortativity in the reputations of co-answerers, relationships between reputation and answer speed, and that the probability of an answer being chosen as the best one strongly depends on temporal characteristics of answer arrivals. We then show that our understanding of such properties is naturally applicable to predicting several important quantities, including the long-term value of the question and its answers, as well as whether a question requires a better answer. Finally, we discuss the implications of these results for the design of Q&A sites.
Learning to infer social ties in large networks
- In PKDD
, 2011
"... Abstract. In online social networks, most relationships are lack of meaning labels (e.g., “colleague ” and “intimate friends”), simply because users do not take the time to label them. An interesting question is: can we automatically infer the type of social relationships in a large network? what ar ..."
Abstract
-
Cited by 36 (16 self)
- Add to MetaCart
(Show Context)
Abstract. In online social networks, most relationships are lack of meaning labels (e.g., “colleague ” and “intimate friends”), simply because users do not take the time to label them. An interesting question is: can we automatically infer the type of social relationships in a large network? what are the fundamental factors that imply the type of social relation-ships? In this work, we formalize the problem of social relationship learn-ing into a semi-supervised framework, and propose a Partially-labeled Pairwise Factor Graph Model (PLP-FGM) for learning to infer the type of social ties. We tested the model on three different genres of data sets: Publication, Email and Mobile. Experimental results demonstrate that the proposed PLP-FGM model can accurately infer 92.7 % of advisor-advisee relationships from the coauthor network (Publication), 88.0 % of manager-subordinate relationships from the email network (Email), and 83.1 % of the friendships from the mobile network (Mobile). Finally, we develop a distributed learning algorithm to scale up the model to real large networks. 1
Echoes of power: language effects and power differences in social interaction.
- In Proceedings of the 21st international conference on World Wide Web, WWW ’12,
, 2012
"... ABSTRACT Understanding social interaction within groups is key to analyzing online communities. Most current work focuses on structural properties: who talks to whom, and how such interactions form larger network structures. The interactions themselves, however, generally take place in the form of ..."
Abstract
-
Cited by 36 (2 self)
- Add to MetaCart
(Show Context)
ABSTRACT Understanding social interaction within groups is key to analyzing online communities. Most current work focuses on structural properties: who talks to whom, and how such interactions form larger network structures. The interactions themselves, however, generally take place in the form of natural language -either spoken or written -and one could reasonably suppose that signals manifested in language might also provide information about roles, status, and other aspects of the group's dynamics. To date, however, finding domain-independent language-based signals has been a challenge. Here, we show that in group discussions, power differentials between participants are subtly revealed by how much one individual immediately echoes the linguistic style of the person they are responding to. Starting from this observation, we propose an analysis framework based on linguistic coordination that can be used to shed light on power relationships and that works consistently across multiple types of power -including a more "static" form of power based on status differences, and a more "situational" form of power in which one individual experiences a type of dependence on another. Using this framework, we study how conversational behavior can reveal power relationships in two very different settings: discussions among Wikipedians and arguments before the U. S. Supreme Court.
mTrust: Discerning Multi-Faceted Trust in a Connected World
"... Traditionally, research about trust assumes a single type of trust between users. However, trust, as a social concept, inherently has many facets indicating multiple and heterogeneous trust relationships between users. Due to the presence of a large trust network for an online user, it is necessary ..."
Abstract
-
Cited by 29 (16 self)
- Add to MetaCart
(Show Context)
Traditionally, research about trust assumes a single type of trust between users. However, trust, as a social concept, inherently has many facets indicating multiple and heterogeneous trust relationships between users. Due to the presence of a large trust network for an online user, it is necessary to discern multi-faceted trust as there are naturally experts of different types. Our study in product review sites reveals that people place trust differently to different people. Since the widely used adjacency matrix cannot capture multi-faceted trust relationships between users, we propose a novel approach by incorporating these relationships into traditional rating prediction algorithms to reliably estimate their strengths. Our work results in interesting findings such as heterogeneous pairs of reciprocal links. Experimental results on real-world data from Epinions and Ciao show that our work of discerning multi-faceted trust can be applied to improve the performance of tasks such as rating prediction, facet-sensitive ranking, and status theory.
Fast and accurate estimation of shortest paths in large graphs
- In Proceedings of Conference on Information and Knowledge Management (CIKM
, 2010
"... Computing shortest paths between two given nodes is a fundamental operation over graphs, but known to be nontrivial over large disk-resident instances of graph data. While a numberoftechniquesexistfor answeringreachabilityqueries and approximating node distances efficiently, determining actual short ..."
Abstract
-
Cited by 28 (1 self)
- Add to MetaCart
(Show Context)
Computing shortest paths between two given nodes is a fundamental operation over graphs, but known to be nontrivial over large disk-resident instances of graph data. While a numberoftechniquesexistfor answeringreachabilityqueries and approximating node distances efficiently, determining actual shortest paths (i.e. the sequence of nodes involved) is often neglected. However, in applications arising in massive online social networks, biological networks, and knowledge graphs it is often essential to find out many, if not all, shortest paths between two given nodes. In this paper, we address this problem and present a scalable sketch-based index structure that not only supports estimation of node distances, but also computes corresponding shortest paths themselves. Generating the actual path information allows for further improvements to the estimation accuracy of distances (and paths), leading to near-exact shortest-path approximations in real world graphs. We evaluate our techniques – implemented within a fully functional RDF graph database system – over large realworld social and biological networks of sizes ranging from tens of thousand to millions of nodes and edges. Experiments on several datasets show that we can achieve query response times providing several orders of magnitude speedup over traditional path computations while keeping the estimation errors between 0 % and 1 % on average.
Keep your friends close: Incorporating trust into social network-based sybil defenses
- in Proc. of INFOCOM, 2011
"... Abstract—Social network-based Sybil defenses exploit the algorithmic properties of social graphs to infer the extent to which an arbitrary node in such a graph should be trusted. However, these systems do not consider the different amounts of trust represented by different graphs, and different leve ..."
Abstract
-
Cited by 26 (6 self)
- Add to MetaCart
Abstract—Social network-based Sybil defenses exploit the algorithmic properties of social graphs to infer the extent to which an arbitrary node in such a graph should be trusted. However, these systems do not consider the different amounts of trust represented by different graphs, and different levels of trust between nodes, though trust is being a crucial requirement in these systems. For instance, co-authors in an academic collaboration graph are trusted in a different manner than social friends. Furthermore, some social friends are more trusted than others. However, previous designs for social network-based Sybil defenses have not considered the inherent trust properties of the graphs they use. In this paper we introduce several designs to tune the performance of Sybil defenses by accounting for differential trust in social graphs and modeling these trust values by biasing random walks performed on these graphs. Surprisingly, we find that the cost function, the required length of random walks to accept all honest nodes with overwhelming probability, is much greater in graphs with high trust values, such as co-author graphs, than in graphs with low trust values such as online social networks. We show that this behavior is due to the community structure in high-trust graphs, requiring longer walk to traverse multiple communities. Furthermore, we show that our proposed designs to account for trust, while increase the cost function of graphs with low trust value, decrease the advantage of attacker. I.
Towards unbiased BFS sampling
- SELECTED AREAS IN COMMUNICATIONS, IEEE JOURNAL ON
, 2011
"... Breadth First Search (BFS) is a widely used approach for sampling large graphs. However, it has been empirically observed that BFS sampling is biased toward high-degree nodes, which may strongly affect the measurement results. In this paper, we quantify and correct the degree bias of BFS. First, we ..."
Abstract
-
Cited by 26 (4 self)
- Add to MetaCart
(Show Context)
Breadth First Search (BFS) is a widely used approach for sampling large graphs. However, it has been empirically observed that BFS sampling is biased toward high-degree nodes, which may strongly affect the measurement results. In this paper, we quantify and correct the degree bias of BFS. First, we consider a random graph RG(pk) with an arbitrary degree distribution pk. For this model, we calculate the node degree distribution expected to be observed by BFS as a function of the fraction f of covered nodes. We also show that, for RG(pk), all commonly used graph traversal techniques (BFS, DFS, Forest Fire, Snowball Sampling, RDS) have exactly the same bias. Next, we propose a practical BFS-bias correction procedure that takes as input a collected BFS sample together with the fraction f. Our correction technique is exact (i.e., leads to unbiased estimation) for RG(pk). Furthermore, it performs well when applied to a broad range of Internet topologies and to two large BFS samples of Facebook and Orkut networks.