#### DMCA

## Analysis of Topological Characteristics of Huge Online Social Networking Services. (2007)

Venue: | In Proc. of ACM WWW, |

Citations: | 260 - 6 self |

### Citations

3936 | Emergence of scaling in random networks
- Barabási, Albert
- 1999
(Show Context)
Citation Context ...t, the focus of most sociological studies has been interactions in small groups, not structures of large and extensive networks. Difficulty in obtaining large data sets was one reason behind the lack of structural study. However, as reported in [2] recently, missing data may distort the statistics severely and it is imperative to use large data sets in network structure analysis. It is only very recently that we have seen research results from large networks. Novel network structures from human societies and communication systems have been unveiled; just to name a few are the Internet and WWW [3] and the patents, Autonomous Systems (AS), and affiliation networks [4]. Even in the short history of the Internet, SNSs are a fairly new phenomenon and their network structures are not yet studied carefully. The social networks of SNSs are believed to reflect the real-life social relationships of people more accurately than any other online networks. Moreover, because of their size, they offer an unprecedented opportunity to study human social networks. In this paper, we pose and answer the following questions: What are the main characteristics of online social networks? Ever since the scale-... |

1288 |
The Small World Problem
- Milgram
- 1967
(Show Context)
Citation Context ...d as a unique characteristic of social networks and its origin was suggested as rich community structures of human relationships [19]. As introduced in Stanley Milgram’s experiment of mail forwarding =-=[20]-=-, the degree of separation (ℓ) is the mean distance between any two nodes of the network. Accurate calculation of the degree of separation or the average path length, as we call it in this paper, requ... |

1012 | Error and attack tolerance of complex networks. Nature
- Albert, Jeong, et al.
- 2000
(Show Context)
Citation Context ...l, but it is very difficult to get a power-law degree distribution from a network without the power-law decaying degree distribution. 4.2 Metrics of interest We begin the analysis of online social network topologies by looking at their degree distributions. Networks of a power-law degree distribution, P (k) ∼ k−γ , where k is the node degree and γ ≤ 3, attest to the existence of a relatively small number of nodes with a very large number of links. These networks also have distinguishing properties, such as vanishing epidemic threshold, ultra-small worldness, and robustness under random errors [11, 12, 13, 14]. The degree distribution is often plotted as a complementary cumulative probability function (CCDF), ℘(k) ≡ R ∞ k P (k′)dk′ ∼ k−α ∼ k−(γ−1). As a power-law distribution shows up as a straight WWW 2007 / Track: Semantic Web Session: Semantic Web and Web 2.0 837 line in a log-log plot, the exponent of a power-law distribution is a representative characteristic, distinguishing one from others. Next, we examine the clustering coefficient. The clustering coefficient of a node is the ratio of the number of existing links over the number of possible links between its neighbors. Given a network G = (... |

634 |
Random Graphs with Arbitrary Degree Distributions and their Applications
- Newman, Strogatz, et al.
- 2001
(Show Context)
Citation Context ...eadth-first search. By extrapolating this number sequence, we predict how many steps are needed to cover the entire network, and obtain an estimate of the average path length by the following formula =-=[21]-=-. log(N/n1) + 1, (4) log(n2/n1) where N is the total number of nodes and n1 and n2 are the average numbers of first and second neighbors respectively. Palmer et al. propose an approximation for the ef... |

607 | Power-law distributions in empirical data
- Clauset, Shalizi, et al.
(Show Context)
Citation Context ...ibution is a representative characteristic, distinguishing one from others. Recently, the method of maximum likelihood was suggested as an un-biased and accurate estimator of power-law exponent [15], =-=[16]-=-. The approximate expression for the powerlaw exponent in the discrete case is given by the following expression: � n� γ � 1 + n ln i=1 ki kmin − 1 2 � −1 , (1) where kmin is the point where the power... |

575 | Epidemic spreading in scale-free networks.
- Pastor-Satorras, Vespignani
- 2001
(Show Context)
Citation Context ...l, but it is very difficult to get a power-law degree distribution from a network without the power-law decaying degree distribution. 4.2 Metrics of interest We begin the analysis of online social network topologies by looking at their degree distributions. Networks of a power-law degree distribution, P (k) ∼ k−γ , where k is the node degree and γ ≤ 3, attest to the existence of a relatively small number of nodes with a very large number of links. These networks also have distinguishing properties, such as vanishing epidemic threshold, ultra-small worldness, and robustness under random errors [11, 12, 13, 14]. The degree distribution is often plotted as a complementary cumulative probability function (CCDF), ℘(k) ≡ R ∞ k P (k′)dk′ ∼ k−α ∼ k−(γ−1). As a power-law distribution shows up as a straight WWW 2007 / Track: Semantic Web Session: Semantic Web and Web 2.0 837 line in a log-log plot, the exponent of a power-law distribution is a representative characteristic, distinguishing one from others. Next, we examine the clustering coefficient. The clustering coefficient of a node is the ratio of the number of existing links over the number of possible links between its neighbors. Given a network G = (... |

540 | Graphs Over Time: Densification Laws, Shrinking Diameters and Possible Explanations
- Leskovec, Kleinberg, et al.
- 2005
(Show Context)
Citation Context ...etwork structures from human societies and communication systems have been unveiled; just to name a few are the Internet and WWW [3] and the patents, Autonomous Systems (AS), and affiliation networks =-=[4]-=-. Even in the short history of the Internet, SNSs are a fairly new phenomenon and their network structures are not yet studied carefully. The social networks of SNSs are believed to reflect the real-l... |

475 |
Assortative Mixing in Networks
- Newman
- 2002
(Show Context)
Citation Context ...odes of degree k. Its distribution is often characterized by the assortativity (r), which is defined as the Pearson correlation coefficient of the degrees of either nodes which is connected by a link =-=[17]-=-. It is expressed as follows: r = 〈kikj〉 − 〈ki〉〈kj〉 � (〈k 2 i 〉 − 〈ki〉 2 )(〈k 2 j 〉 − 〈kj〉 2 ) , (3) where ki and kj are degrees of the nodes located at either end of a link and the 〈·〉 notation repre... |

434 |
The spread of epidemic disease on networks,
- Newman
- 2002
(Show Context)
Citation Context ...r of nodes with a very large number of links. These networks also have distinguishing properties, such as vanishing epidemic threshold, ultra-small worldness, and robustness under random errors [11], =-=[12]-=-, [13], [14]. The degree distribution is often plotted as a complementary cumulative probability function (CCDF), ℘(k) ≡ � ∞ k P (k′ )dk ′ ∼ k−α ∼ k−(γ−1) . As a power-law distribution shows up as a s... |

413 | Power laws, Pareto distributions and Zipf's law.
- Newman
- 2005
(Show Context)
Citation Context ... distribution is a representative characteristic, distinguishing one from others. Recently, the method of maximum likelihood was suggested as an un-biased and accurate estimator of power-law exponent =-=[15]-=-, [16]. The approximate expression for the powerlaw exponent in the discrete case is given by the following expression: � n� γ � 1 + n ln i=1 ki kmin − 1 2 � −1 , (1) where kmin is the point where the... |

276 |
Mixing patterns in networks
- Newman
- 2003
(Show Context)
Citation Context ...ave an assortative mixing pattern, and when r < 0, disassortative mixing. Most social networks exhibit an assortative mixing pattern, whereas other networks show a disassortative mixing pattern [17], =-=[18]-=-. The assortative mixing pattern is considered as a unique characteristic of social networks and its origin was suggested as rich community structures of human relationships [19]. As introduced in Sta... |

260 | Scientific collaboration networks: I. Network construction and fundamental results.
- Newman
- 2001
(Show Context)
Citation Context ... of large-scale social networks have received much attention since the uncovering of the network of movie actors [3]. It is followed by the analysis on the network of scientific collaboration network =-=[5]-=- and the web of human sexual contacts [6]. However, a link in these networks is different from a normal friend relationship and the large-scale analysis on the such networks has remained uncharted. Re... |

232 | Geographic routing in social networks.
- Liben-Nowell, Novak, et al.
- 2005
(Show Context)
Citation Context ...layers: messages, guest book, and flirts. An interesting feature is super-heavy tails which go beyond the trend of small degree region. Another work investigates an online blog community, LiveJournal =-=[9]-=-. The number of users examined is 1, 312, 454, about half of whom publicize their snail mail addresses. By examining this partial list of real addresses of bloggers, the work uncovers the connection b... |

223 |
Coevolution of neocortical size, group size and language in humans.
- Dunbar
- 1993
(Show Context)
Citation Context ...it tells us that there is a certain limit in the number of friends. In sociology and anthropology, the theoretical limit in the number of social relationship is known as Dunbar’s number and it is 150 =-=[25]-=-. Intriguingly, the characteristic number of friends, 46, is one third of Dunbar’s number and the ratio coincides with the ratio between the number of Cyworld users and the South Korean population. 10... |

218 |
Why social networks are different from other types of networks.”;
- Newman, Park
- 2003
(Show Context)
Citation Context ...cated at either end of a link and the 〈·〉 notation represents the average over all links. If a network’s assorativity is negative, a hub tends to be connected to non-hubs, and vice versa. When r > 0, we call the network to have an assortative mixing pattern, and when r < 0, disassortative mixing. Most social networks exhibit an assortative mixing pattern, whereas other networks show a disassortative mixing pattern [15, 16]. The assortative mixing pattern is considered as a unique characteristic of social networks and its origin was suggested as rich community structures of human relationships [17]. As introduced in Stanley Milgram’s experiment of mail forwarding [18], the degree of separation (ℓ) is the mean distance between any two nodes of the network. Accurate calculation of the degree of separation or the average path length, as we call it in this paper, requires the knowledge of the entire topology and the time complexity of O(NL), where L is the number of links and N is the number of nodes. In huge networks like Cyworld, MySpace, and orkut, the calculation is infeasible. Only approximation is possible. From a snowball sample network, we measure the number of nodes at each round o... |

132 |
The web of human sexual contacts.
- Liljeros, Edling, et al.
- 2001
(Show Context)
Citation Context ...ived much attention since the uncovering of the network of movie actors [3]. It is followed by the analysis on the network of scientific collaboration network [5] and the web of human sexual contacts =-=[6]-=-. However, a link in these networks is different from a normal friend relationship and the large-scale analysis on the such networks has remained uncharted. Recently, the rapid growth of online social... |

121 | ANF: A fast and scalable tool for data mining in massive graphs.
- Palmer, Gibbons, et al.
- 2002
(Show Context)
Citation Context ...e N is the total number of nodes and n1 and n2 are the average numbers of first and second neighbors respectively. Palmer et al. propose an approximation for the effective diameter of a massive graph =-=[22]-=-. The effective diameter is the 90th-percentile of the path length distribution, and is a better metric than the maximum diameter in estimating the network size, as the maximum diameter can be an outl... |

116 | Competition and multiscaling in evolving networks.
- Bianconi, Barabasi
- 2001
(Show Context)
Citation Context ...example. Another noticeable viewpoint is fitness-based approaches. In any fitness-based approach, each node has its own fitness value and they are linked by the function of their fitness values [27], =-=[28]-=-, [29]. In the case of the online social networks, both the preferential attachment and the fitnessbased approach may contribute. More attractive and active persons are likely to have many online frie... |

115 |
Universal behavior of load distribution in scale-free networks.
- Goh, Kahng, et al.
- 2001
(Show Context)
Citation Context ... such example. Another noticeable viewpoint is fitness-based approaches. In any fitness-based approach, each node has its own fitness value and they are linked by the function of their fitness values =-=[27]-=-, [28], [29]. In the case of the online social networks, both the preferential attachment and the fitnessbased approach may contribute. More attractive and active persons are likely to have many onlin... |

95 |
Statistical properties of sampled networks.
- Lee, Kim, et al.
- 2006
(Show Context)
Citation Context ...gree region. The huge size of online communities makes the sampling an inevitable process in analyzing the networks. Recently, extensive simulations are performed for several network sampling methods =-=[10]-=- and the effect of missing data in social network analysis is studied [2]. III. ONLINE SNSS Social networking services (SNSs) provide users with an online presence that contains shareable personal inf... |

95 |
Hawoong Jeong and Albert-laszlo Barabasi, “Error and attack tolerance of complex networks”,
- Albert
- 2000
(Show Context)
Citation Context ...ith a very large number of links. These networks also have distinguishing properties, such as vanishing epidemic threshold, ultra-small worldness, and robustness under random errors [11], [12], [13], =-=[14]-=-. The degree distribution is often plotted as a complementary cumulative probability function (CCDF), ℘(k) ≡ � ∞ k P (k′ )dk ′ ∼ k−α ∼ k−(γ−1) . As a power-law distribution shows up as a straight line... |

84 |
Scale-free networks are ultrasmall.
- Cohen, Havlin
- 2003
(Show Context)
Citation Context ...odes with a very large number of links. These networks also have distinguishing properties, such as vanishing epidemic threshold, ultra-small worldness, and robustness under random errors [11], [12], =-=[13]-=-, [14]. The degree distribution is often plotted as a complementary cumulative probability function (CCDF), ℘(k) ≡ � ∞ k P (k′ )dk ′ ∼ k−α ∼ k−(γ−1) . As a power-law distribution shows up as a straigh... |

80 | Effects of missing data in social networks.
- Kossinets
- 2003
(Show Context)
Citation Context ... been interactions in small groups, not structures of large and extensive networks. Difficulty in obtaining large data sets was one reason behind the lack of structural study. However, as reported in =-=[2]-=- recently, missing data may distort the statistics severely and it is imperative to use large data sets in network structure analysis. It is only very recently that we have seen research results from ... |

71 | Scale-free networks from varying vertex intrinsic fitness.
- Caldarelli, Capocci, et al.
- 2002
(Show Context)
Citation Context ...e. Another noticeable viewpoint is fitness-based approaches. In any fitness-based approach, each node has its own fitness value and they are linked by the function of their fitness values [27], [28], =-=[29]-=-. In the case of the online social networks, both the preferential attachment and the fitnessbased approach may contribute. More attractive and active persons are likely to have many online friends. M... |

61 | Structure and time-evolution of an internet dating community.
- Holme, Edling, et al.
- 2004
(Show Context)
Citation Context ...er, a link in these networks is different from a normal friend relationship and the large-scale analysis on the such networks has remained uncharted. Recently, the rapid growth of online social networking services made it possible to investigate the huge online social network directly. Since the rise of Cyworld, many SNSs including MySpace and orkut have grown. However, the analyses on these huge networks have been limited to cultural and business viewpoint [7]. Here, we introduce two relevant works on online social networks. First work is on an Internet dating community, called pussokram.com [8]. The dataset consists of about 30, 000 users and time series of all interactions. By network analysis, fat tails are found in all degree distributions from the networks made by several interaction layers: messages, guest book, and flirts. An interesting feature is super-heavy tails which go beyond the trend of small degree region. Another work investigates an online blog community, LiveJournal [9]. The number of users examined is 1, 312, 454, about half of whom publicize their snail mail addresses. By examining this partial list of real addresses of bloggers, the work uncovers the connection ... |

54 | Emergence of a small world from local interactions: modeling acquaintance networks.
- Davidsen, Ebel, et al.
- 2002
(Show Context)
Citation Context ... power-law distributions is preferential attachment. Not only the well-known Barabási-Albert model, but also many other mechanisms implicitly use preferential attachment. The transitive linking model =-=[26]-=-, which is based on continuously completing triangles with only an edge missing, is one such example. Another noticeable viewpoint is fitness-based approaches. In any fitness-based approach, each node... |

14 |
and Juyong Park. Why social networks are different from other types of networks
- Newman
(Show Context)
Citation Context ... mixing pattern [17], [18]. The assortative mixing pattern is considered as a unique characteristic of social networks and its origin was suggested as rich community structures of human relationships =-=[19]-=-. As introduced in Stanley Milgram’s experiment of mail forwarding [20], the degree of separation (ℓ) is the mean distance between any two nodes of the network. Accurate calculation of the degree of s... |

10 | Impact of snowball sampling ratios on network characteristics estimation: A case study of Cyworld.
- Kwak, Han, et al.
- 2006
(Show Context)
Citation Context ...al clustering coefficient of orkut and MySpace is larger than those from sample networks. We plot the degree correlation distributions of the two sample and complete networks of Cyworld in Figure 6(c). The sample networks exhibit a more definite disassortative mixing pattern in their degree correlation distribution. The distributions from the two sample networks exhibit a clear decreasing pattern for k < 100 and then disperse. In our preliminary work, we have evaluated how close topological characteristics of snowball sampled networks are to the complete network as we vary the sampling ratios [22]. From our numerical analysis, we suggest a practical guideline on the sampling ratio for accurate estimation of the topological metrics, excluding the clustering coefficient, where the explicit sampling ratio for accurate estimation is charted for the other metrics; 0.25% or larger for degree distribution, 0.2% or larger for degree correlation, and 0.9% or larger for assortativity. In the case of the clustering coefficient, even with a sampling ratio of 2%, it is inconclusive if the clustering coefficient of the sample network has converged close to that of the complete network. In summary, w... |

4 |
and Fredrik Liljeros. Structure and time evolution of an internet dating community
- Holme, Edling
(Show Context)
Citation Context ...networks have been limited to cultural and business viewpoint [7]. Here, we introduce two relevant works on online social networks. First work is on an Internet dating community, called pussokram.com =-=[8]-=-. The dataset consists of about 30, 000 users and time series of all interactions. By network analysis, fat tails are found in all degree distributions from the networks made by several interaction la... |

3 | Digital cultural communication: Enabling new media and cocreation in southeast asia.
- Russo, Watkins
- 2005
(Show Context)
Citation Context ...social network directly. Since the rise of Cyworld, many SNSs including MySpace and orkut have grown. However, the analyses on these huge networks have been limited to cultural and business viewpoint =-=[7]-=-. Here, we introduce two relevant works on online social networks. First work is on an Internet dating community, called pussokram.com [8]. The dataset consists of about 30, 000 users and time series ... |

2 |
694 million people currently use the internet worldwide according to comScore networks,
- comscore
- 2006
(Show Context)
Citation Context ... steadily decreases fitting to the y = 1/x line, confirming that our estimation would converge to the true average path length. 5.2 Historical analysis 50 45 40 35 30 25 20 15 10 5 05/1204/1203/1202/1201/1200/1299/12 P o p u la ti o n ( m ill io n s ) Time Population Internet users Weekly Daily Cyworld Cyworld users w/o ilchon Snapshot I,II, Set I Dated partial set Figure 3: Comparison of the national population, Internet users, and Cyworld users As of March 2006, the population of Korea reached 48 million, and there were 24 million Internet users of ages over 15, the 6th largest in the world [21]. Internet users can be classified as “weekly” users, if they use the Internet at least once a week, or “daily” users, if they do every day. According to a market research company, Korean Click, the number of weekly users in 2000 was about 15 million and that of daily users was 10 million. When we extrapolate the numbers of daily and weekly users from 2000 to March 2006 along the increase in the overall Internet users, we obtain the graphs labeled as weekly and daily in Figure 3. WWW 2007 / Track: Semantic Web Session: Semantic Web and Web 2.0 839 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 10 -... |

1 |
Hawoong Jeong. Impact of snowball sampling ratios on network characteristics estimation: A case study of Cyworld
- Kwak, Han, et al.
- 2006
(Show Context)
Citation Context ...for k < 100 and then disperse. In our preliminary work, we have evaluated how close topological characteristics of snowball sampled networks are to the complete network as we vary the sampling ratios =-=[24]-=-. From our numerical analysis, we suggest a practical guideline on the sampling ratio for accurate estimation of the topological metrics, excluding the clustering coefficient, where the explicit C(k) ... |