Results 11 -
19 of
19
Improving Community Detection Methods for Network Data Analysis
, 2014
"... Empirical analysis of network data has been widely conducted for understanding and predicting the structure and function of real systems and identifying interesting patterns and anomalies. One of the most widely studied structural properties of networks is their community structure. In this thesis w ..."
Abstract
- Add to MetaCart
(Show Context)
Empirical analysis of network data has been widely conducted for understanding and predicting the structure and function of real systems and identifying interesting patterns and anomalies. One of the most widely studied structural properties of networks is their community structure. In this thesis we investigate some of the challenges and applications of community detection for analysis of network data and propose different approaches for improving community detection methods. One of the challenges in using community detection for network data analysis is that there is no consensus on a definition for a community despite excessive studies which have been performed on the community structure of real networks. There-fore, evaluating the quality of the communities identified by different community detection algorithms is problematic. In this thesis, we perform an empirical comparison and evaluation of the quality of the communities identified by a variety of community detection algorithms which use different definitions for communities for different applications of network data analysis. Another challenge in using
Localized Algorithm of Community Detection on Large-Scale Decentralized Social Networks
"... Abstract—Despite the overwhelming success of the existing So-cial Networking Services (SNS), their centralized ownership and control have led to serious concerns in user privacy, censorship vulnerability and operational robustness of these services. To overcome these limitations, Distributed Social ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—Despite the overwhelming success of the existing So-cial Networking Services (SNS), their centralized ownership and control have led to serious concerns in user privacy, censorship vulnerability and operational robustness of these services. To overcome these limitations, Distributed Social Networks (DSN) have recently been proposed and implemented. Under these new DSN architectures, no single party possesses the full knowledge of the entire social network. While this approach solves the above problems, the lack of global knowledge for the DSN nodes makes it much more challenging to support some common but critical SNS services like friends discovery and community detection. In this paper, we tackle the problem of community detection for a given user under the constraint of limited local topology information as imposed by common DSN architectures. By considering the Personalized Page Rank (PPR) approach as an ink spilling process, we justify its applicability for decentralized community detection using limited local topology information. Our proposed PPR-based solution has a wide range of applica-tions such as friends recommendation, targeted advertisement, automated social relationship labeling and sybil defense. Using data collected from a large-scale SNS in practice, we demonstrate our adapted version of PPR can significantly outperform the basic PR as well as two other commonly used heuristics. The inclusion of a few manually labeled friends in the Escape Vector (EV) can boost the performance considerably (64.97 % relative improvement in terms of Area Under the ROC Curve (AUC)). I.
usma.edu
"... Current approaches to community detection in social net-works often ignore the spatial location of the nodes. In this paper, we look to extract spatially-near communities in a social network. We introduce a new metric to measure the quality of a community partition in a geolocated social net-works c ..."
Abstract
- Add to MetaCart
Current approaches to community detection in social net-works often ignore the spatial location of the nodes. In this paper, we look to extract spatially-near communities in a social network. We introduce a new metric to measure the quality of a community partition in a geolocated social net-works called “spatially-near modularity ” a value that in-creases based on aspects of the network structure but de-creases based on the distance between nodes in the commu-nities. We then look to find an optimal partition with respect to this measure- which should be an “ideal ” community with respect to both social ties and geographic location. Though an NP-hard problem, we introduce two heuristic algorithms that attempt to maximize this measure and outperform non-geographic community finding by an order of magnitude. Ap-plications to counter-terrorism are also discussed.
Overlapping Communities for Identifying Misbehavior in Network Communications
"... Abstract. In this paper, we study the problem of identifying misbe-having network communications using community detection algorithms. Recently, it was shown that identifying the communications that do not respect community boundaries is a promising approach for network in-trusion detection. However ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. In this paper, we study the problem of identifying misbe-having network communications using community detection algorithms. Recently, it was shown that identifying the communications that do not respect community boundaries is a promising approach for network in-trusion detection. However, it was also shown that traditional community detection algorithms are not suitable for this purpose. In this paper, we propose a novel method for enhancing community detec-tion algorithms, and show that contrary to previous work, they provide a good basis for network misbehavior detection. This enhancement ex-tends disjoint communities identified by these algorithms with a layer of auxiliary communities, so that the boundary nodes can belong to several communities. Although non-misbehaving nodes can naturally be in more than one community, we show that the majority of misbehaving nodes belong to multiple overlapping communities, therefore overlapping com-munity detection algorithms can also be deployed for intrusion detection. Finally, we present a framework for anomaly detection which uses com-munity detection as its basis. The framework allows incorporation of application-specific filters to reduce the false positives induced by com-munity detection algorithms. Our framework is validated using large email networks and flow graphs created from real network traffic. 1
Non-exhaustive, Overlapping k-means
"... Traditional clustering algorithms, such as k-means, output a clustering that is disjoint and exhaustive, that is, every single data point is assigned to exactly one cluster. How-ever, in real datasets, clusters can overlap and there are often outliers that do not belong to any cluster. This is a wel ..."
Abstract
- Add to MetaCart
(Show Context)
Traditional clustering algorithms, such as k-means, output a clustering that is disjoint and exhaustive, that is, every single data point is assigned to exactly one cluster. How-ever, in real datasets, clusters can overlap and there are often outliers that do not belong to any cluster. This is a well recognized problem that has received much atten-tion in the past, and several algorithms, such as fuzzy k-means have been proposed for overlapping clustering. How-ever, most existing algorithms address either overlap or out-lier detection and do not tackle the problem in a unified way. In this paper, we propose a simple and intuitive ob-jective function that captures the issues of overlap and non-exhaustiveness in a unified manner. Our objective function can be viewed as a reformulation of the traditional k-means objective, with easy-to-understand parameters that capture the degrees of overlap and non-exhaustiveness. By studying the objective, we are able to obtain a simple iterative algo-rithm which we call NEO-K-Means (Non-Exhaustive Over-lapping K-Means). Furthermore, by considering an exten-sion to weighted kernel k-means, we can tackle the case of non-exhaustive and overlapping graph clustering. This ex-tension allows us to apply our NEO-K-Means algorithm to the community detection problem, which is an important task in network analysis. Our experimental results show that the new objective and algorithm are effective in find-ing ground-truth clusterings that have varied overlap and non-exhaustiveness; for the case of graphs, we show that our algorithm outperforms state-of-the-art overlapping commu-nity detection methods. 1
A Local Seed Selection Algorithm for Overlapping Community Detection
"... Abstract—One of the widely studied structural properties of social and information networks is their community structure, and a vast variety of community detection algorithms have been proposed in the literature. Expansion of a seed node into a community is one of the most successful methods for loc ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—One of the widely studied structural properties of social and information networks is their community structure, and a vast variety of community detection algorithms have been proposed in the literature. Expansion of a seed node into a community is one of the most successful methods for local community detection, especially when the global structure of the network is not accessible. An algorithm for local community detection only requires a partial knowledge of the network and the computations can be done in parallel starting from seed nodes. The parallel nature of local algorithms allow for fast and scalable solutions, however, the coverage of the commu-nities heavily depends on the seed selection. The communities identified by a local algorithm might cover only a subset of the nodes in a network if the seeds are not selected carefully. In this paper, we propose a novel seeding algorithm which is parameter free, utilizes merely the local structure of the network, and identifies good seeds which span over the whole network. In order to find such seeds, our algorithm first com-putes similarity indices from local link prediction techniques to assign a similarity score to each node, and then a biased graph coloring algorithm is used to enhance the seed selection. Our experiments using large-scale real-world networks show that our algorithm is able to select good seeds which are then expanded into high quality overlapping communities covering the vast majority of the nodes in the network using a personalized PageRank-based community detection algorithm. We also show that using our local seeding algorithm can dramatically reduce the execution time of community detection. I.
The role of space in social groups: Analysis and technological applications
"... The role of space in social groups: ..."
(Show Context)