Results 1  10
of
15
ASCOS: an asymmetric network structure context similarity measure
 In IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Niagara Falls, Canada, 2013. Leman Akoglu et al
"... Abstract—Discovering similar objects in a social network has many interesting issues. Here, we present ASCOS, an Asymmetric Structure COntext Similarity measure that captures the similarity scores among any pairs of nodes in a network. The definition of ASCOS is similar to that of the wellknown Si ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Discovering similar objects in a social network has many interesting issues. Here, we present ASCOS, an Asymmetric Structure COntext Similarity measure that captures the similarity scores among any pairs of nodes in a network. The definition of ASCOS is similar to that of the wellknown SimRank since both define score values recursively. However, we show that ASCOS outputs a more complete similarity score than SimRank because SimRank (and several of its variations, such as PRank and SimFusion) on average ignores half paths between nodes during calculation. To make ASCOS tractable in both computation time and memory usage, we propose two variations of ASCOS: a low rank approximation based approach and an iterative solver GaussSeidel for linear equations. When the target network is sparse, the run time and the required computing space of these variations are smaller than computing SimRank and ASCOS directly. In addition, the iterative solver divides the original network into several independent subsystems so that a multicore server or a distributed computing environment, such as MapReduce, can efficiently solve the problem. We compare the performance of ASCOS with other global structure based similarity measures, including SimRank, Katz, and LHN. The experimental results based on user evaluation suggest that ASCOS gives better results than other measures. In addition, the asymmetric property has the potential to identify the hierarchical structure of a network. Finally, variations of ASCOS (including one distributed variation) can also reduce computation both in space and time. I.
More is Simpler: Effectively and Efficiently Assessing NodePair Similarities Based on Hyperlinks
"... Similarity assessment is one of the core tasks in hyperlink analysis. Recently, with the proliferation of applications, e.g., web search and collaborative filtering, SimRank has been a wellstudied measure of similarity between two nodes in a graph. It recursively follows the philosophy that “two no ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Similarity assessment is one of the core tasks in hyperlink analysis. Recently, with the proliferation of applications, e.g., web search and collaborative filtering, SimRank has been a wellstudied measure of similarity between two nodes in a graph. It recursively follows the philosophy that “two nodes are similar if they are referenced (have incoming edges) from similar nodes”, which can be viewed as an aggregation of similarities based on incoming paths. Despite its popularity, SimRank has an undesirable property, i.e., “zerosimilarity”: It only accommodates paths with equal length from a common “center ” node. Thus, a large portion of other paths are fully ignored. This paper attempts to remedy this issue. (1) We propose and rigorously justify SimRank*, a revised version of SimRank, which resolves such counterintuitive “zerosimilarity ” issues while inheriting merits of the basic SimRank philosophy. (2) We show that the series form of SimRank * can be reduced to a fairly succinct and elegant closed form, which looks even simpler than SimRank, yet enriches semantics without suffering from increased computational cost. This leads to a fixedpoint iterative paradigm of SimRank * in O(Knm) time on a graph of n nodes and m edges for K iterations, which is comparable to SimRank. (3) To further optimize SimRank* computation, we leverage a novel clustering strategy via edge concentration. Due to its NPhardness, we devise an efficient and effective heuristic to speed up SimRank * computation to O(Kn ˜m) time, where ˜m is generally much smaller than m. (4) Using real and synthetic data, we empirically verify the rich semantics of SimRank*, and demonstrate its high computation efficiency. 1.
Parallel Graph Processing on Graphics Processors Made Easy
"... This paper demonstrates Medusa, a programming framework for parallel graph processing on graphics processors (GPUs). Medusa enables developers to leverage the massive parallelism and other hardware features of GPUs by writing sequential C/C++ code for a small set of APIs. This simplifies the impleme ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
This paper demonstrates Medusa, a programming framework for parallel graph processing on graphics processors (GPUs). Medusa enables developers to leverage the massive parallelism and other hardware features of GPUs by writing sequential C/C++ code for a small set of APIs. This simplifies the implementation of parallel graph processing on the GPU. The runtime system of Medusa automatically executes the userdefined APIs in parallel on the GPU, with a series of graphcentric optimizations based on the architecture features of GPUs. We will demonstrate the steps of developing GPUbased graph processing algorithms with Medusa, and the superior performance of Medusa with both realworld and synthetic datasets. 1.
DeltaSimRank Computing on MapReduce
"... Based on the intuition that “two objects are similar if they are related to similar objects”, SimRank (proposed by Jeh and Widom in 2002) has become a famous measure to compare the similarity between two nodes using network structure. Although SimRank is applicable to a wide range of areas such as s ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Based on the intuition that “two objects are similar if they are related to similar objects”, SimRank (proposed by Jeh and Widom in 2002) has become a famous measure to compare the similarity between two nodes using network structure. Although SimRank is applicable to a wide range of areas such as social networks, citation networks, link prediction, etc., it suffers from heavy computational complexity and space requirements. Most existing efforts to accelerate SimRank computation work only for static graphs and on single machines. This paper considers the problem of computing SimRank efficiently in a distributed system while handling dynamic model called Harmonic Field on Nodepair Graph. We use this model to derive SimRank and the proposed DeltaSimRank, which is demonstrated to fit the nature of distributed computing and can be efficiently implemented using Google’s MapReduce paradigm. DeltaSimRank can effectively reduce the computational cost and can also benefit the applications with nonstatic network structures. Our experimental results on four real world networks show that DeltaSimRank is much more efficient than the distributed SimRank algorithm, and leads to up to 30 times speedup in the best case 1.
1Medusa: Simplified Graph Processing on
"... Abstract—Graphs are common data structures for many applications, and efficient graph processing is a must for application performance. Recently, the graphics processing unit (GPU) has been adopted to accelerate various graph processing algorithms such as BFS and shortest paths. However, it is diffi ..."
Abstract
 Add to MetaCart
Abstract—Graphs are common data structures for many applications, and efficient graph processing is a must for application performance. Recently, the graphics processing unit (GPU) has been adopted to accelerate various graph processing algorithms such as BFS and shortest paths. However, it is difficult to write correct and efficient GPU programs and even more difficult for graph processing due to the irregularities of graph structures. To simplify graph processing on GPUs, we propose a programming framework called Medusa which enables developers to leverage the capabilities of GPUs by writing sequential C/C++ code. Medusa offers a small set of userdefined APIs, and embraces a runtime system to automatically execute those APIs in parallel on the GPU. We develop a series of graphcentric optimizations based on the architecture features of GPUs for efficiency. Additionally, Medusa is extended to execute on multiple GPUs within a machine. Our experiments show that (1) Medusa greatly simplifies implementation of GPGPU programs for graph processing, with many fewer lines of source code written by developers; (2) The optimization techniques significantly improve the performance of the runtime system, making its performance comparable with or better than manually tuned GPU graph operations.
Towards GPUAccelerated LargeScale Graph Processing in the Cloud
"... Abstract—Recently, we have witnessed that cloud providers start to offer heterogeneous computing environments. There have been wide interests in both cluster and cloud of adopting graphics processors (GPUs) as accelerators for various applications. On the other hand, largescale processing is import ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract—Recently, we have witnessed that cloud providers start to offer heterogeneous computing environments. There have been wide interests in both cluster and cloud of adopting graphics processors (GPUs) as accelerators for various applications. On the other hand, largescale processing is important for many dataintensive applications in the cloud. In this paper, we propose to leverage GPUs to accelerate largescale graph processing in the cloud. Specifically, we develop an inmemory graph processing engine G2 with three nontrivial GPUspecific optimizations. Firstly, we adopt finegrained APIs to take advantage of the massive thread parallelism of the GPU. Secondly, G2 embraces a graph partition based approach for load balancing on heterogeneous CPU/GPU architectures. Thirdly, a runtime system is developed to perform transparent memory management on the GPU, and to perform scheduling for an improved throughput of concurrent kernel executions from graph tasks. We have conducted experiments on a local cluster of three nodes and an Amazon EC2 virtual cluster of eight nodes. Our preliminary results demonstrate that 1) GPU is a viable accelerator for cloudbased graph processing, and 2) the proposed optimizations further improve the performance of GPUbased graph processing engine. I.
Efficient PartialPairs SimRank Search on Large Networks
"... The assessment of nodetonode similarities based on graph topology arises in a myriad of applications, e.g., web search. SimRank is a notable measure of this type, with the intuition that “two nodes are similar if their inneighbors are similar”. While most existing work retrieving SimRank only con ..."
Abstract
 Add to MetaCart
The assessment of nodetonode similarities based on graph topology arises in a myriad of applications, e.g., web search. SimRank is a notable measure of this type, with the intuition that “two nodes are similar if their inneighbors are similar”. While most existing work retrieving SimRank only considers allpairs SimRank s(⋆, ⋆) and singlesource SimRank s(⋆, j) (scores between every node and query j), there are appealing applications for partialpairs SimRank, e.g., similarity join. Given two node subsets A and B in a graph, partialpairs SimRank assessment aims to retrieve only {s(a, b)}∀a∈A,∀b∈B. However, the bestknown solution appears not selfcontained since it hinges on the premise that the SimRank scores with nodepairs in an hgo cover set must be given beforehand. This paper focuses on efficient assessment of partialpairs SimRank in a selfcontained manner. (1) We devise a novel “seed germination ” model that computes partialpairs SimRank in O(kEmin{A, B}) time and O(E+kV ) memory for k iterations on a graph of V  nodes and E  edges. (2) We further eliminate unnecessary edge access to improve the time of partialpairs SimRank to O(mmin{A, B}), where m ≤ min{kE,∆2k}, and ∆ is the maximum degree. (3) We show that our partialpairs SimRank model also can handle the computations of allpairs and singlesource SimRanks. (4) We empirically verify that our algorithms are (a) 38x faster than the bestknown competitors, and (b) memoryefficient, allowing scores to be assessed accurately on graphs with tens of millions of links. 1.
On the Efficiency of Estimating Penetrating Rank on Large Graphs
"... Abstract. PRank (Penetrating Rank) has been suggested as a useful measure of structural similarity that takes account of both incoming and outgoing edges in ubiquitous networks. Existing work often utilizes memoization to compute PRank similarity in an iterative fashion, which requires cubic time ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. PRank (Penetrating Rank) has been suggested as a useful measure of structural similarity that takes account of both incoming and outgoing edges in ubiquitous networks. Existing work often utilizes memoization to compute PRank similarity in an iterative fashion, which requires cubic time in the worst case. Besides, previous methods mainly focus on the deterministic computation of PRank, but lack the probabilistic framework that scales well for large graphs. In this paper, we propose two efficient algorithms for computing PRank on large graphs. The first observation is that a large body of objects in a real graph usually share similar neighborhood structures. By merging such objects with an explicit lowrank factorization, we devise a deterministic algorithm to compute PRank in quadratic time. The second observation is that by converting the iterative form of PRank into a matrix power series form, we can leverage the random sampling approach to probabilistically compute PRank in linear time with provable accuracy guarantees. The empirical results on both real and synthetic datasets show that our approaches achieve high time efficiency with controlled error and outperform the baseline algorithms by at least one order of magnitude. 1
HETEROGENEOUS FEATURE FUSION FOR VISUAL RECOGNITION BY
"... In the past decade, the popularity of the Internet and digital cameras has led to a flourishing of images and videos. Surveillance videos are increasing explosively with the huge amounts of surveillance cameras. Compared with traditional datasets in computer vision, which host only thousands of ima ..."
Abstract
 Add to MetaCart
(Show Context)
In the past decade, the popularity of the Internet and digital cameras has led to a flourishing of images and videos. Surveillance videos are increasing explosively with the huge amounts of surveillance cameras. Compared with traditional datasets in computer vision, which host only thousands of images, these largescale datasets in the era of the Internet have grown beyond the wildest imagination, and posed a serious challenge for visual recognition and detection. To handle the challenge of visual recognition in complicated scenarios, we that a single feature is not enough to distinguish webscale visual concepts. Accordingly, this dissertation proposes to combine heterogeneous features for different visual recognition tasks. We first develop a machinery called Heterogeneous Feature Machines to effectively fuse multiple types of visual features. In addition, we realize that in specific applications such as consumer photo annotation or surveillance action detection, there are also specific cues which are helpful for visual recognition tasks. We consider three scenarios: (1) consumer photo recognition, where we explore the use of metadata such as time and GPS, (2) Web image searching and annotation, where we combine both user tags and network information for visual applications, and (3) action detection in videos, where the spatialtemporal coherence is combined with multiple visual features for detection tasks. We believe heterogeneous feature fusion is useful in a wide range of applications and merits research efforts in this promising direction. ii To my wife Tanya iii ACKNOWLEDGMENTS First I want thank my incredible adviser, Prof. Thomas S. Huang, for all the patience, advice and guidance which shaped me into a UIUC PhD and a researcher for future life. I am also very lucky to have had the chance to work with Dr. Jiebo