Results 1 
5 of
5
SigSR: SimRank search over singular graphs.
 In Proceedings of the 37th ACM SIGIR International Conference on Research & Development in Information Retrieval (SIGIR 2014),
, 2014
"... ABSTRACT SimRank is an attractive structuralcontext measure of similarity between two objects in a graph. It recursively follows the intuition that "two objects are similar if they are referenced by similar objects". The best known matrixbased method [1] for calculating SimRank, however ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
ABSTRACT SimRank is an attractive structuralcontext measure of similarity between two objects in a graph. It recursively follows the intuition that "two objects are similar if they are referenced by similar objects". The best known matrixbased method [1] for calculating SimRank, however, implies an assumption that the graph is nonsingular, i.e., its adjacency matrix is invertible. In reality, nonsingular graphs are very rare; such an assumption in [1] is too restrictive in practice. In this paper, we provide a treatment of [1], by supporting similarity assessment on noninvertible adjacency matrices. Assume that a singular graph G has n nodes, with r (< n) being the rank of its adjacency matrix. (1) We show that SimRank matrix S on G has an elegant structure: S can be represented as a rank r matrix plus a scaled identity matrix. (2) By virtue of this, an efficient algorithm over singular graphs, SigSR, is proposed for calculating allpairs SimRank in O(r(n 2 + Kr 2 )) time for K iterations. In contrast, the only known matrixbased algorithm that supports singular graphs
Efficient PartialPairs SimRank Search on Large Networks
"... The assessment of nodetonode similarities based on graph topology arises in a myriad of applications, e.g., web search. SimRank is a notable measure of this type, with the intuition that “two nodes are similar if their inneighbors are similar”. While most existing work retrieving SimRank only con ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
The assessment of nodetonode similarities based on graph topology arises in a myriad of applications, e.g., web search. SimRank is a notable measure of this type, with the intuition that “two nodes are similar if their inneighbors are similar”. While most existing work retrieving SimRank only considers allpairs SimRank s(⋆, ⋆) and singlesource SimRank s(⋆, j) (scores between every node and query j), there are appealing applications for partialpairs SimRank, e.g., similarity join. Given two node subsets A and B in a graph, partialpairs SimRank assessment aims to retrieve only {s(a, b)}∀a∈A,∀b∈B. However, the bestknown solution appears not selfcontained since it hinges on the premise that the SimRank scores with nodepairs in an hgo cover set must be given beforehand. This paper focuses on efficient assessment of partialpairs SimRank in a selfcontained manner. (1) We devise a novel “seed germination ” model that computes partialpairs SimRank in O(kEmin{A, B}) time and O(E+kV ) memory for k iterations on a graph of V  nodes and E  edges. (2) We further eliminate unnecessary edge access to improve the time of partialpairs SimRank to O(mmin{A, B}), where m ≤ min{kE,∆2k}, and ∆ is the maximum degree. (3) We show that our partialpairs SimRank model also can handle the computations of allpairs and singlesource SimRanks. (4) We empirically verify that our algorithms are (a) 38x faster than the bestknown competitors, and (b) memoryefficient, allowing scores to be assessed accurately on graphs with tens of millions of links. 1.
NED: An InterGraph Node Metric Based On Edit Distance
"... ABSTRACT Node similarity is fundamental in graph analytics. However, node similarity between nodes in different graphs (intergraph nodes) has not received enough attention yet. The intergraph node similarity is important in learning a new graph based on the knowledge extracted from an existing gra ..."
Abstract
 Add to MetaCart
(Show Context)
ABSTRACT Node similarity is fundamental in graph analytics. However, node similarity between nodes in different graphs (intergraph nodes) has not received enough attention yet. The intergraph node similarity is important in learning a new graph based on the knowledge extracted from an existing graph (transfer learning on graphs) and has applications in biological, communication, and social networks. In this paper, we propose a novel distance function for measuring intergraph node similarity with edit distance, called NED. In NED, two nodes are compared according to their local neighborhood topologies which are represented as unordered kadjacent trees, without relying on any extra information. Due to the hardness of computing tree edit distance on unordered trees which is NPComplete, we propose a modified tree edit distance, called TED*, for comparing unordered and unlabeled kadjacent trees. TED* is a metric distance, as the original tree edit distance, but more importantly, TED* is polynomially computable. As a metric distance, NED admits efficient indexing, provides interpretable results, and shows to perform better than existing approaches on a number of data analysis tasks, including graph deanonymization. Finally, the efficiency and effectiveness of NED are empirically demonstrated using realworld graphs.
Efficient TopK SimRankbased Similarity Join
"... SimRank is a popular and widelyadopted similarity measure to evaluate the similarity between nodes in a graph. It is time and space consuming to compute the SimRank similarities for all pairs of nodes, especially for large graphs. In realworld applications, users are only interested in the most s ..."
Abstract
 Add to MetaCart
(Show Context)
SimRank is a popular and widelyadopted similarity measure to evaluate the similarity between nodes in a graph. It is time and space consuming to compute the SimRank similarities for all pairs of nodes, especially for large graphs. In realworld applications, users are only interested in the most similar pairs. To address this problem, in this paper we study the topk SimRankbased similarity join problem, which finds k most similar pairs of nodes with the largest SimRank similarities among all possible pairs. To the best of our knowledge, this is the first attempt to address this problem. We encode each node as a vector by summarizing its neighbors and transform the calculation of the SimRank similarity between two nodes to computing the dot product between the corresponding vectors. We devise an efficient twostep framework to compute topk similar pairs using the vectors. For large graphs, exact algorithms cannot meet the highperformance requirement, and we also devise an approximate algorithm which can efficiently identify topk similar pairs under userspecified accuracy requirement. Experiments on both real and synthetic datasets show our method achieves high performance and good scalability. 1.