Results 1 - 10
of
54
Fast and accurate estimation of shortest paths in large graphs
- In Proceedings of Conference on Information and Knowledge Management (CIKM
, 2010
"... Computing shortest paths between two given nodes is a fundamental operation over graphs, but known to be nontrivial over large disk-resident instances of graph data. While a numberoftechniquesexistfor answeringreachabilityqueries and approximating node distances efficiently, determining actual short ..."
Abstract
-
Cited by 28 (1 self)
- Add to MetaCart
(Show Context)
Computing shortest paths between two given nodes is a fundamental operation over graphs, but known to be nontrivial over large disk-resident instances of graph data. While a numberoftechniquesexistfor answeringreachabilityqueries and approximating node distances efficiently, determining actual shortest paths (i.e. the sequence of nodes involved) is often neglected. However, in applications arising in massive online social networks, biological networks, and knowledge graphs it is often essential to find out many, if not all, shortest paths between two given nodes. In this paper, we address this problem and present a scalable sketch-based index structure that not only supports estimation of node distances, but also computes corresponding shortest paths themselves. Generating the actual path information allows for further improvements to the estimation accuracy of distances (and paths), leading to near-exact shortest-path approximations in real world graphs. We evaluate our techniques – implemented within a fully functional RDF graph database system – over large realworld social and biological networks of sizes ranging from tens of thousand to millions of nodes and edges. Experiments on several datasets show that we can achieve query response times providing several orders of magnitude speedup over traditional path computations while keeping the estimation errors between 0 % and 1 % on average.
Y.: Fast exact shortest-path distance queries on large networks by pruned landmark labeling
- In: SIGMOD 2013
, 2013
"... We propose a new exact method for shortest-path distance queries on large-scale networks. Our method precomputes distance labels for vertices by performing a breadth-first search from every vertex. Seemingly too obvious and too inefficient at first glance, the key ingredient introduced here is pruni ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
(Show Context)
We propose a new exact method for shortest-path distance queries on large-scale networks. Our method precomputes distance labels for vertices by performing a breadth-first search from every vertex. Seemingly too obvious and too inefficient at first glance, the key ingredient introduced here is pruning during breadth-first searches. While we can still answer the correct distance for any pair of vertices from the labels, it surprisingly reduces the search space and sizes of labels. Moreover, we show that we can perform 32 or 64 breadth-first searches simultaneously exploiting bitwise operations. We experimentally demonstrate that the com-bination of these two techniques is efficient and robust on various kinds of large-scale real-world networks. In particu-lar, our method can handle social networks and web graphs with hundreds of millions of edges, which are two orders of magnitude larger than the limits of previous exact methods, with comparable query time to those of previous methods.
Benefits of bias: Towards better characterization of network sampling
- In SIGKDD
, 2011
"... From social networks to P2P systems, network sampling arises in many settings. We present a detailed study on the nature of biases in network sampling strategies to shed light on how best to sample from networks. We investigate connections between specific biases and various measures of structural r ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
(Show Context)
From social networks to P2P systems, network sampling arises in many settings. We present a detailed study on the nature of biases in network sampling strategies to shed light on how best to sample from networks. We investigate connections between specific biases and various measures of structural representativeness. We show that certain biases are, in fact, beneficial for many applications, as they “push” the sampling process towards inclusion of desired properties. Finally, we describe how these sampling biases can be exploited in several, real-world applications including disease outbreak detection and market research.
A Continuous Query System for Dynamic Route Planning
"... Abstract—In this paper, we address the problem of answering continuous route planning queries over a road network, in the presence of updates to the delay (cost) estimates of links. A simple approach to this problem would be to recompute the best path for all queries on arrival of every delay update ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
(Show Context)
Abstract—In this paper, we address the problem of answering continuous route planning queries over a road network, in the presence of updates to the delay (cost) estimates of links. A simple approach to this problem would be to recompute the best path for all queries on arrival of every delay update. However, such a naive approach scales poorly when there are many users who have requested routes in the system. Instead, we propose two new classes of approximate techniques – K-paths and proximity measures to substantially speed up processing of the set of designated routes specified by continuous route planning queries in the face of incoming traffic delay updates. Our techniques work through a combination of precomputation of likely good paths and by avoiding complete recalculations on every delay update, instead only sending the user new routes when delays change significantly. Based on an experimental evaluation with 7,000 drives from real taxi cabs, we found that the routes delivered by our techniques are within 5 % of the best shortest path and have run times an order of magnitude or less compared to a naive approach. I.
Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs
- In ACM Conference on Information and Knowledge Management (CIKM
, 2011
"... Computing the shortest path between a pair of vertices in a graph is a fundamental primitive in graph algorithmics. Classical exact methods for this problem do not scale up to contemporary, rapidly evolving social networks with hundreds of millions of users and billions of connections. A number of a ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
(Show Context)
Computing the shortest path between a pair of vertices in a graph is a fundamental primitive in graph algorithmics. Classical exact methods for this problem do not scale up to contemporary, rapidly evolving social networks with hundreds of millions of users and billions of connections. A number of approximate methods have been proposed, including several landmark-based methods that have been shown to scale up to very large graphs with acceptable accuracy. This paper presents two improvements to existing landmarkbased shortest path estimation methods. The first improvement relates to the use of shortest-path trees (SPTs). Together with appropriate short-cutting heuristics, the use of SPTs allows to achieve higher accuracy with acceptable time and memory overhead. Furthermore, SPTs can be maintained incrementally under edge insertions and deletions, which allows for a fully-dynamic algorithm. The second improvement is a new landmark selection strategy that seeks to maximize the coverage of all shortest paths by the selected landmarks. The improved method is evaluated on the DBLP, Orkut, Twitter and Skype social networks.
On k-skip Shortest Paths
"... Given two vertices s, t in a graph, let P be the shortest path (SP) from s to t, and P ⋆ a subset of the vertices in P. P ⋆ is a k-skip shortest path from s to t, if it includes at least a vertex out of every k consecutive vertices in P. In general, P ⋆ succinctly describes P by sampling the vertice ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
(Show Context)
Given two vertices s, t in a graph, let P be the shortest path (SP) from s to t, and P ⋆ a subset of the vertices in P. P ⋆ is a k-skip shortest path from s to t, if it includes at least a vertex out of every k consecutive vertices in P. In general, P ⋆ succinctly describes P by sampling the vertices in P with a rate of at least 1/k. This makes P ⋆ a natural substitute in scenarios where reporting every single vertex of P is unnecessary or even undesired. This paper studies k-skip SP computation in the context of spatial network databases (SNDB). Our technique has two properties crucial for real-time query processing in SNDB. First, our solution is able to answer k-skip queries significantly faster than finding the original SPs in their entirety. Second, the previous objective is achieved with a structure that occupies less space than storing the underlying road network. The proposed algorithms are the outcome of a careful theoretical analysis that reveals valuable insight into the characteristics of the k-skip SP problem. Their efficiency has been confirmed by extensive experiments with real data.
Relational approach for shortest path discovery over large graphs
- Proc. VLDB Endow
"... With the rapid growth of large graphs, we cannot assume that graphs can still be fully loaded into memory, thus the disk-based graph operation is inevitable. In this paper, we take the shortest path discovery as an example to investigate the technique issues when leveraging existing infrastructure o ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
(Show Context)
With the rapid growth of large graphs, we cannot assume that graphs can still be fully loaded into memory, thus the disk-based graph operation is inevitable. In this paper, we take the shortest path discovery as an example to investigate the technique issues when leveraging existing infrastructure of relational database (RDB) in the graph data management. Based on the observation that a variety of graph search queries can be implemented by iterative operations including selecting fron-tier nodes from visited nodes, making expansion from the selected frontier nodes, and merging the expanded nodes into the visited ones, we introduce a relational FEM framework with three corre-sponding operators to implement graph search tasks in the RDB context. We show new features such as window function and merge statement introduced by recent SQL standards can not only sim-plify the expression but also improve the performance of the FEM framework. In addition, we propose two optimization strategies specific to shortest path discovery inside the FEM framework. First, we take a bi-directional set Dijkstra’s algorithm in the path finding. The bi-directional strategy can reduce the search space, and set Di-jkstra’s algorithm finds the shortest path in a set-at-a-time fashion. Second, we introduce an index named SegTable to preserve the lo-cal shortest segments, and exploit SegTable to further improve the performance. The final extensive experimental results illustrate our relational approach with the optimization strategies achieves high scalability and performance. 1.
Neighborhood-privacy protected shortest distance computing in cloud
- In SIGMOD Conference
, 2011
"... With the advent of cloud computing, it becomes desirable to utilize cloud computing to efficiently process complex operations in large graphs without compromising their sensitive information. This paper studies shortest distance computing in the cloud, which aims at the following goals: i) preventin ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
(Show Context)
With the advent of cloud computing, it becomes desirable to utilize cloud computing to efficiently process complex operations in large graphs without compromising their sensitive information. This paper studies shortest distance computing in the cloud, which aims at the following goals: i) preventing outsourced graphs from neighborhood attack, ii) preserving shortest distances in outsourced graphs, iii) minimizing overhead on the client side. The basic idea of this paper is to transform an original graph G into a link graph Gl kept locally and a set of outsourced graphs Go. Each outsourced graph should meet the requirement of a new security model called 1-neighborhood-d-radius. In addition, the shortest distance query can be equivalently answered using Gl and Go. Our objective is to minimize the space cost on the client side when both security and utility requirements are satisfied. We devise a greedy method to produce Gl and Go, which can exactly answer the shortest distance queries. We also develop an efficient transformation method to support approximate shortest distance answering under a given additive error bound. The final experimental results illustrate the effectiveness and efficiency of our method.
Online Computation of Fastest Path in Time-Dependent Spatial Networks
, 2011
"... The problem of point-to-point fastest path computation in static spatial networks is extensively studied with many precomputation techniques proposed to speed-up the computation. Most of the existing approaches make the simplifying assumption that travel-times of the network edges are constant. Howe ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
(Show Context)
The problem of point-to-point fastest path computation in static spatial networks is extensively studied with many precomputation techniques proposed to speed-up the computation. Most of the existing approaches make the simplifying assumption that travel-times of the network edges are constant. However, with real-world spatial networks the edge travel-times are time-dependent, where the arrival-time to an edge determines the actual travel-time on the edge. In this paper, we study the online computation of fastest path in time-dependent spatial networks and present a technique which speeds-up the path computation. We show that our fastest path computation based on a bidirectional time-dependent A * search significantly improves the computation time and storage complexity. With extensive experiments using real data-sets (including a variety of large spatial networks with real traffic data) we demonstrate the efficacy of our proposed techniques for online fastest path computation.