Results 1 - 10
of
14
Fast and accurate estimation of shortest paths in large graphs
- In Proceedings of Conference on Information and Knowledge Management (CIKM
, 2010
"... Computing shortest paths between two given nodes is a fundamental operation over graphs, but known to be nontrivial over large disk-resident instances of graph data. While a numberoftechniquesexistfor answeringreachabilityqueries and approximating node distances efficiently, determining actual short ..."
Abstract
-
Cited by 28 (1 self)
- Add to MetaCart
(Show Context)
Computing shortest paths between two given nodes is a fundamental operation over graphs, but known to be nontrivial over large disk-resident instances of graph data. While a numberoftechniquesexistfor answeringreachabilityqueries and approximating node distances efficiently, determining actual shortest paths (i.e. the sequence of nodes involved) is often neglected. However, in applications arising in massive online social networks, biological networks, and knowledge graphs it is often essential to find out many, if not all, shortest paths between two given nodes. In this paper, we address this problem and present a scalable sketch-based index structure that not only supports estimation of node distances, but also computes corresponding shortest paths themselves. Generating the actual path information allows for further improvements to the estimation accuracy of distances (and paths), leading to near-exact shortest-path approximations in real world graphs. We evaluate our techniques – implemented within a fully functional RDF graph database system – over large realworld social and biological networks of sizes ranging from tens of thousand to millions of nodes and edges. Experiments on several datasets show that we can achieve query response times providing several orders of magnitude speedup over traditional path computations while keeping the estimation errors between 0 % and 1 % on average.
Path Oracles for Spatial Networks
, 2009
"... The advent of location-based services has led to an increased demand for performing operations on spatial networks in real time. The challenge lies in being able to cast operations on spatial networks in terms of relational operators so that they can be performed in the context of a database. A line ..."
Abstract
-
Cited by 26 (8 self)
- Add to MetaCart
(Show Context)
The advent of location-based services has led to an increased demand for performing operations on spatial networks in real time. The challenge lies in being able to cast operations on spatial networks in terms of relational operators so that they can be performed in the context of a database. A linear-sized construct termed a path oracle is introduced that compactly encodes the n2 shortest paths between every pair of vertices in a spatial network having n vertices thereby reducing each of the paths to a single tuple in a relational database and enables finding shortest paths by repeated application of a single SQL SELECT operator. The construction of the path oracle is based on the observed coherence between the spatial positions of both source and destination vertices and the shortest paths between them which facilitates the aggregation of source and destination vertices into groups that share common vertices or edges on the shortest paths between them. With the aid of the Well-Separated Pair (WSP) technique, which has been applied to spatial networks using the network distance measure, a path oracle is proposed that takes O(sdn) space, where s is empirically estimated to be around 12 for road networks, but that can retrieve an intermediate link in a shortest path in O(logn) time using a B-tree. An additional construct termed the path-distance oracle of size O(n · max(sd, 1 d ε)) (empirically (n · max(122, 2.5 2 ε))) is proposed that can retrieve an intermediate vertex as well as an ε-approximation of the network distances in O(logn) time using a B-tree. Experimental results indicate that the proposed oracles are linear in n which means that they are scalable and can enable complicated query processing scenarios on massive spatial network datasets.
Query processing using distance oracles for spatial networks
- Best Papers of ICDE 2009 Special Issue
"... Abstract—The popularity of location-based services and the need to do real-time processing on them has led to an interest in performing queries on transportation networks, such as finding shortest paths and finding nearest neighbors. The challenge here is that the efficient execution of spatial oper ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
(Show Context)
Abstract—The popularity of location-based services and the need to do real-time processing on them has led to an interest in performing queries on transportation networks, such as finding shortest paths and finding nearest neighbors. The challenge here is that the efficient execution of spatial operations usually involves the computation of distance along a spatial network instead of “as the crow flies, ” which is not simple. Techniques are described that enable the determination of the network distance between any pair of points (i.e., vertices) with as little as OðnÞ space rather than having to store the n2 distances between all pairs. This is done by being willing to expend a bit more time to achieve this goal such as Oðlog nÞ instead of Oð1Þ, as well as by accepting an error " in the accuracy of the distance that is provided. The strategy that is adopted reduces the space requirements and is based on the ability to identify groups of source and destination vertices for which the distance is approximately the same within some ". The reductions are achieved by introducing a construct termed a distance oracle that yields an estimate of the network distance (termed the "-approximate distance) between any two vertices in the spatial network. The distance oracle is obtained by showing how to adapt the well-separated pair technique from computational geometry to spatial networks. Initially, an "-approximate distance oracle of size Oð n " dÞ is used that is capable of retrieving the approximate network distance in Oðlog nÞ time using a B-tree. The retrieval time can be theoretically reduced n log n further to Oð1Þ time by proposing another "-approximate distance oracle of size Oð
Efficient Evaluation of k-Range Nearest Neighbor Queries in Road Networks
"... Abstract—A k-Range Nearest Neighbor (or kRNN for short) query in road networks finds the k nearest neighbors of every point on the road segments within a given query region based on the network distance. The kRNN query is significantly important for location-based applications in many realistic scen ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
(Show Context)
Abstract—A k-Range Nearest Neighbor (or kRNN for short) query in road networks finds the k nearest neighbors of every point on the road segments within a given query region based on the network distance. The kRNN query is significantly important for location-based applications in many realistic scenarios. For example, (1) the user’s location is uncertain, i.e., user’s location is modeled by a spatial region, and (2) the user is not willing to reveal her exact location to preserve her privacy, i.e., her location is blurred into a spatial region. However, the existing solutions for kRNN queries simply apply the traditional k-nearest neighbor query processing algorithm multiple times, which poses a huge redundant searching overhead. To this end, we propose an efficient kRNN query processing algorithm in this paper. Our algorithm (1) employs a shared execution approach to eliminate the redundant searching overhead, and (2) provides a parameter that can be tuned to achieve a tradeoff between the query processing performance and the storage overhead, while guaranteeing the user’s exact k-nearest neighbors are included in the query answers. The experimental results show that our algorithm always outperforms the existing solution in terms of query response time, and the introduced tuning parameter is an effective way to achieve the tradeoff between the query response time and the storage overhead. I.
Horton+: A Distributed System for Processing Declarative Reachability Queries over Partitioned Graphs
"... Horton+ is a graph query processing system that executes declarative reachability queries on a partitioned attributed multi-graph. It employs a query language, query optimizer, and a distributed execution engine. The query language expresses declarative reachability queries, and supports closures an ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
Horton+ is a graph query processing system that executes declarative reachability queries on a partitioned attributed multi-graph. It employs a query language, query optimizer, and a distributed execution engine. The query language expresses declarative reachability queries, and supports closures and predicates on node and edge attributes to match graph paths. We introduce three algebraic operators, select, traverse, and join, and a query is compiled into an execution plan containing these operators. As reachability queries access the graph elements in a random access pattern, the graph is therefore maintained in the main memory of a cluster of servers to reduce query execution time. We develop a distributed execution engine that processes a query plan in parallel on the graph servers. Since the query language is declarative, we build a query optimizer that uses graph statistics to estimate predicate selectivity. We experimentally evaluate the system performance on a cluster of 16 graph servers using synthetic graphs as well as a real graph from an application that uses reachability queries. The evaluation shows (1) the efficiency of the optimizer in reducing query execution time, (2) system scalability with the size of the graph and with the number of servers, and (3) the convenience of using declarative queries. 1.
Roads Belong in Databases
"... The popularity of location-based services and the need to perform real-time processing on them has led to an interest in queries on road networks, such as finding shortest paths and finding nearest neighbors. The challenge here is that the efficient execution of operations usually involves the compu ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
The popularity of location-based services and the need to perform real-time processing on them has led to an interest in queries on road networks, such as finding shortest paths and finding nearest neighbors. The challenge here is that the efficient execution of operations usually involves the computation of distance along a spatial network instead of “as the crow flies, ” which is not simple. This requires the precomputation of the shortest paths and network distance between every pair of points (i.e., vertices) with as little space as possible rather than having to store the n 2 shortest paths and distances between all pairs. This problem is related to a ‘holy grail ’ problem in databases of how to incorporate road networks into relational databases. A data structure called a road network oracle is introduced that resides in a database and enables the processing of many operations on road networks with just the aid of relational operators. Two implementations of road network oracles are presented. 1
Memory-Efficient Algorithms for Spatial Network Queries
"... Abstract — Incrementally finding thek nearest neighbors (kNN) in a spatial network is an important problem in location-based services. One method (INE) simply applies Dijkstra’s algorithm. Another method (IER) computes the k nearest neighbors using Euclidean distance followed by computing their corr ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract — Incrementally finding thek nearest neighbors (kNN) in a spatial network is an important problem in location-based services. One method (INE) simply applies Dijkstra’s algorithm. Another method (IER) computes the k nearest neighbors using Euclidean distance followed by computing their corresponding network distances, and then incrementally finds the next nearest neighbors in order of increasing Euclidean distance until finding one whose Euclidean distance is greater than the current k nearest neighbor in terms of network distance. The LBC method improves on INE by avoiding the visit of nodes that cannot possibly lead to the k nearest neighbors by using a Euclidean heuristic estimator, and on IER by avoiding the repeated visits to nodes in the spatial network that appear on the shortest paths to different members of the k nearest neighbors by performing multiple instances of heuristic search using a Euclidean heuristic
Dynamic Monitoring of Optimal Locations in Road Network Databases
, 2014
"... Optimal location (OL) queries are a type of spatial queries that are particularly useful for the strategic planning of resources. Given a set of existing facilities and a set of clients, an OL query asks for a location to build a new facility that optimizes a certain cost metric (defined based on t ..."
Abstract
- Add to MetaCart
Optimal location (OL) queries are a type of spatial queries that are particularly useful for the strategic planning of resources. Given a set of existing facilities and a set of clients, an OL query asks for a location to build a new facility that optimizes a certain cost metric (defined based on the distances between the clients and the facilities). Several techniques have been proposed to address OL queries, assuming that all clients and facilities reside in an Lp space. In practice, however, movements between spatial locations are usually confined by the underlying road network, and hence, the actual distance between two locations can differ significantly from their Lp distance. Motivated by the deficiency of the existing techniques, this paper presents a comprehensive study on OL queries in road networks. We propose a unified framework that addresses three variants of OL queries that find important applications in practice, and we instantiate the framework with several novel query processing algorithms. We further extend our framework to efficiently monitor the OLs when locations for facilities and/or clients have been updated. Our dynamic update methods lead to efficient answering of continuous optimal location queries. We demonstrate the efficiency of our solutions through extensive experiments with large real data.
Scalable Network Distance Browsing in Spatial Databases
"... 2 Scalable Network Distance Browsing in Spatial Databases As online map services have become popular, it is imperative for providers to deliver results to queries as fast as possible. Two common tasks performed on spatial networks, shortest path and k-nearest neighbor computation are looked at. For ..."
Abstract
- Add to MetaCart
(Show Context)
2 Scalable Network Distance Browsing in Spatial Databases As online map services have become popular, it is imperative for providers to deliver results to queries as fast as possible. Two common tasks performed on spatial networks, shortest path and k-nearest neighbor computation are looked at. For a static network, an existing best-first k-nearest neighbor algorithm is presented and discussed. While occupying minimal storage, the algorithm is trying to speed up finding results by using precomputed shortest path quadtrees and estimating network distance ranges for possible nearest neighbors.
Route-Saver: Leveraging Route APIs for Accurate and Efficient Query Processing at Location-Based Services
- IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (TKDE)
"... Location-based services (LBS) enable mobile users to query points-of-interest (e.g., restaurants, cafes) on various features (e.g., price, quality, variety). In addition, users require accurate query results with up-to-date travel times. Lacking the monitoring infrastructure for road traffic, the LB ..."
Abstract
- Add to MetaCart
Location-based services (LBS) enable mobile users to query points-of-interest (e.g., restaurants, cafes) on various features (e.g., price, quality, variety). In addition, users require accurate query results with up-to-date travel times. Lacking the monitoring infrastructure for road traffic, the LBS may obtain live travel times of routes from online route APIs in order to offer accurate results. Our goal is to reduce the number of requests issued by the LBS significantly while preserving accurate query results. First, we propose to exploit recent routes requested from route APIs to answer queries accurately. Then, we design effective lower/upper bounding techniques and ordering techniques to process queries efficiently. Also, we study parallel route requests to further reduce the query response time. Our experimental evaluation shows that our solution is 3 times more efficient than a competitor, and yet achieves high result accuracy (above 98%).