Results 1  10
of
29
Scalable Network Distance Browsing in Spatial Databases
, 2008
"... An algorithm is presented for finding the k nearest neighbors in a spatial network in a bestfirst manner using network distance. The algorithm is based on precomputing the shortest paths between all possible vertices in the network and then making use of an encoding that takes advantage of the fact ..."
Abstract

Cited by 80 (8 self)
 Add to MetaCart
(Show Context)
An algorithm is presented for finding the k nearest neighbors in a spatial network in a bestfirst manner using network distance. The algorithm is based on precomputing the shortest paths between all possible vertices in the network and then making use of an encoding that takes advantage of the fact that the shortest paths from vertex u to all of the remaining vertices can be decomposed into subsets based on the first edges on the shortest paths to them from u. Thus, in the worst case, the amount of work depends on the number of objects that are examined and the number of links on the shortest paths to them from q, rather than depending on the number of vertices in the network. The amount of storage required to keep track of the subsets is reduced by taking advantage of their spatial coherence which is captured by the aid of a shortest path quadtree. In particular, experiments on a number of large road networks as
Path Oracles for Spatial Networks
, 2009
"... The advent of locationbased services has led to an increased demand for performing operations on spatial networks in real time. The challenge lies in being able to cast operations on spatial networks in terms of relational operators so that they can be performed in the context of a database. A line ..."
Abstract

Cited by 22 (6 self)
 Add to MetaCart
(Show Context)
The advent of locationbased services has led to an increased demand for performing operations on spatial networks in real time. The challenge lies in being able to cast operations on spatial networks in terms of relational operators so that they can be performed in the context of a database. A linearsized construct termed a path oracle is introduced that compactly encodes the n2 shortest paths between every pair of vertices in a spatial network having n vertices thereby reducing each of the paths to a single tuple in a relational database and enables finding shortest paths by repeated application of a single SQL SELECT operator. The construction of the path oracle is based on the observed coherence between the spatial positions of both source and destination vertices and the shortest paths between them which facilitates the aggregation of source and destination vertices into groups that share common vertices or edges on the shortest paths between them. With the aid of the WellSeparated Pair (WSP) technique, which has been applied to spatial networks using the network distance measure, a path oracle is proposed that takes O(sdn) space, where s is empirically estimated to be around 12 for road networks, but that can retrieve an intermediate link in a shortest path in O(logn) time using a Btree. An additional construct termed the pathdistance oracle of size O(n · max(sd, 1 d ε)) (empirically (n · max(122, 2.5 2 ε))) is proposed that can retrieve an intermediate vertex as well as an εapproximation of the network distances in O(logn) time using a Btree. Experimental results indicate that the proposed oracles are linear in n which means that they are scalable and can enable complicated query processing scenarios on massive spatial network datasets.
Distance Oracles for Spatial Networks
"... Abstract — The popularity of locationbased services and the need to do realtime processing on them has led to an interest in performing queries on transportation networks, such as finding shortest paths and finding nearest neighbors. The challenge is that these operations involve the computation o ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
Abstract — The popularity of locationbased services and the need to do realtime processing on them has led to an interest in performing queries on transportation networks, such as finding shortest paths and finding nearest neighbors. The challenge is that these operations involve the computation of distance along a spatial network rather than “as the crow flies. ” In many applications an estimate of the distance is sufficient, which can be achieved by use of an oracle. An approximate distance oracle is proposed for spatial networks that exploits the coherence between the spatial position of vertices and the network distance between them. Using this observation, a distance oracle is introduced that is able to obtain the εapproximate network distance between two vertices of the spatial network. The network distance between every pair of vertices in the spatial network is efficiently represented by adapting the wellseparated pair technique to spatial networks. Initially, use is made of an εapproximate distance oracle of size O ( n εd) that is capable of retrieving the approximate network distance in O(logn) time using a Btree. The retrieval time can be theoretically reduced to O(1) time by proposing another εapproximate distance oracle of size O ( nlogn εd) that uses a hash table. Experimental results indicate that the proposed technique is scalable and can be applied to sufficiently large road networks. A 10%approximate oracle (ε = 0.1) on a large network yielded an average error of 0.9 % with 90 % of the answers making an error of 2 % or less and an average retrieval time of 68µ seconds. Finally, a strategy for the integration of the distance oracle into any relational database system as well as using it to perform a variety of spatial queries such as region search, knearest neighbor search, and spatial joins on spatial networks is discussed. I.
Shortest Path and Distance Queries on Road Networks: An Experimental Evaluation
"... Computing the shortest path between two given locations in a road network is an important problem that finds applications in various map services and commercial navigation products. The stateoftheart solutions for the problem can be divided into two categories: spatialcoherencebased methods and ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Computing the shortest path between two given locations in a road network is an important problem that finds applications in various map services and commercial navigation products. The stateoftheart solutions for the problem can be divided into two categories: spatialcoherencebased methods and verteximportancebased approaches. The two categories of techniques, however, have not been compared systematically under the same experimental framework, as they were developed from two independent lines of research that do not refer to each other. This renders it difficult for a practitioner to decide which technique should be adopted for a specific application. Furthermore, the experimental evaluation of the existing techniques, as presented in previous work, falls short in several aspects. Some methods were tested only on small road networks with up to one hundred thousand vertices; some approaches were evaluated using distance queries (instead of shortest path queries), namely, queries that ask only for the length of the shortest path; a stateoftheart technique was examined based on a faulty implementation that led to incorrect query results. To address the above issues, this paper presents a comprehensive comparison of the most advanced spatialcoherencebased and verteximportancebased approaches. Using a variety of real road networks with up to twenty million vertices, we evaluated each technique in terms of its preprocessing time, space consumption, and query efficiency (for both shortest path and distance queries). Our experimental results reveal the characteristics of different techniques, based on which we provide guidelines on selecting appropriate methods for various scenarios. 1.
Query processing using distance oracles for spatial networks
 Best Papers of ICDE 2009 Special Issue
"... Abstract—The popularity of locationbased services and the need to do realtime processing on them has led to an interest in performing queries on transportation networks, such as finding shortest paths and finding nearest neighbors. The challenge here is that the efficient execution of spatial oper ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
(Show Context)
Abstract—The popularity of locationbased services and the need to do realtime processing on them has led to an interest in performing queries on transportation networks, such as finding shortest paths and finding nearest neighbors. The challenge here is that the efficient execution of spatial operations usually involves the computation of distance along a spatial network instead of “as the crow flies, ” which is not simple. Techniques are described that enable the determination of the network distance between any pair of points (i.e., vertices) with as little as OðnÞ space rather than having to store the n2 distances between all pairs. This is done by being willing to expend a bit more time to achieve this goal such as Oðlog nÞ instead of Oð1Þ, as well as by accepting an error " in the accuracy of the distance that is provided. The strategy that is adopted reduces the space requirements and is based on the ability to identify groups of source and destination vertices for which the distance is approximately the same within some ". The reductions are achieved by introducing a construct termed a distance oracle that yields an estimate of the network distance (termed the "approximate distance) between any two vertices in the spatial network. The distance oracle is obtained by showing how to adapt the wellseparated pair technique from computational geometry to spatial networks. Initially, an "approximate distance oracle of size Oð n " dÞ is used that is capable of retrieving the approximate network distance in Oðlog nÞ time using a Btree. The retrieval time can be theoretically reduced n log n further to Oð1Þ time by proposing another "approximate distance oracle of size Oð
HLDB: Locationbased services in databases
 In Proceedings of the 20th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems (GIS’12), 339–348. ACM Press. Best Paper Award
, 2012
"... This paper introduces HLDB, the first practical system that can answer exact spatial queries on continental road networks entirely within a database. HLDB is based on hub labels (HL), the fastest pointtopoint algorithm for road networks, and its queries are implemented (quite naturally) in stan ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
This paper introduces HLDB, the first practical system that can answer exact spatial queries on continental road networks entirely within a database. HLDB is based on hub labels (HL), the fastest pointtopoint algorithm for road networks, and its queries are implemented (quite naturally) in standard SQL. Within the database, HLDB answers exact distance queries and retrieves full shortestpath descriptions in real time, even on networks with tens of millions of vertices. The basic algorithm can be extended in a natural way (still in SQL) to answer much more sophisticated queries, such as finding the ten closest fastfood restaurants. We also introduce efficient new HLbased algorithms for even harder problems, such as best via point, ride sharing, and point of interest prediction. The HLDB framework makes it easy to implement these algorithms in SQL, enabling interactive applications on continental road networks.
Online Document Clustering Using the GPU
, 2010
"... Online document clustering takes as its input a list of document vectors, ordered by time. A document vector consists of a list of K terms and their associated weights. The generation of terms and their weights from the document text may vary, but the TFIDF (term frequencyinverse document frequenc ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
(Show Context)
Online document clustering takes as its input a list of document vectors, ordered by time. A document vector consists of a list of K terms and their associated weights. The generation of terms and their weights from the document text may vary, but the TFIDF (term frequencyinverse document frequency) method is popular for clustering applications [1]. The assumption is that the resulting document vector is a good overall representation of the original document. We note that the dimensionality of the document vectors is very high (potentially infinite), since a document could potentially contain any word (term). We also note that the vectors are sparse in the sense that most term weights have a zero value. We assume that each term not explicitly present in a particular document vector has a weight of zero. Document vectors are normalized. Clusters are also represented as a list of weighted terms. At any given time, a cluster’s term vector is equal to the average of all the document vector’s contained by the cluster. Cluster term vectors are truncated to the top K terms (those containing the highest term weights). Cluster term vectors are kept normalized. The objective of the algorithm is to partition the set of document vectors into a set of clusters, each cluster containing only those documents which are similar to each other with respect to some metric. For this paper, we consider the Euclidean dot product as the similarity metric, as it has been shown to provide good results with the TFIDF metric [1]. The similarity between a cluster and a document is defined as the dot product between their term vectors. We first present serial a algorithm for online clustering. We then describe a PRAM algorithm for parallel online clustering, assuming a CRCW model. Finally, we present a practical implementation of an approximate parallel online clustering algorithm, suitable for the CUDA parallel computing architecture [2]. 1. Serial Clustering 1 The basic serial online clustering algorithm takes as input a list of n document vectors, as well as a clustering threshold T ranging between 0 and 1. Below is a high level overview of the algorithm.
Roads Belong in Databases
"... The popularity of locationbased services and the need to perform realtime processing on them has led to an interest in queries on road networks, such as finding shortest paths and finding nearest neighbors. The challenge here is that the efficient execution of operations usually involves the compu ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
The popularity of locationbased services and the need to perform realtime processing on them has led to an interest in queries on road networks, such as finding shortest paths and finding nearest neighbors. The challenge here is that the efficient execution of operations usually involves the computation of distance along a spatial network instead of “as the crow flies, ” which is not simple. This requires the precomputation of the shortest paths and network distance between every pair of points (i.e., vertices) with as little space as possible rather than having to store the n 2 shortest paths and distances between all pairs. This problem is related to a ‘holy grail ’ problem in databases of how to incorporate road networks into relational databases. A data structure called a road network oracle is introduced that resides in a database and enables the processing of many operations on road networks with just the aid of relational operators. Two implementations of road network oracles are presented. 1
MemoryEfficient Algorithms for Spatial Network Queries
"... Abstract — Incrementally finding thek nearest neighbors (kNN) in a spatial network is an important problem in locationbased services. One method (INE) simply applies Dijkstra’s algorithm. Another method (IER) computes the k nearest neighbors using Euclidean distance followed by computing their corr ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract — Incrementally finding thek nearest neighbors (kNN) in a spatial network is an important problem in locationbased services. One method (INE) simply applies Dijkstra’s algorithm. Another method (IER) computes the k nearest neighbors using Euclidean distance followed by computing their corresponding network distances, and then incrementally finds the next nearest neighbors in order of increasing Euclidean distance until finding one whose Euclidean distance is greater than the current k nearest neighbor in terms of network distance. The LBC method improves on INE by avoiding the visit of nodes that cannot possibly lead to the k nearest neighbors by using a Euclidean heuristic estimator, and on IER by avoiding the repeated visits to nodes in the spatial network that appear on the shortest paths to different members of the k nearest neighbors by performing multiple instances of heuristic search using a Euclidean heuristic
A Sorting Approach to Indexing Spatial Data
, 2008
"... Spatial data is distinguished from conventional data by having extent. Therefore, spatial queries involve both the objects and the space that they occupy. The handling of queries that involve spatial data is facilitated by building an index on the data. The traditional role of the index is to sort t ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Spatial data is distinguished from conventional data by having extent. Therefore, spatial queries involve both the objects and the space that they occupy. The handling of queries that involve spatial data is facilitated by building an index on the data. The traditional role of the index is to sort the data, which means that it orders the data. However, since generally no ordering exists in dimensions greater than 1 without a transformation of the data to one dimension, the role of the sort process is one of differentiating between the data and what is usually done is to sort the spatial objects with respect to the space that they occupy. The resulting ordering is usually implicit rather than explicit so that the data need not be resorted (i.e., the index need not be rebuilt) when the queries change (e.g., the query reference objects). The index is said to order the space and the characteristics of such indexes are explored further.