78 citations found. Retrieving documents...
S. Brin. Near neighbor search in large metric spaces. In The VLDB Journal, pages 574-584, 1995.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Antipole Tree Indexing to Support Range Search and - Nearest Neighbor Search   (Correct)

....partition objects into two balanced subsets (those as close or closer than the median and those farther than the median) The same procedure can recursively be applied to each of the two subsets. The Multivantage Point tree [6] is an intellectual descendant of the vantage point tree and the GNAT [7] structure and appears to be superior to the previous methods. The fundamental idea is that, given a point p, one can partition all objects into m partitions based on their distances from p, where the rst partition consists of those points within distance d# from p, and the second consists of ....

S. Brin. Near neighbor search in large metric spaces. Proceedings of the 21th International Conferenceon Very Large Data Bases, pages 574-584, 1995.


Pivot Selection Techniques for Proximity Searching in.. - Bustos, Navarro, Chavez (2001)   (2 citations)  (Correct)

....other. For example, in [10] is proposed to choose objects that maximize the sum of distances between pivots previously chosen (see Section 5. 4 for more details) in [13] is proposed an heuristic based on the second moment of the distance distribution which selects objects that are far away, and in [3] is proposed a greedy heuristic to select objects that are the farthest apart (note that this last structure does not select pivots, but split points ) However, these heuristics only work in specific metric spaces and have a bad behavior in others. In R with the Euclidean metric, it is shown ....

....rest of the objects of the metric space. The objects that satisfy these properties are called outliers. It is clear that pivots must be far away from each other, because two very close pivots give almost the same information for discarding objects. This is in accordance with previous observations [8,13,3]. Then, it can be assumed that good pivots are outliers, so a new selection technique could be as follows: use the same incremental selection method with the new criterion of selecting objects which maximize the sum of the distances between the pivots previously chosen, selecting the first pivot ....

S. Brin. Near neighbor search in large metric spaces. In Proc. 21st Conference on Very Large Databases (VLDB'95), pages 574--584, 1995.


Finding Nearest Neighbors in Growth-restricted Metrics - Karger, Ruhl (2002)   (36 citations)  (Correct)

....applications. The most frequent approach is by pivoting [10, 11] i.e. the space is partitioned into two halves by picking a random pivot, and putting points into either half of the partition according to their distance to the pivot element. Variations use multiple pivoting elements per split [3]. While these structures answer queries in O(logn) time for a point that is actually in the set S, they cannot be used efficiently to find nearest neighbors, or perform range queries unless the radii involved are very small. This is because in general the search ranges can split at every pivoting ....

S. Brin. Near neighbor search in large metric spaces. In Proceedings VLDB, pages 574--584, 1995.


Improved Dynamic Spatial Approximation Trees - Navarro, Reyes   (Correct)

.... to metric spaces as well: the typical feature in high dimensional spaces with L p distances is that the probability distribution of distances among elements has a very concentrated histogram (with larger mean as the dimension grows) making the work of any similarity search algorithm more dicult [2, 3]. In the extreme case we have a space where d(x; x) 0 and 8y 6= x; d(x; y) 1, where it is impossible to avoid a single distance evaluation at search time. We say that a general metric space is high dimensional when its histogram of distances is concentrated. There are a number of methods to ....

S. Brin. Near neighbor search in large metric spaces. In Proc. 21st Conference on Very Large Databases (VLDB'95), pages 574-584, 1995.


Fully Dynamic Spatial Approximation Trees - Navarro, Reyes (2002)   (1 citation)  (Correct)

.... metric spaces as well: the typical feature in high dimensional spaces with L p distances is that the probability distribution of distances among elements has a very concentrated histogram (with larger mean as the dimension grows) making the work of any similarity search algorithm more difficult [2, 3]. We say that a general metric space is high dimensional when its histogram of distances is concentrated. For general metric spaces, there exist a number of methods to preprocess the database in order to reduce the number of distance evaluations [3] All those structures work on the basis of ....

S. Brin. Near neighbor search in large metric spaces. In Proc. 21st Conference on Very Large Databases (VLDB'95), pages 574--584, 1995.


Probabilistic Proximity Searching Algorithms Based on Compact .. - Bustos, Navarro (2002)   (Correct)

.... that use the covering radius criterion are Bisector Trees (BST ) 11] Monotonous BST [13] Voronoi Tree [8] M Tree [7] and List of Clusters [4] Also, there exist algorithms that use both criteria, for example Spatial Approximation Tree (SAT ) 12] and Geometric Near neighbor Access Tree [2]. Of all these algorithms, two of the most efficient are SAT and List of Clusters, so now we explain briefly how these algorithms work. 2.3 Spatial Approximation Tree The SAT [12] is based on approaching the query spatially rather than dividing the search space, that is, start at some point in ....

S. Brin. Near neighbor search in large metric spaces. In Proc. 21st Conference on Very Large Databases (VLDB'95), pages 574--584, 1995.


Searching in Metric Spaces by Spatial Approximation - Navarro (1999)   (6 citations)  (Correct)

....or hardness for searching a D dimensional space: higher dimensional spaces have a probability distribution of distances among elements whose histogram is more concentrated and with larger mean. This makes the work of any similarity search algorithm more di cult (this is discussed for example in [33,6,9,14]) In the extreme case we have a space where d(x; x) 0 and 8y 6= x; d(x; y) 1, where the query has to be exhaustively compared against every element in the set. We will extend this idea by saying that a general metric space is harder than other when its histogram of distances is more ....

....cr(c i ) then there is no need to consider zone i. The techniques can be combined. Some using only hyperplanes are the gh trees and variants [31,27] and Voronoi trees [16,25] Some using only covering radii are the M trees [15] and lists of clusters [12] One using both criteria is the gna tree [6]. To answer 1 NN queries, we simulate a range query with a radius that is initially r = 1, and reduce r as we nd closer and closer elements to q. At the end, we have in r the distance to the closest elements and have seen them all. Unlike a range query, we are now interested in quickly nding ....

[Article contains additional citation context not shown here]

S. Brin. Near neighbor search in large metric spaces. In Proc. of the 21st Conference on Very Large Databases (VLDB'95), pages 574-584, 1995. Gonzalo Navarro


Image Indexing Based on Spatial Similarity - Petrakis, Faloutsos   (Correct)

....(c) Methods based on trees, like the R tree [29] and its derivatives (R tree [30] etc. Recent extensions for high dimensions include the X tree [31] and the SR trees [32] Methods referred to as metric trees or distance trees are based on the idea of indexing using distance information [33, 34, 35]. All these methods try to exploit the triangle inequality in order to prune the search space on a range query. However, none of them tries to map images into points in a target space (also known as feature space ) nor to provide a tool for visualization. Besides, most of these methods require ....

Sergey Brin. Near Neighbor Search in Large Metric Spaces. In Proceedings of the 21st VLDB Conference, pages 574--584, Zurich, Switzerland, 1995.


Efficient Shape Matching and Retrieval at Multiple Scales - Milios, Petrakis (1998)   (Correct)

....filling curves [40] and finally, c) Methods based on trees (k d trees [41] One of the most characteristic methods is the R tree [42] Methods referred to as metric trees are based on the idea of indexing using distance information. VantagePoint (VP) trees [43] and Geometric trees (GNAT) [44] are characteristic examples. Most of these methods require expensive preprocessing for building a tree index structure. An alternative to metric trees is FastMap [45] a fast algorithm that transforms data entities (e.g. shapes in our application) into multidimensional points. 5 7 10 12 ....

Sergey Brin. Near Neighbor Search in Large Metric Spaces. In Proceedings of the 21st VLDB Conference, pages 574--584, Zurich, Switzerland, 1995.


Shock-based Indexing into Large Shape Databases - Sebastian, Klein, Kimia (2002)   (4 citations)  (Correct)

....space will necessarily exclude certain dimensions, variations along which will not lead to robust results. Thus, the problem is one of organizing a metric space which can at best be approximated as a very high dimensional Euclidean space. 2 Distance based nearest neighbor search techniques [10, 31, 32, 9] have been used for searching metric spaces. These methods typically use the triangle inequality to avoid computing the distances of the query to all elements in the database. The basic idea is to select certain elements as pivots, group the remaining elements into clusters based on their ....

.... the performance of these methods deteriorate as the dimensionality of the space increases (curse of dimensionality) principally because pairwise distances between elements in a high dimensional space tend to fall in a narrow range, and the triangle inequality can only eliminate a few elements [9]. In this paper, we examine approaches to make indexing into large databases by matching shock graphs practical. In this approach the edit distance between shock graphs [25, 16, 15] is used as a metric of similarity between shapes. It is obtained by exhaustively searching for the optimal ....

S. Brin. Near neighbor search in large metric spaces. VLDB, pages 574-584, 1995.


Distance Browsing in Spatial Databases - Hjaltason, Samet (1999)   (38 citations)  (Correct)

.... case, it is not possible to produce new objects in the metric space, e.g. to aggregate or divide two objects (in a Euclidean space, bounding rectangles are often used for this purpose) Various methods exist for indexing objects in the metric space model as well as for computing proximity queries [11, 13, 14, 52, 53]. These methods can only make use of the properties of distance metrics (nonnegativity, symmetry, and the triangle inequality) and operate without any knowledge of how objects are represented or how the distances between objects are computed. Such a general approach is usually slower than methods ....

.... points [48] Another approach is to abandon the goal of indexing the data points based on space occupancy and instead use properties of the distance metric employed (see the discussion of the metric space model in Section 2) If a hierararchical index method based on distance (e.g. [11, 14, 52]) is employed, our algorithm is still applicable. In fact, the k nearest neighbor algorithm presented in [14] is similar to our algorithm in that it uses a priority queue for nodes to guide the traversal of the index. If we use the Euclidean distance metric, the nearest neighbor search region ....

[Article contains additional citation context not shown here]

S. Brin. Near neighbor search in large metric space. In U. Dayal, P. M. D. Gray, and S. Nishio, editors, Proceedings of the 21st International Conference on Very Large Data Bases, pages 574--584, Zurich, Switzerland, September 1995.


Towards Measuring the Searching Complexity of Metric Spaces - Chavez, Navarro   (Correct)

....the dimension d. For general metric spaces, on the other hand, there are no known lower bounds, neither in the average nor in the worst case sense. In most cases even the analyses of particular algorithms seem so dicult that the authors validate their complexity claims just by experiments [6, 25]. A few authors attempt to formally analyze their algorithms [13, 27, 2, 23, 12] but they need to make simplifying assumptions that have to be experimentally validated anyway. Part of the problem is that general metric spaces may present widely varying features that a ect the search time. For ....

....in metric spaces has provided a lower bound for the search complexity, as they have been obtained for metric spaces. Our goal is to de ne a measure of the intrinsic search diculty which, albeit not necessarily related to a concept of dimension, permits us deriving those lower bounds. Many authors [6, 8, 12] have proposed to use the histogram of distances to characterize the diculty of searching in an arbitrary metric space, but no quantitative de nition has been attempted. We present now a quantitative measure in this line and study its suitability. Let us start with a well known example. Consider ....

S. Brin. Near neighbor search in large metric spaces. In Proc. 21st Conference on Very Large Databases (VLDB'95), pages 574-584, 1995.


An Effective Clustering Algorithm to Index High.. - Chávez, Navarro   (Correct)

.... can be translated to metric spaces as well: the typical feature in high dimensional spaces is that the probability distribution of distances among elements has a very concentrated histogram (with larger mean as the dimension grows) hampering the work of any similarity search algorithm [5, 7]. In the extreme case we have a space where d(x; x) 0 and 8y 6= x; d(x; y) 1, where it is impossible to avoid a single distance evaluation at search time. We say that a general metric space is high dimensional when its histogram of distances is concentrated. We use in this paper a quantitative ....

....are at the same depth h, regardless of the bucket size. Vantage Point Trees (vp trees) 20, 22] are designed for continuous distance functions. The root has two equal size subtrees that divide the elements in closer to and farther from the root. This can be extended to m ary trees (mvp trees) [5, 4]. Finally, algorithms like AESA [21] LAESA [16, 15] and its variants [18, 8] and Fixed Queries Arrays (fqarrays [9] are based in a common idea: k pivots are selected and each object is mapped to k coordinates which are its distances to the pivots. Later, the query q is also mapped and if it ....

[Article contains additional citation context not shown here]

S. Brin. Near neighbor search in large metric spaces. In Proc. VLDB'95, pages 574--584, 1995.


Searching in Metric Spaces - Chávez, Navarro, Baeza-Yates, .. (1999)   (8 citations)  (Correct)

.... query time BKT [19, 59] n pointers O(n log n) O(n ff ) FQT [5] n: n log n pointers O(n log n) O(n ff ) FHQT [5, 4, 6] n: nh pointers O(nh) O(log n) O(n ff ) FQA [24] nhb bits O(nh) O(log n) O(n ff log n) VPT [62, 68, 26] n pointers O(n log n) O(log n) MVPT [16, 15] n pointers O(n log n) O(log n) VPF [69] n pointers O(n 2 Gammaff ) O(n 1 Gammaff log n) BST [44, 52] n pointers O(n log n) not analyzed GHT [62, 18] n pointers O(n log n) not analyzed GNAT [16] nm 2 distances O(nm log m n) not analyzed VT [32, 51, 63] n ....

.... n) VPT [62, 68, 26] n pointers O(n log n) O(log n) MVPT [16, 15] n pointers O(n log n) O(log n) VPF [69] n pointers O(n 2 Gammaff ) O(n 1 Gammaff log n) BST [44, 52] n pointers O(n log n) not analyzed GHT [62, 18] n pointers O(n log n) not analyzed GNAT [16] nm 2 distances O(nm log m n) not analyzed VT [32, 51, 63] n pointers O(n log n) not analyzed MT [27] n pointers O(n(m: m 2 ) log m n) not analyzed SAT [48] n pointers O(n log n= log log n) O(n 1 Gamma Theta(1= loglog n) AESA [64] n 2 distances O(n 2 ) O(1) ....

[Article contains additional citation context not shown here]

S. Brin. Near neighbor search in large metric spaces. In Proc. 21st Conference on Very Large Databases (VLDB'95), pages 574--584, 1995.


Incremental Similarity Search in Multimedia Databases - Hjaltason, Samet (2000)   (3 citations)  (Correct)

....(e.g. color histograms often lead to 64, or even as high as 256 dimensional vectors) Two strategies have been proposed for handling proximity queries in metric spaces. The first is to work directly within the metric space, often by building hierarchical distance based index structures [9, 11, 19, 66]. Nearest neighbor and k nearest neighbor algorithms have been proposed for many of these structures. The second strategy maps the database objects into a low to medium dimensional vector space and then makes use of efficient spatial indexing methods available there, such as the R tree [29] The ....

....in leaf nodes, or by somehow detecting the fact that an object has been inserted earlier) 30 4.3 Generalized Hyperplane Partitioning Methods 4.3. 1 The gh tree Uhlmann [66] defined a metric tree using generalized hyperplane partitioning, which has been termed a gh tree by later authors [9, 8, 27]. Instead of picking just one object for partitioning as in the vp tree, this method picks two pivots p 1 and p 2 (e.g. the objects farthest from each other) and splits the set of remaining objects based on the closest pivot (see Figure 6b) S 1 = fo 2 Snfp 1 ;p 2 gjd#p 1 ;o##d#p 2 ;o#g;and S ....

[Article contains additional citation context not shown here]

S. Brin. Near neighbor search in large metric space. In Proceedings of the 21st International Conference on Very Large Data Bases (VLDB),U.Dayal,P.M.D.Gray,andS.Nishio,eds.,pages 574--584, Zurich, Switzerland, September 1995.


Database Techniques for Archival of Solid Models - McWherter, Peabody.. (2001)   (1 citation)  (Correct)

....element or the other. This technique has the problem that it requires two or more potentially expensive distance calculations at each level in the tree. One of the most sophisticated approaches is the GNAT tree, which introduces a number of heuristics to ensure that the tree is relatively balanced [5], although this results in a lot of computational overhead. The VP Tree and the GNAT tree, however, fail to perform as well when data is being inserted into the tree in a dynamic way, without costly balancing operations being performed. A structure known as the Metric Tree (M Tree) has been ....

S. Brin. Near neighbor search in large metric spaces. In Proceedings of VLDB 1995, pages 574--584, 1995.


Searching in Metric Spaces - Chavez, Navarro, Baeza-Yates.. (1999)   (8 citations)  (Correct)

....Finally, the author of [Yianilos 1993] considers the problem of pivot selection and argues that it is better to take elements far away from the set. 5.1.2.2 MVPT. The VPT can be extended to m ary trees by using the m Gamma 1 uniform percentiles instead of just the median. This is suggested in [Brin 1995; Bozkaya and Ozsoyoglu 1997] In [Bozkaya and Ozsoyoglu 1997] the Multi Vantage Point Tree (MVPT) is presented. They propose the use of many elements in a single node, much as in [Shapiro 1977] It can be seen that the space is O(n) since each internal node needs to store the m percentiles ....

....into both subtrees. In [Uhlmann 1991b] it is argued that GHTs could work better than VPTs in high dimensions. The same idea of reusing the parent node is proposed in [Bugnion et al. 1993] this time to avoid performing two distance evaluations at each node. 5.1.2.6 GNAT. The GHT is extended in [Brin 1995] to an m ary tree, called GNAT (Geometric Near neighbor Access Tree) keeping the same essential idea. We select, for the first level, m centers c 1 : c m , and define U i = fu 2 U;d(c i ; u) d(c j ; u) 8j 6= ig. That is, U i are the elements closer to c i than to any other c j . From the ....

[Article contains additional citation context not shown here]

Brin, S. 1995. Near neighbor search in large metric spaces. In Proc. 21st Conference on Very Large Databases (VLDB'95) (1995), pp. 574--584.


Fixed Queries Array: A Fast and Economical Data.. - Chávez.. (2001)   (Correct)

....to either the Voronoi graph or its dual, the Delaunay triangulation. In this line we can find generalized hyperplanes (Kalantari and McDonald, 1983; Dehne and paper.tex; 17 11 2000; 19:05; p. 4 Fixed Queries Array 5 Nolteimer, 1987; Uhlmann, 1991b) the GNATs (Geometric Neighbor Access Trees) (Brin, 1995), and more recently the M trees (Ciaccia et al. 1997) the SB algorithm (Clarkson, 1999) and the SAT (Spatial Approximation Tree) Navarro, 1999) The key idea in all these algorithms is to cluster the space so as to search by approaching spatially to the query, as opposed to the pivot based ....

Brin, S.: 1995, `Near neighbor search in large metric spaces'. In: Proc. 21st Conference on Very Large Databases (VLDB'95). pp. 574--584.


Dynamic Spatial Approximation Trees - Navarro, Reyes (2001)   (Correct)

.... to metric spaces as well: the typical feature in high dimensional spaces with L p distances is that the probability distribution of distances among elements has a very concentrated histogram (with larger mean as the dimension grows) making the work of any similarity search algorithm more dicult [5, 10]. In the extreme case we have a space where d(x; x) 0 and 8y 6= x; d(x; y) 1, where it is impossible to avoid a single distance evaluation at search time. We say that a general metric space is high dimensional when its histogram of distances is concentrated. There are a number of methods to ....

....maximum distance between c i and an element in its zone. If d(q; c i ) r cr(c i ) then there is no need to consider zone i. The techniques can be combined. Some techniques using only hyperplanes are [22, 19, 12] Some techniques using only covering radii are [11, 9] One using both criteria [5]. Nearest neighbor queries. To answer 1 NN queries, we simulate a range query with a radius that is initially r = 1, and reduce r as we nd closer and closer elements to q. At the end, we have in r the distance to the closest elements and have seen them all. Unlike a range query, we ....

S. Brin. Near neighbor search in large metric spaces. In Proc. of the 21st Conference on Very Large Databases (VLDB'95), pages 574-584, 1995.


Pivot Selection Techniques for Proximity Searching in.. - Bustos, Navarro.. (2001)   (2 citations)  (Correct)

....all proximity search algorithms based on pivots choose them randomly among the elements of the database. However, it is well known that the way pivots are selected dramatically affects the search performance [10, 8, 9] Some heuristics to choose the pivots better than at random have been presented [12, 4], but in general these heuristics only work in specific metric spaces and have a bad behavior in others. In R k with the Euclidean metric, it is shown in [9] that it is possible to find an optimal set of k 1 pivots selecting them as the vertices of a sufficiently large regular k dimensional ....

....of the elements of the metric space. The elements that satisfy these properties are called outliers. It is clear that pivots must be far away from each other, because two very close pivots give almost the same information for discarding elements. This is in accordance with previous observations [9, 12, 4]. 0 50 100 150 200 250 300 350 400 450 500 1000 1500 2000 2500 3000 3500 4000 4500 Distance evaluations Random Incremental Figure 5. Number of pivots needed to answer range queries using random and incremental selection with the same total complexity. Then, it can be assumed ....

S. Brin. Near neighbor search in large metric spaces. In Proc. 21st Conference on Very Large Databases (VLDB'95), pages 574--584, 1995.


Towards Measuring the Searching Complexity of Metric Spaces - Chávez, Navarro   (Correct)

....Project R 28923A. For general metric spaces, on the other hand, there are no known lower bounds, neither in the average nor in the worst case sense. In most cases even the analyses of particular algorithms seem so dicult that the authors validate their complexity claims just by experiments [6, 7] A few authors attempt to formally analyze their algorithms [8 11, 1] but they need to make simplifying assumptions that have to be experimentally validated anyway. In [12] we presented a framework able to unify the existing approaches under a unique theoretical model. This paper is aimed at ....

....in metric spaces has provided a lower bound for the search complexity, as they have been obtained for metric spaces. Our goal is to de ne a measure of the intrinsic search diculty which, albeit not necessarily related to a concept of dimension, permits us deriving those lower bounds. Many authors [6, 17, 1] have proposed to use the histogram of distances to characterize the diculty of searching in an arbitrary metric space, but no quantitative de nition has been attempted. We present now a quantitative measure in this line and study its suitability. Let us start with a well known example. Consider ....

Brin, S.: Near neighbor search in large metric spaces. In: Proc. 21st Conference on Very Large Databases (VLDB'95). (1995) 574-584


Processing Csing3 Similarity Queries with Distance-based .. - Ciaccia, Patella, Zezula   (Correct)

....dmax (v i ,Reg(N) bounds, since they depend on the kind of data regions managed by the index. For instance, in M tree above bounds are computed as max d(v i ,vr ) r(vr ) 0 and d(v i ,vr ) r(vr ) respectively [CPZ97] Simple calculations are similarly required for other metric trees [Chi94, Bri95, BO97], as well as for spatial access methods, such as R tree (see [RKV95] 4.1 False Drops at the Index Level The absence of any specific assumption about the similarity environment and the access method in Theorem 4 makes it impossible to guarantee the absence of false drops at the level of index ....

S. Brin. Near neighbor search in large metric spaces. In Proceedings of the 21st VLDB International Conference, pages574--584, Zurich, Switzerland, September 1995.


Similarity Search without Tears: the OMNI-Family of All-Purpose .. - Filho, al. (2001)   (1 citation)  (Correct)

....assigning the remaining to the closest representative. Bozkaya and Ozsoyoglu [7] 6] proposed an extension of the vp tree called multi vantage point tree (mvp tree) which chooses in a clever way m vantage points for a node which has a fanout of m 2 . The Geometric Near Access Tree (GNAT) of Brin [8] can be viewed as a refinement of the second technique presented in [9] It stores the distances between pairs of representatives in addition to the representative and the maximum distance. These distances can be used to prune the search space using triangle inequality. An excellent survey of ....

S. Brin, "Near neighbor search in large metric spaces," Proc. Intl. Conf. on Very Large Databases (VLDB), Zurich, Switzerland, 1995, pp. 574-584.


Approximate String Joins in a Database (Almost) for Free - Gravano, Ipeirotis.. (2001)   (Correct)

....guide the search for approximate string matches [4, 11] In [1] Baeza Yates and Gonnet solve the problem of exact substring joins, using suffix arrays and outside the context of a relational database. In the context of databases, several indexing techniques proposed for arbitrary metric spaces [3, 2] could be applied for the problem of approximately retrieving strings. However such structures have to be supported by the database management system. Cohen [5] presented a framework for the integration of heterogeneous databases based on textual similarity and proposed WHIRL, a logic that ....

S. Brin. Near neighbor search in large metric spaces. In Proceedings of the 21st International Conference on Very Large Databases (VLDB'95), pages 574--584, 1995.


Approximate String Joins in a Database (Almost) for Free - Gravano, Ipeirotis.. (2001)   (Correct)

....guide the search for approximate string matches [4, 11] In [1] Baeza Yates and Gonnet solve the problem of exact substring joins, using suffix arrays and outside the context of a relational database. In the context of databases, several indexing techniques proposed for arbitrary metric spaces [3, 2] could be applied for the problem of approximately retrieving strings. However such structures have to be supported by the database management system. Cohen [5] presented a framework for the integration of heterogeneous databases based on textual similarity and proposed WHIRL, a logic that ....

S. Brin. Near neighbor search in large metric spaces. In Proceedings of the 21st International Conference on Very Large Databases (VLDB'95), pages 574--584, 1995.


An Indexing and Retrieval Mechanism for Complex Similarity.. - Guang-Ho Cha Ghcha   (Correct)

....for nearest neighbor queries. Thus, it is difficult to process range queries. In fact, most of the index structures designed only for nearest neighbor queries have these common problems. For example, the optimistic VP (vantage point) tree [16] and the GNAT (Geometric Near neighbor Access Tree) 7 [17] are such kind of index structures. They precalculate some nearest neighbors of points, store the distances in a tree or graph, and use the precalculated information for a more efficient nearest neighbor search. Therefore, they have benefit in the nearest neighbor search time, but have ....

S. Brin, "Near Neighbor Search in Large Metric Spaces," Proceedings of the 21 st VLDB International Conference, 1995, 574-584.


Techniques for Supporting Efficient Content-based.. - Kurniawati, Jin..   (Correct)

....used in building these structures. Almost all of the structures assume that we operate in a vector space (a subset of the metric space with the additional requirement that each vector has a fixed dimension) Structures that operate within a metric space directly (Fukunaga and Narendra, 1975; Brin, 1995; Burkhard and Keller, 1973) will not be as efficient as ones operating with the additional restriction of a vector space. All the structures using bounding boxes assume that we do not have any crosstalk between dimensions in the distance function. If there are some crosstalks between dimensions ....

....Similarity search algorithms are actually nearest neighbour search algorithms with an option to ignore nodes that are not close enough to the query vector. Hence, we can utilize the algorithms developed for nearest neighbour searches Burkhard and Keller (1973) Fukunaga and Narendra (1975) Brin (1995); Kamgar Parsi and Kanal (1985) Hjaltason and Samet (1995) Roussopoulos et al. 1995) to do the search. Region queries can be answered by searching all the nodes whose envelope intersect the given region. If we only care about the final result, then there will not be much difference between the ....

[Article contains additional citation context not shown here]

BRIN, S. (1995). Near neighbor search in large metric spaces. In VLDB 1995.


Spaghettis: An Array Based Algorithm for.. - Chávez.. (1999)   (3 citations)  (Correct)

....reference to such a construction. Instead of building the Voronoi diagram for every database element, a hierarchy of divisions is de ned using two elements per level. This is generalized for using more than one element of the Voronoi diagram in the Geometric Near Neighbor Access Tree or GNAT [4] and in the SB algorithm [8] The GNATs are useful for range queries, while the SB can be used for nearest neighbor queries as well. A di erent approach [12] is based on an adaptation of the dual of the Voronoi graph, the Delaunay triangulation. In [12] the construction of the Spatial ....

S. Brin. Near neighbor search in large metric spaces. In Proc. VLDB'95, pages 574-584, 1995.


A Unified Model for Similarity Searching - Chávez, Navarro..   (Correct)

....so as to predict its probable future behavior) etc. Since the problem has appeared in unrelated areas, the corresponding algorithms and data structures seem to emerge from a great diversity, and different approaches have been proposed and analyzed separately, often under different assumptions [5, 20, 22, 19, 21, 23, 13,15, 1, 4, 14, 18, 3, 11, 17, 7, 8, 24]. Due to space limitations we refer the reader to a recent survey where all the known approaches for similarity searching are discussed [9] Currently, the only realistic way to compare two different algorithms is to apply them to the same data set. We present a unified complexity model for the ....

....hierarchy we could proceed downwards from a very coarse level building a candidate list of equivalence classes of the next level, using for example D j ; this candidate list will be refined using the D j Gamma1 distance function and so on until we reach the bottom level. This is done, e.g. in [4]. The concept of discriminative power serves as an indicator of the performance or fitness of the equivalence relation (or equivalently, of the distance function D) In general, it will be more costly to have more discriminative power. A related concept is that of fragmentation , which is ....

[Article contains additional citation context not shown here]

S. Brin. Near neighbor search in large metric spaces. In Proc. 21st Conference on Very Large Databases (VLDB'95), pages 574--584, 1995.


Unbalancing: the Key to Index High Dimensional Metric Spaces - Chávez, Navarro   (Correct)

.... can be translated to metric spaces as well: the typical feature in high dimensional spaces is that the probability distribution of distances among elements has a very concentrated histogram (with larger mean as the dimension grows) diculting the work of any similarity search algorithm [5, 7]. In the extreme case we have a space where d(x; x) 0 and 8y 6= x; d(x; y) 1, where it is impossible to avoid a single distance evaluation at search time. We say that a general metric space is high dimensional when its histogram of distances is concentrated. There are a number of methods to ....

....are at the same depth h, regardless of the bucket size. Vantage Point Trees (vp trees) 17, 19] are designed for continuous distance functions. The root has two equal size subtrees that divide the elements in closer to and farther from the root. This can be extended to m ary trees (mvp trees) [5, 4]. Generalized hyperplane trees (gh trees) 17] use two pivots for each tree node and divide the space according to which of the two pivots is closer to each object. If this is generalized to an m ary partition then a Geometric Near neighbor Access Tree (gna tree) is obtained [5] which makes a ....

[Article contains additional citation context not shown here]

S. Brin. Near neighbor search in large metric spaces. In Proc. VLDB'95, pages 574-584, 1995.


Searching in Metric Spaces - Chávez, Navarro, Baeza-Yates, .. (1999)   (8 citations)  (Correct)

....and allows overlaps in the areas covered (i.e. a point may belong to more than one partition) This idea is also present in R trees [36] for vector spaces. MVPT The VPT can be extended to m ary trees by using the m Gamma 1 uniform percentiles instead of just the median. This is suggested in [16, 15]. In [15] the Multi Vantage Point Tree (MVPT) is presented. They propose the use of many elements in a single node, much as in [51] It can be seen that the space is O(n) since each internal node needs to store the m percentiles but the leaves do not. The construction time is O(n log n) if we ....

....construction. No analysis is given in [54] but we obtain it by specializing the more general GNATs. p2 p5 p4 p6 p12 p10 p9 p8 p3 p7 p11 p15 p14 p1 p13 p10 p13 p5 p4 p11 p2 p12 p3 p7 p1 p15 p6 p8 p9 p14 Figure 5: Example of the first level of a GHT. GNAT The GHT is extended in [16] to an m ary tree, called GNAT (Geometric Near neighbor Access Tree) keeping the same essential idea. We select, for the first level, m pivots p 1 : p m , and define U i = fu 2 U;d(p i ; u) d(p j ; u) 8j 6= ig. That is, U i are the elements closer to p i than to any other p j . From the root, ....

[Article contains additional citation context not shown here]

S. Brin. Near neighbor search in large metric spaces. In Proc. 21st Conference on Very Large Databases (VLDB'95), pages 574--584, 1995.


Searching in Metric Spaces by Spatial Approximation - Navarro (1999)   (6 citations)  (Correct)

.... can be translated to metric spaces as well: the typical feature in high dimensional spaces is that the probability distribution of distances among elements has a very concentrated histogram (with larger mean as the dimension grows) difficulting the work of any similarity search algorithm [5, 7]. In the extreme case we have a space where d(x; x) 0 and 8y 6= x; d(x; y) 1, where it is impossible to avoid a single distance evaluation at search time. We say that a general metric space is high dimensional when its histogram of distances is concentrated. There are a number of methods to ....

....are at the same depth h, regardless of the bucket size. Vantage Point Trees (vp trees) 13, 15] are designed for continuous distance functions. The root has two equal size subtrees that divide the elements in closer to and farther from the root. This can be extended to m ary trees (mvp trees) [5, 4]. Generalized hyperplane trees (gh trees) 13] use two pivots for each tree node and divide the space according to which of the two pivots is closer to each object. If this is generalized to an m ary partition then a Geometric Near neighbor Access Tree (gna tree) is obtained [5] which makes a ....

[Article contains additional citation context not shown here]

S. Brin. Near neighbor search in large metric spaces. In Proc. VLDB'95, pages 574--584, 1995.


An Effective Clustering Algorithm to Index High Dimensional.. - Chávez, Navarro   (Correct)

.... can be translated to metric spaces from vector spaces: the typical feature in high dimensional spaces is that the probability distribution of distances among elements has a very concentrated histogram (with larger mean as the dimension grows) hampering the work of any similarity search algorithm [4, 6]. In the extreme case we have a space where d(x; x) 0 and 8y 6= x; d(x; y) 1, where it is impossible 1 to avoid a single distance evaluation at search time. We say that a general metric space is high dimensional when its histogram of distances is concentrated. We use in this paper a ....

....are at the same depth h, regardless of the bucket size. Vantage Point Trees (vp trees) 18, 20] are designed for continuous distance functions. The root has two equal size subtrees that divide the elements in closer to and farther from the root. This can be extended to m ary trees (mvp trees) [4, 3]. Finally, algorithms like AESA [19] LAESA [14, 13] and its variants [16, 7] and Fixed Queries Arrays (fq arrays [8] are based in a common idea: k pivots are selected and each object is mapped to k coordinates which are its distances to the pivots. Later, the query q is also mapped and if it ....

[Article contains additional citation context not shown here]

S. Brin. Near neighbor search in large metric spaces. In Proc. VLDB'95, pages 574--584, 1995.


Measuring the Dimensionality of General Metric Spaces - Chávez, Navarro   (Correct)

....the leaves are at the same depth h, regardless of the bucket size. Vantage Point Trees (VPTs) 36, 39] are designed for continuous distance functions. The root has two equal size subtrees that divide the elements in closer to and farther from the root. This can be extended to m ary trees (MVPTs) [10, 9]. Finally, algorithms like AESA [37] LAESA [31, 30] and its variants [33, 13] and Fixed Queries Arrays (FQAs [14] are based in a common idea: k pivots are selected and each object is mapped to k coordinates which are its distances to the pivots. Later, the query q is also mapped and if it ....

....to contain all the points in the zone, and the elements are inserted in the subtrees trying to minimize covering radii. Voronoi Trees (VTs) 19] are a modification that tries reduce the covering radii. GHTs are generalized to an m ary partition in the Geometric Near neighbor Access Tree (GNATs) [10], which makes a Voronoi like partition of the space [1] among the m pivots at each node of the tree. However, the GNAT uses also the covering radius criterion to prune the search even more. The M tree (MT) 16] also takes m elements and divides the space among its zones of influence, but it uses ....

[Article contains additional citation context not shown here]

S. Brin. Near neighbor search in large metric spaces. In Proc. 21st Conference on Very Large Databases (VLDB'95), pages 574--584, 1995.


Nearest Neighbour Search in Hausdorff Distance Pattern Spaces - Braß, Knauer (2001)   (Correct)

No context found.

S. Brin. Near neighbor search in large metric spaces. In The VLDB Journal, pages 574-584, 1995.


Approximate String Joins in a Database (Almost) for Free - Gravano, Ipeirotis.. (2001)   (Correct)

No context found.

S. Brin. Near neighbor search in large metric spaces. In Proceedings of the 21st International Conference on Very Large Databases (VLDB'95), pages 574--584, 1995.


High-Dimensional Access Methods for Efficient Similarity.. - Moënne-Loccoz (2005)   (Correct)

No context found.

S. Brin. Near neighbor search in large metric spaces. In VLDB '95: Proceedings of the 21th International Conference on Very Large Data Bases, pages 574--584, San Francisco, CA, USA, 1995. Morgan Kaufmann Publishers Inc.


Giving suggestions to Misspelled Words: - An Application Of   (Correct)

No context found.

Sergey Brin. Near neighbor search in large metric spaces. In The VLDB Journal, pages 574--584, 1995.


Fast and Accurate Handwritten Character Recognition.. - Perez-Cortes.. (2000)   (Correct)

No context found.

S. Brin. Near neighbor search in large metric spaces. In Proc. 21st Inter. Conf. on Very Large Data Bases, pages 574584, 1995.


The ND-Tree: A Dynamic Indexing Technique for.. - Qian, Zhu, Xue, Pramanik (2003)   (Correct)

No context found.

S. Brin. Near neighbor search in large metric spaces. In Proc. of VLDB, pp. 574--584, 1995.


Navigating nets: Simple algorithms for proximity search.. - Krauthgamer, Lee (2004)   (13 citations)  (Correct)

No context found.

S. Brin. Near neighbor search in large metric spaces. In 21st International Conference on Very Large Data Bases, pages 574--584, 1995.


Analysis of Distance Based Indexing Methods for.. - Mahdi Mirzazadeh..   (Correct)

No context found.

S. Brin, Near neighbor search in large metric spaces, in Proc. of 21st Int. Conf. on Very Large Data Bases (VLDB), Zurich, Switzerland, 1995, pp. 574-584.


A Pivot-Based Routine for Improved Parent-Finding in Hybrid MDS - Morrison, Chalmers (2004)   (Correct)

No context found.

S. Brin. Near neighbor search in large metric spaces. In Proceedings of the 21st Conference on Very Large Databases, pages 574--584, 1995.


NNH: Improving Performance of Nearest-Neighbor Searches Using .. - Jin, Koudas, Li (2004)   (1 citation)  (Correct)

No context found.

Brin, S.: Near neighbor search in large metric spaces. In: The VLDB Journal. (1995) 574--584


Analysis of Search Algorithms and Tree Structures for.. - Neha Singh Undergraduate   (Correct)

No context found.

S. Brin. Near neighbor search in large metric spaces. In Proc. 21st Conference on Very Large Databases (VLDB '95), pages 574-584, 1995.


NNH: Improving Performance of Nearest-Neighbor Searches Using .. - Jin, Koudas, Li (2003)   (1 citation)  (Correct)

No context found.

Brin, S.: Near neighbor search in large metric spaces. In: The VLDB Journal. (1995) 574--584


Probabilistic Proximity Searching Algorithms Based on Compact .. - Bustos, Navarro (2002)   (Correct)

No context found.

S. Brin. Near neighbor search in large metric spaces. In Proc. 21st Conference on Very Large Databases (VLDB'95), pages 574-584, 1995.


Metric-based Shape Retrieval in Large Databases - Thomas Sebastian Benjamin   (Correct)

No context found.

S. Brin. Near neighbor search in large metric spaces. VLDB, pages 574--584, 1995.


Nearest Neighbour Search in Hausdorff Distance Pattern Spaces - Braß, Knauer (2001)   (Correct)

No context found.

S. Brin. Near neighbor search in large metric spaces. In The VLDB Journal, pages 574-584, 1995.


Searching in Metric Spaces by Spatial Approximation - Navarro (1999)   (6 citations)  (Correct)

No context found.

S. Brin. Near neighbor search in large metric spaces. In Proc. of the 21st Conference on Very Large Databases (VLDB'95), pages 574--584, 1995.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC