| W. A. Burkhard and R. M. Keller. Some approaches to best-match file searching. Commun. ACM, 16(4):230--236, 1973. |
....on the tolerance range of the query, r. In practice, however, k # is so large that one cannot store the k # n distances, and the index simply uses as many pivots as space permits. There are many proximity search algorithms in metric spaces that are based on pivots, such as Burkhard Keller Tree [4], Fixed Queries Tree (FQT) 1] FixedHeight FQT (FHQT) 1] Fixed Queries Array (FQA) 5] Vantage Point Tree [13] Multi Vantage Point Tree [2] Excluded Middle Vantage Point Forest [14] AESA [12] Linear AESA (LAESA) 10] and Spaghettis [6] 3 E#ciency criterion Depending on how pivots are ....
W. Burkhard and R. Keller. Some approaches to best-match file searching. Comm. of the ACM, 16(4):230--236, 1973.
....criteria can be applied easily regardless of how the distance function is defined. Some researchers even extend TIEC to situations where the dissimilarity between vectors is measured by a distortion measure which does not necessarily obey the triangle inequality [51, 50] Burkhard and Keller [16] applies TIEC methods to nearest neighbour queries for databases. TIEC 1 is analogous to their clique criterion, while TIEC 2 is very similar to their joint cut off criterion. TIEC 2 is also the basis for the AESA proposed by Vidal [144, 170, 172] and later improved by Mic o et al. 116] ....
....Although we may not be able to improve the r further unless the distribution of the target vectors into clusters is known, we can still improve the bounds by choosing a y curr which is as close as possible to the target vector. Many of the proposed techniques choose the initial y curr arbitrarily [144, 170, 172, 116, 150, 16]. Some choose the codeword whose projection value, such as the Euclidean norm [71, 103] or the mean value of components [30, 134] is closest to that of the target vector. Since close scalar projections do not always mean that the corresponding vectors are also close to each other, there is no ....
[Article contains additional citation context not shown here]
W. A. Burkhard and R. M. Keller. Some approaches to best match file searching. Communications ACM, 16(4):230--236, April 1973.
....subsequences that are a distance less than 1 unit away, there is no point in determining the exact value of D(Q,Cb) which we now know to be at least 5 units away. The first formalization of this idea for fast searching of nearest neighbors in matrices is generally credited to Burkhard and Keller [5]. More efficient implementations are possible, for example Shasha and Wang [33] introduced the Approximation Distance Map (ADM) algorithm that takes advantage of an arbitrary set of pre computed distances instead of using just one randomly chosen reference point. For the problem at hand, ....
....i and fl. ADM and MIN are initialized from line 4 to line 8. From line 10 to line 22, we construct the matrix ADM and MIN as described in [33] For each stage k, 1 k n, ADM[i,j] is the greatest lower bound of any path from i to j that does not pass through an object numbered greater than k [5]. Similarly, MIN[i, j] is the smallest upper bound of the distance between i andj. Note that we further optimized the algorithra by storing or computing only half of the matrices, due to distance symmetry. However, for simplicity and consistency, we show the algorithm of constructing ADM as was ....
Burkhard, W. A. & Keller, R. M. (1973). Some approaches to best-match file searching. Commun. ACM, April. Vol. 16(4), pp 230-236.
....that are a distance less than 1 unit away, there is no point in determining the exact value of D(Q,C b ) which we now know to be at least 5 units away. The first formalization of this idea for fast searching of nearest neighbors in matrices is generally credited to Burkhard and Keller [5]. More efficient implementations are possible; for example, Shasha and Wang [33] introduced the Approximation Distance Map (ADM) algorithm that precomputes an arbitrary set of distances instead of using just one randomly chosen reference point. For the problem at hand, however, the techniques ....
Burkhard, W. A. & Keller, R. M. (1973). Some approaches to best-match file searching. Commun. ACM, April. Vol. 16(4), pp 230-236.
....factor in determining retrieval speed is the time consumed in calculating the distances (e.g. Euclidean) between the query vector and each of the image vectors stored in a large database. Berman and Shapiro [8,9] utilized the triangle inequality technique, first introduced by Burkhard and Keller [10], to significantly reduce the number of direct distance calculations needed for an efficient search algorithm. The basis of this technique is that the distance between the query and the image index vectors cannot be less than the absolute value of the difference between (a) the distance between ....
W. A. Burkhard and R. M. Keller, "Some approaches to best-match file searching," Comm. ACM, Vol. 16, No.4, pp. 230-236, 1973.
.... case, it is not possible to produce new objects in the metric space, e.g. to aggregate or divide two objects (in a Euclidean space, bounding rectangles are often used for this purpose) Various methods exist for indexing objects in the metric space model as well as for computing proximity queries [11, 13, 14, 52, 53]. These methods can only make use of the properties of distance metrics (nonnegativity, symmetry, and the triangle inequality) and operate without any knowledge of how objects are represented or how the distances between objects are computed. Such a general approach is usually slower than methods ....
W. A. Burkhard and R. Keller. Some approaches to best-match file searching. Communications of the ACM, 16(4):230--236, April
....schemes. 2. Related Work Different data structures have been proposed to filter out elements based on the triangular inequality (see [10] for a complete survey) We divide the exposition according to the two classes of techniques. 2.1. Pivot based Algorithms Burkhard Keller Trees (bk trees) [6] are designed for discrete distance functions: they select a pivot element p as the root of the tree, and put at child i the elements which are at distance i to the pivot. Each subtree is recursively built with the same technique until there are b elements or less, in which case the elements are ....
W. Burkhard and R. Keller. Some approaches to bestmatch file searching. CACM, 16(4):230--236, 1973.
....some ideas are better or worse. We add a final subsection devoted to more advanced issues such as dynamic capabilities, I O considerations and approximate and probabilistic algorithms. 11 Data Space Construction Claimed Query Extra CPU Structure Complexity Complexity Complexity query time BKT [19, 59] n pointers O(n log n) O(n ff ) FQT [5] n: n log n pointers O(n log n) O(n ff ) FHQT [5, 4, 6] n: nh pointers O(nh) O(log n) O(n ff ) FQA [24] nhb bits O(nh) O(log n) O(n ff log n) VPT [62, 68, 26] n pointers O(n log n) O(log n) MVPT [16, 15] n pointers O(n ....
....Distance Functions We start by describing tree data structures that apply to distance functions that return a small set of different values. At the end we show how to cope with the general case with these trees. BKT Probably the first general solution to search in metric spaces was presented in [19]. They propose a tree (thereafter called Burkhard Keller Tree, or BKT) which is suitable for discretevalued distance functions. It is defined as follows: an arbitrary element p 2 U is selected as the root of the tree. For each distance i 0, we define U i = fu 2 U;d(u; p) ig as the set of all ....
[Article contains additional citation context not shown here]
W. Burkhard and R. Keller. Some approaches to best-match file searching. Comm. of the ACM, 16(4):230--236, 1973.
....using Euclidean distance as the similarity measurement. Therefore, by considering the methods using the Euclidean distance as the metric, Kamgar s 13 Chapter 2 Content based Retrieval Multimedia Database Background and Indexing Problem Algorithm Data Metric Result Burkhard and Keller (1973) [12]: Some approaches to best match file searching 1000 randomly generated registers of a file using 30 bits keys Hamming distance 700 average distance computations ( 70 ) Fukunaga and Narendra (1975) 23] A branch and bound algorithm for computing K nearest neighbors based on a ....
W. A. Burkhard and R. M. Keller. "Some approaches to best-match file searching". Communications of the ACM, 16(4):230--236, 1973.
....(e.g. color histograms often lead to 64, or even as high as 256 dimensional vectors) Two strategies have been proposed for handling proximity queries in metric spaces. The first is to work directly within the metric space, often by building hierarchical distance based index structures [9, 11, 19, 66]. Nearest neighbor and k nearest neighbor algorithms have been proposed for many of these structures. The second strategy maps the database objects into a low to medium dimensional vector space and then makes use of efficient spatial indexing methods available there, such as the R tree [29] The ....
....vector space in order to take advantage of spatial indexing structures. An alternative is to construct index structures that are based solely on distances between objects. A number of such methods have been proposed over the past few decades, some of the earliest being due to Burkhard and Keller [11]. These methods generally assume that #S;d# forms a finite metric space (see Section 4.1) Typical of distance based indexing structures are metric trees [65, 66] which are binary trees that result in recursively partitioning a data set into two subsets at each node. Uhlmann [66] identified two ....
[Article contains additional citation context not shown here]
W. A. Burkhard and R. Keller. Some approaches to best-match file searching. Communications of the ACM, 16(4):230--236, April 1973.
....Functions. We start by describing tree data structures that apply to distance functions that return a small set of different values. At the end we show how to cope with the general case with these trees. 5.1.1.1 BKT. Probably the first general solution to search in metric spaces was presented in [Burkhard and Keller 1973]. They propose a tree (thereafter called Burkhard Keller Tree, or BKT) which is suitable for discrete valued distance functions. It is defined as follows: an arbitrary element p 2 Uis selected as the root of the tree. For each distance i 0, we define U i = fu 2 U;d(u;p) ig as the set of all ....
....Clustering approaches. Clustering is a very wide area with lots of applications [Jain and Dubes 1988] The general goal is to divide a set in subsets of elements close to each other in the same subset. A few approaches to index metric spaces based on clustering exist. A technique proposed in [Burkhard and Keller 1973] is to recursively divide the set Uin compact subsets U i and choose a representative c i for each. They compute covering radii r i . To search for the closest neighbor, the query q is compared against all the c i and the sets are considered from smallest to largest distance. The r i are used to ....
[Article contains additional citation context not shown here]
Burkhard, W. and Keller, R. 1973. Some approaches to best-match file searching. Comm. of the ACM 16, 4, 230--236.
....aware of in producing consistently good results in a wide variety of cases and in being based on a formal theory. 2. Basic proximity search algorithm using pivots There are many proximity search algorithms in metric spaces that are based in the use of pivots, such as Burkhard Keller Tree (BKT) [5], Fixed Queries Tree (FQT) 2] Fixed Height FQT (FHQT) 2] Fixed Queries Array (FQA) 7] Vantage Point Tree (VPT) 12] Multi Vantage Point Tree (MVPT) 3] Excluded Middle Vantage Point Forest (VPF) 13] Approximating Eliminating Search Algorithm (AESA) 11] Linear AESA (LAESA) 10] and ....
W. Burkhard and R. Keller. Some approaches to bestmatch file searching. Comm. of the ACM, 16(4):230-- 236, 1973.
....and an excellent survey is given in [13] However, most of these methods only work for vector data. Regarding image datasets, some works have used selected objects as reference points to prune distance calculations [4] and to organize index structures [17] The seminal work of Burkhard and Keller [9] provides different interesting techniques for partitioning a metric data set where the recursive process is materialized as a tree. The first technique partitions a dataset by choosing a representative from the set and grouping the elements with respect to their distance from it. The second ....
....The representative and the maximum distance from the representative to a point of the corresponding subset are also maintained to support nearest neighbor queries. The metric tree of Uhlmann [20] and the vantage point tree (vp tree) of Yanilos [23] are somehow similar to the first technique of [9] as they partition the elements into two groups according to a representative, called a vantage point. In [23] the vp tree has been generalized to a multi way tree. In order to reduce the number of distance calculations, Baeza Yates et al. [2] suggested using the same vantage point in all nodes ....
[Article contains additional citation context not shown here]
W. A. Burkhard and R. M. Keller, "Some Approaches to Best-Match File Searching," Communications of the ACM, Vol. 16, No. 4, 1973, pp. 230-236.
....building these structures. Almost all of the structures assume that we operate in a vector space (a subset of the metric space with the additional requirement that each vector has a fixed dimension) Structures that operate within a metric space directly (Fukunaga and Narendra, 1975; Brin, 1995; Burkhard and Keller, 1973) will not be as efficient as ones operating with the additional restriction of a vector space. All the structures using bounding boxes assume that we do not have any crosstalk between dimensions in the distance function. If there are some crosstalks between dimensions (e.g. in QBIC s distance ....
....occurring content based queries in multimedia databases. Similarity search algorithms are actually nearest neighbour search algorithms with an option to ignore nodes that are not close enough to the query vector. Hence, we can utilize the algorithms developed for nearest neighbour searches Burkhard and Keller (1973); Fukunaga and Narendra (1975) Brin (1995) Kamgar Parsi and Kanal (1985) Hjaltason and Samet (1995) Roussopoulos et al. 1995) to do the search. Region queries can be answered by searching all the nodes whose envelope intersect the given region. If we only care about the final result, then ....
[Article contains additional citation context not shown here]
BURKHARD, W. and KELLER, R. (1973). Some approaches to best-match file searching.
....be a distance function: d(x; y) 0 d(x; y) d(y; x) d(x; y) d(x; w) d(w; y) Methods: ffl Branch and bound, searching a cluster hierarchy. FN75] Can be applied with R trees. o o o o o o C1 C2 Q x 26 ffl Pre compute distances from some points. Single point ( star ) BK73] multiple points [Sha77] arbitrary topologies [SW90] Typically 20 80 of the file is searched. o o o o o o Anchor Q 27 4.5 Conclusions Among the SAMs, Z ordering (Linear quadtrees) and R trees seem the most promising methods. 28 5 ACCESS METHODS FOR TEXT Applications: ffl ....
W.A. Burkhard and R.M. Keller. Some approaches to best-match file searching. Comm. of the ACM (CACM), 16(4):230--236, April 1973.
....so as to predict its probable future behavior) etc. Since the problem has appeared in unrelated areas, the corresponding algorithms and data structures seem to emerge from a great diversity, and different approaches have been proposed and analyzed separately, often under different assumptions [5, 20, 22, 19, 21, 23, 13,15, 1, 4, 14, 18, 3, 11, 17, 7, 8, 24]. Due to space limitations we refer the reader to a recent survey where all the known approaches for similarity searching are discussed [9] Currently, the only realistic way to compare two different algorithms is to apply them to the same data set. We present a unified complexity model for the ....
....Fig. 3. With two rings we define an equivalence based on being at the same distance to both points. However, the resulting class is partitioned. 6 Pivot Based and Clustering Algorithms A large class of methods to index metric spaces are just variants of what we call pivot based algorithms [5, 20, 22, 21, 23, 13, 15, 1, 14, 18, 3, 11, 7, 8, 24]. The idea is an extension of Example 3, using more pivots in order to decrease the external complexity. Instead of just one pivot, one selects h pivots p 1 Delta Delta Delta p h 2 U, and stores all the distances d(u; p i ) for all u 2 U. This set of distances is the index. Now, given a query ....
[Article contains additional citation context not shown here]
W. Burkhard and R. Keller. Some approaches to best-match file searching. Comm. of the ACM, 16(4):230--236, April 1973.
....problem. 4.1 Discrete Distance Functions We start by describing data structures that apply to distance functions that return a small set of different values. At the end we show how to cope with the general case. BKT Probably the first general solution to search in metric spaces was presented in [19]. They propose a tree (thereafter called Burkhard Keller Tree, or BKT) which is suitable for discretevalued distance functions. It is defined as follows: an arbitrary element p 2 U is selected as the root of the tree. For each distance i 0, we define U i = fu 2 U;d(u; p) ig as the set of all ....
....at reducing the extra CPU time. Clustering Clustering is a very wide area with lots of applications [39] The general goal is to divide a set in subsets of elements close to each other in the same subset. A few approaches to index metric spaces based on clustering exist. A technique proposed in [19] is to recursively divide the set Uin compact subsets U i and choose a representative p i for each. They compute numbers r i = maxfd(p i ; u) u 2 U i g (which upper bound the radii of the subsets) To search for the closest neighbor, the query q is compared against all the p i and the sets are ....
[Article contains additional citation context not shown here]
W. Burkhard and R. Keller. Some approaches to best-match file searching. Comm. of the ACM, 16(4):230--236, April 1973.
....our data structure against previous work, showing that it outperforms all the other schemes for high dimensions or queries with large radii. 2. Previous Work Different tree structures have been proposed to filter out elements based on the triangular inequality. Burkhard Keller Trees (bk trees) [6] are designed for discrete distance functions: they select a pivot element p as the root of the tree, and put at child i the elements which are at distance i to the pivot. Each subtree is recursively built with the same technique until there are b elements or less, in which case the elements are ....
W. Burkhard and R. Keller. Some approaches to bestmatch file searching. CACM, 16(4):230--236, 1973.
....schemes. 2 Related Work Different data structures have been proposed to filter out elements based on the triangular inequality (see [9] for a complete survey) We divide the exposition according to the two classes of techniques. 2. 1 Pivot based Algorithms Burkhard Keller Trees (bk trees) [5] are designed for discrete distance functions: they select a pivot element p as the root of the tree, and put at child i the elements which are at distance i to the pivot. Each subtree is recursively built with the same technique until there are b elements or less, in which case the elements are ....
W. Burkhard and R. Keller. Some approaches to best-match file searching. CACM, 16(4):230--236, 1973.
....select some elements from U (called pivots) and identify all the other elements with to their distances to (some of) the pivots. The methods differ in how they select the pivots, how much information they store about the distances among elements and pivots, etc. Burkhard Keller Trees (BKTs) [11] are designed for discrete distance functions: they select a pivot element p as the root of the tree, and put at child i the elements which are at distance i to the pivot. Each subtree is recursively built with the same technique until there are b elements or less, in which case the elements are ....
W. Burkhard and R. Keller. Some approaches to best-match file searching. Comm. of the ACM, 16(4):230--236, 1973.
No context found.
W. A. Burkhard and R. M. Keller. Some approaches to best-match file searching. Commun. ACM, 16(4):230--236, 1973.
No context found.
W. A. Burkhard and R. M. Keller. Some approaches to best-match file searching. Communications of the ACM, 16(4):230--236, 1973.
No context found.
W.A.Burkhard and R.M.Keller, "Some approaches to best-match file searching", Comm. ACM Vol. 16 No.4, pp. 230236, 1973.
No context found.
W. A. Burkhard and R. M. Keller, "Some approaches to best-match file searching", in Communications of the ACM, volume 16, pages 230--236, 1973.
No context found.
W. Burkhard and R. Keller. Some approaches to best-match file searching. Communications of the ACM, 16(4):230--236, 1973. 26
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC