25 citations found. Retrieving documents...
Arya, S., Mount, D. M., and Narayan, O, Accounting for Boundary Effects in Nearest Neighbor Searching, 11th Annual Symposium on Computational Geometry, 1995, pp. 336-344.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Adaptive and Incremental Processing for Distance Join Queries - Shin, Moon, Lee   (Correct)

....# must be known a priori. In other words, the heap algorithm cannot be used for incremental distance join queries. Several closely related studies for nearest neighbor queries have been reported in the literature. Among those are nearest neighbor search algorithms based on Voronoi cells [2, 5] and branch and bound techniques [26, 27] a nearest neighbor search algorithm for ranking requirement [15] and multi step # nearest neighbor search algorithms [17, 28] Another closely related issue is estimating spatial join selectivity. Some estimation techniques proposed to use supplementary ....

Sunil Arya, David M. Mount, and Onuttom Narayan. Accounting for boundary effects in nearest neighbor searching. In Proc. 11th Annual Symp. on Computational Geometry, pages 336--344, Vancouver, Canada, 1995.


Adaptive and Incremental Processing for Distance Join Queries - Shin, Moon, Lee (2002)   (Correct)

....k must be known a priori. In other words, the heap algorithm cannot be used for incremental distance join queries. Several closely related studies for nearest neighbor queries have been reported in the literature. Among those are nearest neighbor search algorithms based on Voronoi cells [2, 5] and branch and bound techniques [26, 27] a nearest neighbor search algorithm for ranking requirement [15] and multi step k nearest neighbor search algorithms [17, 28] Another closely related issue is estimating spatial join selectivity. Some estimation techniques proposed to use supplementary ....

.... 3: AM KDJ: Adaptive Multi Stage K Distance Join Algorithm (Compensation Stage) 1: insert all elements in QC into 2: while jAnswerSetj k and QM 6= do 5: else CompensatePlaneSweep(c) procedure CompensatePlaneSweep(hl; ri) 6: L f entries of l sorted in Stage Oneg; fL[1] L[2]; L[jLj]g 7: R f entries of r sorted in Stage Oneg; fR[1] R[2] R[jRj]g 9: n a node with the min axis value 2 L [ R; n becomes an anchor. 11: L L fng; R fnode list in R not paired with n in the Stage One g; f R[n:compensate] R[n:compensate 1] ....

[Article contains additional citation context not shown here]

Sunil Arya, David M. Mount, and Onuttom Narayan. Accounting for boundary effects in nearest neighbor searching. In Proc. 11th Annual Symp. on Computational Geometry, pages 336--344, Vancouver, Canada, 1995.


Distributed Processing of Similarity Queries - Papadopoulos, Manolopoulos (2001)   (Correct)

....structure (the X tree) is presented that is specifically designed for high dimensional data. Experimental results show that the structure outperforms R # trees [4] and TV trees [20] by factors. In [21] we present techniques to answer nearest neighbor queries in declustered R trees. Arya et al. [3] present the impact of taking into consideration boundary effects in the analysis of nearest neighbor queries. In [22] we provide expected upper and lower bounds in the performance of nearest neighbor queries in R trees, by taking into consideration the fractal dimension of the dataset. Also, in ....

S. Arya, D.M. Mount, and O. Narayan, "Accounting for boundary effects in nearest neighbor searching," in Proceedings of the 11-th Annual Symposium on Computational Geometry, Vancouver, British Columbia, Canada, 1995, pp. 336--344. 92 PAPADOPOULOS AND MANOLOPOULOS


Modeling High-Dimensional Index Structures using Sampling - Lang, Singh (2001)   (3 citations)  (Correct)

....The first known analysis of R trees was given by Faloutsos et al. 13] but was restricted to one dimensional data. In a later paper, Kamel and Faloutsos [19] present a cost model using a concept similar to Minkowski sums to predict the number of disk accesses for two dimensional data. Arya et al. [2] give a detailed analysis of NN queries for bucketing and k dtree index structures including boundary effects. However, they assume that the number of data points grows exponentially with the dimensionality which may not hold for real datasets. Berchtold et al. 4] present a cost model for 1 NN ....

S. Arya, D. M. Mount, and O. Narayan. Accounting for boundary effects in nearest neighbor searching. In Symposium on Computational Geometry, pages 336--344, 1995.


Advances in Computational Geometry for Document Analysis - Toussaint   (Correct)

....of knowing the distribution of query points in advance have been found by Clarkson [12] One approach to practical applications is of course to sacrifice finding the exact nearest neighbor. If we are satisfied with finding approximate nearest neighbors then more efficient algorithms are available [3], 34] 37] For apllication results of some new efficient nearest neighbor searching algorithms in handwritten character recognition see [43] For additional references to the latest results concerning nearest neighbor search in arbitrary dimensions and for pointers to many other key recent ....

S. Arya, D. M. Mount, and O. Narayan. Accounting for boundary effects in nearest-neighbor searching. Discrete and Computional Geometry, 16:155-- 176, 1996.


When Is "Nearest Neighbor" Meaningful? - Beyer, Goldstein, Ramakrishnan.. (1999)   (3 citations)  (Correct)

....than 10 of the data pages. Fetching a large number of data pages through a multi dimensional index usually results in unordered retrieval. In the area of the nearest neighbors problem it is used for indicating that a query processing technique performs worse as the dimensionality increases. In [11, 5] it was observed that in some high dimensional cases, the estimate of NN query cost (using some index structure) can be very poor if boundary effects are not taken into account. The boundary effect is that the query region (i.e. a sphere whose center is the query point) is mainly outside the ....

Arya, S., Mount, D.M., Narayan, O.: Accounting for Boundary Effects in Nearest Neighbors Searching. In Proc. 11th ACM Symposium on Computational Geometry (1995) 336--344


Geometric Range Searching and Its Relatives - Agarwal, Erickson (1997)   (98 citations)  (Correct)

.... provided the data structure satisfies certain mild assumptions [7] Note that the query time of the above approach is exponential in d, so it is impractical even for moderate values of d (say d 10) This has lead to the development of algorithms for finding approximate nearest neighbors [88, 26, 29, 28, 169, 166] or for special cases, such as when the distribution of query points is known in advance [84, 262] See [132, 169, 167, 119, 148, 222, 236] for a few heuristics for answering nearest neighbor queries. 7.3 Linear programming queries Let S be a set of n halfspaces in R d . We wish to preprocess ....

S. Arya, D. M. Mount, and O. Narayan, Accounting for boundary effects in nearest-neighbor searching, Discrete Comput. Geom., 16 (1996), 155--176.


Performance Prediction of High-Dimensional Index Structures.. - Lang, Singh (2000)   (Correct)

....first known analysis of R trees was given by Faloutsos et al. FSR87] but was restricted to one dimensional data. In a later paper, Kamel and Faloutsos [KF93] present a cost model using a concept similar to Minkowski sums to predict the number of disk accesses for two dimensional data. Arya et al. [AMN95] give a detailed analysis of NN queries for bucketing and k d tree index structures including boundary effects. However, they assume that the number of data points grows exponentially with the dimensionality which may not hold for real datasets. Berchtold et al. BBKK97] present a cost model for ....

Sunil Arya, David M. Mount, and Onuttom Narayan. Accounting for boundary effects in nearest neighbor searching. In Symposium on Computational Geometry, pages 336--344, 1995.


A Cost Model for Query Processing in High-Dimensional Data Spaces - Böhm (2000)   (Correct)

....Faloutsos and Gaede 1996] These approaches are of minor importance for point databases. The second direction, where the basic model of Friedman, Bentley and Finkel needs extension, are the boundary effects occurring when indexing data spaces of higher dimensionality. Arya, Mount and Narayan [Arya et al. 1995, Arya 1995] presented a new cost model for processing nearest neighbor queries in the context of the application domain of vector quantization. Arya, Mount and Narayan restricted their model to the maximum metric and neglected correlation effects. Unfortunately, they still assume that the number ....

....Gaede 1996] These approaches are of minor importance for point databases. The second direction, where the basic model of Friedman, Bentley and Finkel needs extension, are the boundary effects occurring when indexing data spaces of higher dimensionality. Arya, Mount and Narayan [Arya et al. 1995, Arya 1995] presented a new cost model for processing nearest neighbor queries in the context of the application domain of vector quantization. Arya, Mount and Narayan restricted their model to the maximum metric and neglected correlation effects. Unfortunately, they still assume that the number of points is ....

ARYA S., MOUNT D.M., NARAYAN O. Accounting for boundary effects in nearest neighbor searching. Proc. 11th Symp. on Computational Geometry, Vancouver, Canada, 1995, 336-344.


Randomized Algorithms for Geometric Optimization Problems - Agarwal, Sen (2000)   (5 citations)  (Correct)

....R trees, and Hilbert R trees; see e.g. 87, 104, 129, 127, 78, 99, 170, 184] Even these algorithms suffer from the curse of dimensionality. This has lead to the development of algorithms for finding approximate nearest neighbors Geometric Optimization June 6, 2000 Proximity Problems 20 [25, 26, 27, 49, 122, 129] or for special cases, such as when the distribution of query points is known in advance [52, 188] For a given parameter 0 and a query point , an approximate nearest neighbor query ( NN query) asks for returning a point p 2 S so that d(p; 1 )d(p 0 ; for all p 0 2 S. This ....

S. Arya, D. M. Mount, and O. Narayan, Accounting for boundary effects in nearest-neighbor searching, Discrete Comput. Geom., 16 (1996), 155--176.


Optimizing Search Strategies in k-d Trees - Sample, Haines, Arnold, Purcell (2001)   (2 citations)  (Correct)

....classification problems, and clustering problems. Various methods have been proposed to solve search problems, including hashing and indexing, various types of trees, and many hybrid and novel approaches. Proposed tree solutions alone include k d, B , R , BBD, VAMSplit k d, and other variants [2, 3, 5, 6, 7, 8, 9]. Treebased search strategies are popular for many reasons, including, for n cases, O(log n) search and insertion time, O(n log n) construction time, and reasonable space requirements. Tree structures also allow for dynamic insertion of additional elements and simple formulation of range queries. ....

....the number of points in a space increases rapidly with dimensionality. The increase in the population density of high dimensional spaces diminishes the effects of dimensionality and brings nodes searched closer to the ideal of O(log N) with a large constant that is exponential in dimension, d [7]. Regardless of the ability to reduce the dimensionality of a data set, it is crucial to be able to effectively search the highest possible dimension before resorting to reduction techniques. The second comparison we make is between the optimized process and the process without the tracking nodes. ....

[Article contains additional citation context not shown here]

Arya, S., Mount, D. M., and Narayan, O. 1995. Accounting for Boundary Effects in Nearest Neighbor Searching. In 11th Annual Symposium on Computational Geometry, pages 336-344. Outline:


Approximate Nearest Neighbors: Towards Removing the Curse of.. - Indyk, Motwani (1998)   (138 citations)  (Correct)

.... of k d trees, R trees, and structures based on space filling curves; more recent results are surveyed in [60] While some perform well in 2 3 dimensions, in high dimensional spaces they all exhibit poor behavior in the worst case and in typical cases as well (e.g. see Arya, Mount, and Narayan [4]) Dobkin and Lipton [23] were the first to provide an algorithm for nearest neighbors in d , with query time O(2 d log n) and preprocessing 1 cost O(n 2 d 1 ) Clarkson [16] reduced the preprocessing to O(n dd=2e(1 ffi) while increasing the query time to O(2 O(d log d) log n) ....

S. Arya, D.M. Mount, and O. Narayan, Accounting for boundary effects in nearest-neighbor searching. Discrete and Computational Geometry, 16(1996):155--176.


When Is "Nearest Neighbor" Meaningful? - Beyer, Goldstein, Ramakrishnan.. (1999)   (3 citations)  (Correct)

....functions. An example from statistics: in [26] it is used to note that multivariate density estimation is very problematic in high dimensions. In the area of the nearest neighbors problem it is used for indicating that a query processing technique performs worse as the dimensionality increases. In [11, 5] it was observed that in some high dimensional cases, the estimate of NN query cost (using some index structure) can be very poor if boundary effects are not taken into account. The boundary effect is that the query region (i.e. a sphere whose center is the query point) is mainly outside the ....

....and not how to process such a query. Therefore, the term dimensionality curse (as used by the NN research community) is only relevant to Section 6, and not to the main results in this paper. 7. 2 Computational Geometry The nearest neighbor problem has been studied in computational geometry (e.g. [4, 5, 6, 9, 12]) However, the usual approach is to take the number of dimensions as a constant and find algorithms that behave well when the number of points is large enough. They observe that the problem is hard and define the approximate nearest neighbor problem as a weaker problem. In [6] there is an ....

S. Arya, D. M. Mount, and O. Narayan. Accounting for boundary effects in nearest neighbors searching. In Proc. 11th ACM Symposium on Computational Geometry, pages 336--344, 1995.


Geometric Range Searching and Its Relatives - Agarwal, Erickson (1999)   (98 citations)  (Correct)

.... provided the data structure satisfies certain mild assumptions [7] Note that the query time of the above approach is exponential in d, so it is impractical even for moderate values of d (say d 10) This has lead to the development of algorithms for finding approximate nearest neighbors [26, 28, 29, 91, 185, 188] or for special cases, such as when the distribution of query points is known in advance [87, 296] Geometric Range Searching and Its Relatives 47 Because of wide applications of nearest neighbor searching, many heuristics have been developed, especially in higher dimensions. These algorithms ....

S. Arya, D. M. Mount, and O. Narayan, Accounting for boundary effects in nearest-neighbor searching, Discrete Comput. Geom., 16 (1996), 155--176.


Geometric Range Searching and Its Relatives - Agarwal, Erickson (1999)   (98 citations)  (Correct)

....a parallel algorithm for nearest neighbor searching. For large input sets, one desires an algorithm that minimizes the number of disk accesses. Many of the heuristics mentioned above try to optimize the I O efficiency, though none of them gives any performance guarantee. A few recent papers [24, 46, 236, 93] analyze the efficiency of some of the heuristics, under certain assumptions on the input. 7.3 Linear programming queries Let S be a set of n halfspaces in R d . We wish to preprocess S into a data structure so that for a direction vector v, we can determine the first point of T h2S h in ....

S. Arya, D. Mount, and O. Narayan, Accounting for boundary effects in nearest neighbor searching, Proc. 11th Annu. ACM Sympos. Comput. Geom., 1995, pp. 336--344.


Similarity Indexing: Algorithms and Performance - White, Jain (1996)   (54 citations)  (Correct)

....case (T is large) ffl may be thought of as the maximum error allowed relative to the exact result. For example, if ffl = 0:5, the distance to the kth approximate nearest neighbor might be as much as 50 greater than the distance to the true kth nearest neighbor. However, we and other researchers [3] have found that in practice, the average error is much less than the maximum allowed error, and for small values of ffl, the probability of that a non exact results is actually returned is often very small or negligible. The performance results in this paper use Euclidean distance as a similarity ....

....k d tree that in some applications can allow constant time searching (with respect to the dataset size) of a k d tree in lower dimension. Sproull [34] provided refinements to the k d tree and observed that in practice the k d tree performance degrades rapidly with dimension. Arya and Mount [3] analyzed the k d tree (and the bucketing algorithm) taking boundary effects into account, showing that dependence on dimension is much better than Cleary s bound when the number of data points is not large with respect to dimension (N 6AE 2 d ) In his thesis, Arya [1] provides further ....

[Article contains additional citation context not shown here]

S. Arya, D. M. Mount, and O. Narayan. Accounting for Boundary Effects in Nearest Neighbor Searching. In 11th Annual Symposium on Computational Geometry, pages 336--344, Vancouver, British Columbia, Canada, June 1995. ACM Press.


Approximate Nearest Neighbor Queries Revisited - Chan (1998)   (25 citations)  (Correct)

....efficient in low dimensions, is impractical in high dimensions, because constant factors grow exponentially when d varies. This exponential dependence on d is also inherent in Arya, et al. s method and in traditional methods based on grids (bucketing) quadtrees, and k d trees; see Arya, et al. [7] for an analysis of a grid method. In some applications such as vector quantization, the dimension d may actually be a function of n. To circumvent the exponential growth problem, one can demand less and settle for a rough approximation to the post office problem. In applications where any ....

S. Arya, D. M. Mount, and O. Narayan. Accounting for boundary effects in nearest neighbor searching. Discrete Comput. Geom., 16:155--176, 1996.


When Is "Nearest Neighbor" Meaningful? - Beyer, Goldstein, Ramakrishnan.. (1999)   (3 citations)  (Correct)

....neighbor is likely to be well separated from most of the other data points, the answer size estimate can be used to choose between a linear scan or some indexing structure. 8 Related Work 8. 1 Computational Geometry The nearest neighbor problem has been studied in computational geometry (e.g. [4, 5, 6, 8, 11]) However, the usual approach is to take the number of dimensions as a constant and find algorithms that behave well when the number of points is large enough. They observe that the problem is hard and define the approximate nearest neighbor problem as a weaker problem. In [6] there is an ....

....are exponential in dimensionality. In [6] they recommend not to use the algorithm in more than 12 dimensions. It is impractical to use the algorithm in [8] when the number of points is much lower than exponential in the number of dimensions. 8. 2 Boundary Effects and Index Structure Utility In [10, 5] it was observed that in some high dimensional cases, the estimate of NN query cost (using some index structure) can be very poor if boundary effects are not taken into account. The boundary effect is that the query region (i.e. a sphere whose center is the query point) is mainly outside the ....

S. Arya, D. M. Mount, and O. Narayan. Accounting for boundary effects in nearest neighbors searching. In Proc. 11th ACM Symposium on Computational Geometry, pages 336--344, 1995.


Optimal Multi-Step k-Nearest Neighbor Search - Seidl, Kriegel (1998)   (62 citations)  (Correct)

....In order to efficiently process k nearest neighbor queries by directly using multidimensional index structures, several approaches are available from the literature. The proposals include cell based approaches for nearest neighbor search which are conceptually based on Voronoi cells [PS 93] AMN 95] Ber 98] branch and bound algorithms for k nearest neighbor search [FBF 77] RP 92] RKV 95] and incremental algorithms for similarity ranking [Hen 94] HS 95] Recently, a fast parallel method has been suggested [Ber 97] Also theoretical results have been published concerning the ....

Arya S., Mount D. M., Narayan O.: `Accounting for Boundary Effects in Nearest Neighbor Searching', Proc. 11th Annual Symposium on Computational Geometry, Vancouver, Canada, 1995, pp. 336-344.


A Local Search Approximation Algorithm for k-Means.. - Kanungo, Mount.. (2003)   (2 citations)  Self-citation (Mount)   (Correct)

....time of the filtering algorithm grows superlinearly with dimension. The curse of dimensionality would suggest that the growth rate should be exponential in dimension, but these experiments indicate a more modest growth. This is likely due to boundary effects. This phenomenon was described in [4] in the context of nearest neighbor searching. The hybrid heuristic and iterated Lloyd s performed comparably with respect to average distortion, while the swap heuristics performed considerably worse. This suggests that the importance of moving to a local minimum grows in significance as ....

S. Arya, D. M. Mount, and O. Narayan. Accounting for boundary effects in nearest-neighbor searching. Discrete Comput. Geom., 16:155--176, 1996.


An Optimal Algorithm for Approximate Nearest.. - Arya, Mount.. (1994)   (227 citations)  Self-citation (Arya Mount)   (Correct)

....methods suffer as dimension increases. The constant factors hidden in the asymptotic running time grow at least as fast as 2 d (depending on the metric) Sproull [Spr91] observed that the empirically measured running time of kd trees does increase quite rapidly with dimension. Arya, et al. AMN95] showed that if n is not significantly larger than 2 d , as arises in some applications, then boundary effects mildly decrease this exponential dimensional dependence. From the perspective of worst case performance, an ideal solution would be to preprocess the points in O(n log n) time, into a ....

....of the number of cells and the logarithm of (2 ffl) 1 ffl) This relationship is evidenced in Figure 14(b) Note that both axes are on a logarithmic scale. Boundary effects probably play a role since the empirically observed values are somewhat smaller than predicted by the formula [AMN95] 6.5 Summary of Experiments A number of conclusions can be drawn from these experiments. First, in moderate dimensions, significant savings in running time can be achieved by computing approximate nearest neighbors. For the ffl = 3 cases, improvements in running time on the order of factors of ....

S. Arya, D. M. Mount, and O. Narayan. Accounting for boundary effects in nearest neighbor searching. In Proc. 11th Annu. ACM Sympos. Comput. Geom., pages 336-- 344, 1995.


Optimizing Search Strategies in k-d Trees - Sample, Haines, Arnold, Purcell (2001)   (2 citations)  (Correct)

No context found.

Arya, S., Mount, D. M., and Narayan, O, Accounting for Boundary Effects in Nearest Neighbor Searching, 11th Annual Symposium on Computational Geometry, 1995, pp. 336-344.


Limitations of Non-Uniform Computational Models - Chakrabarti (2002)   (Correct)

No context found.

S. Arya, D. M. Mount, and O. Narayan. Accounting for boundary effects in nearestneighbor searching. Disc. Comput. Geom., 16(2):155--176, 1996.


Dynamically Optimizing High-Dimensional Index Structures - Böhm, Kriegel (2000)   (Correct)

No context found.

Arya S., Mount D.M., Narayan O.: `Accounting for Boundary Effects in Nearest Neighbor Searching', Proc. 11th Symp. on Computational Geometry, Vancouver, Canada, pp. 336344, 1995.


Similarity Search in High Dimensions via Hashing - Gionis, Indyk, Motwani (1997)   (68 citations)  (Correct)

No context found.

S. Arya, D.M. Mount, and O. Narayan, Accounting for boundary effects in nearest-neighbor searching. Discrete and Computational Geometry, 16 (1996), pp. 155--176.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC