#### DMCA

## A Framework for Accelerating High-dimensional NN-queries (2001)

Citations: | 2 - 0 self |

### Citations

1262 | The r*-tree: an efficient and robust access method for points and rectangles
- Beckmann
- 1990
(Show Context)
Citation Context ...the fractal dimensionality estimation have only minor impact on the overall cost. We illustrate the wide applicability of our framework by using a number of different index structures such as R -tree =-=[2]-=-, X-tree [7], SS/SR-tree [20, 12], and VA-file [19]. One key advantage is that index structures without NN query support, such as the Pyramid tree [5], can be extended by our framework to support this... |

590 | The x-tree : An index structure for high-dimensional data
- Berchtold
- 1996
(Show Context)
Citation Context ...dimensionality estimation have only minor impact on the overall cost. We illustrate the wide applicability of our framework by using a number of different index structures such as R -tree [2], X-tree =-=[7]-=-, SS/SR-tree [20, 12], and VA-file [19]. One key advantage is that index structures without NN query support, such as the Pyramid tree [5], can be extended by our framework to support this query type.... |

438 | The sr-tree: An index structure for high-dimensional nearest neighbor queries
- Katayama, Satoh
- 1997
(Show Context)
Citation Context ...stimation have only minor impact on the overall cost. We illustrate the wide applicability of our framework by using a number of different index structures such as R -tree [2], X-tree [7], SS/SR-tree =-=[20, 12]-=-, and VA-file [19]. One key advantage is that index structures without NN query support, such as the Pyramid tree [5], can be extended by our framework to support this query type. As a by-product, we ... |

345 | Similarity indexing with the ss-tree
- White, Jain
- 1996
(Show Context)
Citation Context ...stimation have only minor impact on the overall cost. We illustrate the wide applicability of our framework by using a number of different index structures such as R -tree [2], X-tree [7], SS/SR-tree =-=[20, 12]-=-, and VA-file [19]. One key advantage is that index structures without NN query support, such as the Pyramid tree [5], can be extended by our framework to support this query type. As a by-product, we ... |

210 | Ranking in spatial databases.
- Hjaltason, Samet
- 1995
(Show Context)
Citation Context ...he number of disk pages to read, or by reducing the number of disk seek operations during page reads. An algorithm that minimizes the number of index pages to read was proposed by Hjaltason and Samet =-=[10]-=-. Their algorithm was proven to be optimal in the sense that only pages intersecting the NN-query sphere are accessed. However, since the pages are read according to their distance to the query point ... |

208 | The pyramid-technique: towards breaking the curse of dimensionality
- Berchtold
- 1998
(Show Context)
Citation Context ...ber of different index structures such as R -tree [2], X-tree [7], SS/SR-tree [20, 12], and VA-file [19]. One key advantage is that index structures without NN query support, such as the Pyramid tree =-=[5]-=-, can be extended by our framework to support this query type. As a by-product, we demonstrate how our framework can be modified to support approximate NN-queries. Our experimental results show that o... |

168 | A model for the prediction of R-tree performance.
- Theodoridis, Sellis
- 1996
(Show Context)
Citation Context ... the M-tree nodes by statistics (the distance distribution) in order to predict the CPU and I/O cost of range and NN queries. Most work is based on histograms over the dataset. Theodoridis and Sellis =-=[18]-=- give a model for predicting the performance of range queries on R -trees. They generate what they call a density surface which is basically a two-dimensional histogram representing the local densitie... |

125 | Estimating the selectivity of spatial queries using the correlation fractal dimension
- Faloutsos
- 1995
(Show Context)
Citation Context ...r sample : We will discuss the quality of this estimate in Section 3.1. The fractal dimensionalities D 2 and D 0 2 can be computed using a box-counting algorithm as suggested by Belussi and Faloutsos =-=[3]-=-. We found in our experiments, however, that this algorithm is not applicable to high-dimensional data since the point sets become too sparse. We therefore used a modified version of the box-counting ... |

91 | Independent quantization: An index compression technique for high-dimensional data spaces.
- Berchtold, Bohm, et al.
- 2000
(Show Context)
Citation Context ...que presented in this paper aims at 19 accelerating single queries, we will concentrate on the latter body of work. The work that comes most closely to our technique was presented by Berchtold et al. =-=[4]-=-. They propose a new index structure, called IQ-tree, and give a probability-based method to optimize page reads during NN queries. The authors estimate the probability that a page will be read during... |

81 | Selectivity estimation in spatial databases.”
- Acharya, Poosala, et al.
- 1999
(Show Context)
Citation Context ...performance of range queries on R -trees. They generate what they call a density surface which is basically a two-dimensional histogram representing the local densities of the dataset. Acharya et al. =-=[1]-=- show how the prediction error of histograms can be reduced by reducing the variance of densities within the histogram regions. Korn et al. [13] demonstrate how discontinuities at the histogram region... |

51 | Improving the Query Performance of High-Dimensional Index Structures Using Bulk-Load Operations,
- Berchtold, Böhm, et al.
- 1998
(Show Context)
Citation Context ...nge queries in the X-tree [7] work the same way as in the R -tree. The only difference is the way the bounding boxes are constructed. In our experiments, we used the bulkloaded X-tree as presented in =-=[6]-=-. This bulkloading algorithm ensures that no overlaps occur between the bounding boxes and that the page utilization is maximized. VA-file For the VA-file [19], we used the following range query algor... |

48 | Deflating the Dimensionality Curse Using Multiple Fractal
- Pagel, Korn, et al.
- 2013
(Show Context)
Citation Context ...ty of the data is the same everywhere. Real data, however, is clustered and the density varies over the data space. One way of modeling real data is by considering its fractal properties. Korn et al. =-=[14]-=- give formulas for the k-NN-radius based on the fractal dimensionality of data. More precisely, they calculate the radius as follows: d nn (k) = 1 2 \Delta ` k N \Gamma 1 ' 1 D 2 where D 2 is the Corr... |

37 |
When is "nearest neighbor" meaningful
- Beyer, Goldstein, et al.
- 1998
(Show Context)
Citation Context ...ed to be 1 100 . For lower dimensionalities, the error is small because the estimate r sample is close to the correct radius value. For higher dimensionalities, all points tend to be equidistant (see =-=[8]-=-). Therefore, the error in the k-NN radius estimate has to drop as well. This leaves us with a maximum at around 10 dimensions. Depending on k, this error ranges between 10% and 14%. Subfigure 4(b) pl... |

35 | Bulk loading the m-tree
- Ciaccia, Patella
- 1998
(Show Context)
Citation Context ...s to be accessed to get an initial radius estimate. Other authors extend index structures by statistical information, which can be used to estimate the query radius and the query cost. Ciaccia et al. =-=[9]-=- extend the M-tree nodes by statistics (the distance distribution) in order to predict the CPU and I/O cost of range and NN queries. Most work is based on histograms over the dataset. Theodoridis and ... |

25 | Analyzing range queries on spatial data,”
- Jin, An, et al.
(Show Context)
Citation Context ...by reducing the variance of densities within the histogram regions. Korn et al. [13] demonstrate how discontinuities at the histogram region edges can be avoided by using splines. Finally, Jin et al. =-=[11]-=- extend the histogram approach for spatial non-point data. The advantage of these techniques is their high accuracy in modeling data by considering local effects. A disadvantage of the histogram appro... |

24 | R.: Reading a Set of Disk Pages
- Seeger, Larson, et al.
- 1993
(Show Context)
Citation Context ...follows. Since the k closest points from the pages P are already known, this range query needs to read only the pages from P 0 \Gamma P . In general, both range queries use the heuristic suggested by =-=[17]-=- to minimize the I/O cost for reading a set of disk pages. 6 2.4 Approximate k-NN Query Algorithm The following modification of our accelerated k-NN query algorithm yields an approximate k-NN query al... |

22 | Range selectivity estimation for continuous attributes. In
- Korn, Johnson, et al.
- 1999
(Show Context)
Citation Context ...nting the local densities of the dataset. Acharya et al. [1] show how the prediction error of histograms can be reduced by reducing the variance of densities within the histogram regions. Korn et al. =-=[13]-=- demonstrate how discontinuities at the histogram region edges can be avoided by using splines. Finally, Jin et al. [11] extend the histogram approach for spatial non-point data. The advantage of thes... |

16 |
An approximation based data structure for similarity search
- Weber, Blott
- 1997
(Show Context)
Citation Context ...nor impact on the overall cost. We illustrate the wide applicability of our framework by using a number of different index structures such as R -tree [2], X-tree [7], SS/SR-tree [20, 12], and VA-file =-=[19]-=-. One key advantage is that index structures without NN query support, such as the Pyramid tree [5], can be extended by our framework to support this query type. As a by-product, we demonstrate how ou... |

15 | Modeling high-dimensional index structures using sampling
- Lang, Singh
- 2001
(Show Context)
Citation Context ...am approaches is that they are not applicable in high dimensions since either the number of histogram regions becomes too large or these regions contain too much empty space and become inaccurate. In =-=[16]-=-, sampling is used to overcome this problem. In contrast to this paper, the sample is used to predict the overall query cost of a given index structure. More specifically, the sample is used to predic... |

1 |
Hakan Ferhatosmanoglu, Divyakant Agrawal, Amr El Abbadi, and Ambuj
- Lang
(Show Context)
Citation Context ...ments, however, that this algorithm is not applicable to high-dimensional data since the point sets become too sparse. We therefore used a modified version of the box-counting algorithm as defined in =-=[15]-=-. Similar to the original box-counting algorithm, this algorithm has a complexity of O(N log N ). Amortized over all queries, this cost is negligible. Note that the estimator we just presented, is onl... |