• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

DMCA

Similarity search in high dimensions via hashing (1999)

Cached

  • Download as a PDF

Download Links

  • [www.cs.ust.hk]
  • [www.cs.princeton.edu]
  • [www.cs.princeton.edu]
  • [theory.stanford.edu]
  • [theory.stanford.edu]
  • [theory.stanford.edu]
  • [www.cs.princeton.edu]
  • [nichol.as]
  • [www.cs.princeton.edu]
  • [www.wisdom.weizmann.ac.il]
  • [nichol.as]
  • [www.wisdom.weizmann.ac.il]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Aristides Gionis , Piotr Indyk , Rajeev Motwani
Citations:641 - 10 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@INPROCEEDINGS{Gionis99similaritysearch,
    author = {Aristides Gionis and Piotr Indyk and Rajeev Motwani},
    title = {Similarity search in high dimensions via hashing},
    booktitle = {},
    year = {1999},
    pages = {518--529}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

The nearest- or near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search/index structures for performing similarity search over high-dimensional data, e.g., image databases, document collections, time-series databases, and genome databases. Unfortunately, all known techniques for solving this problem fall prey to the \curse of dimensionality. " That is, the data structures scale poorly with data dimensionality; in fact, if the number of dimensions exceeds 10 to 20, searching in k-d trees and related structures involves the inspection of a large fraction of the database, thereby doing no better than brute-force linear search. It has been suggested that since the selection of features and the choice of a distance metric in typical applications is rather heuristic, determining an approximate nearest neighbor should su ce for most practical purposes. In this paper, we examine a novel scheme for approximate similarity search based on hashing. The basic idea is to hash the points

Keyphrases

similarity search    high dimension    document collection    approximate similarity search    problem fall prey    data dimensionality    high-dimensional data    dimension exceeds    genome database    image database    typical application    large fraction    brute-force linear search    practical purpose    large variety    data structure    novel scheme    near-neighbor query problem    search index structure    basic idea    k-d tree    database application    related structure    time-series database   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University