See this document in CiteSeerX!

LSH Forest: Self-Tuning Indexes for Similarity Search (2005)  (Make Corrections)  (1 citation)
Mayank Bawa Tyson Condie U. C. Berkeley Berkeley, CA...



  Home/Search   Context   Related

 
View or download:
arsky.com/WWW2005/CD/docs/p651.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  arsky.com/WWW2005/CD/contents (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: We consider the problem of indexing high-dimensional data for answering (approximate) similarity-search queries. Similarity indexes prove to be important in a wide variety of settings: Web search engines desire fast, parallel, main-memory-based indexes for similarity search on text data; database systems desire disk-based similarity indexes for high-dimensional data, including text and images; peer-to-peer systems desire distributed similarity indexes with low communication cost. We propose an... (Update)

Cited by:   More
DPTree: A Balanced Tree Based Indexing Framework for.. - Li, Lee, Sivasubramaniam (2006)   (Correct)

Active bibliography (related documents):   More   All
1.5:   LSH Forest: Self-Tuning Indexes for Similarity Search - Bawa, Condie, Ganesan (2005)   (Correct)
0.6:   Automated Modeling and Nonlinear Axis Scaling - Leejay Wu (2005)   (Correct)
0.4:   Density-Based Indexing for Approximate Nearest-Neighbor.. - Bennett, Fayyad, Geiger (1999)   (Correct)

Similar documents based on text:   More   All
0.2:   Make it Fresh, Make it Quick - Searching a Network of.. - Bawa, Bayardo, Jr. (2003)   (Correct)
0.2:   SETS: Search Enhanced by Topic Segmentation - Bawa, Manku, Raghavan (2003)   (Correct)
0.2:   Minimizing View Sets without Losing Query-Answering Power - Li, Bawa, Ullman (2001)   (Correct)

BibTeX entry:   (Update)

@misc{ bawa-lsh,
  author = "Mayank Bawa Bawa",
  title = "LSH Forest: Self-Tuning Indexes for Similarity Search",
  url = "citeseer.ist.psu.edu/bawa05lsh.html" }
Citations (may not include all citations):
576   Authoritative sources in a hyperlinked environment - Kleinberg - 1998
302   data-mining and visualization of traditional and multimedia .. (context) - Faloutsos, Lin et al. - 1995
204   tree: An index structure for high-dimensional data (context) - Berchtold, Keim et al. - 1996
165   Approximate nearest neighbor - towards removing the curse of.. - Indyk, Motwani - 1998
162   Similarity indexing with ss-tree (context) - White, Jain - 1996
154   Automatic resource compilation by analyzing hyperlink struct.. - Chakrabarti, Dom et al. - 1998
147   A quantitative analysis and performance study for similarity.. - Weber, Schek et al. - 1998
136   Syntactic clustering of the web (context) - Broder, Glassman et al. - 1997
130   tree: A search structure for large multidimensional indexes (context) - Robinson - 1981
106   Inferring web communities from link topology - Gibson, Kleinberg et al. - 1998
86   The pyramid-technique: Towards breaking the curse of dimensi.. (context) - Berchtold, Bohm et al. - 1998
82   Finding related pages in the world wide web - Dean, Henzinger - 1999
78   PATRICIA - practical algorithm to retrieve information coded.. (context) - Morrison - 1968
76   Similarity search in high dimensions via hashing - Gionis, Indyk et al. - 1999
67   the resemblance and containment of documents - Broder - 1998
55   Finding interesting associations without support pruning - Cohen, Datar et al. - 2000
55   Copy detection mechanisms for digital documents - Brin, Davis et al. - 1995
52   Computing iceberg queries efficiently - Fang, Shivakumar et al. - 1998
51   A fast index for semistructured data - Cooper, Sample et al. - 2001
32   Measures of distributional similarity - Lee - 1999
32   Topical locality in the web - Davison - 2000
28   tree: A new data structure for string search in external mem.. (context) - Ferragina, Grossi et al. - 1999
24   Building a scalable and accurate copy detection mechanism - Shivakumar, Garcia-Molina - 1996
22   ACM Transactions on Database Systems (context) - Bayer, Unterauer - 1977
21   Evaluating strategies for similarity search on the web - Haveliwala, Gionis et al. - 2002
19   Finding replicated web collections - Cho, Shivakumar et al. - 2000
14   Scalable data acces pp system using unbalanced search tree (context) - data, systems et al. - 2002
13   Online balancing of range-partitioned data with applications.. - Ganesan, Bawa et al. - 2004
9   trees with variable-length records (context) - McCreight - 1977
6   trees: A dynamic index structure for spatial searching (context) - Gutman - 1997
6   the web: A study of host pairs with replicated content (context) - Bharat, Broder et al. - 1999
4   tree: Page organization and techniques: A personal account (context) - Lomet, of - 2001
4   tree: An efficient and robust access method for points and r.. (context) - Katayama, Satoh - 1997

Documents on the same site (http://arsky.com/WWW2005/CD/contents.htm):   More
Improving Portlet Interoperability through Deep Annotation - Diaz, Iturrioz, Irastorza (2005)   (Correct)
GlobeDB: Autonomic Data Replication for Web Applications - Sivasubramanian, Alonso, .. (2005)   (Correct)
A Multi-Threaded PIPELINED Web Server Architecture for.. - Choi, Kim, Ersoz, Das (2005)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC