• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 11,182
Next 10 →

Similarity search in high dimensions via hashing

by Aristides Gionis, Piotr Indyk, Rajeev Motwani , 1999
"... The nearest- or near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search/index structures for performing similarity search over high-dimensional data, e.g., image dat ..."
Abstract - Cited by 641 (10 self) - Add to MetaCart
The nearest- or near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search/index structures for performing similarity search over high-dimensional data, e.g., image

Locality-sensitive hashing scheme based on p-stable distributions

by Mayur Datar, Piotr Indyk - In SCG ’04: Proceedings of the twentieth annual symposium on Computational geometry , 2004
"... inÇÐÓ�Ò We present a novel Locality-Sensitive Hashing scheme for the Approximate Nearest Neighbor Problem underÐÔnorm, based onÔstable distributions. Our scheme improves the running time of the earlier algorithm for the case of theÐnorm. It also yields the first known provably efficient approximate ..."
Abstract - Cited by 521 (8 self) - Add to MetaCart
inÇÐÓ�Ò We present a novel Locality-Sensitive Hashing scheme for the Approximate Nearest Neighbor Problem underÐÔnorm, based onÔstable distributions. Our scheme improves the running time of the earlier algorithm for the case of theÐnorm. It also yields the first known provably efficient approximate

Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web

by David Karger, Eric Lehman, Tom Leighton, Matthew Levine, Daniel Lewin, Rina Panigrahy - IN PROC. 29TH ACM SYMPOSIUM ON THEORY OF COMPUTING (STOC , 1997
"... We describe a family of caching protocols for distrib-uted networks that can be used to decrease or eliminate the occurrence of hot spots in the network. Our protocols are particularly designed for use with very large networks such as the Internet, where delays caused by hot spots can be severe, and ..."
Abstract - Cited by 699 (10 self) - Add to MetaCart
We describe a family of caching protocols for distrib-uted networks that can be used to decrease or eliminate the occurrence of hot spots in the network. Our protocols are particularly designed for use with very large networks such as the Internet, where delays caused by hot spots can be severe

A Scalable Content-Addressable Network

by Sylvia Ratnasamy , Paul Francis, Mark Handley, Richard Karp, Scott Shenker - IN PROC. ACM SIGCOMM 2001 , 2001
"... Hash tables – which map “keys ” onto “values” – are an essential building block in modern software systems. We believe a similar functionality would be equally valuable to large distributed systems. In this paper, we introduce the concept of a Content-Addressable Network (CAN) as a distributed infra ..."
Abstract - Cited by 3371 (32 self) - Add to MetaCart
Hash tables – which map “keys ” onto “values” – are an essential building block in modern software systems. We believe a similar functionality would be equally valuable to large distributed systems. In this paper, we introduce the concept of a Content-Addressable Network (CAN) as a distributed

Query evaluation techniques for large databases

by Goetz Graefe - ACM COMPUTING SURVEYS , 1993
"... Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On ..."
Abstract - Cited by 767 (11 self) - Add to MetaCart
Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem

Extracting Relations from Large Plain-Text Collections

by Eugene Agichtein, Luis Gravano , 2000
"... Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use for answering precise queries or for running data mining tasks. We explore a technique for extracting such tables fr ..."
Abstract - Cited by 494 (25 self) - Add to MetaCart
Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use for answering precise queries or for running data mining tasks. We explore a technique for extracting such tables

Mining Quantitative Association Rules in Large Relational Tables

by Ramakrishnan Srikant, Rakesh Agrawal , 1996
"... We introduce the problem of mining association rules in large relational tables containing both quantitative and categorical attributes. An example of such an association might be "10% of married people between age 50 and 60 have at least 2 cars". We deal with quantitative attributes by fi ..."
Abstract - Cited by 444 (3 self) - Add to MetaCart
We introduce the problem of mining association rules in large relational tables containing both quantitative and categorical attributes. An example of such an association might be "10% of married people between age 50 and 60 have at least 2 cars". We deal with quantitative attributes

Wide-area cooperative storage with CFS

by Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica , 2001
"... The Cooperative File System (CFS) is a new peer-to-peer readonly storage system that provides provable guarantees for the efficiency, robustness, and load-balance of file storage and retrieval. CFS does this with a completely decentralized architecture that can scale to large systems. CFS servers pr ..."
Abstract - Cited by 999 (53 self) - Add to MetaCart
provide a distributed hash table (DHash) for block storage. CFS clients interpret DHash blocks as a file system. DHash distributes and caches blocks at a fine granularity to achieve load balance, uses replication for robustness, and decreases latency with server selection. DHash finds blocks using

Efficient implementation of a BDD package

by Karl S. Brace, Richard L. Rudell, Randal E. Bryant - In Proceedings of the 27th ACM/IEEE conference on Design autamation , 1991
"... Efficient manipulation of Boolean functions is an important component of many computer-aided design tasks. This paper describes a package for manipulating Boolean functions based on the reduced, ordered, binary decision diagram (ROBDD) representation. The package is based on an efficient implementat ..."
Abstract - Cited by 504 (9 self) - Add to MetaCart
implementation of the if-then-else (ITE) operator. A hash table is used to maintain a strong carwnical form in the ROBDD, and memory use is improved by merging the hash table and the ROBDD into a hybrid data structure. A memory funcfion for the recursive ITE algorithm is implemented using a hash-based cache

GHT: A Geographic Hash Table for Data-Centric Storage

by Sylvia Ratnasamy, Brad Karp, Li Yin, Fang Yu, Deborah Estrin, Ramesh Govindan, Scott Shenker , 2002
"... Making effective use of the vast amounts of data gathered by largescale sensor networks will require scalable, self-organizing, and energy-efficient data dissemination algorithms. Previous work has identified data-centric routing as one such method. In an associated position paper [23], we argue tha ..."
Abstract - Cited by 392 (30 self) - Add to MetaCart
Making effective use of the vast amounts of data gathered by largescale sensor networks will require scalable, self-organizing, and energy-efficient data dissemination algorithms. Previous work has identified data-centric routing as one such method. In an associated position paper [23], we argue that a companion method, data-centric storage (DCS), is also a useful approach. Under DCS, sensed data are stored at a node determined by the name associated with the sensed data. In this paper,
Next 10 →
Results 1 - 10 of 11,182
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University