Results 1 - 10
of
11,182
Similarity search in high dimensions via hashing
, 1999
"... The nearest- or near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search/index structures for performing similarity search over high-dimensional data, e.g., image dat ..."
Abstract
-
Cited by 641 (10 self)
- Add to MetaCart
The nearest- or near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search/index structures for performing similarity search over high-dimensional data, e.g., image
Locality-sensitive hashing scheme based on p-stable distributions
- In SCG ’04: Proceedings of the twentieth annual symposium on Computational geometry
, 2004
"... inÇÐÓ�Ò We present a novel Locality-Sensitive Hashing scheme for the Approximate Nearest Neighbor Problem underÐÔnorm, based onÔstable distributions. Our scheme improves the running time of the earlier algorithm for the case of theÐnorm. It also yields the first known provably efficient approximate ..."
Abstract
-
Cited by 521 (8 self)
- Add to MetaCart
inÇÐÓ�Ò We present a novel Locality-Sensitive Hashing scheme for the Approximate Nearest Neighbor Problem underÐÔnorm, based onÔstable distributions. Our scheme improves the running time of the earlier algorithm for the case of theÐnorm. It also yields the first known provably efficient approximate
Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web
- IN PROC. 29TH ACM SYMPOSIUM ON THEORY OF COMPUTING (STOC
, 1997
"... We describe a family of caching protocols for distrib-uted networks that can be used to decrease or eliminate the occurrence of hot spots in the network. Our protocols are particularly designed for use with very large networks such as the Internet, where delays caused by hot spots can be severe, and ..."
Abstract
-
Cited by 699 (10 self)
- Add to MetaCart
We describe a family of caching protocols for distrib-uted networks that can be used to decrease or eliminate the occurrence of hot spots in the network. Our protocols are particularly designed for use with very large networks such as the Internet, where delays caused by hot spots can be severe
A Scalable Content-Addressable Network
- IN PROC. ACM SIGCOMM 2001
, 2001
"... Hash tables – which map “keys ” onto “values” – are an essential building block in modern software systems. We believe a similar functionality would be equally valuable to large distributed systems. In this paper, we introduce the concept of a Content-Addressable Network (CAN) as a distributed infra ..."
Abstract
-
Cited by 3371 (32 self)
- Add to MetaCart
Hash tables – which map “keys ” onto “values” – are an essential building block in modern software systems. We believe a similar functionality would be equally valuable to large distributed systems. In this paper, we introduce the concept of a Content-Addressable Network (CAN) as a distributed
Query evaluation techniques for large databases
- ACM COMPUTING SURVEYS
, 1993
"... Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On ..."
Abstract
-
Cited by 767 (11 self)
- Add to MetaCart
Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem
Extracting Relations from Large Plain-Text Collections
, 2000
"... Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use for answering precise queries or for running data mining tasks. We explore a technique for extracting such tables fr ..."
Abstract
-
Cited by 494 (25 self)
- Add to MetaCart
Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use for answering precise queries or for running data mining tasks. We explore a technique for extracting such tables
Mining Quantitative Association Rules in Large Relational Tables
, 1996
"... We introduce the problem of mining association rules in large relational tables containing both quantitative and categorical attributes. An example of such an association might be "10% of married people between age 50 and 60 have at least 2 cars". We deal with quantitative attributes by fi ..."
Abstract
-
Cited by 444 (3 self)
- Add to MetaCart
We introduce the problem of mining association rules in large relational tables containing both quantitative and categorical attributes. An example of such an association might be "10% of married people between age 50 and 60 have at least 2 cars". We deal with quantitative attributes
Wide-area cooperative storage with CFS
, 2001
"... The Cooperative File System (CFS) is a new peer-to-peer readonly storage system that provides provable guarantees for the efficiency, robustness, and load-balance of file storage and retrieval. CFS does this with a completely decentralized architecture that can scale to large systems. CFS servers pr ..."
Abstract
-
Cited by 999 (53 self)
- Add to MetaCart
provide a distributed hash table (DHash) for block storage. CFS clients interpret DHash blocks as a file system. DHash distributes and caches blocks at a fine granularity to achieve load balance, uses replication for robustness, and decreases latency with server selection. DHash finds blocks using
Efficient implementation of a BDD package
- In Proceedings of the 27th ACM/IEEE conference on Design autamation
, 1991
"... Efficient manipulation of Boolean functions is an important component of many computer-aided design tasks. This paper describes a package for manipulating Boolean functions based on the reduced, ordered, binary decision diagram (ROBDD) representation. The package is based on an efficient implementat ..."
Abstract
-
Cited by 504 (9 self)
- Add to MetaCart
implementation of the if-then-else (ITE) operator. A hash table is used to maintain a strong carwnical form in the ROBDD, and memory use is improved by merging the hash table and the ROBDD into a hybrid data structure. A memory funcfion for the recursive ITE algorithm is implemented using a hash-based cache
GHT: A Geographic Hash Table for Data-Centric Storage
, 2002
"... Making effective use of the vast amounts of data gathered by largescale sensor networks will require scalable, self-organizing, and energy-efficient data dissemination algorithms. Previous work has identified data-centric routing as one such method. In an associated position paper [23], we argue tha ..."
Abstract
-
Cited by 392 (30 self)
- Add to MetaCart
Making effective use of the vast amounts of data gathered by largescale sensor networks will require scalable, self-organizing, and energy-efficient data dissemination algorithms. Previous work has identified data-centric routing as one such method. In an associated position paper [23], we argue that a companion method, data-centric storage (DCS), is also a useful approach. Under DCS, sensed data are stored at a node determined by the name associated with the sensed data. In this paper,
Results 1 - 10
of
11,182