Results 1 - 10
of
13
More Robust Hashing: Cuckoo Hashing with a Stash
- IN PROCEEDINGS OF THE 16TH ANNUAL EUROPEAN SYMPOSIUM ON ALGORITHMS (ESA
, 2008
"... Cuckoo hashing holds great potential as a high-performance hashing scheme for real applications. Up to this point, the greatest drawback of cuckoo hashing appears to be that there is a polynomially small but practically significant probability that a failure occurs during the insertion of an item, r ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
Cuckoo hashing holds great potential as a high-performance hashing scheme for real applications. Up to this point, the greatest drawback of cuckoo hashing appears to be that there is a polynomially small but practically significant probability that a failure occurs during the insertion of an item, requiring an expensive rehashing of all items in the table. In this paper, we show that this failure probability can be dramatically reduced by the addition of a very small constant-sized stash. We demonstrate both analytically and through simulations that stashes of size equivalent to only three or four items yield tremendous improvements, enhancing cuckoo hashing’s practical viability in both hardware and software. Our analysis naturally extends previous analyses of multiple cuckoo hashing variants, and the approach may prove useful in further related schemes.
Cheap and Large CAMs for High Performance Data-Intensive Networked Systems
"... We show how to build cheap and large CAMs, or CLAMs, using a combination of DRAM and flash memory. These are targeted at emerging data-intensive networked systems that require massive hash tables running into a hundred GB or more, with items being inserted, updated and looked up at a rapid rate. For ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
We show how to build cheap and large CAMs, or CLAMs, using a combination of DRAM and flash memory. These are targeted at emerging data-intensive networked systems that require massive hash tables running into a hundred GB or more, with items being inserted, updated and looked up at a rapid rate. For such systems, using DRAM to maintain hash tables is quite expensive, while on-disk approaches are too slow. In contrast, CLAMs cost nearly the same as using existing on-disk approaches but offer orders of magnitude better performance. Our design leverages an efficient flash-oriented data-structure called BufferHash that significantly lowers the amortized cost of random hash insertions and updates on flash. BufferHash also supports flexible CLAM eviction policies. We prototype CLAMs using SSDs from two different vendors. We find that they can offer average insert and lookup latencies of 0.006ms and 0.06ms (for a 40 % lookup success rate), respectively. We show that using our CLAM prototype significantly improves the speed and effectiveness of WAN optimizers. 1
Hash-Based Techniques for High-Speed Packet Processing
"... Abstract Hashing is an extremely useful technique for a variety of high-speed packet-processing applications in routers. In this chapter, we survey much of the recent work in this area, paying particular attention to the interaction between theoretical and applied research. We assume very little bac ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Abstract Hashing is an extremely useful technique for a variety of high-speed packet-processing applications in routers. In this chapter, we survey much of the recent work in this area, paying particular attention to the interaction between theoretical and applied research. We assume very little background in either the theory or applications of hashing, reviewing the fundamentals as necessary. 1
Optimal fast hashing
- In 28th IEEE International Conference on Computer Communications (INFOCOM
, 2009
"... Abstract—This paper is about designing optimal highthroughput hashing schemes that minimize the total number of memory accesses needed to build and access an hash table. Recent schemes often promote the use of multiple-choice hashing. However, such a choice also implies a significant increase in the ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
Abstract—This paper is about designing optimal highthroughput hashing schemes that minimize the total number of memory accesses needed to build and access an hash table. Recent schemes often promote the use of multiple-choice hashing. However, such a choice also implies a significant increase in the number of memory accesses to the hash table, which translates into higher power consumption and lower throughput. In this paper, we propose to only use choice when needed. Given some target hash table overflow rate, we provide a lower bound on the total number of needed memory accesses. Then, we design and analyze schemes that provably achieve this lower bound over a large range of target overflow values. Further, for the multilevel hash table scheme, we prove that the optimum occurs when its subtable sizes decrease in a geometric way, thus formally confirming a heuristic rule-of-thumb. A. Background I.
Maximum matchings in random bipartite graphs and the space utilization of cuckoo hashtables
, 2009
"... We study the the following question in Random Graphs. We are given two disjoint sets L, R with |L | = n = αm and |R | = m. We construct a random graph G by allowing each x ∈ L to choose d random neighbours in R. The question discussed is as to the size µ(G) of the largest matching in G. When consi ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
We study the the following question in Random Graphs. We are given two disjoint sets L, R with |L | = n = αm and |R | = m. We construct a random graph G by allowing each x ∈ L to choose d random neighbours in R. The question discussed is as to the size µ(G) of the largest matching in G. When considered in the context of Cuckoo Hashing, one key question is as to when is µ(G) = n whp? We answer this question exactly when d is at least three. We also establish a precise threshold for when Phase 1 of the Karp-Sipser Greedy matching algorithm suffices to compute a maximum matching whp.
An Efficient Hardware-based Multi-hash Scheme for High Speed IP Lookup
"... The increasingly more stringent performance and power requirements of Internet routers call for scalable IP lookup strategies that go beyond the currently viable TCAM- and trie-based solutions. This paper describes a new hash-based IP lookup scheme that is both storage efficient and highperformance. ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
The increasingly more stringent performance and power requirements of Internet routers call for scalable IP lookup strategies that go beyond the currently viable TCAM- and trie-based solutions. This paper describes a new hash-based IP lookup scheme that is both storage efficient and highperformance. In order to achieve high storage efficiency, we take a multi-hashing approach and employ an advanced hashing technique that effectively resolves hashing collisions by dynamically migrating IP prefixes that are already in the lookup table as new prefixes are inserted. To obtain high lookup throughput, the multiple hash tables are accessed in parallel (using different hash functions) or in a pipelined manner. We evaluate the proposed scheme using up-to-date core routing tables and discuss how its key design parameters can be determined. When compared with state-of-the-art TCAM designs, our scheme reduces area and power requirements by 60 % and 80 % respectively, while achieving competitive lookup rates. We expect that the proposed scheme scales well with the anticipated routing table sizes and technologies in the future. 1
An Analysis of Random-Walk Cuckoo Hashing
"... In this paper, we provide a polylogarithmic bound that holds with high probability on the insertion time for cuckoo hashing under the random-walk insertion method. Cuckoo hashing provides a useful methodology for building practical, high-performance hash tables. The essential idea of cuckoo hashing ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
In this paper, we provide a polylogarithmic bound that holds with high probability on the insertion time for cuckoo hashing under the random-walk insertion method. Cuckoo hashing provides a useful methodology for building practical, high-performance hash tables. The essential idea of cuckoo hashing is to combine the power of schemes that allow multiple hash locations for an item with the power to dynamically change the location of an item among its possible locations. Previous work on the case where the number of choices is larger than two has required a breadth-first search analysis, which is both inefficient in practice and currently has only a polynomial high probability upper bound on the insertion time. Here we significantly advance the state of the art by proving a polylogarithmic bound on the more efficient randomwalk method, where items repeatedly kick out random blocking items until a free location for an item is found. 1
Hash Tables With Finite Buckets Are Less Resistant To Deletions
"... Abstract — We show that when memory is bounded, i.e. buckets are finite, dynamic hash tables that allow insertions and deletions behave significantly worse than their static counterparts that only allow insertions. This behavior differs from previous results in which, when memory is unbounded, the t ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract — We show that when memory is bounded, i.e. buckets are finite, dynamic hash tables that allow insertions and deletions behave significantly worse than their static counterparts that only allow insertions. This behavior differs from previous results in which, when memory is unbounded, the two models behave similarly. We show the decrease in performance in dynamic hash tables using several hash-table schemes. We also provide tight upper and lower bounds on the achievable overflow fractions in these schemes. Finally, we propose an architecture with contentaddressable memory (CAM), which mitigates this decrease in performance. A. Background I.
Access-Efficient Balanced Bloom Filters
"... Bloom Filters should particularly suit network devices, because of their low theoretical memory-access rates. However, in practice, since memory is often divided into blocks and Bloom Filters hash elements into several arbitrary memory blocks, Bloom Filters actually need high memory-access rates. O ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Bloom Filters should particularly suit network devices, because of their low theoretical memory-access rates. However, in practice, since memory is often divided into blocks and Bloom Filters hash elements into several arbitrary memory blocks, Bloom Filters actually need high memory-access rates. On the other hand, hashing all Bloom Filter elements into a single memory block to solve this problem also yields high false positive rates. In this paper, we propose to implement load-balancing schemes for the choice of the memory block, along with an optional overflow list, resulting in improved false positive rates while keeping a high memory-access efficiency. To study this problem, we define, analyze and solve a fundamental access-constrained balancing problem, where incoming elements need to be optimally balanced across resources while satisfying average and instantaneous constraints on the number of memory accesses associated with checking the current load of the resources. We then build on this problem to suggest a new access-efficient Bloom Filter scheme, called the Balanced Bloom Filter. Finally, we show that this scheme can reduce the false positive rate by up to two orders of magnitude, with a worst-case cost of up to 3 memory accesses for each element and an overflow list size of 0.5 % of the elements.
Some Open Questions Related to Cuckoo Hashing
"... Abstract. The purpose of this brief note is to describe recent work in the area of cuckoo hashing, including a clear description of several open problems, with the hope of spurring further research. 1 ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. The purpose of this brief note is to describe recent work in the area of cuckoo hashing, including a clear description of several open problems, with the hope of spurring further research. 1

