• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

On universal classes of extremely random constant-time hash functions (2004)

by A Siegel
Venue:SIAM J. Comput
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 41
Next 10 →

Why simple hash functions work: Exploiting the entropy in a data stream

by Michael Mitzenmacher, Salil Vadhan - In Proceedings of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms , 2008
"... Hashing is fundamental to many algorithms and data structures widely used in practice. For theoretical analysis of hashing, there have been two main approaches. First, one can assume that the hash function is truly random, mapping each data item independently and uniformly to the range. This idealiz ..."
Abstract - Cited by 49 (8 self) - Add to MetaCart
Hashing is fundamental to many algorithms and data structures widely used in practice. For theoretical analysis of hashing, there have been two main approaches. First, one can assume that the hash function is truly random, mapping each data item independently and uniformly to the range. This idealized model is unrealistic because a truly random hash function requires an exponential number of bits to describe. Alternatively, one can provide rigorous bounds on performance when explicit families of hash functions are used, such as 2-universal or O(1)-wise independent families. For such families, performance guarantees are often noticeably weaker than for ideal hashing. In practice, however, it is commonly observed that weak hash functions, including 2-universal hash functions, perform as predicted by the idealized analysis for truly random hash functions. In this paper, we try to explain this phenomenon. We demonstrate that the strong performance of universal hash functions in practice can arise naturally from a combination of the randomness of the hash function and the data. Specifically, following the large body of literature on random sources and randomness extraction, we model the data as coming from a “block source, ” whereby

Balanced allocation and dictionaries with tightly packed constant size bins

by Martin Dietzfelbinger, et al. , 2007
"... ..."
Abstract - Cited by 37 (2 self) - Add to MetaCart
Abstract not found

Almost Random Graphs with Simple Hash Functions

by Martin Dietzfelbinger, Philipp Woelfel - STOC'03 , 2003
"... ..."
Abstract - Cited by 30 (1 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...se of multigraph, i. e., there may be multiple edges. Accordingly, when we talk about sets of edges, we mean multisets.sSiegel’s high-performance hash classes. Siegel (in the technical report version =-=[23]-=- of [22]) gave a construction to the following effect. Fact 1. Let 0 < µ < 1 and k ≥ 1 with µk < 1 be given. Then if ζ < 1 and d ≥ 1 satisfy ζ ≥ 2k 1+log d+µ log n + ·k (for d ζ log n n large enough),...

Linear probing with constant independence

by Anna Pagh, Rasmus Pagh - In STOC ’07: Proceedings of the thirty-ninth annual ACM symposium on Theory of computing , 2007
"... Hashing with linear probing dates back to the 1950s, and is among the most studied algorithms. In recent years it has become one of the most important hash table organizations since it uses the cache of modern computers very well. Unfortunately, previous analyses rely either on complicated and space ..."
Abstract - Cited by 23 (2 self) - Add to MetaCart
Hashing with linear probing dates back to the 1950s, and is among the most studied algorithms. In recent years it has become one of the most important hash table organizations since it uses the cache of modern computers very well. Unfortunately, previous analyses rely either on complicated and space consuming hash functions, or on the unrealistic assumption of free access to a truly random hash function. Already Carter and Wegman, in their seminal paper on universal hashing, raised the question of extending their analysis to linear probing. However, we show in this paper that linear probing using a pairwise independent family may have expected logarithmic cost per operation. On the positive side, we show that 5-wise independence is enough to ensure constant expected time per operation. This resolves the question of finding a space and time efficient hash function that provably ensures good performance for linear probing.
(Show Context)

Citation Context

...-wise independence is sufficient to achieve essentially the same performance as in the fully random case. (We use n to denote the number of keys inserted into the hash table.) Another paper by Siegel =-=[12]-=- shows that evaluation of a hash function from a O(log n)-wise independent family requires time Ω(log n) unless the space used to describe the function is n Ω(1) . A family of functions is given that ...

On Dynamic Range Reporting in One Dimension

by Christian Worm Mortensen , Rasmus Pagh, et al. , 2008
"... We consider the problem of maintaining a dynamic set of integers and answering queries of the form: report a point (equivalently, all points) in a given interval. Range searching is a natural and fundamental variant of integer search, and can be solved using predecessor search. However, for a RAM wi ..."
Abstract - Cited by 20 (5 self) - Add to MetaCart
We consider the problem of maintaining a dynamic set of integers and answering queries of the form: report a point (equivalently, all points) in a given interval. Range searching is a natural and fundamental variant of integer search, and can be solved using predecessor search. However, for a RAM with w-bit words, we show how to perform updates in O(lg w) time and answer queries in O(lg lg w) time. The update time is identical to the van Emde Boas structure, but the query time is exponentially faster. Existing lower bounds show that achieving our query time for predecessor search requires doubly-exponentially slower updates. We present some arguments supporting the conjecture that our solution is optimal. Our solution is based on a new and interesting recursion idea which is “more extreme” that the van Emde Boas recursion. Whereas van Emde Boas uses a simple recursion (repeated halving) on each path in a trie, we use a nontrivial, van Emde Boas-like recursion on every such path. Despite this, our algorithm is quite clean when seen from the right angle. To achieve linear space for our data structure, we solve a problem which is of independent interest. We develop the first scheme for dynamic perfect hashing requiring sublinear space. This gives a dynamic Bloomier filter (an approximate storage scheme for sparse vectors) which uses low space. We strengthen previous lower bounds to show that these results are optimal.

Strongly history-independent hashing with applications

by Guy E. Blelloch - In Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science , 2007
"... We present a strongly history independent (SHI) hash table that supports search in O(1) worst-case time, and insert and delete in O(1) expected time using O(n) data space. This matches the bounds for dynamic perfect hashing, and improves on the best previous results by Naor and Teague on history ind ..."
Abstract - Cited by 19 (5 self) - Add to MetaCart
We present a strongly history independent (SHI) hash table that supports search in O(1) worst-case time, and insert and delete in O(1) expected time using O(n) data space. This matches the bounds for dynamic perfect hashing, and improves on the best previous results by Naor and Teague on history independent hashing, which were either weakly history independent, or only supported insertion and search (no delete) each in O(1) expected time. The results can be used to construct many other SHI data structures. We show straightforward constructions for SHI ordered dictionaries: for n keys from {1,..., n k} searches take O(log log n) worst-case time and updates (insertions and deletions) O(log log n) expected time, and for keys in the comparison model searches take O(log n) worst-case time and updates O(log n) expected time. We also describe a SHI data structure for the order-maintenance problem. It supports comparisons in O(1) worst-case time, and updates in O(1) expected time. All structures use O(n) data space. 1
(Show Context)

Citation Context

...h : U → [p] which can be evaluated in O(1) time. For a discussion of efficient O(1)-universal hash functions, see [16]. Where Θ(log n)-universal hash functions are needed, the constructions of Siegel =-=[19, 20]-=- and Östlin and Pagh [15] are suitable, assuming the keys are integers. The latter are also suitable where full randomness is called for. History independence is defined below. The definition of weak ...

Hash-Based Techniques for High-Speed Packet Processing

by Adam Kirsch, Michael Mitzenmacher, George Varghese
"... Hashing is an extremely useful technique for a variety of high-speed packet-processing applications in routers. In this chapter, we survey much of the recent work in this area, paying particular attention to the interaction between theoretical and applied research. We assume very little background ..."
Abstract - Cited by 15 (2 self) - Add to MetaCart
Hashing is an extremely useful technique for a variety of high-speed packet-processing applications in routers. In this chapter, we survey much of the recent work in this area, paying particular attention to the interaction between theoretical and applied research. We assume very little background in either the theory or applications of hashing, reviewing the fundamentals as necessary.
(Show Context)

Citation Context

...gman’s original work [9], there has been a substantial amount of research on efficient constructions of hash functions that are theoretically suitable for use in data structures and algorithms (e.g., =-=[48, 55]-=- and references therein). Unfortunately, while there are many impressive theoretical results in that literature, the constructed hash families are usually impractical. Thus, at least at present, these...

Backyard Cuckoo Hashing: Constant Worst-Case Operations with a Succinct Representation

by Yuriy Arbitman, Moni Naor, Gil Segev , 2010
"... The performance of a dynamic dictionary is measured mainly by its update time, lookup time, and space consumption. In terms of update time and lookup time there are known constructions that guarantee constant-time operations in the worst case with high probability, and in terms of space consumption ..."
Abstract - Cited by 12 (4 self) - Add to MetaCart
The performance of a dynamic dictionary is measured mainly by its update time, lookup time, and space consumption. In terms of update time and lookup time there are known constructions that guarantee constant-time operations in the worst case with high probability, and in terms of space consumption there are known constructions that use essentially optimal space. In this paper we settle two fundamental open problems: • We construct the first dynamic dictionary that enjoys the best of both worlds: we present a two-level variant of cuckoo hashing that stores n elements using (1+ϵ)n memory words, and guarantees constant-time operations in the worst case with high probability. Specifically, for any ϵ = Ω((log log n / log n) 1/2) and for any sequence of polynomially many operations, with high probability over the randomness of the initialization phase, all operations are performed in constant time which is independent of ϵ. The construction is based on augmenting cuckoo hashing with a “backyard ” that handles a large fraction of the elements, together with a de-amortized perfect hashing scheme for eliminating the dependency on ϵ.
(Show Context)

Citation Context

...rested in functions that have a short representation and can be evaluated in constant time in the unit cost RAM model. Although there are no such constructions of k-wise independent functions, Siegel =-=[Sie04]-=- constructed a pretty good approximation that is sufficient for our applications (see also the recent improvement of Dietzfelbinger and Rink [DR09] to Siegel’s construction). For any two sets U and V ...

Tabulation Based 5-Universal Hashing and Linear Probing

by Mikkel Thorup, Yin Zhang
"... Previously [SODA’04] we devised the fastest known algorithm for 4-universal hashing. The hashing was based on small pre-computed4-universal tables. This led to a five-fold improvement in speed over direct methods based on degree 3 polynomials. In this paper, we show that if the pre-computed tables a ..."
Abstract - Cited by 7 (4 self) - Add to MetaCart
Previously [SODA’04] we devised the fastest known algorithm for 4-universal hashing. The hashing was based on small pre-computed4-universal tables. This led to a five-fold improvement in speed over direct methods based on degree 3 polynomials. In this paper, we show that if the pre-computed tables are made 5-universal, then the hash value becomes 5-universal without any other change to the computation. Relatively this leads to even bigger gains since the direct methods for 5-universal hashing use degree 4 polynomials. Experimentally, we find that our method can gain up to an order of magnitude in speed over direct 5-universal hashing. Some of the most popular randomized algorithms have been proved to have the desired expected running time using

De Dictionariis Dynamicis Pauco Spatio Utentibus (lat. On Dynamic Dictionaries Using Little Space)

by Erik D. Demaine, Friedhelm Meyer Auf Der Heide, Rasmus Pagh, Mihai Patrascu , 2005
"... We develop dynamic dictionaries on the word RAM that use asymptotically optimal space, up to constant factors, subject to insertions and deletions, and subject to supporting perfect-hashing queries and/or membership queries, each operation in constant time with high probability. When supporting only ..."
Abstract - Cited by 6 (0 self) - Add to MetaCart
We develop dynamic dictionaries on the word RAM that use asymptotically optimal space, up to constant factors, subject to insertions and deletions, and subject to supporting perfect-hashing queries and/or membership queries, each operation in constant time with high probability. When supporting only membership queries, we attain the optimal space bound of Θ(n lg u n) bits, where n and u are the sizes of the dictionary and the universe, respectively. Previous dictionaries either did not achieve this space bound or had time bounds that were only expected and amortized. When supporting perfect-hashing queries, the optimal space bound depends on the range {1, 2,..., n + t} of hashcodes allowed as output. We prove that the optimal space bound is Θ(n lg lg u n n + n lg t+1) bits when supporting only perfect-hashing queries, and it is Θ(n lg u n n + n lg t+1) bits when also supporting membership queries. All upper bounds are new,) lower bound. as is the Ω(n lg n t+1 1
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University