• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Low redundancy in static dictionaries with O(1) worst case lookup time (1999)

by R Pagh
Venue:In Proceedings of ICALP’99
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 16
Next 10 →

Are bitvectors optimal?

by H. Buhrman, P. B. Miltersen, J. Radhakrishnan, S. Venkatesh
"... ... We show lower bounds that come close to our upper bounds (for a large range of n and ffl): Schemes that answer queries with just one bitprobe and error probability ffl must use \Omega ( nffl log(1=ffl) log m) bits of storage; if the error is restricted to queries not in S, then the scheme must u ..."
Abstract - Cited by 49 (7 self) - Add to MetaCart
... We show lower bounds that come close to our upper bounds (for a large range of n and ffl): Schemes that answer queries with just one bitprobe and error probability ffl must use \Omega ( nffl log(1=ffl) log m) bits of storage; if the error is restricted to queries not in S, then the scheme must use \Omega ( n2ffl2 log(n=ffl) log m) bits of storage. We also

Succinct Representations of lcp Information and Improvements in the Compressed Suffix Arrays

by Kunihiko Sadakane , 2002
"... We introduce two succinct data structures to solve various string problems. One is for storing the information of lcp, the longest common prefix, between suffixes in the suffix array, and the other is an improvement in the compressed suffix array which supports linear time counting queries for any p ..."
Abstract - Cited by 46 (5 self) - Add to MetaCart
We introduce two succinct data structures to solve various string problems. One is for storing the information of lcp, the longest common prefix, between suffixes in the suffix array, and the other is an improvement in the compressed suffix array which supports linear time counting queries for any pattern. The former occupies only 2n + o(n) bits for a text of length n for computing lcp between adjacent suffixes in lexicographic order in constant time, and 6n + o(n) bits between any two suffixes. No data structure in the literature attained linear size. The latter has size proportional to the text size and it is applicable to texts on any alphabet Σ such that |Σ| = log^O(1) n. These space-economical data structures are useful in processing huge amounts of text data.

LOW REDUNDANCY IN STATIC DICTIONARIES WITH CONSTANT QUERY TIME

by Rasmus Pagh - SIAM J. COMPUT. , 2001
"... A static dictionary is a data structure storing subsets of a finite universe U, answering membership queries. We show that on a unit cost RAM with word size Θ(log |U|), a static dictionary for n-element sets with constant worst case query time can be obtained using B +O(log log |U|)+o(n) (|U|) bits ..."
Abstract - Cited by 40 (6 self) - Add to MetaCart
A static dictionary is a data structure storing subsets of a finite universe U, answering membership queries. We show that on a unit cost RAM with word size Θ(log |U|), a static dictionary for n-element sets with constant worst case query time can be obtained using B +O(log log |U|)+o(n) (|U|) bits of storage, where B = ⌈log2 ⌉ is the minimum number of bits needed to represent all n-n element subsets of U.

Cell Probe Complexity - a Survey

by Peter Bro Miltersen - In 19th Conference on the Foundations of Software Technology and Theoretical Computer Science (FSTTCS), 1999. Advances in Data Structures Workshop , 1999
"... The cell probe model is a general, combinatorial model of data structures. We give a survey of known results about the cell probe complexity of static and dynamic data structure problems, with an emphasis on techniques for proving lower bounds. 1 Introduction 1.1 The 'Were-you-last?' game A Dre ..."
Abstract - Cited by 27 (0 self) - Add to MetaCart
The cell probe model is a general, combinatorial model of data structures. We give a survey of known results about the cell probe complexity of static and dynamic data structure problems, with an emphasis on techniques for proving lower bounds. 1 Introduction 1.1 The 'Were-you-last?' game A Dream Team, consisting of m players, is held captive in the dungeon of their adversary, Hannibal. He now makes them play his favourite game, Were-you-last?. Before the game starts the players of the Team are allowed to meet to discuss a strategy (obviously, Hannibal has the room bugged and is listening in). After the discussion they are led to separate waiting rooms. Then Hannibal leads each of the players of the team, one by one, to the playing field. The players do not know the order in which they are led to the field and they spend their time there alone. The playing field is a room, containing an infinite number of boxes, labelled 0, 1, 2, 3, . . . . Inside each box is a switch that can be ...

The Cell Probe Complexity of Succinct Data Structures

by Anna Gal, Anna Gál, Peter Bro Miltersen - In Automata, Languages and Programming, 30th International Colloquium (ICALP 2003 , 2003
"... We show lower bounds in the cell probe model for the redundancy/query time tradeoff of solutions to static data structure problems. ..."
Abstract - Cited by 27 (0 self) - Add to MetaCart
We show lower bounds in the cell probe model for the redundancy/query time tradeoff of solutions to static data structure problems.

Compressed data structures: dictionaries and data-aware measures

by Ankur Gupta, Wing-kai Hon, Rahul Shah, Jeffrey Scott Vitter - In Proc. 5th International Workshop on Experimental Algorithms (WEA , 2006
"... Abstract. We propose measures for compressed data structures, in which space usage is measured in a data-aware manner. In particular, we consider the fundamental dictionary problem on set data, where the task is to construct a data structure to represent a set S of n items out of a universe U = {0,. ..."
Abstract - Cited by 19 (1 self) - Add to MetaCart
Abstract. We propose measures for compressed data structures, in which space usage is measured in a data-aware manner. In particular, we consider the fundamental dictionary problem on set data, where the task is to construct a data structure to represent a set S of n items out of a universe U = {0,..., u − 1} and support various queries on S. We use a well-known data-aware measure for set data called gap to bound the space of our data structures. We describe a novel dictionary structure taking gap+O(n log(u/n) / log n)+O(n log log(u/n)) bits. Under the RAM model, our dictionary supports membership, rank, select, and predecessor queries in nearly optimal time, matching the time bound of Andersson and Thorup’s predecessor structure [AT00], while simultaneously improving upon their space usage. Our dictionary structure uses exactly gap bits in the leading term (i.e., the constant factor is 1) and answers queries in near-optimal time. When seen from the worst case perspective, we present the first O(n log(u/n))-bit dictionary structure which supports these queries in nearoptimal time under RAM model. We also build a dictionary which requires the same space and supports membership, select, and partial rank queries even more quickly in O(log log n) time. To the best of our knowledge, this is the first of a kind result which achieves data-aware space usage and retains near-optimal time. 1

A compressed self-index using a Ziv-Lempel dictionary

by Luís M. S. Russo, Arlindo L. Oliveira - In: SPIRE. Volume 4209 of LNCS. (2006) 163–180
"... Abstract. A compressed full-text self-index for a text T, of size u, is a data structure used to search patterns P, of size m, in T that requires reduced space, i.e. that depends on the empirical entropy (Hk, H0) of T, and is, furthermore, able to reproduce any substring of T. In this paper we prese ..."
Abstract - Cited by 17 (5 self) - Add to MetaCart
Abstract. A compressed full-text self-index for a text T, of size u, is a data structure used to search patterns P, of size m, in T that requires reduced space, i.e. that depends on the empirical entropy (Hk, H0) of T, and is, furthermore, able to reproduce any substring of T. In this paper we present a new compressed self-index able to locate the occurrences of P in O((m + occ) log n) time, where occ is the number of occurrences and σ the size of the alphabet of T. The fundamental improvement over previous LZ78 based indexes is the reduction of the search time dependency on m from O(m 2) to O(m). To achieve this result we point out the main obstacle to linear time algorithms based on LZ78 data compression and expose and explore the nature of a recurrent structure in LZ-indexes, the T78 suffix tree. We show that our method is very competitive in practice by comparing it against the LZ-Index, the FM-index and a compressed suffix array. 1

On the Probe Complexities of Membership and Perfect Hashing

by Rasmus Pagh
"... This paper considers the following static data structure problems. ..."
Abstract - Cited by 12 (5 self) - Add to MetaCart
This paper considers the following static data structure problems.

Computational depth: Concept and applications

by Luis Antunes, Lance Fortnow, Dieter van Melkebeek, N. V. Vinodchandran - THEOR. COMPUT. SCI , 2006
"... ..."
Abstract - Cited by 7 (4 self) - Add to MetaCart
Abstract not found

Lossy Dictionaries

by Rasmus Pagh, Flemming Friche Rodler - In ESA ’01: Proceedings of the 9th Annual European Symposium on Algorithms , 2001
"... Bloom filtering is an important technique for space efficient storage of a conservative approximation of a set S. The set stored may have up to some specified number of false positive members, but all elements of S are included. In this paper we consider lossy dictionaries that are also allowed to h ..."
Abstract - Cited by 4 (1 self) - Add to MetaCart
Bloom filtering is an important technique for space efficient storage of a conservative approximation of a set S. The set stored may have up to some specified number of false positive members, but all elements of S are included. In this paper we consider lossy dictionaries that are also allowed to have false negatives, i.e., leave out elements of S. The aim is to maximize the weight of included keys within a given space constraint. This relaxation allows a very fast and simple data structure making almost optimal use of memory. Being more time efficient than Bloom filters, we believe our data structure to be well suited for replacing Bloom filters in some applications. Also, the fact that our data structure supports information associated to keys paves the way for new uses, as illustrated by an application in lossy image compression.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University