MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Using difficulty of prediction to decrease computation: Fast sort, priority queue and convex hull on entropy bounded inputs (1993) [15 citations — 3 self]

Download:
Download as a PDF | Download as a PS
by Shenfeng Chen, John H. Reif
in Proceedings of the 34th Symposium on Foundations of Computer Science, Los Alamitos
http://www.cs.duke.edu/~chen/papers/sort.ps
Add To MetaCart

Abstract:

There is an upsurge in interest in the Markov model and also more general stationary ergodic stochastic distributions in theoretical computer science community recently (e.g. see [Vitter,Krishnan91], [Karlin,Philips,Raghavan92], [Raghavan92] for use of Markov models for on-line algorithms, e.g., cashing and prefetching). Their results used the fact that compressible sources are predictable (and vise versa), and showed that on-line algorithms can improve their performance by prediction. Actual page access sequences are in fact somewhat compressible, so their predictive methods can be of benefit. This paper investigates the interesting idea of decreasing computation by using learning in the opposite way, namely to determine the difficulty of prediction. That is, we will approximately learn the input distribution, and then improve the performance of the computation when the input is not too predictable, rather than the reverse. To our knowledge, this is first case of a computational problem where we do not assume any particular fixed input distribution and yet computation is decreased when the input is less predictable, rather than the reverse. We concentrate our investigation on a basic computational problem: sorting and a basic data structure problem: maintaining a priority queue. We present the first known case of sorting and priority queue algorithms whose complexity depends on the binary entropy H 1 of input keys where assume that input keys are generated from an unknown but arbitrary stationary ergodic source. This is, we assume that each of the input keys can be each arbitrarily long, but have entropy H. Note that H can be estimated in practice since the compression ratio ae using optimal Ziv-Lempel compression limits to 1=H for large inputs. Although sets of keys found in practice can not be expected to satisfy any fixed particular distribution such as uniform distribution, there is a large well documented body of empirical evidence that shows this compression ratio ae and thus 1=H is a constant for realistic inputs encountered in practice [1, 31], say typically around 3 to at most 20. Our algorithm runs in O(n log( log n

Citations

5825 Introduction to Algorithms – Cormen, Leiserson, et al. - 2001
759 A universal algorithm for sequential data compression – Ziv, Lempel - 1977
578 A method for the construction of minimum redundancy codes – Huffman - 1952
557 An Introduction to Parallel Algorithms – JaJa - 1992
492 Art of Computer Programming, Volume 3: Sorting and Searching (2nd Edition – Knuth - 1998
295 Parallel Merge Sort – Cole - 1988
274 Computational Geometry. An introduction through randomized algorithms – Mulmuley - 1994
224 Practical Prefetching via Data Compression – Curewitz, Krishnan, et al. - 1993
166 A guided tour of Chernoff bounds – Hagerup, Rub - 1990
143 Data Compression: Methods and Theory – Storer - 1988
129 A logarithmic time sort for linear size networks – Reif, Valiant - 1987
66 Finding the maximum, merging and sorting in a parallel computation model – Shiloach, Vishkin - 1981
65 Preserving order in a forest in less than logarithmic time – Boas - 1975
62 Optimal and sublogarithmic time randomized parallel sorting algorithms – Rajasekaran, Reif - 1989
42 Parallel Sorting and Data Partitioning by Sampling – Huang, Chow - 1983
37 LZW Data Compression – Nelson - 1989
29 Implementations of randomized sorting on large parallel machines – Hightower, Prins, et al. - 1992
25 A statistical adversary for on-line algorithms – Raghavan - 1992
25 Coding theorems for individual sequences – Ziv - 1978
24 An optimal parallel algorithm for integer sorting – Reif - 1985
23 On parallel hashing and integer sorting – Matias, Vishkin - 1991
21 Towards optimal parallel bucket sorting – Hagerup - 1987
20 Finite State Markov Decision Processes – Derman - 1970
20 Fast hashing on a PRAM|designing by expectation – Gil, Matias - 1991
18 Measures of presortedness and optimal sorting algorithms – Mannila - 1985
17 Optimal randomized parallel algorithms for computational geometry, in – Reif, Sen - 1987
11 On parallel integer sorting – Rajasekaran, Sen - 1992
10 The complexity of searching an ordered random table – Yao, Yao - 1976
7 Asymptotical growth of a class of random trees. The Annals of Probability – Pittel - 1985
6 A Fast Probabilistic Sorting Algorithm – REISCHUK - 1981
6 Un)Expected behavior of typical suffix trees – Szpankowski - 1992
3 A Typical Behavior of Some Data Compression Schemes – Szpankowski - 1991
2 J.Hastad, Optimal bounds for decision problems – Beam - 1987
2 on a theme by Ziv and Lempel, Combinatorial algorithms on words, edited by A.Apostolico and Z.Galil – Miller, Wegman
1 J.G.Cleary and I.H.Witten, Text Compression – Bell - 1990
1 C.G.Plaxton, S.J.Smith and M.Zagha, A comparison of sorting algorithms for the Connection – Blelloch, Maggs
1 A.Nicolau, Bitonic sorting with O(N log – Bilardi
1 Optimistic Sorting and Information Theoretic Complexity – Mcllroy - 1993