Results 1  10
of
32
Sketching and Streaming Entropy via Approximation Theory
"... We conclude a sequence of work by giving nearoptimal sketching and streaming algorithms for estimating Shannon entropy in the most general streaming model, with arbitrary insertions and deletions. This improves on prior results that obtain suboptimal space bounds in the general model, and nearopti ..."
Abstract

Cited by 33 (0 self)
 Add to MetaCart
(Show Context)
We conclude a sequence of work by giving nearoptimal sketching and streaming algorithms for estimating Shannon entropy in the most general streaming model, with arbitrary insertions and deletions. This improves on prior results that obtain suboptimal space bounds in the general model, and nearoptimal bounds in the insertiononly model without sketching. Our highlevel approach is simple: we give algorithms to estimate Rényi and Tsallis entropy, and use them to extrapolate an estimate of Shannon entropy. The accuracy of our estimates is proven using approximation theory arguments and extremal properties of Chebyshev polynomials, a technique which may be useful for other problems. Our work also yields the bestknown and nearoptimal additive approximations for entropy, and hence also for conditional entropy and mutual information.
Fast moment estimation in data streams in optimal space
 In Proceedings of the 43rd ACM Symposium on Theory of Computing (STOC
, 2011
"... We give a spaceoptimal algorithm with update time O(log 2 (1/ε) log log(1/ε)) for (1 ± ε)approximating the pth frequency moment, 0 < p < 2, of a lengthn vector updated in a data stream. This provides a nearly exponential improvement in the update time complexity over the previous spaceoptim ..."
Abstract

Cited by 24 (8 self)
 Add to MetaCart
(Show Context)
We give a spaceoptimal algorithm with update time O(log 2 (1/ε) log log(1/ε)) for (1 ± ε)approximating the pth frequency moment, 0 < p < 2, of a lengthn vector updated in a data stream. This provides a nearly exponential improvement in the update time complexity over the previous spaceoptimal algorithm of [KaneNelsonWoodruff, SODA 2010], which had update time Ω(1/ε 2). 1
Compressed counting
 CoRR
"... We propose Compressed Counting (CC) for approximating the αth frequency moments (0 < α ≤ 2) of data streams under a relaxed strictTurnstile model, using maximallyskewed stable random projections. Estimators based on the geometric mean and the harmonic mean are developed. When α = 1, a simple cou ..."
Abstract

Cited by 21 (13 self)
 Add to MetaCart
(Show Context)
We propose Compressed Counting (CC) for approximating the αth frequency moments (0 < α ≤ 2) of data streams under a relaxed strictTurnstile model, using maximallyskewed stable random projections. Estimators based on the geometric mean and the harmonic mean are developed. When α = 1, a simple counter suffices for counting the first moment (i.e., sum). The geometric mean estimator of CC has asymptotic variance ∝ ∆ = α − 1, capturing the intuition that the complexity should decrease as ∆ = α−1  → 0. However, the previous classical algorithms based on symmetric stable random projections[12, 15] required O ( 1/ɛ 2) space, in order to approximate the αth moments within a 1 + ɛ factor, for any 0 < α ≤ 2 including α = 1. We show ( that using the geometric mean estimator, CC 1 requires O log(1+ɛ) + 2 √ ∆ log3/2 ( √∆)) + o space, as ∆ → (1+ɛ) 0. Therefore, in the neighborhood of α = 1, the complexity of CC is essentially O (1/ɛ) instead of O ( 1/ɛ 2). CC may be useful for estimating Shannon entropy, which can be approximated by certain functions of the αth moments with α → 1. [10, 9] suggested using α = 1 + ∆ with (e.g.,) ∆ < 0.0001 and ɛ < 10 −7, to rigorously ensure reasonable approximations. Thus, unfortunately, CC is “theoretically impractical ” for estimating Shannon entropy, despite its empirical success reported in [16]. 1
Optimal Sampling from Sliding Windows
 ACM PODS2009
, 2009
"... A sliding windows model is an important case of the streaming model, where only the most “recent” elements remain active and the rest are discarded in a stream. The sliding windows model is important for many applications (see, e.g., Babcock, Babu, Datar, Motwani and Widom (PODS 02); and Datar, Gion ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
(Show Context)
A sliding windows model is an important case of the streaming model, where only the most “recent” elements remain active and the rest are discarded in a stream. The sliding windows model is important for many applications (see, e.g., Babcock, Babu, Datar, Motwani and Widom (PODS 02); and Datar, Gionis, Indyk and Motwani (SODA 02)). There are two equally important types of the sliding windows model – windows with fixed size, (e.g., where items arrive one at a time, and only the most recent n items remain active for some fixed parameter n), and bursty windows (e.g., where many items can arrive in “bursts ” at a single step and where only items from the last t steps remain active, again for some fixed parameter t). Random sampling is a fundamental tool for data streams, as numerous algorithms operate on the sampled data instead of on the entire stream. Effective sampling from sliding windows is a nontrivial problem, as elements eventually expire. In fact, the deletions are implicit; i.e., it is not possible to identify deleted elements without storing the entire window. The implicit nature of deletions on sliding windows does not allow the existing methods (even those that support explicit deletions, e.g., Cormode, Muthukrishnan and Rozenbaum (VLDB 05); Frahling, Indyk and Sohler (SOCG 05)) to be directly “translated ” to the sliding windows model. One trivial approach to overcoming the problem of implicit deletions is that of oversampling. When k samples are required, the oversampling method maintains k ′> k samples in the hope that at least k samples are not expired. The obvious disadvantages of this method are twofold: (a) It introduces additional costs and thus decreases the performance; and (b) The memory bounds are not deterministic, which is atypical for
Design and Performance Analysis of a DRAMbased Statistics Counter Array Architecture
 ANCS 2009
, 2009
"... The problem of maintaining efficiently a large number (say millions) of statistics counters that need to be updated at very high speeds (e.g. 40 Gb/s) has received considerable research attention in recent years. This problem arises in a variety of router management and data streaming applications w ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
The problem of maintaining efficiently a large number (say millions) of statistics counters that need to be updated at very high speeds (e.g. 40 Gb/s) has received considerable research attention in recent years. This problem arises in a variety of router management and data streaming applications where large arrays of counters are used to track various network statistics and implement various counting sketches. It proves too costly to store such large counter arrays entirely in SRAM while DRAM is viewed as too slow for providing wirespeed updates at such high speeds. In this paper, we propose a DRAMbased counter architecture that can effectively maintain wirespeed updates to large counter arrays. The proposed approach is based on the observation that modern commodity DRAM architectures, driven by aggressive performance roadmaps for consumer applications (e.g. video games), have advanced architecture features that can be exploited to make a DRAMbased solution practical. In particular, we propose a randomized DRAM architecture that can harness the performance of modern commodity DRAM offerings by interleaving counter updates to multiple memory banks. The proposed architecture makes use of a simple randomization scheme, a small cache, and small request queues to statistically guarantee a nearperfect loadbalancing of counter updates to the DRAM banks. The statistical guarantee of the proposed scheme is proven using a novel combination of convex ordering and large deviation theory. Our proposed counter scheme can support arbitrary increments and decrements at wirespeed, and it can support different number representations, including both integer and floating point number representations.
Computationally efficient estimators for dimension reductions using stable random projections
 In ICDM
, 2008
"... Abstract The method of stable random projections is an efficient tool for computing the l α distances using low memory, where ..."
Abstract

Cited by 7 (7 self)
 Add to MetaCart
(Show Context)
Abstract The method of stable random projections is an efficient tool for computing the l α distances using low memory, where
Streaming Algorithms for Estimating Entropy
"... Abstract — We give a method for estimating the empirical Shannon entropy of a distribution in the streaming model of computation. Our approach reduces this problem to the wellstudied problem of estimating frequency moments. The analysis of our approach is based on new results which establish quantit ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Abstract — We give a method for estimating the empirical Shannon entropy of a distribution in the streaming model of computation. Our approach reduces this problem to the wellstudied problem of estimating frequency moments. The analysis of our approach is based on new results which establish quantitative bounds on the rate of convergence of Rényi entropy towards Shannon entropy. I.
DRAM is plenty fast for wirespeed statistics counting
 In ACM HotMetrics
, 2008
"... Perflow network measurement at Internet backbone links requires the efficient maintanence of large arrays of statistics counters at very high speeds (e.g. 40 Gb/s). The prevailing view is that SRAM is too expensive for implementing large counter arrays, but DRAM is too slow for providing wirespeed ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
Perflow network measurement at Internet backbone links requires the efficient maintanence of large arrays of statistics counters at very high speeds (e.g. 40 Gb/s). The prevailing view is that SRAM is too expensive for implementing large counter arrays, but DRAM is too slow for providing wirespeed updates. This view is the main premise of a number of hybrid SRAM/DRAM architectural proposals [2, 3, 4, 5] that still require substantial amounts of SRAM for large arrays. In this paper, we present a contrarian view that modern commodity DRAM architectures, driven by aggressive performance roadmaps for consumer applications (e.g. video games), have advanced architecture features that can be exploited to make DRAM solutions practical. We describe two such schemes that can harness the performance of these DRAM offerings by enabling the interleaving of counter updates to multiple memory banks. These counter schemes are the first to support arbitrary increments and decrements for either integer or floating point number representations at wirespeed. We believe our preliminary success with the use of DRAM schemes for wirespeed statistics counting opens the possibilities for broader research opportunities to generalize the proposed ideas for other network measurement functions.
A New Algorithm for Compressed Counting with Applications in Shannon Entropy Estimation in Dynamic Data
"... Efficient estimation of the moments and Shannon entropy of data streams is an important task in modern machine learning and data mining. To estimate the Shannon entropy, it suffices to accurately estimate the αth moment with ∆ = 1 − α  ≈ 0. To guarantee that the error of estimated Shannon entro ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
(Show Context)
Efficient estimation of the moments and Shannon entropy of data streams is an important task in modern machine learning and data mining. To estimate the Shannon entropy, it suffices to accurately estimate the αth moment with ∆ = 1 − α  ≈ 0. To guarantee that the error of estimated Shannon entropy is within a νadditive factor, the method of symmetric stable random projections requires O ( 1 ν2∆2) samples, which is extremely expensive. The first paper (Li, 2009a) in Compressed Counting (CC), based on skewedstable random projections, supplies a substantial improvement by reducing the sample complexity to O ( 1 ν2), which is still expensive. The followup work (Li, 2009b) provides a practical algorithm, which is however difficult to analyze theoretically. In this paper, we propose a new accurate algorithm for Compressed Counting, whose sample complexity is only O ( 1 ν2) for νadditive Shannon entropy estimation. The constant factor for this bound is merely about 6. In addition, we prove that our algorithm achieves an upper bound of the Fisher information and in fact it is close to 100 % statistically optimal. An empirical study is conducted to verify the accuracy of our algorithm.
A very efficient scheme for estimating entropy of data streams using compressed counting
, 2008
"... Compressed Counting (CC) was recently proposed for approximating the αth frequency moments of data streams, for 0 < α ≤ 2. Under the relaxed strictTurnstile model, CC dramatically improves the standard algorithm based on symmetric stable random projections, especially as α → 1. A direct applicat ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
(Show Context)
Compressed Counting (CC) was recently proposed for approximating the αth frequency moments of data streams, for 0 < α ≤ 2. Under the relaxed strictTurnstile model, CC dramatically improves the standard algorithm based on symmetric stable random projections, especially as α → 1. A direct application of CC is to estimate the entropy, which is an important summary statistic in Web/network measurement and often serves a crucial “feature ” for data mining. The Rényi entropy and the Tsallis entropy are functions of the αth frequency moments; and both approach the Shannon entropy as α → 1. A recent theoretical work suggested using the αth frequency moment to approximate the Shannon entropy with α = 1+δ and very small δ  (e.g., < 10 −4). In this study, we experiment using CC to estimate frequency moments, Rényi entropy, Tsallis entropy, and Shannon entropy, on real Web crawl data. We demonstrate the variancebias tradeoff in estimating Shannon entropy and provide practical recommendations. In particular, our experiments enable us to draw some important conclusions: • As α → 1, CC dramatically improves symmetric stable random projections in estimating frequency moments, Rényi entropy, Tsallis entropy, and Shannon entropy. The improvements appear to approach “infinity.” • CC is a highly practical algorithm for estimating Shannon entropy (from either Rényi or Tsallis entropy) with α ≈ 1. Only a very small sample (e.g., 20) is needed to achieve a high accuracy (e.g., < 1 % relative errors). • Using symmetric stable random projections and α = 1+δ with very small δ  does not provide a practical algorithm because the required sample size is enormous. • If we do need to use symmetric stable random projections for estimating Shannon entropy, we should exploit the variancebias tradeoff by letting α be away from 1, for much better performance. • Even in terms of the best achievable performance in estimating Shannon entropy, CC still considerably improves symmetric stable random projections by one or two magnitudes, both in terms of the estimation accuracy and the required sample size (storage space).