| N. Alon, P. B. Gibbons, Y. Matias, and M. Szegedy. "Tracking Join and Self-Join Sizes in Limited Storage". ACM PODS, 1999. |
....in size, but the function itself is computationally di#cult. We also consider this case, and we provide a private approximation to natural #P hard problems related to the permanent. Related results. There are several very e#cient algorithms for approximating the L or Hamming distance (cf. [2, 15, 24, 21]) However, these results do not directly translate into communication e#cient private approximation protocols, as is discussed further in Section 3. Since the presentation of an early unpublished version of our work [13] Halevi et al. 19] investigate private approximations of NP hard ....
....turn our attention to privately approximating natural #P hard problems, where the goal is to achieve polynomial time private approximations. Note that artificial privately approximable #P hard problems are easily constructed. For example, consider any #P hard problem f(x) with output in the range [0, 2 ]. Then g(x) f(x) 2 is computationally equivalent to f(x) and, in particular, is computationally interesting i# f is. Although, for many values of #, 2 is a (1 #) factor private approximation to g(x) this approximation doesn t approximate any interesting quantity. Thus, even if a ....
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy, Tracking Join and Self-Join Sizes in Limited Storage. In 18th PODS, 10--20, New York, 1999. 11
.... data stream computation has given rise to several recent (theoretical and practical) studies of on line or one pass algorithms with limited memory requirements for different problems; examples include quantile and orderstatistics computation [16, 21] estimating frequency moments and join sizes [3, 2], data clustering and decision tree construction [10, 18] estimating correlated aggregates [13] and computing one dimensional (i.e. single attribute) histograms and Haar wavelet decompositions [17, 15] Other related studies have proposed techniques for incrementally maintaining equi depth ....
....approximate query processing tools inapplicable in a data stream setting. Note that, even though random sample data summaries can be easily constructed in a single pass [23] it is well known that such summaries typically give very poor result estimates for queries involving one or more joins [1, 6, 2] ) Our Contributions. In this paper, we tackle the hard technical problems involved in the approximate processing of complex (possibly multi join) aggregate decision support queries over continuous data streams with limited memory. Our approach is based on randomizing techniques that compute ....
[Article contains additional citation context not shown here]
N. Alon, P.B. Gibbons, Y. Matias, and M. Szegedy. "Tracking Join and Self-Join Sizes in Limited Storage". In Proc. of the Eighteenth ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems, May 1999.
....moderate in size, but the function itself is computationally difficult. We also consider this case and provide a private approximation to the permanent and related #P hard problems. Related results. There are several very efficient algorithms for approximating the L p or Hamming distance (cf. [28, 2, 15, 21]) However, these results do not directly translate into communication efficient private approximation protocols, as is discussed further in Section 3. Since the presentation of an early unpublished version of our work [13] Halevi et al. 19] investigate private approximations of NP hard ....
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy, Tracking join and self-join sizes in limited storage. In 18th PODS, 10--20, New York, 1999.
.... values of #c i ,d i # where c i s and d i s appear any which way seems to be difficult (and likewise for more complex linear projections like wavelet coefficients) In the unordered cash register model, even keeping track of highest Ba#i# s is difficult in general; this is the top B queries in [11, 4, 2]. We are able to formalize all these intuitions in rigorous mathematical framework and prove that computing the highest B term approximation for a signal in any of these data streaming models is difficult, i.e. would require storing too much data, nearly equal to the size of the signal, and even ....
N. Alon, P. Gibbons, Y. Matias and M. Szegedy. Tracking join and self-join sizes in limited storage. In ACM Symposium on Principles of Database Systems (PODS), 1999.
....model of computation and complexity measures for streaming and sketching algorithms. In Section 3, we present our main technical results. Section 4 explains the relationship of our algorithm to other recent work, including that of Broder et al. BCFM00] on sketching and that of Alon et al. [AMS99, AGMS99] on frequency moments. 2 Models of Computation Our model is closely related to that of Henzinger, Raghavan, and Rajagopalan [HRR98] We also describe a related sketch model that has been used, e.g. in [BCFM00] 2.1 The Streaming Model As in [HRR98] a data stream is a sequence of data items # 1 ....
....0 # PAS(log(M) log(n) log(1 #) # 2 ) 2. F #=0 0 # PAS(log(M) log(n) log(1 #) # 2 ) 3. For all # # (0, 1) all fixed # 1, # 1 4, and M 1 # 2, and, for any f = o(n) F # 0 ## PAS(f(n) 4. 3 Approximating the L 2 Di#erence and the Second Frequency Moment In [AMS99] and [AGMS99], the authors consider the following problem. The input is a sequence of elements from [n] 0, n 1 . An element i # [n] may occur many times. Each occurrence of i is marked as a positive or negative occurrence; we let a i denote the number of times i occurs positively less ....
[Article contains additional citation context not shown here]
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy. Tracking Join and Self-Join Sizes in Limited Storage. In Proc. of the 18'th Symp. on Principles of Database Systems, ACM Press, New York, pages 10--20, 1999.
....moderate in size, but the function itself is computationally difficult. We also consider this case and provide a private approximation to the permanent and related #P hard problems. Related results. There are several very efficient algorithms for approximating the L p or Hamming distance (cf. [28, 2, 15, 21]) However, these results do not directly translate into communication efficient private approximation protocols, as is discussed further in Section 3. Since the presentation of an early unpublished version of our work [13] Halevi et al. 19] investigate private approximations of NP hard ....
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy, Tracking join and self-join sizes in limited storage. In 18th PODS, 10--20, New York, 1999.
.... of (c i Gamma d i ) where c i s and d i s appear any which way seems to be difficult (and likewise for more complex linear projections like wavelet coefficients) In the unordered cash register model, even keeping track of highest B a(i) s is difficult in general; this is the top B queries in [11, 4, 2]. We are able to formalize all these intuitions in rigorous mathematical framework and prove that computing the highest B term approximation for a signal in any of these data streaming models is difficult, i.e. would require storing too much data, nearly equal to the size of the signal, and even ....
N. Alon, P. Gibbons, Y. Matias and M. Szegedy. Tracking join and self-join sizes in limited storage. In ACM Symposium on Principles of Database Systems (PODS), 1999.
....only moderate in size, but the function itself is computationally di#cult. We also consider this case and provide a private approximation to the permanent and related #P hard problems. Related results. There are several very e#cient algorithms for approximating the L p or Hamming distance (cf. [28,2,15,21]) However, these results do not directly translate into communication e#cient private approximation protocols, as is discussed further in Section 3. Since the presentation of an early unpublished version of our work [13] Halevi et al. 19] investigate private approximations of NP hard ....
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy, Tracking join and self-join sizes in limited storage. In 18th PODS, 10--20, New York, 1999.
....we give some alternate protocols for our results, including our private approximation of the L 2 norm in the alternate off line communication model. In Appendix C, we give proofs of lemmas and theorems not given in the main body. Related results Several existing approximation algorithms (cf. [AGMS99,FKSV99,FS00,KN97,Ind00]) for the L p or Hamming distance are efficient even for massive data sets. These algorithms all use correlated randomness between players to reduce the communication required (more details are given in Section 3.1) However, these results do not directly translate into communication efficient ....
....in which the inputs are massive. In the setting of massive inputs, we will consider sublinear approximations to functions whose exact computation requires at least linear and at most polynomial resources. We consider several resources, all of which have been considered previously, for example by [AGMS99,FKSV99,HRR98,KN97]. We seek protocols in which the total communication and the number of rounds are small. The parties should be able to compute their protocol responses quickly, using little storage space. Ideally, we desire that only one round of the protocol need involve raw input and that each party can compute ....
[Article contains additional citation context not shown here]
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy, Tracking Join and Self-Join Sizes in Limited Storage. In Proceedings of the 18th Symposium on Principles of Database Systems (PODS), pp. 10--20, ACM Press, New York, 1999.
....model of computation and complexity measures for streaming and sketching algorithms. In Section 3, we present our main technical results. Section 4 explains the relationship of our algorithm to other recent work, including that of Broder et al. BCFM98] on sketching and that of Alon et al. [AMS96, AGMS99] on frequency moments. 2 Models of Computation Our model is closely related to that of Henzinger, Raghavan, and Rajagopalan [HRR98] We also describe a related sketch model that has been used, e.g. in [BCFM98] 2.1 The Streaming Model As in [HRR98] a data stream is a sequence of data items ....
....and there may be many items of type i of each sign. Denote by a i the number of positive occurrences of i and by b i the number of negative occurrences of i, and let F k denote P ja i Gamma b i j k . We have obtained the following corollary. It was obtained independently by Alon et al. [AGMS99]. Corollary 18 F 2 2 PASST( log n log m) log(1=ffi) ffl 2 ; log(n) log log(n) log(1=ffi) ffl 2 ) Here F 2 = P (a i Gamma b i ) 2 , where a i is the number of positive occurrences of i in the input and b i is the number of negative occurrences of i in the input. Proof. sketch] ....
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy. Tracking Join and Self-Join Sizes in Limited Storage. In Proc. of the 18'th Symp. on Principles of Database Systems, ACM Press, New York, pages 10--20, 1999. 20
....data set. To be of use in operations, the streams must be analyzed locally and their synopses sent to a central operations facility. The enormous scale, distributed nature, and one pass processing requirement on the data sets of interest must be addressed with new algorithmic techniques. In [AMS96, KOR98, AGMS99, FKSV99], the authors presented a new technique: a space efficient, one pass algorithm for approximating the L 1 difference P i ja i Gamma b i j or L 2 difference Gamma P i ja i Gamma b i j 2 Delta 1=2 between two functions, when the function values a i and b i are given as data streams, ....
....with Previous Work We give an approximation algorithm for, among other cases, p = 1 and p = 2. The p = 1 case was first solved in [FKSV99] using different techniques. Our algorithm is less efficient in time and space, though by no more than a power. The case p = 2 was first solved in [AMS96, AGMS99], and it is easily seen that our algorithm for the case p = 2 coincides with the algorithm of [AMS96, AGMS99] Our algorithm is similar to [AMS96, FKSV99] at the top level, using the strategy proposed by [AMS96] 4.2 Random Self Reducibility Our proof technique can be regarded as exploitation of ....
[Article contains additional citation context not shown here]
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy. Tracking Join and Self-Join Sizes in Limited Storage. In Proc. of the 18'th Symp. on Principles of Database Systems, ACM Press, New York, pages 10--20, 1999.
....and there may be many items of type i of each sign. Denote by a i the number of positive occurrences of i and by b i the number of negative occurrences of i, and let F k denote P ja i Gamma b i j k . We have obtained the following corollary. It was obtained independently by Alon et al. [2]. Corollary 16 F 2 2 PASST( log n log m) log(1=ffl) 2 ; log(n) log log(n) log(1=ffl) 2 ) Here F 2 = P (a i Gamma b i ) 2 , where a i is the number of positive occurrences of i in the input and b i is the number of negative occurrences of i in the input. Proof. sketch] With k; ....
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy. Tracking Join and Self-Join Sizes in Limited Storage. In Proc. of the 18'th Symp. on Principles of Database Systems, ACM Press, New York, pages 10--20, 1999.
No context found.
N. Alon, P. B. Gibbons, Y. Matias, and M. Szegedy. "Tracking Join and Self-Join Sizes in Limited Storage". ACM PODS, 1999.
No context found.
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy. Tracking join and self-join sizes in limited storage. In Proc. ACM PODS Conf., pages 10--20, 1999.
No context found.
N. Alon, P. Gibbons, Y. Matias and M. Szegedy. Tracking join and self-join sizes in limited storage. ACM PODS, 1999, 10--20.
No context found.
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy. Tracking join and self-join sizes in limited storage. In Proceedings of the Eighteenth ACM Symposium on Principles of Database Systems (PODS '99), pages 10--20, 1999.
No context found.
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy, "Tracking join and self-join sizes in limited storage," in PODS, 1999, pp. 10--20.
No context found.
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy. Tracking join and self-join sizes in limited storage. In Proc. of the 1999.
No context found.
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy. Tracking join and self-join sizes in limited storage. In Proceedings of the Eighteenth ACM Symposium on Principles of Database Systems (PODS '99), pages 10-- 20, 1999.
No context found.
N. Alon, P. B. Gibbons, Y. Matias, and M. Szegedy. Tracking join and self-join sizes in limited storage. In Proc. PODS, pages 10--20, 1999.
No context found.
N. Alon, P. B. Gibbons, Y. Matias, and M. Szegedy. Tracking join and self-join sizes in limited storage. In Proceedings of the 18th ACM Principles of Database Systems, pages 10-20, 1999.
No context found.
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy. Tracking join and self-join sizes in limited storage. JCSS, 64:719-- 747, 2002.
No context found.
N. Alon, P. Gibbons, Y. Matias and M. Szegedy. Tracking join and self-join sizes in limited storage. ACM PODS, 1999, 10--20.
No context found.
N. Alon, P. Gibbons, Y. Matias, and M. Szegedy, "Tracking Join and Self-Join Sizes in Limited Storage," Proc. ACM PODS, pp. 1020, 1999.
No context found.
N. Alon, P. B. Gibbons, Y. Matias, and M. Szegedy, Tracking Join and Self-Join Size in Limited Storage, ACM PODS, 1999.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC