• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

DMCA

Mergeable Summaries

Cached

  • Download as a PDF

Download Links

  • [www.cs.ust.hk]
  • [www.cse.ust.hk]
  • [www.cs.utah.edu]
  • [www.cs.utah.edu]
  • [www.cs.utah.edu]
  • [www.cs.utah.edu]
  • [www.cs.ust.hk]
  • [www.cse.ust.hk]
  • [www.cs.ust.hk]
  • [www.cse.ust.hk]
  • [www.cse.ust.hk]
  • [www.cse.ust.hk]
  • [www.cs.utah.edu]
  • [www.cs.utah.edu]
  • [www.cs.duke.edu]
  • [www.cs.duke.edu]
  • [www.cs.duke.edu]
  • [dimacs.rutgers.edu]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Pankaj K. Agarwal , Jeff M. Phillips , Graham Cormode , Zhewei Wei , Zengfeng Huang , Ke Yi
Citations:22 - 7 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{Agarwal_mergeablesummaries,
    author = {Pankaj K. Agarwal and Jeff M. Phillips and Graham Cormode and Zhewei Wei and Zengfeng Huang and Ke Yi},
    title = {Mergeable Summaries},
    year = {}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

We study the mergeability of data summaries. Informally speaking, mergeability requires that, given two summaries on two data sets, there is a way to merge the two summaries into a single summary on the union of the two data sets, while preserving the error and size guarantees. This property means that the summaries can be merged in a way like other algebraic operators such as sum and max, which is especially useful for computing summaries on massive distributed data. Several data summaries are trivially mergeable by construction, most notably all the sketches that are linear functions of the data sets. But some other fundamental ones like those for heavy hitters and quantiles, are not (known to be) mergeable. In this paper, we demonstrate that these summaries are indeed mergeable or can be made mergeable after appropriate modifications. Specifically, we show that for ε-approximate heavy hitters, there is a deterministic mergeable summary of size O(1/ε); for ε-approximate quantiles, there is a deterministic summary of size O ( 1 log(εn)) that has a restricted form of mergeability, ε and a randomized one of size O ( 1 1 log3/2) with full merge-ε ε ability. We also extend our results to geometric summaries such as ε-approximations and ε-kernels. We also achieve two results of independent interest: (1) we provide the best known randomized streaming bound for ε-approximate quantiles that depends only on ε, of size O ( 1 1 log3/2), and (2) we demonstrate that the MG and the ε ε SpaceSaving summaries for heavy hitters are isomorphic. Supported by NSF under grants CNS-05-40347, IIS-07-

Keyphrases

data set    mergeable summary    approximate quantiles    heavy hitter    full merge ability    size guarantee    independent interest    linear function    several data summary    approximate heavy hitter    algebraic operator    grant cns-05-40347    deterministic mergeable summary    deterministic summary    geometric summary    data summary    fundamental one    appropriate modification    single summary    spacesaving summary   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University