8 citations found. Retrieving documents...
Buchsbaum, A.L., D. Caldwell, K.W. Church, G.S.Fowler, S. Muthukrishnan (2000), "Engineering the compression of massive tables: an experimental approach", in Proceedings of the 11ACM-SIAM Symposium on Discrete Algorithms, pp. 175-184, Philadelphia, USA.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Network Data Mining and Analysis: The - Project Minos Garofalakis   (Correct)

.... hundred bytes of data that capture information on various (categorical and numerical) attributes of each call; this includes network level information (e.g. endpoint exchanges) time stamp information (e.g. call start and end times) and billing information (e.g. applied tariffs) among others [4]. These CDRs are stored in tables that can grow to truly massive sizes, in the order of several Terabytes per year. A key observation is that these massive collections of network traffic and CDR data typically hide invaluable knowledge that enables several key network management tasks, ....

.... for Network Data Tables Data compression issues arise naturally in applications dealing with massive data sets, and effective solutions are crucial for optimizing the usage of critical system resources like storage space and I O bandwidth, as well as network bandwidth (for transferring the data) [4, 7]. Several statistical and dictionary based compression methods have been proposed for text corpora and multimedia data, some of which (e.g. Lempel Ziv or Huffman) yield provably optimal asymptotic performance in terms of certain ergodic properties of the data source. These methods, however, fail ....

A.L. Buchsbaum, D.F. Caldwell, K. Church, G.S. Fowler, and S. Muthukrishnan. "Engineering the Compression of Massive Tables: An Experimental Approach". In Proc. of the 11th Annual ACM-SIAM Symp. on Discrete Algorithms, January 2000.


Data Mining Meets Network Management: The NEMESIS Project - Garofalakis, Rastogi (2001)   (1 citation)  (Correct)

.... hundred bytes of data that capture information on various (categorical and numerical) attributes of each call; this includes network level information (e.g. endpoint exchanges) time stamp information (e.g. call start and end times) and billing information (e.g. applied tariffs) among others [5]. These CDRs are stored in tables that can grow to truly massive sizes, in the order of several Terabytes per year. A key observation is that these massive collections of network traffic and CDR data typically hide invaluable knowledge that enables several key network management tasks, ....

.... NETWORK DATA TABLES Data compression issues arise naturally in applications dealing with massive data sets, and effective solutions are crucial for optimizing the usage of critical system resources like storage space and I O band width, as well as network bandwidth (for transferring the data) [5, 10]. Several statistical and dictionary based compression methods have been proposed for text corpora and multimedia data, some of which (e.g. Lempel Ziv or Huffman) yield provably optimal asymptotic performance in terms of certain ergodic properties of the data source. These methods, however, fail ....

[Article contains additional citation context not shown here]

Adam L. Buchsbaum, Donald F. Caldwell, Kenneth Church, Glenn S. Fowler, and S. Muthukrishnan. "Engineering the Compression of Massive Tables: An Experimental Approach".InProceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms, San Francisco, California, January 2000.


Towards Compressing Web Graphs - Adler, Mitzenmacher (2000)   (25 citations)  (Correct)

....distribution of indegrees in GW is Zipfian with ff 3, then = O(n) Branchings generally refer to the (equivalent) maximum weight problem. They are sometimes also referred to as arborescences. 5 We point out that we have found a similar idea to the algorithm Find Reference alluded to in [5], in the context of compressing tables of data, where one column can be used to compress another. The authors mention that the problem can be reduced to a minimum spanning tree problem (in their case, edges are undirected) 3.1 Additional improvements and related problems In practice, after we ....

....if we allow depth at most two, then the problem of finding the optimal directed minimum spanning tree 6 is equivalent to the facility location problem. Indeed, it is this connection to the facility location problem that was used in the work on compressing tables of data mentioned earlier [5]. In the terminology of [15] each page is a possible facility; a page that is not compressed by a reference corresponds to an opened facility; and a page that is compressed using a reference corresponds to a location receiving shipment from a facility corresponding to the reference page. We ....

A. L. Buchsbaum, D. F. Caldwell, K. W. Church, G. S. Fowler, and S. Muthukrishnan. Engineering the compression of massive tables: an experimental approach. In Proceedings of 11th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 175-184, 2000.


Towards Compressing Web Graphs - Adler, Mitzenmacher (2000)   (25 citations)  (Correct)

....we examine here. Directed minimum spanning trees have been used previously in other scenarios to provide good compression. Tate [18] uses such trees to obtain a reordering of the bands of a multispectral image that allows for the optimal compression. More recently, a similar idea is alluded to in [6], in the context of compressing tables of data. There, the authors use one column to compress another, and mention that the problem can be reduced to a minimum spanning tree problem, although in their case, edges are undirected. 1.2 Framework When we discuss compressing Web like graphs, there ....

....NPhard; for example, if we allow depth at most two, then the problem of finding the optimal directed minimum spanning tree is equivalent to the facility location problem. This connection to the facility location problem was a major point in the work on compressing tables of data mentioned earlier [6]. In practice, we expect that using the FIND REFERENCE algorithm to initially find a directed tree and then chopping the tree to maintain a depth bound (by changing some nodes to be compressed without a reference and thus linking them to the root r) will be suitable. 4 Hardness results Since ....

A. L. Buchsbaum, D. F. Caldwell, K. W. Church, G. S. Fowler, and S. Muthukrishnan. Engineering the compression of massive tables: an experimental approach. In Proceedings of 11th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 175-184, 2000.


Improving Table Compression with Combinatorial Optimization - Buchsbaum, Fowler, Giancarlo (2002)   (1 citation)  Self-citation (Buchsbaum Fowler)   (Correct)

No context found.

A. L. Buchsbaum, D. F. Caldwell, K. W. Church, G. S. Fowler, and S. Muthukrishnan. Engineering the compression of massive tables: An experimental approach. In Proc. 11th ACM-SIAM SODA, pages 175--84, 2000.


Design of an End-to-End Method to Extract Information From.. - Ana Costa Silva   (Correct)

No context found.

Buchsbaum, A.L., D. Caldwell, K.W. Church, G.S.Fowler, S. Muthukrishnan (2000), "Engineering the compression of massive tables: an experimental approach", in Proceedings of the 11ACM-SIAM Symposium on Discrete Algorithms, pp. 175-184, Philadelphia, USA.


Compressing Large Boolean Matrices Using Reordering.. - Johnson, Krishnan.. (2004)   (1 citation)  (Correct)

No context found.

Buchsbaum, A. L., Caldwell, D. F., Church, K. W., Fowler, G. S., and Muthukrishnan, S. Engineering the compression of massive tables: an experimental approach. In Proc. 10th ACM-SIAM Symp. Discrete Algorithms (2000), Society for Industrial and Applied Mathematics, pp. 175--184.


Using Constrained Models for Guaranteed-Error Semantic.. - Babu, Garofalakis   (Correct)

No context found.

A.L. Buchsbaum, D.F. Caldwell, K. Church, G.S. Fowler, and S. Muthukrishnan. "Engineering the Compression of Massive Tables: An Experimental Approach". In Proc. of the 11th Annual ACM-SIAM Symp. on Discrete Algorithms, January 2000.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC