| R. D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843-850, July 1988. |
....a generic program, and not as the latest development in XML compression. Furthermore, we have not been able to obtain the executables or the source code of most of the existing XML compressors. Existing XML compressors. Structure specific compression methods give much better compression results [4, 15, 14, 46] than conventional compression methods such as the Unix compress utility [52] There exist many XML compressors; we know of XMLZip [12] XMill [36] ICT s XML Xpress [25] Millau [16] XMLPPM [6] XGrind [48] and lossy XML compression [5] We will not perform an exhaustive comparison between our ....
Robert D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843--850, 1988.
.... code size while leaving the code directly executable [12] b) schemes that compress object code by exploiting certain statistical properties of the underlying instruction format [14, 18, 25, 31] and (c) schemes that compress the abstract syntax tree (AST) of a program by using either statistical [7, 13] or dictionary based approaches [17] Our approach falls into the last category. The source code (modulo comments, layout, and names of internal identifiers) can easily be regenerated from an AST. Since the AST is composed according to a given abstract grammar (AG) we are using domain knowledge ....
....to the number of given alternatives. We want to use as few bits as possible for encoding the choice c. The two options are to use Huffman coding or arithmetic coding. Using Huffman code as discussed in Stone [35] is very fast, but is much less flexible compared to or arithmetic coding. Cameron [7] shows that arithmetic coding is more appropriate for good compression results. An arithmetic coder is the best means to encode a number of choices if each alternative i 2 f1; 2; ng has a certain probability p i , where P n i=1 p i = 1 and n is given by the kind of choice node. The ....
[Article contains additional citation context not shown here]
R. D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843--850, July 1988.
....it should also be easy to learn and accessible to a wide variety of potential designers of computer languages. 4 Related Work The initial research relevant to AST compression was conducted in the 1980 s and focused on the reduction of storage requirements for Pascal source les. Cameron [5] was the rst to combine tree encoding and arithmetic coding. He assigns xed probabilities to alternatives appearing in the grammar. Tarhio [28,29] uses PPM to drive the arithmetic coder in a fashion similar to ours and Cheney [6] suggests similar ideas for term compression. Franz [11] was the ....
Cameron, R. D., Source encoding using syntactic information source models, IEEE Transactions on Information Theory 34 (1988), pp. 843-850.
.... executable [DEM99] 2) schemes that compress object code by exploiting certain statistical properties of the underlying instruction format 97, Fra99, Luc00, Pug99] and (3) schemes that compress the abstract syntax tree (AST) of a program by using either dictionary based [FK97] or statistical [Cam88, ECM98] approaches. Our approach falls into the last category, or more precisely, we compress the AST of a program using novel statistical approaches. The source code (modulo comments, layout, and names of internal identifiers) can easily be regenerated from an AST. Since the AST is composed according ....
....equal to the number of given alternatives. We want to use as few bits as possible for encoding the choice c. The two options are to use Hu#man coding or arithmetic coding. Using Hu#man code as discussed in Stone [Sto86] is very fast, but is much less flexible compared to arithmetic coding. Cameron [Cam88] shows that arithmetic coding is more appropriate for good compression results. An arithmetic coder [WNC87] is the best means to encode a number of choices if each alternative i 2, n has a certain probability p i , where # i=1 p i = 1 and n is given by the kind of choice node. The ....
[Article contains additional citation context not shown here]
Robert D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843--850, July 1988.
....annotation, and the final column the delta that is added (in bytes) when annotations are incorporated into our encoding. 8 Related Work The initial research on syntax directed compression was conducted in the 1980s primarily to reduce the storage requirements for source text files. Cameron [5] introduced a combination 17 of arithmetic coding with an encoding scheme similar to ours. And more recently Tarhio [24] suggests the application of PPM variants to the compression of parse trees. For a more detailed bibliography on compressing ASTs see [23] None of these techniques attempt to ....
R. D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843--850, July 1988. 20
....section 6. This reduces the size of Pascal source to at least 44 of its original size. Katajainen et al. 32] achieve similar results with automatically generated encoders and decoders. Al Hussaini [3] implemented another compression system based on probabilistic grammars and LR parsing. Cameron [9] introduces a combination of arithmetic coding with the encoding scheme from section 6. He assigns xed probabilities to alternatives appearing in the grammar and uses these probabilities to arithmetically encode the pre order representation of ASTs. Furthermore, he uses di erent pools of strings ....
R. D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843-850, July 1988.
....networking environment. Past annotation frameworks have reported le size increases ranging from 7 to 97 [22, 18, 16] 8 Related Work The initial research on syntax directed compression was conducted in the 1980s primarily to reduce the storage requirements for source text les. Cameron [7] introduced a combination of arithmetic coding with an encoding scheme similar to ours. And more recently Tarhio [26] suggests the application of PPM variants to the compression of parse trees. For a more detailed bibliography on compressing ASTs see [25] None of these techniques attempt to ....
R. D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843-850, July 1988.
....combine that combines a shape and some contents into a datatype value. These programs are polytypic [15] programs: programs that work uniformly for large classes of datatypes. The construction proves that the two functions are each others inverses. Note that shapes are at the heart of Jay s [14] theory of polytypism, but here we only use separate and combine as examples of simple data conversion programs. 1.1.2 Traversals. When working with structured data, one often performs operations on all elements in a structure. A traversal is an operation on a structured value that walks ....
....are databases, HTML files, and JavaScript programs and it pays to compress these structured files to obtain faster transmission or fewer CDs. Structure specific compression methods give much better compression results than conventional compression methods such as the Unix compress utility [2, 4] Structured compression is also used in heap compression and binary I O [23] The idea of designing structure specific compression programs has been around since the beginning of the 1980s, but for many years only example instantiations appeared in the literature. This paper describes the ....
[Article contains additional citation context not shown here]
Robert D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843-850, 1988.
.... or the degree of ambiguity of a context free grammar [5] For recent results concerning random generation of words in an ambiguous context free grammar, see [9] Applications of random generation of words in computational biology are mentioned in [3] For compression of program files, see e.g. [1,7]. Parallel algorithms for ranking context free languages are studied in [6, 8] 2. Preliminaries If not otherwise stated we follow the notations and definitions of [4] Let G = V,Y. P,S) be a context free grammar (hereafter simply a grammar ) whose productions are uniquely labelled by the ....
Robert D.Cameron, Source encoding using syntactic information source models. IEEE Trans. Inf. Theor. IT-34 (1988), 843-850.
.... code size while leaving the code directly executable [12] b) schemes that compress object code by exploiting certain statistical properties of the underlying instruction format [14, 18, 25, 31] and (c) schemes that compress the abstract syntax tree (AST) of a program by using either statistical [7, 13] or dictionary based approaches [17] Our approach falls into the last category. The source code (modulo comments, layout, and names of internal identifiers) can easily be regenerated from an AST. Since the AST is composed according to a given abstract grammar (AG) we are using domain knowledge ....
....equal to the number of given alternatives. We want to use as few bits as possible for encoding the choice c. The two options are to use Huffman coding or arithmetic coding. Using Huffman code as discussed in Stone [36] is very fast, but is much less flexible compared to arithmetic coding. Cameron [7] shows that arithmetic coding is more appropriate for good compression results. An arithmetic coder is the best means to encode a number of choices if each alternative i 2 f1; 2; ng has a certain probability p i , where P n i=1 p i = 1 and n is given by the kind of choice node. The ....
[Article contains additional citation context not shown here]
R. D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843--850, July 1988.
.... [DEM99] 2) schemes that compress object code by exploiting certain statistical properties of the underlying instruction format [EEF 97, Fra99, Luc00, Pug99] and (3) schemes that compress the abstract syntax tree (AST) of a program by using either dictionary based [FK97] or statistical [Cam88, ECM98] approaches. Our approach falls into the last category, or more precisely, we compress the AST of a program using novel statistical approaches. Since the AST is composed according to a given abstract grammar (AG) we are using domain knowledge about the underlying language to achieve a ....
....is equal to the number of given alternatives. We want to use as few bits as possible for encoding the choice c. The two options are to use Hu#man coding or arithmetic coding. Using Hu#man code as discussed in Stone [Sto86] is very fast, but is much less flexible than arithmetic coding. Cameron [Cam88] shows that arithmetic coding is more appropriate for good compression results and recent improved implementations [MNW98] make it also very fast. An arithmetic coder [WNC87] is a flexible means to encode a number of choices if each alternative i # 1, 2, n has a certain probability p ....
[Article contains additional citation context not shown here]
Robert D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843--850, July 1988. 14
....compression techniques might be able to compress XML documents as well as or better than other representations. Fortuitously, XML s simple design makes testing this hypothesis easier than in previous structured data compression approaches such as syntax based compression of program les [12, 3] or machine code compression[8, 16] Liefke and Suciu [15] describe XMILL, an XML compressor that transforms documents to expose redundancy, then applies standard text compressors. XMILL combined with gzip compresses XML data about 10 better than gzip on equivalent non XML forms; further ....
....compression. Kanne and Moerkotte [11] have addressed storing XML eciently for database querying. Both grammar conscious and grammar inferring text compression might apply to XML compression, since XML has context free structure containing unstructured text. Katajainen et al. 12] and Cameron [3] were the rst to investigate grammarbased compression; more recently, Lake [14] combined PPM and grammar modeling. Nevill Manning and Witten [18] and Yang and Kie er [13] have investigated text compression using grammars learned from the text. our model multiplexing approach resembles stream ....
R. D. Cameron. Source encoding using syntactic information source models. IEEE Trans. Inform. Theory, 34(4):843-850, July 1988.
.... [DEM99] b) schemes that compress object code by exploiting certain statistical properties of the underlying instruction format [EEF 97, Fra99, Luc00, Pug99] and (c) schemes that compress the abstract syntax tree (AST) of a program by using either dictionary based [FK97] or statistical [Cam88, ECM98] approaches. Our approach falls into the last category. The source code (modulo comments, layout, and names of internal identifiers) can easily be regenerated from an AST. Since the AST is composed according to a given abstract grammar (AG) we are using domain knowledge about the ....
....to the number of given alternatives. We want to use as few bits as possible for encoding the choice c. The two options are to use Hu#man coding or arithmetic coding. Using Hu#man code as discussed in Stone [Sto86] is very fast, but is much less flexible compared to or arithmetic coding. Cameron [Cam88] shows that arithmetic coding is more appropriate for good compression results. An arithmetic coder [WNC87] is the best means to encode a number of choices if each alternative i # 1, 2, n has a certain probability p i , where # n i=1 p i = 1 and n is given by the kind of choice ....
[Article contains additional citation context not shown here]
Robert D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843--850, July 1988.
.... code size while leaving the code directly executable [12] b) schemes that compress object code by exploiting certain statistical properties of the underlying instruction format [14, 18, 25, 31] and (c) schemes that compress the abstract syntax tree (AST) of a program by using either statistical [7, 13] or dictionary based approaches [17] Our approach falls into the last category. The source code (modulo comments, layout, and names of internal identifiers) can easily be regenerated from an AST. Since the AST is composed according to a given abstract grammar (AG) we are using domain knowledge ....
....to the number of given alternatives. We want to use as few bits as possible for encoding the choice N . The two options are to use Huffman coding or arithmetic coding. Using Huffman code as discussed in Stone [35] is very fast, but is much less flexible compared to or arithmetic coding. Cameron [7] shows that arithmetic coding is more appropriate for good compression results. An arithmetic coder is the best means to encode a number of choices if each alternative V OWQ I#J K 0 25340 9 has a certain probability X(Y , where Z [ Y ] X 1 and T is given by the kind of choice ....
[Article contains additional citation context not shown here]
R. D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843--850, July 1988.
.... or the degree of ambiguity of a context free grammar [5] For recent results concerning random generation of words in an ambiguous context free grammar, see [9] Applications of random generation of words in computational biology are mentioned in [3] For compression of program files, see e.g. [1,7]. Parallel algorithms for ranking context free languages are studied in [6, 8] 2. Preliminaries If not otherwise stated we follow the notations and definitions of [4] Let G = V,S,P,S) be a context free grammar (hereafter simply a grammar ) whose productions are uniquely labelled by the ....
Robert D.Cameron, Source encoding using syntactic information source models. IEEE Trans. Inf. Theor. IT-34 (1988), 843-850.
....and so on. The Rush research group has a program that semi automatically converts Tcl scripts to Rush scripts. This program is not publicly available so we converted the corpus of test scripts to Rush syntax by hand. This manual conversion was the limiting factor on the size of the test corpus. [Cam88] and [KPT86] are the primary references for syntax directed compression. Both of these researchers compress Pascal programs. KPT86] constructs a parse tree for the Pascal program, removes all nodes that have only one child and then linearizes the parse tree. All user defined symbols are added to ....
....production that was used to expand the node. The output of the compression program is the symbol table followed by an encoding of the linearized parse tree i.e. the symbol and production indices in the parse tree are coded with a fixed length binary code. KPT86] achieves 50 compression. [Cam88] constructs a parse tree for the Pascal program, performs a preorder walk through the tree and uses arithmetic coding to encode the tree during the course of the walk. For each non terminal the compression algorithm encodes the probability of the production that was used to expand the ....
[Article contains additional citation context not shown here]
Robert D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843--850, July 1988.
....examples are databases, HTML les, and JavaScript programs and it pays to compress these structured les to obtain faster transmission or fewer CDs. Structure speci c compression methods give much better compression results than conventional compression methods such as the Unix compress utility [2,4]. Structured compression is also used in heap compression and binary I O [23] The idea of designing structure speci c compression programs has been around since the beginning of the 1980s, but, as far as we are aware, there is no generic description of the program, only example instantiations ....
Robert D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843-850, 1988.
No context found.
R. D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843-850, July 1988.
No context found.
R. D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843--850, July 1988.
No context found.
Robert D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843--850, 1988.
No context found.
Robert D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843-850, 1988.
No context found.
R. D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843--850, July 1988.
No context found.
Robert D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843--850, 1988.
No context found.
R. D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843--850, July 1988.
No context found.
Robert D. Cameron. Source encoding using syntactic information source models. IEEE Transactions on Information Theory, 34(4):843--850, 1988.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC