• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Approximation of smallest linear tree grammar

by Artur Jeż, Markus Lohrey
Venue:CoRR
Add To MetaCart

Tools

Sorted by:
Results 1 - 4 of 4

CONTEXT UNIFICATION IS IN PSPACE

by Artur Jez , 2013
"... ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
Abstract not found

A really simple approximation of smallest grammar

by Artur Jeż - IN PROC. 25TH ANNUAL SYMPOSIUM ON COMBINATORIAL PATTERN MATCHING (CPM), LNCS 8486 , 2014
"... In this paper we present a really simple linear-time algorithm constructing a context-free grammar of size O(g log(N/g)) for the input string, where N is the size of the input string and g the size of the optimal grammar generating this string. The algorithm works for arbitrary size alphabets, but ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
In this paper we present a really simple linear-time algorithm constructing a context-free grammar of size O(g log(N/g)) for the input string, where N is the size of the input string and g the size of the optimal grammar generating this string. The algorithm works for arbitrary size alphabets, but the running time is linear assuming that the alphabet Σ of the input string can be identified with numbers from {1,..., Nc} for some constant c. Algorithms with such an approximation guarantee and running time are known, however all of them were non-trivial and their analyses were involved. The here presented algorithm computes the LZ77 factorisation and transforms it in phases to a grammar. In each phase it maintains an LZ77-like factorisation of the word with at most ` factors as well as additional O(`) letters, where ` was the size of the original LZ77 factorisation. In one phase in a greedy way (by a left-to-right sweep and a help of the factorisation) we choose a set of pairs of consecutive letters to be replaced with new symbols, i.e. nonterminals of the constructed grammar. We choose at least 2/3 of the letters in the word and there are O(`) many different pairs among them. Hence there are O(logN) phases, each of them introduces O(`) nonterminals to a grammar. A more precise analysis yields a bound O( ` log(N/`)). As ` ≤ g, this yields the desired bound O(g log(N/g)).
(Show Context)

Citation Context

... analysis based on the recompression technique, which allowed avoiding the connection of SLPs and LZ77 compression. This made it possible to generalise this approach also to grammars generating trees =-=[10]-=-. On the downside, the analysis is quite complex. Contribution of this paper. We present a very simple algorithm together with a straightforward and natural analysis. It chooses the pairs to be replac...

Constructing Small Tree Grammars and Small Circuits for Formulas

by Danny Hucke , Markus Lohrey , Eric Noeth , 2014
"... Abstract It is shown that every tree of size n over a fixed set of σ different ranked symbols can be decomposed into O( n log σ n ) = O( n log σ log n ) many hierarchically defined pieces. Formally, such a hierarchical decomposition has the form of a straight-line linear context-free tree grammar o ..."
Abstract - Add to MetaCart
Abstract It is shown that every tree of size n over a fixed set of σ different ranked symbols can be decomposed into O( n log σ n ) = O( n log σ log n ) many hierarchically defined pieces. Formally, such a hierarchical decomposition has the form of a straight-line linear context-free tree grammar of size O( n log σ n ), which can be used as a compressed representation of the input tree. This generalizes an analogous result for strings. Previous grammar-based tree compressors were not analyzed for the worst-case size of the computed grammar, except for the top dag of Bille et al., for which only the weaker upper bound of O( n log 0.19 n ) for unranked and unlabelled trees has been derived. The main result is used to show that every arithmetical formula of size n, in which only m ≤ n different variables occur, can be transformed (in time O(n log n)) into an arithmetical circuit of size O( n·log m log n ) and depth O(log n). This refines a classical result of Brent, according to which an arithmetical formula of size n can be transformed into a logarithmic depth circuit of size O(n). Missing proofs can be found in the long version ACM Subject Classification E.4 Data compaction and compression Keywords and phrases grammar-based compression, tree compression, arithmetical circuits Introduction Grammar-based compression has emerged to an active field in string compression during the past 20 years. The idea is to represent a given string s by a small context-free grammar that generates only s; such a grammar is also called a straight-line program, briefly SLP. For instance, the word (ab) 1024 can be represented by the SLP with the productions A 0 → ab and A i → A i−1 A i−1 for 1 ≤ i ≤ 10 (A 10 is the start symbol). The size of this grammar is much smaller than the size (length) of the string (ab) 1024 . In general, an SLP of size n (the size of an SLP is usually defined as the total length of all right-hand sides of the productions) can produce a string of length 2 Ω(n) . Hence, an SLP can be seen indeed as a succinct representation of the generated string. The goal of grammar-based string compression is to construct from a given input string s a small SLP that produces s. Several algorithms for this have been proposed and analyzed. Prominent grammar-based string compressors are for instance LZ78, RePair, and BISECTION, see To evaluate the compression performance of a grammar-based compressor C, two different approaches can be found in the literature: A first approach is to analyze the size of the SLP produced by C for an input string x compared to the size of a smallest SLP for x. This leads to the approximation ratio for C, see

APPROXIMATION OF GRAMMAR-BASED COMPRESSION VIA

by Artur Jeż
"... ar ..."
Abstract - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...n recent work of Lohrey and the author the algorithm presented in this paper is generalised to the case of tree-grammars, yielding a first provable approximation for the smallest tree grammar problem =-=[8]-=-. Comparison with Sakamoto’s algorithm. The general approach is similar to Sakamoto’s method, however, the pairing of letters seems more natural in here presented paper. Also, the construction of nont...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University