22 citations found. Retrieving documents...
T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In Proc. 6th Intl. Symp. on String Processing and Information Retrieval (SPIRE'99), pages 89-96. IEEE CS Press, 1999.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
A Dictionary-based Compressed Pattern Matching Algorithm - Ho, Yen (2002)   (Correct)

....a key role in today s Internet applications, provided that the matching can be carried out efficiently. As a consequence, compressed pattern matching has been receiving increasing attention from both theoretical and practical viewpoints in the computer science and engineering society. See, e.g. [1, 2, 4]. In particular, a compressed patter matching (for LZW compression) algorithm has been reported in whose asymptotic running time outperforms the method of decompressing the text followed by an ordinary pattern matching. More recent work on compressed pattern matching can be found in [5] The aim ....

....is added with its parent pointing to the parent node. char:null index:0 parent:NULL char: a char: b Now we are in a position to describe our compressed pattern matching algorithm in detail.The basic structure of many of the compressed pattern matching algorithm found in the literature [4]. Unlike the plain text pattern matching in which the text is scanned on a character by character basis, compressed pattern matching requires processing the compressed text in a block by block fashion. As a consequence, the notion of the so called partial match is central to many of the ....

T. Kida, Y. Shibata, M. Yakeda, A. Shinohara and S. Arikawa, A Unifying Framework for Compressed Pattern Matching, 6th IEEE Int'l Symp. on String Processing and Information Retrieval, pp. 89-96, 1999.


Regular Expression Searching on Compressed Text - Navarro   (Correct)

....and a new variant proposed that was competitive and convenient for search purposes. A similar result, restricted to the LZW format, was independently found and presented by Kida et al. 14] The same group generalized the existing algorithms and nicely uni ed the concepts in a general framework [12]. Recently, Navarro and Tarhio [28] presented a new, faster, algorithm based on Boyer Moore. Approximate string matching on compressed text aims at nding the pattern where a limited number of di erences between the pattern and its occurrences are permitted. The problem, advocated in 1992 [2] ....

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In Proc. 6th Intl. Symposium on String Processing and Information Retrieval (SPIRE'99), pages 89-96. IEEE CS Press, 1999.


Approximate Matching of Run-Length Compressed Strings - Mäkinen, Navarro, Ukkonen (2001)   (Correct)

....deletions and substitutions that are needed to make the two strings equal. For this distance we are interested in k jP j errors. Many studies have been made around the subject of compressed pattern matching over di erent compression formats, starting with the work of Amir and Benson [1] e.g. [2, 10, 17, 16]. The only works addressing the approximate variant of the problem have been [14, 19, 22] on Ziv Lempel [27] Our focus is approximate matching over run length encoded strings. In run length encoding, a string that consists of repetitions of letters is compressed by encoding each repetition as a ....

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In Proc. 6th Symposium on String Processing and Information Retrieval (SPIRE'99), pages 89-96. IEEE CS Press, 1999.


Approximation Algorithms for Grammar-Based Data Compression - Lehman (2002)   (1 citation)  (Correct)

....an important attraction of grammar based compression relative to otherwise competitive compression schemes. For example, the best pattern matching algorithm that operates on a string compressed as a grammar is asymptotically faster than the 13 equivalent for the well known LZ77 compression format [14]. 1.1.4 Hierarchical Approximation Finally, work on the smallest grammar problem qualitatively extends the study of approximation algorithms. In particular, we shift from problems on at objects (such as graphs, CNF formulas, bins and weights, etc. to a hierarchical object, context free ....

....byte pair encoding algorithm [13] Most of these procedures are described and analyzed in Chapter 4. Beyond the design of new algorithms for the smallest grammar problem, there has been an e ort to develop algorithms that manipulate strings while still in compressed form. For example, Kida [14] and Shibata, et al. 32] have proposed pattern matching algorithms that run in time related not to the length of the searched string, but rather to the size of the grammar representing it. The good performance of such algorithms is emerging as a signi cant advantage of grammar based compression ....

Takuya Kida, Yusuke Shibata, Masayuki Takeda, Ayumi Shinohara, and Setsuo Arikawa. A unifying framework for compressed pattern matching. In International Symposium on String Processing and Information Retrieval, pages 89-96, 1999.


Approximating the Smallest Grammar: Kolmogorov.. - Charikar, Liu.. (2002)   (4 citations)  (Correct)

....This comprehensibility is an important attraction of grammar based compression relative to otherwise competitive compression schemes. For example, the best pattern matching algorithm that operates on a string compressed as a grammar is asymptotically faster than the equivalent for LZ77 [6]. Finally, work on the smallest grammar problem extends the study of approximation algorithms to hierarchical objects, such as grammars, as opposed to at objects, such as graphs, CNF formulas, etc. This is a signi cant shift, since many real world problems have a hierarchical nature, but ....

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In SPIRE/CRIWG, pages 89-96, 1999.


Path Matching in Compressed Control Flow Traces - Zhang, Gupta (2002)   (1 citation)  (Correct)

....the commonly used compression algorithm is the SEQUITUR algorithm [4] When the need to search for a pattern in the data stream arises, it is highly desirable to avoid uncompressing the data. Therefore researchers have been developing algorithms for pattern matching that operate on compressed data [1, 2, 5, 9, 10]. This paper addresses the problem of nding an occurrence of a path composed of a sequence of basic blocks, which is the pattern, in a compressed control ow trace. A control ow trace captures the complete execution path taken by a program on a given input in the form of the complete sequence ....

.... X 2 a 2 ; X k a k ; X k 1 X l(1) X r(1) X k 2 X l(2) X r(2) X k s X l(s) X r(s) where = fa i j1 i kg, a i can be one of B j (block id j) F j (entry to function F j ) and E (a function exit point) In dictionary based compressed matching algorithms (e.g. 1] and [2]) each non terminal symbol in the dictionary has an associated pre x ag, a sux ag and an internal ag (see example in Figure 2a) When combining two non terminal symbols, their combination will decide the internal, pre x, and sux ags of the combined node. However, this is not sucient for path ....

[Article contains additional citation context not shown here]

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa, \A Unifying Framework for Compressed Pattern Matching," 6th International Symposium on String Processing and Information Retrieval, pages 89-96. IEEE Computer Society, 1999.


Algorithms on Strings Based on the Compressed Suffix Arrays - Sadakane   (Correct)

....databases because the purpose of a search is to obtain a part of the text containing a given pattern. The text occupies n log # bits. Indeed, the data size is always larger than that of the original text size. Though some algorithms for finding words from a compressed text have been proposed [4, 11], the algorithms have to scan the whole compressed text. As a result, their query time is proportional to the size of the compressed text and they are not applicable to huge texts. Though a search index using the su#x array of a compressed text has also been proposed [15] it is di#cult to ....

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A Unifying Framework for Compressed Pattern Matching. In Proc. IEEE String Processing and Information Retrieval Symposium (SPIRE'99), pages 89--96, September 1999.


Approximate Matching of Run-Length Compressed Strings - Mäkinen, Navarro, Ukkonen   (Correct)

....in k jP j errors. Many studies have been made around the subject of compressed pattern matching over di erent compression formats, starting with the work of Amir and Supported by the Academy of Finland under grant 22584. Supported in part by Fondecyt grant 1 990627. Benson [1] e.g. [2, 8, 10, 9]. The only works addressing the approximate variant of the problem have been [11, 13, 15] on Ziv Lempel [20] Our focus is approximate matching over run length encoded strings. In runlength encoding a string that consists of repetitions of letters is compressed by encoding each repetition as a ....

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In Proc. SPIRE'99, pages 89-96. IEEE CS Press, 1999.


Regular Expression Searching over Ziv-Lempel Compressed Text - Navarro (2001)   (Correct)

....and a new variant proposed which was competitive and convenient for search purposes. A similar result, restricted to the LZW format, was independently found and presented by Kida et al. 14] The same group generalized the existing algorithms and nicely unified the concepts in a general framework [12]. Recently, Navarro and Tarhio [25] presented a new, faster, algorithm based on Boyer Moore. Approximate string matching on compressed text aims at finding the pattern where a limited number of differences between the pattern and its occurrences are permitted. The problem, advocated in 1992 [2] ....

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In Proc. SPIRE'99, pages 89--96. IEEE CS Press, 1999.


Boyer-Moore String Matching over Ziv-Lempel Compressed Text - Navarro, Tarhio (2000)   (6 citations)  (Correct)

....compressed texts (simple and extended patterns) and specialized it for the particular cases of LZ77, LZ78 and a new variant proposed which was competitive and convenient for search purposes. A similar result, restricted to the LZW format, was independently found and presented in [14] Finally, [12] generalized the existing algorithms and nicely unified the concepts in a general framework. 3 Basic Concepts 3.1 The Ziv Lempel Compression Formats LZ78 and LZW The general idea of Ziv Lempel compression is to replace substrings in the text by a pointer to a previous occurrence of them. If the ....

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In Proc. 6th Intl. Symp. on String Processing and Information Retrieval (SPIRE'99), pages 89--96. IEEE CS Press, 1999.


Compressed Text Databases with Efficient Query Algorithms based.. - Sadakane (2000)   (1 citation)  (Correct)

....to obtain a part of the text containing a given pattern. The text occupies n log # bits. We assume that the base of logarithm is two. Indeed, the data size is always larger than that of the original text size. Though some algorithms for finding words from a compressed text have been proposed [4, 12], the algorithms have to scan the whole compressed text. As a result, their query time is proportional to the size of the compressed text and they are not applicable to huge texts. Though a search index using the su#x array of a compressed text has also been proposed [16] it is di#cult to search ....

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A Unifying Framework for Compressed Pattern Matching. In Proc. IEEE String Processing and Information Retrieval Symposium (SPIRE'99), pages 89--96, September 1999.


Multiple Pattern Matching Algorithms on Collage System - Kida, Matsumoto, Takeda.. (2001)   Self-citation (Kida Takeda Shinohara Arikawa)   (Correct)

No context found.

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In Proc. 6th International Symp. on String Processing and Information Retrieval, pages 89--96. IEEE Computer Society, 1999.


Speeding Up Pattern Matching by Text Compression - Shibata, TakuyaKida.. (2000)   (4 citations)  Self-citation (Kida Shibata Takeda Shinohara Arikawa)   (Correct)

No context found.

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In Proc. 6th International Symp. on String Processing and Information Retrieval, pages 89--96. IEEE Computer Society, 1999.


Multiple Pattern Matching in LZW Compressed Text - Kida, Takeda, Shinohara.. (1998)   (9 citations)  Self-citation (Kida Takeda Shinohara Arikawa)   (Correct)

....LZ78131] LZW, and so on) Amir, Benson, and Farach[4] presented algorithms for searching an LZW compressed text for a single pattern. We presented in [ 18] an extension of [4] to multiple pattern searching, together with the first experimental results in this area. Moreover, we introduced in [16] a unifying framework, named collage system, which abstracts various dictionary based methods. We showed a general pattern matching algorithm for text string described in terms of collage system. The algorithm can be applied to the problem for any compression methods, such as the Ziv Lempel fam ....

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In 6th International Symposium on String Processing and Information Retrieval, pages 89-96. IEEE Computer Society, 1999.


Multiple Pattern Matching Algorithms on Collage System - Kida, Matsumoto, Takeda.. (2001)   Self-citation (Kida Takeda Shinohara Arikawa)   (Correct)

....for any compression method covered by the framework. 1 Introduction The compressed pattern matching problem was rst de ned by Amir and Benson [2] and various compressed pattern matching algorithms have been proposed depending on underlying compression methods (see survey papers [19, 23] In [7] we introduced a collage system, which is a formal system to represent a string by a pair of dictionary D and sequence S of phrases in D. The basic operations are concatenation, truncation, and repetition. Collage systems give us a unifying framework of various dictionary based compression ....

....are concatenation, truncation, and repetition. Collage systems give us a unifying framework of various dictionary based compression methods, such as Lempel Ziv family (LZ77, LZSS, LZ78, LZW) RE PAIR [11] SEQUITUR [16] and the static dictionary based compression method. We also proposed in [7] the simple pattern matching algorithm on collage system, which simulates the move of the Knuth Morris Pratt automaton [10] running on the original text, by using the functions Jump and Output. In this paper we address the multiple pattern matching problem on collage system. That is, given a set ....

[Article contains additional citation context not shown here]

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In Proc. 6th International Symp. on String Processing and Information Retrieval, pages 89-96. IEEE Computer Society, 1999.


Compressed Pattern Matching for SEQUITUR - Mitarai, Hirao, Matsumoto.. (2000)   (1 citation)  Self-citation (Takeda Shinohara Arikawa)   (Correct)

....report all occurrences of multiple patterns for LZW compressed files. Followed this work, a good deal of practical e#ort has been made on compressed pattern matching for LZW [16] LZ77 [16] pattern substitution method [22] and so on. From a theoretical viewpoint, on the other hand, Kida et al. [6] introduced a collage system as a unifying framework which abstracts various dictionary based compression methods. Through the collage systems, many dictionary based compression methods can be categorized into some classes (Fig. 1) and we can capture the essence of each compression method in the ....

....the sequence is encoded by using the arithmetic coder. 3 A unifying framework for compressed pattern matching In a dictionary based compression, a text string is described by a pair of a dictionary and a sequence of tokens, each of which represents a phrase defined in the dictionary. Kida et al. [6] introduced a unifying framework, named collage system, which abstracts various dictionary based methods, such as the Lempel Ziv family, Sequitur, Re Pair [10] and static dictionary methods. They presented a general compressed pattern matching algorithm for the framework, which is based on the ....

[Article contains additional citation context not shown here]

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In Proc. 6th International Symp. on String Processing and Information Retrieval, pages 89--96. IEEE Computer Society, 1999.


Bit-Parallel Approach to Approximate String.. - Matsumoto, Kida.. (2000)   (6 citations)  Self-citation (Kida Takeda Shinohara Arikawa)   (Correct)

....concern where the goal is to perform the exact string matching in a compressed text without decompressing it. The problem has been studied in 1990 s by several researchers mainly for dictionary based compression methods, such as the Ziv Lempel family (e.g. LZ77 [12] LZ78 [13] LZW [10] In [5], we introduced collage systems, a formal system to represent a text string, that captures various dictionary based compressions. Within this framework, we generalized the existing compressed pattern matching algorithms and unified the concepts into a general algorithm. Thus any compression ....

....u [i] u[1 : u i] Denote by D(x, y) the edit distance between two strings x and y. 3 Collage system In a dictionary based compression, a text string is described by a pair of a dictionary and a sequence of tokens, each of which represents a phrase defined in the dictionary. Kida et al. [5] introduced a unifying framework, named collage system, which abstracts various dictionary based methods, such as the Lempel Ziv family, the RE PAIR [6] and static dictionary methods. In [5] they presented a general 4 Tetsuya Matsumoto, Takuya Kida, Masayuki Takeda, Ayumi Shinohara and Setsuo ....

[Article contains additional citation context not shown here]

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In Proc. 6th International Symp. 14 Tetsuya Matsumoto, Takuya Kida, Masayuki Takeda, Ayumi Shinohara and Setsuo Arikawa on String Processing and Information Retrieval, pages 89--96. IEEE Computer Society, 1999.


Speeding Up Pattern Matching By Text Compression - Shibata, Kida, Fukamachi.. (2000)   (4 citations)  Self-citation (Kida Shibata Takeda Shinohara Arikawa)   (Correct)

....scheme that requires no such extra effort. Thus we must re estimate the performance of existing compression methods or develop a new compression method in the light of the new criterion: Efficiency of compressed pattern matching. As an effective tool for such re estimation, we introduced in [17] a unifying framework, named collage system, which abstracts various dictionary based compression methods, such as Lempel Ziv family, and the static dictionary methods. We developed a general compressed pattern matching algorithm for strings described in terms of collage system. Therefore, any of ....

....encoded pattern is not unique. A solution due to Manber [21] was to devise a way to restrict the number of possible encodings for any string. The approach we take here is basically an instance of the general compressed pattern matching algorithm for strings described in terms of collage system [17]. As stated in Introduction, collage system is a unifying framework that abstracts most of existing dictionary based compression methods. In the framework, a string is described by a pair of a dictionary D and a sequence S of tokens representing phrases in D. A dictionary D is a sequence of ....

[Article contains additional citation context not shown here]

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In Proc. 6th International Symp. on String Processing and Information Retrieval. IEEE Computer Society, 1999.


Faster Fully Compressed Pattern Matching Algorithm.. - Hirao, Takeda.. (2000)   (1 citation)  Self-citation (Takeda Shinohara Arikawa)   (Correct)

....defined before. The length of the string represented by a straight line program can be exponentially long with respect to the size of the straight line program. We note that the class of straight line programs is the central subclass of collage systems, that was recently introduced by Kida et al. [7] as a unifying framework for various (not fully) compressed pattern matching algorithms. Their results imply that the compressed pattern matching for straight line programs can be solved in O(n m 2 r) time using O(n m 2 ) space, where n and m are the length of text and pattern, ....

....to the set Occ(T, P ) can be determined in O(n) time. 5 Conclusion We showed a fully compressed matching algorithm for the class of balanced straightline programs, that runs in O(nm) time using O(nm) space. We summarize in Fig. 6 Compressed Matching Fully Compressed Matching Collage Systems [7] O(n height(T ) m 2 ) 7] unknown Straight line programs O(n m 2 ) 7] O(n 2 m 2 ) 8] balanced Straight line programs O(n m 2 ) 7] O(nm) Figure 6: Comparison of the complexity the running times of related algorithms. In future works, we will try to extend our algorithm to the ....

[Article contains additional citation context not shown here]

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In Proc. 6th International Symposium on String Processing and Information Retrieval, pages 89--96, 1999.


A Boyer-Moore type algorithm for compressed pattern.. - Shibata, Matsumoto.. (2000)   (3 citations)  Self-citation (Shibata Takeda Shinohara Arikawa)   (Correct)

....2 E#ciency of compressed pattern matching In order to achieve a fast search in compressed files, we have to re estimate the existing compression methods in the light of the new criterion: E#ciency of compressed pattern matching. As an e#ective tool for such re estimation, we introduced in [11] a unifying framework, named collage system, which abstracts various dictionary based compression methods, such as the Lempel Ziv family, BPE, and the static dictionary methods. In the framework, a text string is described by a pair of a dictionary D and a sequence S of tokens, each of which ....

....a text string is described by a pair of a dictionary D and a sequence S of tokens, each of which represents a phrase defined in D. The dictionary D is given as a sequence of assignments where the basic operations are concatenation, repetition, and prefix (su#x) truncation. We developed in [11] a general compressed pattern matching algorithm for texts described in terms of collage system. Consequently, any of the compression methods that can be described within the framework has a compressed pattern matching algorithm as an instance. We denote by t.u the phrase represented by a token t. ....

[Article contains additional citation context not shown here]

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In Proc. 6th International Symp. on String Processing and Information Retrieval, pages 89--96. IEEE Computer Society, 1999. 10 Yusuke Shibata, Tetsuya Matsumoto, Masayuki Takeda, Ayumi Shinohara and Setsuo Arikawa


Practical and Flexible Pattern Matching over Ziv-Lempel.. - Navarro, Raffinot   (Correct)

No context found.

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In Proc. 6th Intl. Symp. on String Processing and Information Retrieval (SPIRE'99), pages 89-96. IEEE CS Press, 1999.


Approximating the Smallest Grammar: Kolmogorov.. - Charikar, Lehman, .. (2002)   (4 citations)  (Correct)

No context found.

T. Kida, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. A unifying framework for compressed pattern matching. In SPIRE/CRIWG, pages 89--96, 1999.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC