Results 1 
4 of
4
LempelZiv parsing in external memory
, 2013
"... Abstract. For decades, computing the LZ factorization (or LZ77 parsing) of a string has been a requisite and computationally intensive step in many diverse applications, including text indexing and data compression. Many algorithms for LZ77 parsing have been discovered over the years; however, des ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Abstract. For decades, computing the LZ factorization (or LZ77 parsing) of a string has been a requisite and computationally intensive step in many diverse applications, including text indexing and data compression. Many algorithms for LZ77 parsing have been discovered over the years; however, despite the increasing need to apply LZ77 to massive data sets, no algorithm to date scales to inputs that exceed the size of internal memory. In this paper we describe the first algorithm for computing the LZ77 parsing in external memory. Our algorithm is fast in practice and will allow the next generation of text indexes to be realised for massive strings and string collections. 1
Crochemore’s String Matching Algorithm: Simplification, Extensions, Applications⋆
"... Abstract. We address the problem of string matching in the special case where the pattern is very long. First, constant extra space algorithms are desirable with long patterns, and we describe a simplified version of Crochemore’s algorithm retaining its linear time complexity and constant extra spac ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract. We address the problem of string matching in the special case where the pattern is very long. First, constant extra space algorithms are desirable with long patterns, and we describe a simplified version of Crochemore’s algorithm retaining its linear time complexity and constant extra space usage. Second, long patterns are unlikely to occur in the text at all. Thus we define a generalization of string matching called Longest Prefix Matching that asks for the occurrences of the longest prefix of the pattern occurring in the text at least once, and modify the simplified Crochemore’s algorithm to solve this problem. Finally, we define and solve the problem of Sparse Longest Prefix Matching that is useful when the pattern has to be split into multiple pieces because it is too long to be processed in one piece. These problems are motivated by and have application in LempelZiv (LZ77) factorization. 1
Computing Reversed LempelZiv Factorization Online
"... Abstract. Kolpakov and Kucherov proposed a variant of the LempelZiv factorization, called the reversed LempelZiv (RLZ) factorization (Theoretical Computer Science, 410(51):5365–5373, 2009). In this paper, we present an online algorithm that computes the RLZ factorization of a given string w of le ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. Kolpakov and Kucherov proposed a variant of the LempelZiv factorization, called the reversed LempelZiv (RLZ) factorization (Theoretical Computer Science, 410(51):5365–5373, 2009). In this paper, we present an online algorithm that computes the RLZ factorization of a given string w of length n in O(n log2 n) time using O(n log σ) bits of space, where σ ≤ n is the alphabet size. Also, we introduce a new variant of the RLZ factorization with selfreferences, and present two online algorithms to compute this variant, in O(n log σ) time using O(n log n) bits of space, and in O(n log2 n) time using O(n log σ) bits of space.