Abstract. We present a new algorithm to search regular expressions, which is able to skip text characters. The idea is to determine the minimum length ` of a string matching the regular expression, manipulate the original automaton so that it recognizes all the reverse prefixes of length up to ` of the strings accepted, and use it to skip text characters as done for exact string matching in previous work. As we show experimentally, the resulting algorithm is fast, the fastest one in many cases of interest. 1
|
760
|
A universal algorithm for sequential data compression
– Lempel, Ziv
- 1977
|
|
578
|
A method for the construction of minimum redundancy codes
– Huffman
- 1952
|
|
557
|
Text Compression
– Bell, Cleary, et al.
- 1990
|
|
523
|
Arithmetic coding for data compression
– Witten, Neal, et al.
- 1987
|
|
481
|
Compression of individual sequences via variable-rate coding
– Ziv, Lempel
- 1978
|
|
447
|
Fast pattern matching in strings
– Knuth, Morris, et al.
- 1977
|
|
377
|
A fast string searching algorithm
– Boyer, Moore
- 1977
|
|
328
|
Efficient string matching: An aid to bibliographic search
– Aho, Corasick
- 1975
|
|
325
|
A Technique for High-Performance Data Compression
– Welch
- 1984
|
|
307
|
Text Algorithms
– Crochemore, Rytter
- 1994
|
|
237
|
Manber U.,: Fast Text Searching: Allowing Errors
– Wu
- 1992
|
|
223
|
Universal codeword sets and representations of the integers
– Elias
- 1975
|
|
165
|
A new approach to Text searching
– Baeza-Yates, Gonnet
- 1992
|
|
106
|
A locally adaptive data compression scheme
– Bentley, Sleator, et al.
- 1986
|
|
101
|
A fast bit-vector algorithm for approximate string matching based on dynamic programming
– Myers
- 1998
|
|
100
|
AGREP-a fast approximate pattern-matching tool
– Wu, Manber
- 1992
|
|
87
|
The PROSITE database, its status
– Hofmann, Bucher, et al.
- 1999
|
|
82
|
Transducers and Repetitions
– Crochemore
- 1986
|
|
82
|
A very fast substring search algorithm
– Sunday
- 1990
|
|
77
|
Regular expression search algorithm
– Thompson
- 1968
|
|
72
|
Speeding up two string matching algorithms
– Crochemore, Czumaj, et al.
- 1994
|
|
69
|
From regular expressions to deterministic automata
– Berry, Sethi
- 1986
|
|
69
|
String matching in Lempel-Ziv compressed strings
– Farach, Thorup
- 1998
|
|
64
|
Efficient two-dimensional com-pressed matching
– Amir, Benson
- 1992
|
|
63
|
Let sleeping files lie: Pattern matching in Z-compressed file
– Amir, Benson, et al.
- 1996
|
|
60
|
Practical fast searching in strings
– Horspool
- 1980
|
|
58
|
Data compression with finite windows
– Fiala, Greene
- 1989
|
|
48
|
Theoretical and empirical comparisons of approximate string matching algorithms
– Chang, Lampe
- 1992
|
|
46
|
A string matching algorithm fast on the average
– Commentz-Walter
- 1979
|
|
44
|
Text retrieval: Theory and practice
– Baeza-Yates
- 1992
|
|
43
|
A text compression scheme that allows fast searching directly in the compressed file
– Manber
- 2001
|
|
42
|
A subquadratic algorithm for approximate limited expression matching
– Wu, Manber, et al.
- 1996
|
|
38
|
Regular expressions into finite automata
– Bruggemann-Klein
- 1993
|
|
38
|
The complexity of pattern matching for a random string
– Yao
- 1979
|
|
37
|
The abstract theory of automata
– Glushkov
- 1961
|
|
36
|
Taxonomies and toolkits of regular language algorithms
– Watson
- 1995
|
|
35
|
A general practical approach to pattern matching over ZivLempel compressed text
– Navarro, Raffinot
- 1999
|
|
33
|
A Comparison of Approximate String Matching Algorithms
– JOKINEN, TARHIO, et al.
- 1996
|
|
29
|
Fast and practical approximate pattern matching
– Baeza-Yates, Perleberg
- 1992
|
|
28
|
Fast and flexible string matching by combining bit-parallelism and suffix automata
– Navarro, Raffinot
- 2007
|
|
25
|
A four russians algorithm for regular expression pattern matching
– Myers
- 1992
|
|
24
|
A bit-parallel approach to suffix automata: Fast extended string matching
– Navarro, Raffinot
- 1998
|
|
22
|
Experimental results on string matching algorithms
– Lecroq
- 1995
|
|
21
|
Efficient algorithms for Lempel-Ziv encoding
– Gasieniec, Karpinski, et al.
- 1996
|
|
21
|
G.: Faster Approximate String Matching, Algorithmica
– Baeza-Yates, Navarro
- 1999
|
|
18
|
Fast searching on compressed text allowing errors
– Moura, Navarro, et al.
- 1998
|
|
17
|
Shift-And approach to pattern matching in LZW compressed text
– Kida, Takeda, et al.
- 1999
|
|
17
|
Direct pattern matching on compressed text
– Moura, Navarro, et al.
- 1998
|
|
15
|
Longest-match string searching for Ziv-Lempel compression
– Bell, Kulp
- 1993
|
|
14
|
Fast regular expression search
– Navarro, Raffinot
- 1999
|