56 citations found. Retrieving documents...
Sunday, D. A very fast substring search algorithm. Comm. ACM 33, 8 (August 1990), 132--142.

 Home/Search   Document Not in Database   Summary   ACM   TOC   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Text Searching: Theory and Practice - Baeza-Yates, Navarro   (Correct)

....we assume that P and the window are compared right to left. Preprocessing requires O(m ) time and O( space, while the average search time is still O(n=min(m; The algorithm ts in less than ten lines of code in most programming languages. A close variant of this algorithm is due to Sunday [62], which uses the text position following the window instead of the last window position. This makes the search time closer to n= m 1) instead of Horspool s n=m, and is in practice faster than Horspool. However, if one takes into account that with probability ( 1) even the rst comparison in ....

D. Sunday. A very fast substring search algorithm. Communications of the ACM, 33(8):132-142, 1990.


Compact Approximation of Lattice Functions with Applications.. - Boldi, Vigna (2002)   (Correct)

....algorithms that can be used with large alphabets, such as Unicode collation elements, with a small setup time. 1 Introduction One of the fastest known algorithms for searching a pattern (i.e. a string) in large texts is the Boyer Moore algorithm [6] along with its many variants (e.g. [1, 8, 7, 12, 13]) All variants are based on the following simple idea: Let p be the pattern of length P and t be the text of length T to be searched . The pattern occurs in position k if p i t k i for all 0 i P . Now, when examining a candidate position k, we compare the characters t k m 1 , t k m 2 , ....

....text character found in that position (the bad character) then we compute the shift as max(1, j # # (c) 1) Having a larger value for # # has the simple effect of reducing the shift. This is true even for variations of the algorithm that look at different characters to compute the shift (e.g. [13]) 6 0.1 0.2 0.3 0.4 0.5 10 20 30 40 50 d Figure 3: The error probability of each summand in the exponential case with n 2n ln 2. 4.1 Implementation Issues Suppose we want to find occurrences of pattern p in text t . Devising an approximator for # # requires choosing the various ....

[Article contains additional citation context not shown here]

Daniel M. Sunday. A very fast substring search algorithm. Comm. ACM, 33(8):132--142, 1990.


The Abstraction and Instantiation of String-Matching.. - Amtoft, Consel.. (2001)   (2 citations)  (Correct)

.... Partsch and Stomp s formal derivation [45] and just recently Hernandez and Rosenblueth s logic program derivation [31] But as reviewed in Aho s chapter in the Handbook of Theoretical Computer Science [1] several variants of Boyer and Moore s string matcher exist (such as Sunday s variant [52] and Baeza Yates, Cho#rut, and Gonnet s variant [6] with recurrent concerns about linearity in principle (e.g. Schaback s work [47] and in practice (e.g. Horspool s work [33] 7 Conclusion and issues We have abstracted a naive quadratic substring program with a static cache, and ....

Daniel M. Sunday. A very fast substring search algorithm. Communications of the ACM, 33(8):132--142, August 1990.


Rethinking Java Strings - Boldi, Vigna (2003)   (Correct)

....For instance, building data structures (and in general using new) leads to methods that are unsuitable for everyday usage. On the other hand, the brute force approach of String (a double loop) is definitely not very efficient. MutableString uses a relaxed version of Daniel Sunday s QuickSearch [7], a variant of the Boyer Moore algorithm. In QuickSearch, a table records for each character in the search alphabet the last occurrence of the character in the search pattern (or the pattern length, if the character does not appear) After checking whether the pattern appears in positions t , ....

Daniel M. Sunday. A very fast substring search algorithm. Comm. ACM, 33(8):132--142, 1990.


Sharing of Computations - Amtoft (1993)   (1 citation)  (Correct)

....the KMP algorithm is derived by PE, after a naive substring matcher has been transformed into one more suitable for PE purposes. Contrary to the claims of the paper, this transformation cannot (in my opinion ) be considered automatic; neither is it obvious that it preserves semantics. In [Sun90] some very e#cient variations of the Boyer Moore idea are presented (but not within the context of program derivation) The approach is characterized by two features: After a mismatch has been found in e.g. the situation . TEX. it is the symbol in the subject string immediate to the ....

Daniel M. Sunday. A very fast substring search algorithm. Communications of the ACM, 33(8):132--142, August 1990.


Compact Approximation of Lattice Functions with Applications.. - Boldi, Vigna (2002)   (Correct)

....algorithms that can be used with large alphabets, such as Unicode collation elements, with a small setup time. 1 Introduction One of the fastest known algorithms for searching a pattern (i.e. a string) in large texts is the Boyer Moore algorithm [5] along with its many variants (e.g. [1, 7, 6, 11, 12]) All variants are based on the following simple idea: Let p be the pattern of length P and t be the text of length T to be searched . The pattern occurs in position k if p i t k i for all 0 i P . Now, when examining a candidate position k, we compare the characters t k m 1 , t k m 2 , ....

.... property: Theorem 1 For all x #, f (x) # f (x) Note that in case L 0, 1 we obtain exactly a Bloom filter (by approximating the characteristic function of a subset of #) whereas in case d 1 we obtain the structure described in [8] As an example, consider the function f = [12] # N given by f (1) and f (x) 0 in all other cases. We let d 2, b 6, h 0 (x) #x 2# and h 1 (x) 5x mod 6 3 (these functions have been chosen for exemplification only) In the upper part of Figure 1, one can see how values are mapped and maximised into the buckets; in the ....

[Article contains additional citation context not shown here]

Daniel M. Sunday. A very fast substring search algorithm. Comm. ACM, 33(8):132--142, 1990.


Direct Pattern Matching on Compressed Text - de Moura, Navarro, Ziviani (1998)   (6 citations)  (Correct)

....presented in [18] which runs in O(kv w) time to search W . For exact search, after obtaining the compressed code (a sequence of bytes) we can choose any known algorithm to process the search. In the experimental results presented in this paper we used the Boyer MooreHorspool Sunday (BMHS) [17] algorithm, which has good practical performance. If we are doing approximate search then the original pattern is represented by the set of lists L 1 ; L j , where L i has the compressed codes that matches the 4 Text Vocabulary Vocab. Text Size (bytes) #Words Size (bytes) #Words Size ....

....algorithms work faster on longer patterns. The multi pattern search algorithm chosen to search the elements of each list is an efficient technique proposed by Baeza Yates and Navarro [4, 5] to handle multiple patterns. This algorithm is an extension of the Boyer Moore Horspool Sunday (BMHS) [17] algorithm, which has a cost of O(n log(c) c) on average, where n is the size in bytes of the compressed text and c is the length of the smaller compressed pattern searched. We analyze the performance of our searching algorithm. The analysis considers a random text, which is very appropriate ....

D. Sunday. A very fast substring search algorithm. Communications of the ACM, 33(8):133--142, 1990.


Approximate String Matching in Musical Sequences - Crochemore, Iliopoulos.. (2001)   (5 citations)  (Correct)

....O(n) These algorithms use the bitwise technique. It is possible to adapt fast and practical exact pattern matching algorithms to these kind of approximations. In this paper we will present the adaptations of the Tuned Boyer Moore [8] the Skip Search algorithm [5] and the Maximal Shift algorithm [17] and present some experiments to assert that these adaptations are faster than the algorithms using the bitwise technique. The paper is organised as follows. In the next section we present some basic de nitions for strings and background notions for approximate matching. In Sections 3 5 we ....

....starting point of p in t and checks if the pattern occurs at that position. To do approximate pattern matching, the buckets can be computed as follows: z[a] fi j p i = ag Figure 2 shows the pseudo code for Skip Search algorithm. 5 Maximal Shift Approximate Pattern Matching Sunday [17] designed an exact string matching algorithm where the pattern positions are scanned from the one which will lead to a larger shift to the one which will lead to a shorter shift, in case of a mismatch. Doing so one may hope 5 Skip Search(p; m; t; n; 1 . P reprocessing 2 for all a 2 3 do ....

D. M. Sunday, A very fast substring search algorithm, CACM, Vol 33, (1990), pp. 132-142.


Efficient Experimental String Matching by Weak Factor.. - Allauzen, Crochemore.. (2001)   (Correct)

....done during the backward search is bounded by n log and the total number by 2n n log . 2 4. 3 Experimental results In this section, we present experimental results on the time complexity of our string matching algorithms, compared to the following algorithms: Sunday: the Sunday algorithm [10] is often considered as the fastest in practice; BM: the Boyer Moore algorithm [6] BDM: the classical Backward Dawg Matching with a sux automaton [8] Su : the Backward Dawg Matching with a sux automaton but without testing terminal states, this is equivalent to the basic approach with the factor ....

D. Sunday. A very fast substring search algorithm. CACM, 33(8):132-142, August 1990.


Factor Oracle: A New Structure for Pattern Matching - Allauzen, Crochemore, Raffinot (1999)   (11 citations)  (Correct)

....KMP Back to the current position Window Re reading by KMP Figure11. Second case : the critical position is reached 3.2 Experimental results In this section, we present the experimental results obtained. More precisely, we compare the following algorithms. Sunday: the Sunday algorithm [15] is often considered as the fastest in practice, BM: the Boyer Moore algorithm [6] BDM: the classical Backward Dawg Matching with a suffix automaton [11] Suff: the Backward Dawg Matching with a suffix automaton but without testing terminal states, this is equivalent to the basic ....

D. Sunday. A very fast substring search algorithm. CACM, 33(8):132--142, August 1990.


New and Faster Filters for Multiple Approximate String Matching - Baeza-Yates, Navarro   (Correct)

....a multipattern exact search for the pieces. Each occurrence of a piece is verified to check if it is surrounded by a complete match. If there are 11 not too many verifications, this algorithm is extremely fast. From the many algorithms for multipattern search, an extension of Sunday s algorithm [27] gave us the best results. We build a trie with the sub patterns. From each text position we search the text characters into the trie, until a leaf is found (match) or there is no path to follow (mismatch) The jump to the next text position is precomputed as the minimum of the jumps allowed in ....

D. Sunday. A very fast substring search algorithm. CACM, 33(8):132--142, August 1990.


Boyer-Moore String Matching over Ziv-Lempel Compressed Text - Navarro, Tarhio (2000)   (6 citations)  (Correct)

....amount of shifting obtained. Two main techniques are used: Occurrence heuristic: pick a character in the window and shift the window forward the minimum necessary to align the selected text character with the same character in the pattern. Horspool [9] uses the m th window character and Sunday [19] the (m 1) th (actually outside the window) These methods need a table d that for each character gives its last occurrence in the pattern (the details depend on the versions) The Simplified BM (SBM) method [5] uses the character at the position that failed while checking the window, which needs ....

....and our implementation of it works only until m = 32 (it would be slower, not faster, for longer patterns) We have also considered the naive approach of decompressing then searching. Two choices are shown: DS uses our LZ78 format and decompresses the file in memory while applying a Sunday [19] search algorithm over it; D Agrep first decompresses the text and then then runs agrep over it. Agrep [21, 22] is considered the fastest text searching tool, and we recall that the decompression time of our format is the fastest. As can be seen, our algorithms are significantly faster than ....

D. Sunday. A very fast substring search algorithm. CACM, 33(8):132--142, 1990.


Fast and Flexible Word Searching on Compressed Text - de Moura, Navarro.. (2000)   (4 citations)  (Correct)

....search W . A simple word is searched in O(w) time using, e.g. a hash table. 4.2 Searching Phase For exact search, after obtaining the compressed codeword (a sequence of bytes) we can choose any known algorithm to process the search. In the experimental results presented in this paper we used the Sunday [Sunday 1990] algorithm, from the Boyer Moore family, which has good practical performance. In the case of approximate or extended searching we convert the problem to the exact multipattern searching problem. We just obtain a set of codewords that match the pattern and use a multipattern search algorithm ....

Sunday, D. 1990. A very fast substring search algorithm. Communications of the ACM 33, 8, 133--142.


Indexing Multimedia Databases - Faloutsos   (Correct)

....queries ffl Boolean: data OR information) AND retrieval AND (NOT text) ffl Additional features: data ADJACENT retrieval ffl Keyword search: data, retrieval, information 30 5. 1 Full text scanning ffl single term: Knuth, Morris and Pratt [KMP77] Boyer and Moore [BM77] and improvements [Sun90] ffl multiple terms: Aho and Corasick [AC75] ffl randomized algorithm: Fingerprints [KR87] ffl approximate match: agrep [WM92] BYG92] documents c a t NO space overhead BUT slow. 31 5.2 Inversion Aaron zoo . document file BUT: ffl Space overhead ffl ....

D.M. Sunday. A very fast substring search algorithm. Comm. of ACM (CACM), 33(8):132--142, August 1990.


Very Fast and Simple Approximate String Matching - Navarro, Baeza-Yates (1998)   (4 citations)  (Correct)

....case r = k 1 and m 0 = bm= k 1)c, the search cost is O(mn=w) which is the same cost for exact searching using Shift Or. Later, in [3] the use of a multipattern extension of an algorithm of the Boyer Moore (BM) family was proposed. In [1,2] we tested an extension of the BM Sunday algorithm [10]: we split the pattern in pieces of length bm= k 1)c and dm= k 1)e and form a trie with the pieces. We also build a pessimistic d table with all the pieces (the longer pieces are pruned to build this table) This table stores, for each character, the smallest shift allowed among all the ....

D. Sunday. A very fast substring search algorithm. Communications of the ACM, 33(8):132--142, August 1990.


Block Addressing Indices for Approximate Text Retrieval - Baeza-Yates, Navarro (1997)   (2 citations)  (Correct)

....this can be done better. We know (from the vocabulary) exactly which words matching the pattern are present in each block. Hence, we can search for those words only, instead of running a slower approximate search algorithm again. We use an extension of the Boyer Moore Horspool Sunday algorithm [7] to multipattern search. This gave us better results than an Aho Corasick machine, since a few words are searched on each block (this decision is also supported by [8] We compared this strategy against Glimpse version 4.0. We used the small index provided by Glimpse, i.e. the one addressing ....

D. Sunday. A very fast substring search algorithm. CACM, 33(8):132--142, Aug 1990.


Experimental Results on String Matching Algorithms - Lecroq (1995)   (4 citations)  Self-citation (Sunday)   (Correct)

No context found.

D.M. Sunday, `A very fast substring search algorithm', Comm. ACM, 33, 132--142 (1990).


Pattern and Approximate-Pattern Matching for Program Compaction - Johnson, Mycroft   (Correct)

No context found.

Sunday, D. A very fast substring search algorithm. Comm. ACM 33, 8 (August 1990), 132--142.


Practical and Flexible Pattern Matching over Ziv-Lempel.. - Navarro, Raffinot   (Correct)

No context found.

D. Sunday. A very fast substring search algorithm. Communications of the ACM, 33(8):132-142, August 1990.


Mutable Strings in Java: Design, Implementation and.. - Boldi, Vigna   (Correct)

No context found.

D. M. Sunday, A very fast substring search algorithm, Comm. ACM 33 (8) (1990) 132--142. 24


Approximate String Matching in Musical Sequences - Crochemore, Iliopoulos.. (2001)   (5 citations)  (Correct)

No context found.

D. M. Sunday, A very fast substring search algorithm, CACM, Vol 33, (1990), pp. 132-142.


Speeding Up Pattern Matching by Text Compression - Shibata, TakuyaKida.. (2000)   (4 citations)  (Correct)

No context found.

D. M. Sunday. A very fast substring search algorithm. Comm. ACM, 33(8):132-- 142, 1990.


Pattern Matching and Text Compression Algorithms - Crochemore, Lecroq (2003)   (3 citations)  (Correct)

No context found.

Sunday, D. M. 1990. A very fast substring search algorithm. Comm. ACM 33(8):132-142.


Fast String Matching using an n-gram Algorithm - Kim, Shawe-Taylor (1994)   (4 citations)  (Correct)

No context found.

D. M. Sunday, `A very fast substring search algorithm', Comm. ACM, 33, (8), 132--142 (1990).


Boyer-Moore String Matching - Single-Table Boyer-Moore The   (Correct)

No context found.

Sunday, D. M. A very fast substring search algorithm. Commun. ACM 33, 8 (Aug. 1990), 132-142.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC