17 citations found. Retrieving documents...
A. Blumer, A. Ehrenfeucht and D. Haussler, Average Size of Suffix Trees and DAWGS, Discrete Applied Mathematics, 24, 37-45 (1989).

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
On Compact Directed Acyclic Word Graphs - Crochemore, Vérin (1997)   (15 citations)  (Correct)

....is steadily growing, and therefore saving memory space is wanted both for the construction of the index and for its use. The Compact Directed Acyclic Word Graph (CDAWG) keeps the direct access to information while requiring less memory space. The structure has been introduced by Blumer et al. [4, 5]. The implementation is obtained by deleting all states of outdegree one and their corresponding transitions (excepting terminal states) We present an algorithm that builds directly compact DAWGs. This construction avoids constructing the DAWG first, which makes it suitable for the presently ....

....made on biological DNA sequences, considering them as words over the alphabet Sigma = fa# c# g# tg,we got that more than 60 of states have outdegree one. So, the deletion of these states is worth, it provides an importantsaving. The average analysis of the number of states and edges in done in [5] in a Bernouilly model of probability. When a state p is deleted, the deletion of its outgoing edges is realized by concatenating their label to the labels of incoming edges. For example, let r and p be states linked by a transition (r#b#p) The edges (r#b#p) and (p# a# q) are replaced by the ....

[Article contains additional citation context not shown here]

A. Blumer, D. Haussler, and A. Ehrenfeucht. Average sizes of suffix trees and dawgs. Discrete Applied Mathematics, 24:37--45, 1989.


Factor Oracle: A New Structure for Pattern Matching - Allauzen, Crochemore, Raffinot (1999)   (11 citations)  (Correct)

....1 DAWGs, Directed Acyclic Word Graphs, are just suffix automata in which all states are terminal states One of the oldest method is to merge the compression techniques applied both by the suffix tree and the suffix automaton. It leads to the notion of compact suffix automaton (or compact DAWG) [5]. The direct construction of this structure is given in [12, 13] A second method to reduce the size of indexes has been considered in the text compression method in [10] It consists in representing the complement language of the factors (substrings) of the text. More precisely, only minimal ....

A. Blumer, A. Ehrenfeucht, and D. Haussler. Average size of suffix trees and DAWGS. Discret. Appl. Math., 24:37--45, 1989.


Suffix Trees and their Applications in String Algorithms - Grossi, Italiano (1993)   (2 citations)  (Correct)

....and Vishkin [86] and Farach and Muthukrishnan [35] have devised parallel constructions whose work is optimal also for a small alphabet. The statistical behavior of suffix trees has been studied under general and mild probabilistic frameworks by Apostolico and Szpankowski [12] Blumer et al. [18], Devroye et al. 32] Grassberger [51] Jacquet and Szpankowski [60] Shields [89] and 2 Szpankowski [94, 95] One of the main properties of the suffix tree is that its asymptotic expected depth is logarithmic in the length of the given string, even though it may be linear in the worst case. ....

Blumer, A., Ehrenfeucht, A., and Haussler, D., Average size of suffix trees and DAWGs, Discrete Appl. Math. 24, 37--45, (1989). 34


A Generalized Suffix Tree and Its (Un)Expected Asymptotic.. - Szpankowski (1996)   (Correct)

.... obtained by Devroye et al. 11] and the limiting distribution of the typical depth in a suffix tree is reported in Jacquet and Szpankowski 4 [19] Szpankowski [38] extended some these results to a more general probabilistic model (still for b = 1) Heuristic arguments were used by Blumer et al. [6] to show that the average number of internal nodes in a suffix tree is a linear function of n, and a rigorous proof of this can be found in [19] Finally, Shields [34] recently established the almost sure behavior of the external path length of a noncompact suffix tree in the Bernoulli model and ....

....(2:7a) 9 h 3 = lim n 1 minflog P Gamma1 (X n 1 ) P (X n 1 ) 0g n = lim n 1 log(1= maxfP (X n 1 ) P (X n 1 ) 0g n ; 2:7b) as already defined in Pittel [30] Note also that h 3 h (b) 2 h h 1 . 2 Remark 1. i) Bernoulli Model. In this widely used model (cf. 4] [6], 9] 11] 16] 17] 23] 31] 36] and [37] symbols from the alphabet Sigma are generated independently, that is, P (X n 1 ) P n (X 1 1 ) In particular, the ith symbol from the alphabet Sigma is generated according to the probability p i , where 1 i V and P V i=1 p i = 1. ....

A. Blumer, A. Ehrenfeucht and D. Haussler, Average Size of Suffix Trees and DAWGS, Discrete Applied Mathematics, 24, 37-45 (1989).


Asymptotic estimation of the average number of terminal states.. - Raffinot (1997)   (Correct)

....estimation of the average number of terminal states in DAWGs (extended abstract) Mathieu Raffinot Institut Gaspard Monge Abstract. Following the work of A.Blumer, A.Ehrenfeucht and D. Haussler [1], we obtain an asymptotic estimation of the average number of terminal states in the suffix directed acyclic word graph (DAWG) under a Bernouilli model. This estimation is useful to understand the average behavior of algorithms which attach special action to terminal nodes of DAWGs, like BDM [4] ....

.... matching and lead to very fast algorithms, like BDM for one pattern, or MultiBDM for several patterns (see [3] Studies have been undertaken to calculate their sizes in terms of number of nodes and edges, so as to predict the maximal or average space needed by the algorithms that use them (see [1]) and to demonstrate some of their properties. This paper takes place in this context. We give an asymptotic estimation of the number of terminal states of a DAWG under a Bernouilli model. This estimation is useful to understand the average behavior of algorithms which attach special action to ....

[Article contains additional citation context not shown here]

Anselm Blumer, Andrzej Ehrenfeucht, and David Haussler. Average sizes of suffix trees and dawgs. Discrete Applied Mathematics, 24(1):37--45, 1989.


Fast and Flexible String Matching by Combining.. - Navarro, Raffinot (1998)   (3 citations)  (Correct)

....automaton of the word baabbaa is given in Figure 2. 0,1,2,3,4,5,6,7 1,4,5 2,6 3,7 2,3,6,7 4 5 6 7 b a a b b a a b a b a Figure 2: Deterministic suffix automaton of the word 0 b 1 a 2 a 3 b 4 b 5 a 6 a 7 3 The (deterministic) suffix automaton is a well known structure [9, 7, 11, 23], and we do not prove any of its properties here (neither the correctness of the previous construction) The size of DAWG(p) is linear in m (counting both nodes and edges) and a linear on line construction algorithm exists [9] A very important fact for our algorithm is that this automaton can ....

Anselm Blumer, Andrzej Ehrenfeucht, and David Haussler. Average sizes of suffix trees and dawgs. Discrete Applied Mathematics, 24(1):37--45, 1989.


Complexity of Sequential Pattern Matching Algorithms - Regnier, Szpankowski (1999)   (Correct)

....n = ffn o(n) All previous results have been able only to show that c n = Theta(n) but they did not excluded some bounded fluctuation of the coefficient at n. We should point out that in the analysis of algorithms on words such a fluctuation can occur in some problems involving suffix trees (cf. [4, 13]) But, in this paper we prove that such a fluctuation cannot take place for the complexity function of the strongly sequential pattern matching algorithms. For example, in the worst case we prove here that for any given pattern p, any ffl 0 and any n n ffl , one can find a text t n 1 such ....

A. Blumer, A. Ehrenfeucht and D. Haussler, Average Size of Suffix Trees and DAWGS, Discrete Applied Mathematics, 24, 37-45 (1989).


Autocorrelation On Words And Its Applications - Analysis.. - Jacquet, Szpankowski (1994)   (9 citations)  (Correct)

....results for some other suffix tree parameters such as the typical depth, the depth of insertion, and the shortest path. Shields [25] proved almost sure convergence of the external path length in the Markovian model. The size of a suffix tree was investigated by Blumer, Ehrenfeucht and Haussler [4] using a mixture of analytical and simulation tools. In Section 4, we present a rigorous proof of such a result. The limiting distribution of the depth in a suffix tree was left open, and we intend to fill this gap. Preliminary results of this paper were presented in [16] In passing, we note that ....

.... although it looks more complicated than necessary is the right approach as we shall prove below. We should also stress that this methodology gives a unified approach for analyzing some other digital structures (e.g. independent tries, digital search trees, direct acyclic word graphs (DAWG) [4], etc. Using the above idea we shall compute respectively in Sections 3.2 and 3.3 the generating functions of the depths for an independent trie (easy) and a suffix tree (difficult ) These two generating functions are asymptotically compared to show that they do not differ too much for large n ....

A. Blumer, A. Ehrenfeucht and D. Haussler, Average Size of Suffix Trees and DAWGS, Discrete Applied Mathematics, 24, 37-45 (1989).


A Bit-parallel Approach to Suffix Automata: Fast Extended.. - Navarro, Raffinot (1998)   (4 citations)  (Correct)

....suffix automaton of the word baabbaa is given in Figure 2. 0,1,2,3,4,5,6,7 1,4,5 2,6 3,7 2,3,6,7 4 5 6 7 b a a b b a a b a b a Fig. 2. Deterministic suffix automaton of the word baabbaa. The largest node is the initial state. The (deterministic) suffix automaton is a well known structure [8, 5, 11, 18], and we do not prove any of its properties here (neither the correctness of the previous construction) The size of DAWG(p) is linear in m (counting both nodes and edges) and can be built in linear time [8] A very important fact for our algorithm is that this automaton can not only be used to ....

A. Blumer, A. Ehrenfeucht, and D. Haussler. Average sizes of suffix trees and dawgs. Discrete Applied Mathematics, 24(1):37--45, 1989.


Complexity of Sequential Pattern Matching Algorithms - Regnier, Szpankowski   (Correct)

....= ffn o(n) All previous results have been able only to show that c n = Theta(n) but they did not excluded some bounded fluctuation of the coefficient at n. We should point out that in the analysis of algorithms on words such a fluctuation can occur in some problems involving suffix trees (cf. [4, 14, 20]) But, in this paper we prove that such a fluctuation cannot take place for the complexity function of the strongly sequential pattern matching algorithms. For example, in the worst case we prove here that for any given pattern p, any ffl 0 and any n n ffl , one can find a text t n 1 such ....

A. Blumer, A. Ehrenfeucht and D. Haussler, Average Size of Suffix Trees and DAWGS, Discrete Applied Mathematics, 24, 37-45, 1989.


Asymptotic Properties Of Data Compression And Suffix Trees - Szpankowski (1993)   (16 citations)  (Correct)

....case of a mismatch between T and P , say at position n 1 of P , the next attempt to match depends on the internal structure (i.e. repeated substrings) of the first n symbols of the pattern P . It turns out that this problem can be efficiently solved by means of a suffix tree (cf. 1] 4] 5] [9], 14] 18] 28] 40] In particular, recently Chang and Lawler [11] used suffix trees to design an algorithm that on average needs O( jT j=jP j) log jP j) steps to find all occurrences of the pattern P of length jP j in the text T of length jT j. 2 From the above discussion, one concludes ....

....Arratia and Waterman [6] investigated a related problem, namely the longest contiguous matching within a single sequence, and obtained several interesting results in this direction. Their findings are related to the hight of a suffix tree. Finally, heuristic arguments were used by Blumer et al. [9] to show that the average number of internal nodes in a slightly different model of suffix trees is a linear function of n (more precisely, the coefficient of n contains an oscillating term) Jacquet and Szpankowski [18] established rigorously the latter result regarding the average size of ....

A. Blumer, A. Ehrenfeucht and D. Haussler, Average Size of Suffix Trees and DAWGS, Discrete Applied Mathematics, 24, 37-45 (1989).


Reducing the Space Requirement of Suffix Trees - Kurtz (1998)   (21 citations)  (Correct)

....would reduce the space requirement for each branch record to 4 integers, and the space complexity would be 5n integers, independent on the actual number q of branching nodes. However, q is usually considerably smaller than 0:8n (q = 0:62n is the theoretical average value for random strings, see [BEH89] so that this worst case improvement would result in a larger space usage in practice. Therefore we do not further consider it. Note that storing the nodes of the suffix tree in breadth first or depth first order to save the space for the brother or child references does not make sense here. ....

A. Blumer, A. Ehrenfeucht, and D. Haussler. Average Size of Suffix Trees and DAWGS. Discrete Applied Mathematics, 24:37--45, 1989.


Suffix Trees and String Complexity - O'Connor, Snider   (Correct)

....information concerning the span from considering the DAWG since enumerating automatons is difficult in general. Jansen proved several of his results using statistical and combinatorial arguments which did not refer to the DAWG (see x3.4 of the thesis) Some properties of the DAWG are presented in [6, 12]. 2.2 Pattern matching algorithms Pattern matching algorithms are concerned with methods for finding and or retrieving a pattern or substring w from a given piece of text Y [1] If there are to be repeated searches on the text Y , then Y may be preprocessed and stored in a particular data ....

A. Blumer, E. Ehrenfeucht, and D. Haussler. Average sizes of suffix trees and DAWGs. Discrete Applied Mathemetics, 24:37--45, 1989.


Direct construction of Compact Directed Acyclic Word Graphs - Crochemore, Vérin (1997)   (9 citations)  (Correct)

....methods are of no use to reduce the memory space of such indexes because they eliminate the direct access to substrings. On the contrary, the Compact Directed Acyclic Word Graph (CDAWG) keeps the direct access while requiring less memory space. The structure has been introduced by Blumer et al. [4, 5]) The automaton is based on the concatenation of factors issued from a same context. This concatenation induces the deletion of all states of outdegree one and of their corresponding transitions, excepting terminal states. This saves 50 of memory space. At the same time, the reduction of the ....

....of biological DNA sequences, considering them as words over the alphabet Sigma = fa; c; g; tg, we got that more than 60 of states have an outdegree one. So, the deletion of these states is worth, it provides an important saving. The average analysis of the number of states and edges is done in [5] in a Bernouilly model of probability. When a state p is deleted, the deletion of outgoing edges is realized by adding the label of the outgoing edge of the deleted state to the labels of its incoming edges. For example, let r, p and q be states linked by transitions (r; b; p) and (p; a; q) We ....

[Article contains additional citation context not shown here]

A. Blumer, D. Haussler, and A. Ehrenfeucht. Average sizes of suffix trees and dawgs. Discrete Applied Mathematics, 24:37--45, 1989.


Complexity Of Sequential Pattern Matching Algorithms - May Mireille Egnier   (Correct)

No context found.

A. Blumer, A. Ehrenfeucht and D. Haussler, Average Size of Suffix Trees and DAWGS, Discrete Applied Mathematics, 24, 37-45 (1989).


Practical Suffix Tree Construction - Tata, Hankins, Patel (2004)   (Correct)

No context found.

A. Blumer, A. Ehrenfeucht, and D. Haussler. Average Sizes of Suffix Trees and DAWGs. Discrete Applied Mathematics, 24(1):37--45, 1989.


Practical Suffix Tree Construction - Tata, Hankins, Patel (2004)   (Correct)

No context found.

A. Blumer, A. Ehrenfeucht, and D. Haussler. Average Sizes of Suffix Trees and DAWGs. Discrete Applied Mathematics, 24(1):37--45, 1989.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC