| FLAJOLET, P. On the performance evaluation of extendible hashing and trie searching. Acta Informatica 20 (1983), 345--369. |
.... [2] for IP routing tables [34] and text compression [9] Balanced tries (represented in a single array) are used as a data structure for dynamic hashing [18] This broad range of applications justifies the view of tries as a general purpose data structure, whose properties are well understood [17, 19, 29, 25, 41, 47]. As our experiments have confirmed, tries are fast but space intensive. The high memory usage of tries has long been recognized as a serious problem [13, 31] and many techniques have been proposed to reduce their size. Proposals for modifications to tries that address the issue of high memory ....
P. Flajolet. On the performance evaluation of extendible hashing and trie searching. Acta Informatica, 20:345--369, 1983.
....were proposed for these algorithms. Undoubtly, the most popular data structures for algorithms on words are digital trees [18, 21] e.g. tries, PATRICIA tries, digital search trees) and suffix trees [14, 30] The importance of digital trees stem from their applications in sorting and searching [5, 10, 15, 18, 21, 23, 26, 28, 29, 31], data compression [16, 30, 32, 33] pattern matching [14] the shortest common superstring problem [12] searching for a leader [9] estimating the number of questions necessary to identify many distinct objects [25] prediction [8] and so forth. These problems recently became very important due ....
P. Flajolet, On the Performance Evaluation of Extendible Hashing and Trie Searching, Acta Informatica, 20, 345--369, 1983.
.... list of references can be found in Handbook of Theoretical Computer Science [18] The expected average depth of a trie containing n independent random strings from a distribution with density function f 2 L 2 is (log n) 6] This result holds also for data from a Bernoulli type process [8, 9]. The best known compression technique for tries is path compression. The idea is simple: paths consisting of a sequence of single child nodes are compressed, as shown in Figure 1b. A path compressed binary trie is often referred to as a Patricia tree. Path compression may reduce the size of the ....
P. Flajolet. On the performance evaluation of extendible hashing and trie searching. Acta Informatica, 20:345--369, 1983.
....set X contains zero or one element. The advantage of the trie is that it only maintains the minimal pre x set of symbols that is necessary to distinguish all the elements of X . Digital trees are a common data structure found in numbers of algorithms on words such as sorting and searching [18, 5, 6, 14], data compression [16, 27, 28] and pattern matching [12] The need of ecient storage and transmission of multimedia [13] and applications to DNA sequencing [12] points out the importance of such data structures. Patricia tries have been introduced in 1968 by Morrison [21] This structure is a ....
Flajolet, P., On the Performance Evaluation of Extendible Hashing and Trie Searching, Acta Informatica, 20,345-369, 1983.
....analyzed. Knuth s book, vol. 3, 15] contains the rst analyses of parameters of tries, though these are restricted to additive parameters (number of internal nodes and external path length) in an essential way. The rst works regarding trie height are due to Yao [30] R egnier [21, 22] Flajolet [5], and Szpankowski [25, 26] When it appeared in 1992, Mahmoud s book [17] gave a general synthesis on trie analyses and the current state of knowledge. This paper is devoted to the average case analysis of another important parameter, namely the stack size. When choosing an order on the symbols of ....
Flajolet, P., On the performance of evaluation of extendible hashing and trie searching. Acta Informatica 20, 1983, 345-369.
....p 2 ) P k (n k 1 =2) e 1=2 and consequently n k = p 2 n k 1 =2: This completes the proof of Proposition 1. 5. Tries In this section we want to show that the height of tries can be handled in the same fashion as above to obtain the same result (which is well known, compare with [10, 4, 2, 11, 12]) Let Hn = H TR n denote the random variable describing the height of tries of size n. Then it is known that and set the generating functions G k (x) X n 0 P[Hn h] x n n (18) satisfy the recurrent relation G k 1 (x) G k (x=2) 2 (19) with G 0 (x) 1 x. Thus, G k (x) 1 ....
Ph. Flajolet, On the performance evaluation of extendible hashing and trie searching, Acta Inf. 20 (1983), 345-369.
.... list of references can be found in Handbook of Theoretical Computer Science [22] The expected average depth of a trie containing n independent random strings from a distribution with density function f 2 L 2 is (log n) 8] This result holds also for data from a Bernoulli type process [10, 11]. The best known compression technique for tries is path compression. The 2 idea is simple: paths consisting of a sequence of single child nodes are compressed, as shown in Figure 1b. A path compressed binary trie is often referred to as a Patricia trie. Path compression may reduce the size of ....
P. Flajolet. On the performance evaluation of extendible hashing and trie searching. Acta Informatica, 20:345-369, 1983.
....would need to be made in order for the expected value of the maximum queue length to be r. This was not the growth pattern we were interested in (we wanted to know what happens when m and n grow together) and these papers did not investigate the error terms nor the speed of convergence. Flajolet [1983] sharpened and extended the estimates made by these authors. The application addressed by Flajolet relates to trie searching, particularly the maximum depth of the trie directories. Like the previous papers, Flajolet [1983] obtains the number of memories as a function of the number of processors ....
....papers did not investigate the error terms nor the speed of convergence. Flajolet [1983] sharpened and extended the estimates made by these authors. The application addressed by Flajolet relates to trie searching, particularly the maximum depth of the trie directories. Like the previous papers, Flajolet [1983] obtains the number of memories as a function of the number of processors and the maximum queue length, whereas we obtain the queue length as a function of the number of processors and memories. Otherwise, the two problems are the same. 2 However, Flajolet [1983] uses methods entirely different ....
[Article contains additional citation context not shown here]
Flajolet, P. "On the Performance Evaluation of Extendible Hashing and Trie Searching," Acta Informatica, vol. 20, 1983, pp. 345--369.
.... of Theoretical Computer Implementing a dynamic compressed trie 26 Science [21] The expected average depth of a trie containing n independent random strings from a distribution with density function f 2 L 2 is Theta(log n) 7] This result holds also for data from a Bernoulli type process [9, 10]. The best known compression technique for tries is path compression. The idea is simple: paths consisting of a sequence of single child nodes are compressed, as shown in Figure 1b. A path compressed binary trie is often referred to as a Patricia trie. Path compression may reduce the size of the ....
P. Flajolet. On the performance evaluation of extendible hashing and trie searching. Acta Informatica, 20:345--369, 1983.
.... other work, we observe that traditional version control systems such as SCCS [68] or RCS [76] are different in the following respects: Applications cannot control the use of deltas, modeling of version control is mixed up with its realization, and deltas are y Analysis and simulation results of [29] and [53] indicate that the above mentioned dynamic hashing techniques might be superior to our approach in the case of large databases, which consist of many thousands of pages and which would require deep tries. z Research is under way to allow client processes to define complex events which ....
Ph. Flajolet. On the Performance Evaluation of the Extendible Hashing and Trie Searching. Acta Informatica, vol. 20, Springer Press, pp. 345-367 (1983).
....the degree of fan out dependent on key length. Binary tree access is commonly ignored in practice. Access paths of length greater than 14 in files exceeding 30K records is intolerable. However, one can attain large fan out n ary trees in tries , which label the edges with search key substrings [Bur76, Fla81]. Finding an effective set of edge labels (it must be a reasonably large disjoint set of key subsegments that appear in most keys) that is appropriate in a variety of applications is difficult. Consequently, trie retrieval tends to be reserved for special purpose applications. The access method ....
P. Flajolet, On the Performance Evaluation of Extendible Hashing and Trie Searching, RJ3258, IBM, San Jose CA, Oct. 1981.
....The string ruler method is used to show that the depth of a suffix tree does not differ significantly from the depth of an independent trie built over the same probabilistic model. We should mention here that such independent tries have been recently extensively analyzed, most notably in [8, 9, 15, 17, 21 24, 26 28]. In particular, Pittel [22] and Jacquet and R egnier [14] derived the limiting distribution for the depth in the independent model, while recently Jacquet and Szpankowski [15] extended this result to the Markovian model. Finally, in Section 4 we apply the string ruler approach to prove another ....
....the generating function Ez Dn = Ez Dn (i) for all i = 1; n, and it becomes 9cf. 15] Ez Dn = 1 Gamma 1 Gamma z n n X r=2 ( Gamma1) r n r r 1 1 Gamma z(p r q r ) 2:4) Asymptotics of (2. 4) were extensively studied in the past through the Mellin transform [8, 9, 14, 15, 17, 23, 24, 26, 27] and through probabilistic methods [6, 22] For instance, the average depth ED n is equal to ED n = 1=h 1 Delta log n 1=h 1 Delta (fl h 2 = 2h 1 ) P (log n) O(n Gamma1 ) where h 1 = Gammap log p Gamma q log q is the entropy of the alphabet, h 2 = p 2 log p q 2 log q, and P (log n) ....
[Article contains additional citation context not shown here]
P. Flajolet, On the Performance Evaluation of Extendible Hashing and Trie Searching, Acta Informatica, 20, 345-369 (1983).
....statistically independent. One may ask how close a random shape of a suffix tree resembles the shape of independent tries. In order to compare suffix trees and independent tries we need some results for tries. Fortunately, in recent years independent tries have been studied very extensively (cf. [FL, KN, PI, SZ1, SZ2]) In particular, it is proved that the average depth for tries is ED n = 1 E log n O(1) where E is the entropy of the alphabet, that is, E = Gamma P V i=1 p i log p i . For the average height the following result is known (cf. FL, PI, SZ2] EH n = 2 log P Gamma1 log n O(1) where P = P ....
....have been studied very extensively (cf. FL, KN, PI, SZ1, SZ2] In particular, it is proved that the average depth for tries is ED n = 1 E log n O(1) where E is the entropy of the alphabet, that is, E = Gamma P V i=1 p i log p i . For the average height the following result is known (cf. [FL, PI, SZ2]) EH n = 2 log P Gamma1 log n O(1) where P = P V i=1 p 2 i . Moreover, for independent keys it is proved that the variance of the depth varD n is either O(1) for symmetric alphabet (e.g. varD n = 3:507. for V = 2; cf. SZ1] or D n = E2 GammaE 2 E 3 log n O(1) for the asymmetric ....
P. Flajolet, On the Performance Evaluation of Extendible Hashing and Trie Searching, Acta Informatica, 20, 345--369 (1983).
....of a leaf node is greater than the global depth after a split, the directory doubles to make room to point the split leaf nodes. Let b be the maximum number of RIDs in a leaf node. When b 1, Flajolet has investigated the expected number D of directory entries of EH with a coarse approximation [4], where m is the inserted records. Therefore, the directory size of EH with sufficient large node size increases almost linearly as m grows. In fact, the expected directory size is negligible compared to the storage for leaf nodes. For example, if b = 1024, then after a million inserts, the number ....
P. Flajolet, "On the Performance Evaluation of Extendible Hashing and Tree Searching, " Acta Informatica, Vol. 20, 1983, pp. 345-369.
....D R(000) directory leaf nodes hash function key hash value 000 001 010 011 100 101 110 111 global depth local depth R(100) R(001) R(101) R(111) 3 2 3 3 1 R(110) R(010) Figure 2. Extendible hashing structure. tigated the expected number D of directory entries of EH with a coarse approximation (Flajolet 1983), where m is the inserted records. Therefore, the directory size of EH with sufficient large node size increases almost linearly as m grows. In fact, the expected directory size is negligible compared to the storage for leaf nodes. For example, if b = 1024, then after a million inserts, the number ....
Flajolet, P. (1983) "On the Performance Evaluation of Extendible Hashing and Trie Searching.
....are based on easy and straightforward approximations. Keywords: Analysis of algorithms, data management, data and file structures, hashing. 1 Introduction Extendible hashing (EH) was introduced and analyzed by Fagin et al. 3] and has been further studied by Mendelson [8] Yao [12] and Flajolet [4]. Their analysis concentrated on the asymptotic behavior for the different EH and Trie file characteristics, but followed the usual fixed bucket capacity for the leaf pages. This paper extends the analysis of EH found in [8] to cover Partial Expansions (PE) with elastic buckets [7] Partial ....
Flajolet, P. On the Performance Evaluation of Extendible Hashing and Trie Searching, Acta Informatica 20, (1983), 345--369.
....were proposed for these algorithms. Undoubtly, the most popular data structures for algorithms on words are digital trees [18, 21] e.g. tries, PATRICIA tries, digital search trees) and suffix trees [14, 30] The importance of digital trees stem from their applications in sorting and searching [5, 10, 15, 18, 21, 23, 26, 28, 29, 31], data compression [16, 30, 32, 33] pattern matching [14] the shortest common superstring problem [12] searching for a leader [9] estimating the number of questions necessary to identify many distinct objects [25] prediction [8] and so forth. These problems recently became very important due ....
P. Flajolet, On the Performance Evaluation of Extendible Hashing and Trie Searching, Acta Informatica, 20, 345--369, 1983.
....height of a map of cardinal n. Then, as n tends to 1, hn 2: log 2 n Recall that 1 dlog 2 ne is the smallest achievable height of such a tree. Hence the tree is not balanced, but not too much out of balance either. Proof: This property expresses the average height of a radix exchange tree [15][47] We give here another, basic proof. To find the average height of a tree, we must compute P 1 d=0 d:P =d n , where P =d n is the probability that the height of the tree is exactly d. But P =d n = P d n Gamma P d 1 n , so: hn = 1 X d=1 P d n and we shall find fine bounds ....
Philippe Flajolet. On the performance evaluation of extendible hashing and trie searching. Acta Informatica, 20:345--369, 1983.
....issue ) Ramanujan s Q function plays a vital role again. We will not describe this here, since the present issue is somehow built around this fascinating research (see also Knuth s paper in this issue) 6. Generalized Digital Trees Tries and digital search trees were covered in several papers, [36, 37, 41, 42, 54, 59, 62]. We have already discussed digital search trees in Section 4. However, the paper [103] is perhaps the most influential and original contribution in this area, and therefore we will briefly sketch it here. The recursion f n b = 1 n X k=0 2 Gamman n k (f k f n Gammak ) where b = ....
P. Flajolet. On the performance evaluation of extendible hashing and trie searching. Acta Informatica, 20:345--369, 1983.
No context found.
Flajolet, P. On the performance evaluation of extendible hashing and trie searching. Acta Inf. 20 (1983), 345369.
....were in the context of memoryless sources. For the additive parameters of size and path length, they were performed by De Bruijn and Knuth around 1965 and reported in the rst edition of [34] published in 1973. Height was later analysed under Poisson and Bernoulli models in a series of papers [16, 20, 46, 63]. Asymmetric memoryless sources and Markov chain models were then treated systematically by Pittel, as well as Szpankowski and his collaborators: see for instance [26, 27, 43, 54, 55] Devroye [11] has been the rst to consider the eoeect on tries of a nonuniform density but only in conjunction ....
....built on an ntuple of random items has height at most k under the Bernoulli model. The exponential generating function Pi k (z) of the k;n has value Pi k (z) Y jhj=k (1 zu h ) In the case of a memoryless source, the analysis can be conducted from there using the saddle point method [16, 20] and it even becomes completely elementary in the case of unbiased binary tries; see [39] We propose to follow precisely the same approach in order to analyse height in the case of a general dynamical source. The reader is also referred to Jacquet and Szpankowski s interesting paper [29] for a ....
[Article contains additional citation context not shown here]
Flajolet, P. On the performance evaluation of extendible hashing and trie searching. Acta Inf. 20 (1983), 345369.
....They concern the size of the directory which exhibits a non linear growth of the form n fi , fi 1. However, the non linearity factor is of the rough form n 1=b , so that the observed behaviour is practically linear provided small values of b are avoided. The estimates are due to Flajolet [8] and R egnier [22] They are based on occupancy statistics, saddle point estimates and Mellin transforms. Results in this paper indicate that, under paging conditions, trees of low degree (binary search trees and tries, quadtrees and quadtries, generalized digital trees) compare very favorably to ....
Flajolet, P. On the performance evaluation of extendible hashing and trie searching. Acta Inf. 20 (1983), 345--369.
No context found.
FLAJOLET, P. On the performance evaluation of extendible hashing and trie searching. Acta Informatica 20 (1983), 345--369.
No context found.
P. Flajolet. On the performance evaluation of extendible hashing and trie searching. Acta Informatica, 20(4):345--369, 1983.
No context found.
Flajolet, P. (1983). On the performance evaluation of extendible hashing and trie search. Acta Informatica, vol. 20, pp. 345--369.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC