| P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. In Proceedings of the 27th Annual ACM Symposium on the Theory of Computing (STOC '95), pages 693-702. ACM, May 1995. |
....(b) corresponding nested intervals; c) corresponding disjoint intervals and the equivalent set of disjoint prefixes. The first two characteristics mean that certain theoretically appealing solutions based on, e.g. suffix trees [21] string prefix matching [3, 4] or dynamic string searching [12] are not applicable, as their performance would not scale. Fortunately, the third characteristic means that specialized data structures can be designed with the desired performance levels. There are many papers in the literature proposing schemes to solve the IP routing problem [7, 8, 9, 10, 11, ....
P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. In Proc. 27th ACM STOC, pages 693--702, 1995.
....Such elements cannot be manipulated efficiently in standard B trees, which assumes that elements (and thus routing elements) are of unit size. Ferragina and Grossi developed the elegant string B tree where a query string q is routed through a node using a so called blind trie data structure [78]. A blind trie is a variant of the compacted trie [104, 117] which fits in one disk block. In this way a query can be answered in O(log B N jqj=B) I O. See [64, 80, 77, 20] for other results on string B trees and external string processing. 3 Buffer trees In internal memory, an N element ....
P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. In Proc. ACM Symp. on Theory of Computation, pages 693--702, 1995.
.... of sparse sux trees of K arkk ainen and Ukkonen [20] that consider a reduced set of sux, the sux cactus of K arkk ainen [19] who degenerates the sux tree structure without over charging too much the access time, and the version dedicated to external memory (SB trees) by Ferragina and Grossi [14], but several others exist (see [3] and [18] for example) An excellent solution to save size of sux structures is to simultaneously compact and minimize the sux trie. Compaction and minimization are commutative operations, and when applied both, they yield the compact sux automaton, denoted by ....
P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. Proceedings of the 27th ACM Symposium on the Theory of Computing, ACM Press, 1995.
.... sparse sux trees due to K arkk ainen and Ukkonen [38] which considers a reduced set of suxes, the sux cactus due to K arkk ainen [39] who degenerates the sux tree structure without increasing too much the access time, and the version dedicated to external memory (SB trees) by Ferragina and Grossi [40], but several other variations exist (see [41] and [42] for example) An excellent solution to save on the size of sux structures is to simultaneously compact and minimize the sux trie. Compaction and minimization are commutative operations, and when both are applied, they yield the compact sux ....
P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. Proceedings of the 27th ACM Symposium on the Theory of Computing, ACM Press, 1995.
.... The problem of implementing various classes of permutations has been addressed in [47, 48, 50] More recently researchers have moved on to more specialized problems in the computational geometry [11, 15, 34, 40, 67, 74, 79, 110, 121, 130, 137] graph [12, 40, 42, 97] and string areas [44, 56, 57]. As already mentioned the number of I O operations needed to read the entire input is N=B and for convenience we call this quotient n. We use the term scanning to describe the fundamental primitive of reading (or writing) all elements in a set stored contiguously in external memory by reading ....
P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. In Proc. ACM Symp. on Theory of Computation, pages 693--702, 1995.
.... eighties by Aggarwal and Vitter [6] and subsequently I O algorithms have been developed for several problem domains, including computational geometry [29, 7, 13, 14, 4, 15, 31, 38, 39, 41, 3, 44, 2, 12, 13, 16, 28, 30, 44] graph algorithms [17, 7, 33, 1, 21, 8, 27, 35, 40] and string processing [25, 26, 11, 20]. Also I O performance can often be improved if many disks can efficiently be used in parallel and the use of parallel disks has received a lot of theoretical attention. Recent surveys of theoretical results in the area of I O efficient algorithms can be found in [10, 9, 42, 43] TPIE, a ....
P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. In Proc. ACM Symp. on Theory of Computation, pages 693--702, 1995.
....with external keys we have a problem. Such a scan means that every pointer to an external key would have to be dereferenced, causing a horrendous amount of I O. A solution that has been proposed to this problem, in the 1 dimensional case, is to store a small in memory structure: an elided trie [4, 6]. An elided trie for a set of strings is obtained as follows. First, construct a compacted trie on the strings in question [14] Many edges in this trie will have multiple symbols, in situations where the first of these symbols determined the path to be taken down the trie, and there was no ....
....achieved by our implementation and one for the ideal ratio. It is evident that due to the design of the string B tree, the space efficiency of R trees improves when used in the string domain. 5 Related Work The work presented in this paper is based on the idea of a string B tree proposed in [6, 5, 4]. In these papers, the authors introduce the notion of an index structure designed for unbounded length strings, and use elided tries in the leaf pages of a BTree for this purpose. However, the work described is very specific to a (1 dimensional) B tree. Our contribution in this paper is to extend ....
P. Ferragina and R. Grossi. A Fully Dynamic Data Structure For External Substring Search. Proceedings of the 27th Annual ACM Symposium on the Theory of Computing, pages 693--702, May 1995.
....a stratified index over the PATRICIA tries. However, applying this savings to might lead us to read the wrong block, however, such errors are immediately detected, and the correct block is read. In Section 5 we compare our data structure with the String B tree proposed by Ferragina and Grossi [4, 3]. Finally in Section 6 we compare the performance of the new access method to B trees. 2 Tries Let Sigma be a finite set of characters (to be referred to as the alphabet) a string over Sigma is a finite sequence of characters, and Sigma denotes the set of all such strings. Let T be a ....
....PT, the search might lead us to a key K 6= P , so one has to check if K = P . For example, when 10 searching the key P = 10010, one follows the edge labeled 1 = P [0] to reach a vertex at depth 3, then the edge labeled 1 = P [3] to reach a vertex at depth 4, and finally the edge labeled 0 = P [4] to reach key K = 11110. node PT search(key P , PT T )f v =blind search(P , T ) if Key(v) P then return SUCCESS at v ; else return FAIL at v ; g Program PT Search 4.1.2 Insertion To insert a record R with a new key P , we first find its PT parent, i.e. a PT vertex y that points to ....
[Article contains additional citation context not shown here]
P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. In ACM Symposium on the Theory of Computing, pages 693--702, 1995.
.... The problem of implementing various classes of permutations has been addressed in [47, 48, 50] More recently researchers have moved on to more specialized problems in the computational geometry [11, 15, 34, 40, 67, 74, 79, 110, 121, 130, 137] graph [12, 40, 42, 97] and string areas [44, 56, 57]. As already mentioned the number of I O operations needed to read the entire input is N B and for convenience we call this quotient n. We use the term scanning to describe the fundamental primitive of reading (or writing) all elements in a set stored contiguously in external memory by reading ....
P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. In Proc. ACM Symp. on Theory of Computation, pages 693--702, 1995.
....is applied to a contour before it is displayed. 1. 2 Previous results In the last few years, considerable attention has been given to the development of I O efficient algorithms in many problem domains, including sorting and permuting [1, 38] computational geometry [2, 7, 22] string algorithms [6, 19], and graph algorithms [3, 11, 21, 27, 34] There has recently been growing interest in developing I O efficient geometric algorithms with applications in GIS [5, 7, 22] see also the recent survey by Arge [4] There has also been a lot of work in the database community on I O algorithms for GIS ....
P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. In Proc. ACM Symp. on Theory of Computation, pages 693--702, 1995.
.... 71] The problem of implementing various classes of permutations has been addressed in [38, 39, 41] More recently researchers have moved on to more specialized problems in the computational geometry [12, 14, 19, 32, 53, 99] graph theoretical [13, 14, 32, 34, 52, 66] and string processing areas [15, 35, 45, 46]. As already mentioned the number of I O operations needed to read the entire input is N=B and for convenience we call this quotient n. One normally uses the term scanning to describe the fundamental primitive of reading (or writing) all elements in a set stored contiguously in external memory by ....
P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. In Proc. ACM Symp. on Theory of Computation, pages 693--702, 1995.
.... matrix algebra and related problems arising in scientific computation [3, 51, 52] More recently, researchers have designed external memory algorithms for a number of problems in different areas, such as in computational geometry [32, 5, 53, 31, 2, 11, 34, 44, 47, 12, 50, 17, 1] string processing [28, 29, 9] and graph theoretic computation [6, 24, 38, 35] Some encouraging experimental results regarding the practical merits of the developed algorithms have also been obtained [23, 51, 11, 33] Recent surveys can be found in [7, 8] 1.3 Our Results In this paper, we combine and modify in novel ways ....
P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. In Proc. ACM Symp. on Theory of Computation, pages 693--702, 1995.
....h incoming pointers could be in di erent blocks and, when no speci c block strategy is adopted, moving u might cause the explicit update of all these pointers. If h is not bounded a priori, this single update becomes costly as we have to pay (h) I Os. More details on this problem can be found in [14]. 4 Related Work. There is a great deal of work on decomposable searching problems. They were introduced by Bentley [8] for dynamizing static data structures. The initial goal was to support insertions with low amortized times, without a ecting much of the query eciency. Other dynamization ....
P. Ferragina and R. Grossi. A fully dynamic data structure for external substring search. Proc. 27th ACM Symp. on Theory of Computing (1995), 693-702. Full version in: An external-memory indexing data structure and its applications. To appear in Journal of the ACM.
....tree [3, 65, 75] the repetition finder [82] and the subword tree [8, 24] The ability of the suffix tree to represent all the substrings in linear space has inspired several variations. The suffix array [76] cactus suffix array [63] dynamic suffix array [37] PAT array [50] and SB tree [38] are examples of arrays or trees containing the suffixes of the given string in the lexicographic order obtained by visiting the leaves of the corresponding suffix tree. The directed acyclic word graph (DAWG) and minimal suffix and factor automata [16, 28, 30] are either labeled graphs or automata ....
Ferragina, P., and Grossi, R., A fully-dynamic data structure for external substring search, Proc. ACM Symposium on Theory of Computing (1995).
....Among them, the SB tree is the most I O efficient in the worst case; it can be employed to sort a set of strings in O(K log B K N B ) I Os. This sorting algorithm can be converted into an optimal Theta(K log 2 K N) time algorithm for the internal comparison model by fixing B to a constant [21]. Very recently, Bentley and Sedgwick [14] emphasized the practical importance of string sorting and presented a version of quicksort that also achieves the optimal Theta(K log 2 K N) comparison bound. Analyzed in the I O model, however, the algorithm uses O(K log 2 K N) I Os, which is worse ....
P. Ferragina and R. Grossi. An external-memory indexing data structure with applications. Full version of STOC'95 paper "A fully-Dynamic data structure for external substring search", 1996.
.... done on matrix algebra and related problems arising in scientific computation [2, 45, 46] More recently, researchers have designed I O algorithms for a number of problems in different areas, such as in computational geometry [6, 10, 28] graph theoretic computation [6, 7, 16] and string matching [11, 17, 20, 22, 23]. Aggarwal and Vitter [2] proved that the number of I Os needed to sort N indivisible elements is Omega Gamma N B log M=B N B ) in the comparison I O model (where the order between two elements can be inferred only by their comparison or by transitivity) 1 They also proved that ....
....is developed, and using this technique the buffer tree is designed. Using this structure in the external insertion sort yields an optimal sorting algorithm. As far as the general string sorting problem is concerned, there are a number of data structures like prefix B trees [13] SB trees [20], compacted tries [38] suffix trees [17, 36] and suffix arrays [35] that can be used to sort arbitrarily long strings in external memory. Among them, the SB tree is the most I O efficient in the worst case; it can be employed to sort a set of strings in O(K log B K N B ) I Os. This sorting ....
[Article contains additional citation context not shown here]
P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. In Proc. ACM Symp. on Theory of Computation, pages 693--702, 1995.
....pointers could be in different blocks and, when no specific block strategy is adopted, moving u might cause the explicit update of all these pointers. If h is not bounded a priori, this single update becomes costly as we have to pay Theta(h) I Os. More details on this problem can be found in [12]. Related Work. There is a great deal of work on decomposable searching problems. They were introduced by Bentley [8] for dynamizing static data structures. The initial goal was to support insertions with low amortized times, without affecting much of the query efficiency. Other dynamization ....
P. Ferragina and R. Grossi. A Fully Dynamic Data Structure for External Substring Search, 27th ACM Symp. on Theory of Computing (1995). Full version in: An External-Memory
....if any, needed in D(U; N=K) for the sampled intervals from V i and V j (there are at most eight such intervals) take O(id(U; N=K) time in all. Finally, all amortized bounds can be converted to worst case ones by performing lazy split and merge operations (details in the full paper, see also [9]) Processing Stab(p) query is more involved. If p lies between two consecutive EV s sets, then Stab(p) returns the smallest interval (if any) containing p in D(U; N=K) which can be retrieved in O(sq(U; N=K) time. For the rest of this proof, assume p lies within some set EV i : this set can be ....
P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. In ACM Symp. on Theory of Computing, 693--702, 1995. Full version in Technical Report 18/96 , Dipartimento di Sistemi e Informatica, Universit'a di Firenze, Italy.
....this dynamic situation and the consequent fragmentation of the text, we need to design novel techniques and data structures that allow us to discard those spurious occurrences without examining all of them. In particular, we show how the String B tree data structure, recently proposed in [7], can be successfully applied to solve some of the subproblems arising in the implementation of the operations above. We start out by describing our representation of the changing text in Section 2, and presenting the high level ideas underlying the design of our algorithms in Section 3. We go on ....
....particular, Sections 7.1.1 and 7.1.2 present previous work while Section 7.1.3 presents an original result. We then give the solutions to Problems 1 4 in Sections 7.3.1 7.3.4. 7. 1 Some useful tools In this section, we describe the suffix tree [17, 14] and an application of the String B tree [7] called dynamic suffix array [5] We then introduce a new labelling technique different from Karp et al. s [11] 7.1.1 The Suffix Tree Definition. The suffix tree [14, 17] for a string X[1; m] denoted by STX , is a compacted trie that stores the suffixes of the augmented string X[1; m 1] in ....
[Article contains additional citation context not shown here]
P. Ferragina, R. Grossi. A fully-dynamic data structure for external substring search. ACM Symposium on the Theory of Computing, pages 693--702, 1995. Full version in http://www.di.unipi.it/¸ferragin and http://www.dsi.unifi.it/¸grossi
....and optical disks, prefix and range searching, string searching and sorting, suffix array, suffix tree, text indexing. AMS(MOS) subject classifications: 68P05, 68P10, 68P20, 68Q20, 68Q25. The results described in this paper were presented at the ACM Symposium on Theory of Computing (1995) see [18]. The first author was supported by MURST of Italy and by a Post Doctoral Fellowship at the Max Planck Institut fur Informatik, Saarbrucken, Germany (ferragin di.unipi.it) The second author was supported by MURST of Italy (grossi dsi.unifi.it) 1 Introduction Large scale heterogeneous ....
....external memory text indexing data structures whose performance is provably good in the worst case is important. In this paper, we introduce a new data structure, the String B Tree 1 which achieves this goal. In a short phrase, it is a com 1 The original name of the data structure was SB tree [18, 19]. Recently, Don Knuth pointed out the bination of B trees and Patricia tries for internal node indices that is made more effective by adding extra pointers to speed up search and update operations. In a certain sense, String B trees link external memory data structures to string matching data ....
Ferragina, P., and Grossi, R. A fully-dynamic data structure for external substring search. In ACM Symposium on the Theory of Computing (1995), pp. 693--702.
....in scientific literature under different names and studied in various forms (e.g. see [Apo85, CR94] The following is a brief list of them. 1. The first group is made up of data structures whose underlying topology is a trie. Among other things, they include compacted tries, blind tries [FG95a], Patricia trees [Mor68] subword trees [Apo85] and suffix trees [McC76] Given a string set S, these data structures have the following form: Each arc is labeled by a substring (possibly a character) Sibling arcs are ordered according to their first character (which are distinct) Each node ....
....and minimal suffix and factor automata allow us to perform simple pattern queries, whereas complete inverted files enable more complex queries. 3. Finally, the third group comprises lexicographically ordered data structures such as the suffix array [MM93] the PAT array [BG91] and the SB tree [FG95a]. They maintain strings (and possibly their suffixes) in lexicographic order. The space required is still optimal but the construction time is suboptimal by a logarithmic factor in the worst case (which disappears on average for suffix arrays [MM93] Searching for a pattern string takes optimal ....
[Article contains additional citation context not shown here]
P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. In Proc 27th STOC, pages 693--702, 1995.
....expected bounds. 1 Introduction There is a growing interest in algorithms working on sets of data that are too large to be fit in the internal memory of computers, and that consequently need to perform input output accesses to external storage devices, like disks and CD ROMs (see e.g. 4, 11, 19, 21, 29, 37] These devices are roughly 10 6 times slower than internal memory in terms of access time. In many applications, this disparity has given rise to an input output (or I O) bottleneck, in which the time spent on moving data between internal and external memory dominates the overall ....
P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. In Proc. 27th Annu. ACM Sympos. Theory Comput., 693--702, 1995.
No context found.
P. Ferragina and R. Grossi. A fully-dynamic data structure for external substring search. In Proceedings of the 27th Annual ACM Symposium on the Theory of Computing (STOC '95), pages 693-702. ACM, May 1995.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC