#### DMCA

## External Memory Algorithms and Data Structures (1998)

### Cached

### Download Links

Citations: | 349 - 23 self |

### Citations

3138 | Modern Information Retrieval
- Baeza-Yates, Ribeiro-Neto
- 1999
(Show Context)
Citation Context ...id method was introduced as the basis of the widely used GLIMPSE search tool [227]. Another way to index text is to use hashing to get small signatures for portions of text. The reader is referred to =-=[154; 61]-=- for more background on the above methods. 13.2 String B-Trees In a conventional B-tree, Θ(B) unit-sized keys are stored in each internal node to guide the searching, and thus the entire node fits int... |

1243 | B.: ‘The R*-tree: An Efficient and Robust Access Method for Points and Rectangles
- Beckmann, Kriegel, et al.
- 1990
(Show Context)
Citation Context ...he dynamic setting, there are several popular heuristics for where to insert new items into an R-tree and how to rebalance it; see [18; 158; 170] for a survey. The R*- tree variant of Beckmann et al. =-=[74]-=- seems to give best overall query performance. To insert an item, we start at the root and recursively insert the item into the subtree whose bounding box would expand the least in order to accommodat... |

744 |
A Fast String Searching Algorithm
- Boyer, Moore
- 1977
(Show Context)
Citation Context ...n inverted file is used to specify the chunks containing each word; the search within a chunk can be carried out by using a fast sequential method, such as the Knuth-Morris-Pratt [204] or Boyer-Moore =-=[84]-=- methods. This particular hybrid method was introduced as the basis of the widely used GLIMPSE search tool [227]. Another way to index text is to use hashing to get small signatures for portions of te... |

591 |
The input/output complexity of sorting and related problems
- Aggarwal, Jeffrey
- 1988
(Show Context)
Citation Context ...rix transposition. Hong and Kung [185] developed a pebbling model of I/O for straightline computations, and Savage and Vitter [281] extended the model to deal with block transfer. Aggarwal and Vitter =-=[23]-=- generalized Floyd’s I/O model to allow D simultaneous block transfers, but the model was unrealistic in that the D simultaneous transfers were allowed to take place on a single disk. They developed m... |

264 | Geometric range searching and its relatives
- Agarwal, Erickson
- 1999
(Show Context)
Citation Context ...ry performance is no better than that of cross trees, and for some methods, such as grid files, queries can require Θ(n) I/Os, even if there are no points satisfying the query. We refer the reader to =-=[18; 158; 246]-=- for a broad survey of these and other interesting methods. Space-filling curves arise again in connection with R-trees, which we describe next. 11.2 R-trees The R-tree of Guttman [176] and its many v... |

255 | Data structures for mobile data
- Basch, Guibas, et al.
- 1999
(Show Context)
Citation Context ...e same sort of data structure as for nonmoving data, but to update it whenever items move sufficiently far so as to trigger important combinatorial events that are relevant to the application at hand =-=[70]-=-. For example, an event relevant for range search might be triggered when two items move to the same horizontal displacement (which happens when the x-ordering of the two items is about to switch). A ... |

199 | Active disks: programming model, algorithms and evaluation
- Acharya, Uysal, et al.
- 1998
(Show Context)
Citation Context ...r parallel computation [122; 121; 215; 294]. Models of “active disks” augmented with processing capabilities to reduce data traffic to the host, especially during streaming applications, are given in =-=[4; 267]-=-. Models of microelectromechanical systems (MEMS) for mass storage appear in [171]. Some authors have studied problems that can be solved efficiently by making only External Memory Algorithms and Data... |

183 | Indexing moving points
- Arge, Argawal, et al.
- 2000
(Show Context)
Citation Context ... achieve the same bounds for closest pair by maintaining a well-separated pair decomposition. For finding nearest neighbors and approximate nearest neighbors, two other approaches are partition trees =-=[9; 10]-=- and localityExternal Memory Algorithms and Data Structures: Dealing with Massive Data · 55 sensitive hashing [164]. Planar point location is studied in [37; 306], and the dual problem of planar point... |

183 | An asymptotically optimal multiversion B-tree
- Becker, Gschwind, et al.
- 1996
(Show Context)
Citation Context ... performance in internal memory. Reducing the number of pointers allows a higher branching factor and thus faster search. Partially persistent versions of B-trees have been developed by Becker et al. =-=[73]-=-, Varman and Verma [309], and Arge et al. [37]. By persistent data structure, we mean that searches can be done with respect to any timestamp y [133; 134]. In a partially persistent data structure, on... |

149 | The Buffer Tree: A New Technique for Optimal I/O-Algorithms - Arge - 1995 |

149 | Cache-oblivious Btrees
- Bender, Demaine, et al.
(Show Context)
Citation Context ...archical models can be run in the PDM setting. Frigo et al. [156] introduce the notion of cache-oblivious algorithms, which require no knowledge of the storage parameters, like M and B. Bender et al. =-=[75]-=- and Bender et al. [76] develop cache-oblivious versions of B-trees. We refer the reader to [33] for a survey of cache-oblivious algorithms and data structures. The match between theory and practice i... |

132 |
A model for hierarchical memory
- Aggarwal, Alpern, et al.
- 1987
(Show Context)
Citation Context ... from registers at the small end to tertiary storage at the large end. Optimal algorithms for PDM often generalize in a recursive fashion to yield optimal algorithms in the hierarchical memory models =-=[20; 21; 317; 319]-=-. Conversely, the algorithms for hierarchical models can be run in the PDM setting. Frigo et al. [156] introduce the notion of cache-oblivious algorithms, which require no knowledge of the storage par... |

113 | Decomposable searching problems I: Static-to-dynamic transformation
- Bentley, Saxe
- 1980
(Show Context)
Citation Context ...ta and then computing the final result from the solutions to each subset. Dictionary search and range searching are obvious examples of decomposable problems. Bentley developed the logarithmic method =-=[79; 254]-=- to convert efficient static data structures for decomposable problems into general dynamic ones. In the internal memory setting, the logarithmic method consists of maintaining a series of static subs... |

110 | Data Streams – Models and Algorithms - Aggarwal - 2007 |

107 | The uniform memory hierarchy model of computation. Algorithmica
- Alpern, Carter, et al.
- 1994
(Show Context)
Citation Context ....1. algorithms as well as the following references: Aggarwal et al. [20] define an elegant hierarchical memory model, and Aggarwal et al. [21] augment it with block transfer capability. Alpern et al. =-=[29]-=- model levels of memory in which the memory size, block size, and bandwidth grow at uniform rates. Vitter and Shriver [319] and Vitter and Nodine [317] discuss parallel versions and variants of the hi... |

107 | Prefix b-trees
- Bayer, Unterauer
- 1977
(Show Context)
Citation Context ...ings could be stored instead in each node, but access to the strings during search would require more than a constant number of I/Os per node. In order to save space in each node, Bayer and Unterauer =-=[72]-=- investigated the use of prefix representations of keys. Ferragina and Grossi [148; 149] recently developed an elegant generalization of the B-tree called the String B-tree or simply SB-tree (not to b... |

103 | A functional approach to external graph algorithms. Algorithmica
- Abello, Buchsbaum, et al.
- 2002
(Show Context)
Citation Context ...th Massive Data · 23 key values among the N items, but all the items have different secondary information that must be maintained, and therefore items cannot be aggregated with a count. Abello et al. =-=[2]-=- and Matias et al. [229] develop optimal distribution sort algorithms for bundle sorting using BundleSort(N,K) = O ¡ n ·max©1, logm min{K,n}ª¢ (7) I/Os, and Matias et al. [229] prove the matching lowe... |

102 |
Hierarchical memory with block transfer
- Aggarwal, Chandra, et al.
- 1987
(Show Context)
Citation Context ... from registers at the small end to tertiary storage at the large end. Optimal algorithms for PDM often generalize in a recursive fashion to yield optimal algorithms in the hierarchical memory models =-=[20; 21; 317; 319]-=-. Conversely, the algorithms for hierarchical models can be run in the PDM setting. Frigo et al. [156] introduce the notion of cache-oblivious algorithms, which require no knowledge of the storage par... |

85 | On two-dimensional indexability and optimal range search indexing
- Arge, Samoladas, et al.
- 1999
(Show Context)
Citation Context ...hich in addition has optimal output complexity, appears in [11]. Range-max and stabbing-max queries are studied in [14; 16]. 11.4 Bootstrapping for Three-Sided Orthogonal 2-D Range Search Arge et al. =-=[49]-=- provide another example of the bootstrapping paradigm by developing an optimal dynamic EM data structure for three-sided orthogonal 2-D range searching (see Figure 9(c)) that meets all three design c... |

80 | Optimal dynamic interval management in external memory - Arge, Vitter - 1996 |

78 | The Priority R-tree: a practically efficient and worst-case optimal R-tree
- Arge, Berg, et al.
(Show Context)
Citation Context ... as range searching and point location. Similar questions apply to the methods discussed in Section 11.1. New R-tree partitioning methods by de Berg et al. [119], Agarwal et al. [17], and Arge et al. =-=[38]-=- provide some provable bounds on overlap and query performance. In the static setting, in which there are no updates, constructing the R*-tree by repeated insertions, one by one, is extremely slow. A ... |

75 | The buffer tree: A technique for designing batched external data structures
- Arge
(Show Context)
Citation Context ...ality of the EM algorithm given in [100] for list ranking assumes that √ m log m = Ω(log n), which is usually true in practice. That assumption can be removed by use of the buffer tree data structure =-=[32]-=- discussed in Section 10.4. A practical randomized implementation of list ranking appears in [292]. Dehne et al. [122; 121] and Sibeyn and Kaufmann [294] use a related approach and get efficient I/O b... |

73 | External memory data structures
- Arge
- 2001
(Show Context)
Citation Context ... equivalent in the randomized sense. Deterministic simulations on the other hand require a factor of log(N/D)/ log log(N/D) more I/Os [58]. Surveys of I/O models, algorithms, and challenges appear in =-=[3; 31; 163; 235; 290]-=-. Several versions of PDM have been developed for parallel computation [122; 121; 215; 294]. Models of “active disks” augmented with processing capabilities to reduce data traffic to the host, especia... |

69 |
Multidimensional Divide and Conquer
- Bentley
- 1980
(Show Context)
Citation Context ...sed as a subroutine in the standard sweep line algorithm in order to get an optimal EM algorithm for orthogonal segment intersection. Arge showed how to extend buffer trees to implement segment trees =-=[78]-=- in external memory in a batched dynamic setting by reducing the node degrees to Θ( √ m ) and by introducing multislabs in each node, which were explained in Section 7 for the related batched problem ... |

68 | Scalable sweeping-based spatial join
- Arge, Procopiuc, et al.
- 1998
(Show Context)
Citation Context ... preprocessing, each of the O(logm n) sweeps (one per level of recursion) takes O(n) I/Os, yielding the desired bound (13). The resulting algorithm, called Scalable Sweeping-Based Spatial Join (SSSJ) =-=[46; 47]-=-, outperforms other techniques for rectangle intersection. It was tested against two other sweep line algorithms: the Partition-Based Spatial-Merge (QPBSM) used in Paradise [259] and a faster version ... |

67 | External-memory algorithms for processing line segments in geographic information systems
- Arge, Vengroff, et al.
(Show Context)
Citation Context ...ries in 2-D Constructive Solid Geometry (CSG) models of size N . The parameters Q and Z are set to 0 if they are not relevant for the particular problem. Goodrich et al. [166], Zhu [335], Arge et al. =-=[55]-=-, Arge et al. [47], and Crauser et al. [115; 116] develop EM algorithms for those problems using these EM paradigms for batched problems: Distribution sweeping, a generalization of the distribution pa... |

63 | Simple Randomized Mergesort on Parallel Disks
- Barve, Grove, et al.
- 1996
(Show Context)
Citation Context ...ing algorithms that use disks independently. The algorithms are based upon the important distribution and merge paradigms, which are two generic approaches to sorting. The SRM method and its variants =-=[65; 69; 126; 188]-=-, which are based upon a randomized merge technique, outperform disk striping in practice for reasonable values of D. All the algorithms use online load balancing strategies so that the data items acc... |

55 | Efficient searching with linear constraints - Agarwal, Arge, et al. |

51 | Improving the Query Performance of High-Dimensional Index Structures Using Bulk-Load Operations
- Berchtold, Kriegel
- 1998
(Show Context)
Citation Context ... method related to [192] was presented in [193]. The quality of the Hilbert R-tree in terms of query performance is generally not as good as that of an R*-tree, especially for higher-dimensional data =-=[80; 194]-=-. In order to get the best of both worlds—the query performance of R*-trees and the bulk construction efficiency of Hilbert R-trees—Arge et al. [40] and van den Bercken et al. [307] independently devi... |

48 | Efficient Bulk Operations on Dynamic R-Trees
- Arge, Hinrichs, et al.
- 1999
(Show Context)
Citation Context ...*-tree, especially for higher-dimensional data [80; 194]. In order to get the best of both worlds—the query performance of R*-trees and the bulk construction efficiency of Hilbert R-trees—Arge et al. =-=[40]-=- and van den Bercken et al. [307] independently devised fast bulk loading methods based upon buffer trees that do top-down construction in O(n logm n) I/Os, which matches the performance of the bottom... |

46 | Optimal external memory interval management
- Arge, Vitter
- 2003
(Show Context)
Citation Context ...g data. Lehman and Yao [212], Mohan [238], Lomet and Salzberg [220], and Bender et al. [77] explore mechanisms to add concurrency and recovery to B-trees. 10.2 Weight-Balanced B-trees Arge and Vitter =-=[56]-=- introduce a powerful variant of B-trees called weight-balanced B-trees, with the property that the weight of any subtree at level h (i.e., the number of nodes in the subtree rooted at a node of heigh... |

42 | The r*-tree: An e cient and robust access method for points and rectangles - Beckmann, Kriegel, et al. - 1990 |

37 | Almost) optimal parallel block access for range queries
- Atallah, Prabhakar
- 2000
(Show Context)
Citation Context ...st the k − 2 ones of interest, but we can discard the starting points we don’t need.) The total number of I/Os to answer the range query is thus O(logB N + z), which is optimal. Atallah and Prabhakar =-=[59]-=- and Bhatia et al. [82] consider the problem of how to tile a multidimensional array of blocks onto parallel disks so that range queries on a range queries can be answered in near-optimal time. 54 · J... |

35 | A general lower bound on the I/O-complexity of comparison-based algorithms. Personal communication
- Arge, Knudsen, et al.
(Show Context)
Citation Context ...er space to serve as write queues. The read and write bounds can be improved with a corresponding tradeoff in redundancy and internal memory space. 5.5 Handling Duplicates: Bundle Sorting Arge et al. =-=[41]-=- describe a single-disk merge sort algorithm for the problem of duplicate removal, in which there are a total of K distinct items among the N items. When duplicates get grouped together during a merge... |

35 | Worst-case efficient external-memory priority queues
- Brodal, Katajainen
- 1998
(Show Context)
Citation Context ...air decomposition of N points in d dimensions in O ¡ Sort(N) ¢ I/Os, and they apply it to the problems of finding the K nearest neighbors for each point and the K closest pairs. Brodal and Katajainen =-=[88]-=- provide a worst-case optimal priority queue, in the sense that every sequence of B insert and delete min operations requires only O(logm n) I/Os. Practical implementations of priority queues based up... |

33 | Cache-Oblivious String B-trees
- Bender, Farach-Colton, et al.
- 2006
(Show Context)
Citation Context ...run in the PDM setting. Frigo et al. [156] introduce the notion of cache-oblivious algorithms, which require no knowledge of the storage parameters, like M and B. Bender et al. [75] and Bender et al. =-=[76]-=- develop cache-oblivious versions of B-trees. We refer the reader to [33] for a survey of cache-oblivious algorithms and data structures. The match between theory and practice is harder to establish f... |

31 | Maintaining approximate extent measures of movingpoints
- AGARWAL, HAR-PELED
- 2002
(Show Context)
Citation Context ...o account for the movement of the items stored within. They maintain an outer approximation of the true bounding box, which they periodically update to refine the approximation. Agarwal and Har-Peled =-=[19]-=- show how to maintain a provably good approximation of the minimum bounding box with need for only a constant number of refinement events. Agarwal et al. [13] develop persistent data structures where ... |

31 |
On sorting strings in external memory
- Arge, Ferragina, et al.
- 1997
(Show Context)
Citation Context ... et al. [102] demonstrate a duality between text indexing and range searching and use it to derive improved EM algorithms and stronger lower bounds for text indexing. 13.4 Sorting Strings Arge et al. =-=[39]-=- consider several models for the problem of sorting K strings of total length N in external memory. They develop efficient sorting algorithms in these models, making use of the SB-tree, buffer tree te... |

31 |
Organization of large ordered indexes
- Bayer, McCreight
- 1972
(Show Context)
Citation Context ...exploit block transfer, trees in external memory generally use a block for each node, which can store Θ(B) pointers and data values. The well-known balanced multiway B-tree due to Bayer and McCreight =-=[71; 108; 202]-=-, is the most widely used nontrivial EM data structure. The degree of each node in the B-tree (with the exception of the root) is required to be Θ(B), which guarantees that the height of a B-tree stor... |

31 | Implementing I/O-efficient data structures using TPIE
- Arge, Procopiuc, et al.
- 2002
(Show Context)
Citation Context ...system views computation as a continuous process during which a program is fed streams of data from an outside source and leaves trails of results behind. TPIE (Transparent Parallel I/O Environment)3 =-=[40; 48; 304; 311]-=- provides a framework-oriented interface for 3The TPIE software distribution is available free of charge at http://www.cs.duke.edu/TPIE/ on the World Wide Web. 62 · J. S. Vitter batched computation, a... |

30 | On external-memory MST, SSSP and multi-way planar graph separation
- Arge, Brodal, et al.
- 2000
(Show Context)
Citation Context ...s for binary heaps and tournament trees. Munagala and Ranade [242] give improved graph algorithms for connectivity and undirected breadth-first search (BFS). Their approach is extended by Arge et al. =-=[34]-=- to compute the minimum spanning forest (MSF) and by Mehlhorn and External Memory Algorithms and Data Structures: Dealing with Massive Data · 33 Graph Problem I/O Bound, D = 1 List ranking, Euler tour... |

29 | I/O-efficient dynamic planar point location
- Arge, Vahrenhold
- 2004
(Show Context)
Citation Context ...rtized per insertion. By global rebuilding we can do deletions in O(logB N) I/Os amortized. As in the internal memory case, the amortized updates can typically be made worst-case. Arge and Vahrenhold =-=[54]-=- obtain I/O bounds for dynamic point location in general planar subdivisions similar to those of [7], but without use of level-balanced trees. Their method uses a weight-balanced base structure at the... |

28 | Box-trees and R-trees with near-optimal query time
- Agarwal, Berg, et al.
(Show Context)
Citation Context ...rees for problems such as range searching and point location. Similar questions apply to the methods discussed in Section 11.1. New R-tree partitioning methods by de Berg et al. [119], Agarwal et al. =-=[17]-=-, and Arge et al. [38] provide some provable bounds on overlap and query performance. In the static setting, in which there are no updates, constructing the R*-tree by repeated insertions, one by one,... |

27 | External-memory algorithms with applications in geographic information systems - Arge - 1997 |

26 | I/O-efficient algorithms for problems on grid-based terrains
- Arge, Toma, et al.
- 2000
(Show Context)
Citation Context ...s Other batched geometric problems studied in the PDM model include range counting queries [221], constrained Delauney triangulation [15], and a host of problems on terrains and grid-based GIS models =-=[6; 8; 35; 36; 52; 178]-=-. Breimann and Vahrenhold [85] survey several EM problems in computational geometry. 8. BATCHED PROBLEMS ON GRAPHS We adopt the convention that the edges of the input graph, each of the form (u, v) fo... |

25 | A computational study of external-memory BFS algorithms
- Ajwani, Dementiev, et al.
- 2006
(Show Context)
Citation Context ...forest, respectively. We use w and W to denote the minimum and maximum weights in a weighted graph. The lower bounds are discussed in Section 5.8. 34 · J. S. Vitter Meyer [232] for BFS. Ajwani et al. =-=[26]-=- do a computational study of EM algorithms for BFS. Dementiev et al. [127] implement practical EM algorithms for MSF, and Dementiev [123] gives practical EM implementations for approximate graph color... |

24 | On External-Memory Planar Depth First Search
- Arge, Meyer, et al.
- 2001
(Show Context)
Citation Context ... several of these problems can be solved substantially faster in O ¡ Sort(E) ¢ I/Os [11; 53; 100; 225; 222; 223; 303]. Other EM algorithms for planar, near-planar, and bounded-degree graphs appear in =-=[43; 50; 51; 57; 177; 234]-=-. 8.3 Sequential Simulation of Parallel Algorithms Chiang et al. [100] exploit the key idea that efficient EM algorithms can often be developed by a sequential simulation of a parallel algorithm for t... |

23 | I/O-efficient batched union-find and its applications to terrain analysis
- Agarwal, Arge, et al.
- 2006
(Show Context)
Citation Context ...s Other batched geometric problems studied in the PDM model include range counting queries [221], constrained Delauney triangulation [15], and a host of problems on terrains and grid-based GIS models =-=[6; 8; 35; 36; 52; 178]-=-. Breimann and Vahrenhold [85] survey several EM problems in computational geometry. 8. BATCHED PROBLEMS ON GRAPHS We adopt the convention that the edges of the input graph, each of the form (u, v) fo... |

22 |
Hash functions for priority queues
- Ajtai, Fredman, et al.
- 1984
(Show Context)
Citation Context ...he pointers to the B children of the SB-tree node are also stored at the leaves. is based upon a variant of the Patricia trie character-based data structure [202; 239] along the lines of Ajtai et al. =-=[25]-=-. It achieves B-way branching with a total storage of O(B) characters, which fit in O(1) blocks. Each of its internal nodes stores an index (a number from 0 to L, where L is the maximum length of a st... |

22 | The I/O-complexity of ordered binary-decision diagram manipulation
- Arge
- 1995
(Show Context)
Citation Context ... the remaining case of the lower bound for BundleSort(N,K) by a potential argument based upon the transposition lower bound. Dividing by D gives the lower bound for D disks. Chiang et al. [100], Arge =-=[30]-=-, Arge and Miltersen [44], Munagala and Ranade [242], and Erickson [141] give models and lower bound reductions for several computational geometry and graph problems. The geometry problems discussed i... |

20 | On Showing Lower Bounds for External-Memory Computational Geometry Problems. External Memory Algorithms and Visualization
- Arge, Miltersen
- 1999
(Show Context)
Citation Context ...e lower bound for BundleSort(N,K) by a potential argument based upon the transposition lower bound. Dividing by D gives the lower bound for D disks. Chiang et al. [100], Arge [30], Arge and Miltersen =-=[44]-=-, Munagala and Ranade [242], and Erickson [141] give models and lower bound reductions for several computational geometry and graph problems. The geometry problems discussed in Section 7 are equivalen... |

20 | Theory and Practice of I/OEfficient Algorithms for Multidimensional Batched Searching Problems
- Arge, Procopiuc, et al.
- 1998
(Show Context)
Citation Context ...uctive Solid Geometry (CSG) models of size N . The parameters Q and Z are set to 0 if they are not relevant for the particular problem. Goodrich et al. [166], Zhu [335], Arge et al. [55], Arge et al. =-=[47]-=-, and Crauser et al. [115; 116] develop EM algorithms for those problems using these EM paradigms for batched problems: Distribution sweeping, a generalization of the distribution paradigm of Section ... |

20 | A framework for index bulk loading and dynamization
- Agarwal, Arge, et al.
- 2001
(Show Context)
Citation Context ...rst-case) way with only O(1) I/Os. Such applications are very common when the the nodes have secondary structures, as in multidimensional search trees, or when rebuilding is expensive. Agarwal et al. =-=[12]-=- apply weight-balanced B-trees to convert partition trees such as kd-trees, BBD trees, and BAR trees, which were designed for internal memory, into efficient EM data structures. Weight-balanced trees ... |

20 | A Unified Approach for Indexed and Non-Indexed Spatial Joins
- Arge, Procopiuc, et al.
(Show Context)
Citation Context ... join that access preexisting index structures (and thus do random I/O) can often be slower in practice than algorithms that access substantially more data but in a sequential order (as in streaming) =-=[45]-=-. It is thus helpful not only to consider the number of block transfers, but also to distinguish between the I/Os that are random versus those that are sequential. In some applications, automated dyna... |

19 | Optimal parallel sorting in multi-level storage
- Aggarwal, Plaxton
- 1994
(Show Context)
Citation Context ...on for error correction and recovery. Chaudhry and Cormen [92] show experimentally that oblivious algorithms such as Columnsort work well in the context of cluster-based sorting. Aggarwal and Plaxton =-=[22]-=- developed an optimal deterministic merge sort based upon the Sharesort hypercube parallel sorting algorithm [118]. To guarantee even distribution during the merging, it employs two high-level merging... |

19 | I/O-efficient construction of constrained Delaunay triangulations
- Agarwal, Arge, et al.
- 2005
(Show Context)
Citation Context ...ly increasing by a factor of m. 7.2 Other Batched Geometric Problems Other batched geometric problems studied in the PDM model include range counting queries [221], constrained Delauney triangulation =-=[15]-=-, and a host of problems on terrains and grid-based GIS models [6; 8; 35; 36; 52; 178]. Breimann and Vahrenhold [85] survey several EM problems in computational geometry. 8. BATCHED PROBLEMS ON GRAPHS... |

19 |
On the Average Number of Rebalancing Operations in Weight-Balanced Trees
- Blum, Mehlhorn
- 1980
(Show Context)
Citation Context ...ced B-trees to convert partition trees such as kd-trees, BBD trees, and BAR trees, which were designed for internal memory, into efficient EM data structures. Weight-balanced trees called BB[α]-trees =-=[83; 245]-=- have been designed for internal External Memory Algorithms and Data Structures: Dealing with Massive Data · 41 memory; they maintain balance via rotations, which is appropriate for binary trees, but ... |

18 | Cache-oblivious data structures
- Arge, Brodal, et al.
- 2005
(Show Context)
Citation Context ...oblivious algorithms, which require no knowledge of the storage parameters, like M and B. Bender et al. [75] and Bender et al. [76] develop cache-oblivious versions of B-trees. We refer the reader to =-=[33]-=- for a survey of cache-oblivious algorithms and data structures. The match between theory and practice is harder to establish for hierarchical models and caches than for disks. Generally, the most sig... |

16 |
A B + -tree structure for large quadtrees
- Abel
- 1984
(Show Context)
Citation Context ...nize the points into a B-tree [161; 193; 253]. Linearization can also be used to represent nonpoint data, in which the data items are partitioned into one or more multidimensional rectangular regions =-=[1; 252]-=-. All the methods described in this paragraph use linear space, and they work well in certain situations; however, their worst-case range query performance is no better than that of cross trees, and f... |

16 | I/O-Efficient dynamic point location in monotone planar subdivisions
- AGARWAL, ARGE, et al.
- 1999
(Show Context)
Citation Context ...e B-tree from the leaves for x and y until we reach their common ancestor. Order queries arise in online algorithms for planar point location and for determining reachability in monotone subdivisions =-=[7]-=-. If we augment a conventional B-tree with parent pointers, then each split operation costs Θ(B) I/Os to update parent pointers, although the I/O cost is only O(1) when amortized over the updates to t... |

16 | Modeling and optimizing i/o throughput of multiple disks on a bus
- Barve, Shriver, et al.
- 1999
(Show Context)
Citation Context ...= 2. The input data items are initially striped block-by-block across the disks. For example, data items 16 and 17 are stored in the second block (i.e., in stripe 1) of disk D3. 10 · J. S. Vitter al. =-=[67]-=-, and Farach-Colton et al. [145], distinguish between sequential reads and random reads and consider the effects of features such as disk buffer caches and shared buses, which can reduce the time per ... |

15 | Integrated prefetching and caching in single and parallel disk systems
- Albers, Bttner
- 2003
(Show Context)
Citation Context ...efetching We can get further improvements in merge sort by a more careful prefetching schedule for the runs. Barve et al. [66], Kallahalla and Varman [190; 191], Albers and 22 · J. S. Vitter Büttner =-=[27; 28]-=-, Shah et al. [287; 288], and Hon et al. [184] have developed competitive and optimal methods for prefetching blocks in parallel I/O systems. Hutchinson et al. [188] have demonstrated a powerful duali... |

14 | I/Oecient point location using persistent Btrees
- Arge, Danner, et al.
(Show Context)
Citation Context ...number of pointers allows a higher branching factor and thus faster search. Partially persistent versions of B-trees have been developed by Becker et al. [73], Varman and Verma [309], and Arge et al. =-=[37]-=-. By persistent data structure, we mean that searches can be done with respect to any timestamp y [133; 134]. In a partially persistent data structure, only the most recent version of the data structu... |

13 | A theoretical framework for memory-adaptive algorithms
- BARVE, VITTER
- 1999
(Show Context)
Citation Context ...ta · 63 previous sections assume a fixed memory allocation; they must resort to virtual memory if the memory allocation is reduced, often causing a severe degradation in performance. Barve and Vitter =-=[68]-=- discuss the design and analysis of EM algorithms that adapt gracefully to changing memory allocations. In their model, without loss of generality, an algorithm (or program) P is allocated internal me... |

11 | New coding techniques for improved bandwidth utilization
- Adler
- 1996
(Show Context)
Citation Context ...he desired output. It is conjectured that the sorting lower bound (6) remains valid even if the indivisibility assumption is lifted. However, for an artificial problem related to transposition, Adler =-=[5]-=- showed that removing the indivisibility assumption can lead to faster algorithms. A similar result is shown by Arge and Miltersen [44] for the decision problem of determining if N data item values ar... |

11 |
I/O-efficient algorithms for contour line extraction and planar graph blocking
- Agarwal, Arge, et al.
- 1998
(Show Context)
Citation Context ...tion. Arge [30] gives efficient algorithms for constructing ordered binary decision diagrams. Techniques for storing graphs on disks for efficient traversal and shortest path queries are discussed in =-=[11; 165; 187; 247]-=-. Computing wavelet decompositions and histograms [320; 321; 323] is an EM graph problem related to transposition that arises in On-Line Analytical Processing (OLAP). Wang et al. [322] give an I/Oeffi... |

10 | I/O-efficient topological sorting of planar dags
- Arge, Toma, et al.
- 2003
(Show Context)
Citation Context ...ort(V )+V ¢ . For special cases, such as trees, planar graphs, outerplanar graphs, and graphs of bounded tree width, several of these problems can be solved substantially faster in O ¡ Sort(E) ¢ I/Os =-=[11; 53; 100; 225; 222; 223; 303]-=-. Other EM algorithms for planar, near-planar, and bounded-degree graphs appear in [43; 50; 51; 57; 177; 234]. 8.3 Sequential Simulation of Parallel Algorithms Chiang et al. [100] exploit the key idea... |

10 | A simple and efficient parallel disk mergesort
- Barve, Vitter
- 1999
(Show Context)
Citation Context ...ing algorithms that use disks independently. The algorithms are based upon the important distribution and merge paradigms, which are two generic approaches to sorting. The SRM method and its variants =-=[65; 69; 126; 188]-=-, which are based upon a randomized merge technique, outperform disk striping in practice for reasonable values of D. All the algorithms use online load balancing strategies so that the data items acc... |

9 |
External Memory Algorithms for Diameter and All-Pairs Shortest-Paths on Sparse Graphs
- Arge, Meyer, et al.
- 2004
(Show Context)
Citation Context ... o´ [89; 100; 208] Transitive closure O V v q e m ´ [100] Undirected all-pairs shortest paths O ¡ V √ V e + V e logE ¢ [103] Diameter, Undirected unweighted all-pairs shortest paths O ¡ V Sort(E) ¢ =-=[42; 103]-=- Table 4. Best known I/O bounds for batched graph problems for the single-disk case D = 1. The number of vertices is denoted by V = vB and the number of edges by E = eB; for simplicity, we assume that... |

8 | A functional approach to external memory graph algorithms - Abello, Buchsbaum, et al. - 1998 |

8 | An optimal dynamic interval stabbing-max data structure
- Agarwal, Arge, et al.
- 2005
(Show Context)
Citation Context ...f the isosurface (or contour) of a surface. A data structure for a related problem, which in addition has optimal output complexity, appears in [11]. Range-max and stabbing-max queries are studied in =-=[14; 16]-=-. 11.4 Bootstrapping for Three-Sided Orthogonal 2-D Range Search Arge et al. [49] provide another example of the bootstrapping paradigm by developing an optimal dynamic EM data structure for three-sid... |

8 | I/O-efficient strong connectivity and depth-first search for directed planar graphs - Arge, Zeh - 2003 |

7 | External memory algorithms with dynamically changing memory allocations, tech - BARVE, VITTER - 1998 |

7 | Optimal external memory planar point enclosure
- Arge, Samoladas, et al.
(Show Context)
Citation Context ...ory Algorithms and Data Structures: Dealing with Massive Data · 55 sensitive hashing [164]. Planar point location is studied in [37; 306], and the dual problem of planar point enclosure is studied in =-=[50]-=-. Numerous other data structures have been developed for range queries and related problems on spatial data. We refer to [18; 31; 158; 246] for a broad survey. 11.7 Lower Bounds for Orthogonal Range S... |

6 | Baeza-Yates. Expected behaviour of B + - trees under random insertions - A - 1989 |

6 | I/O-e cient dynamic point location in monotone planar subdivisions - Agarwal, Brodal, et al. - 1999 |

6 | I/O-efficient hierarchical watershed decomposition of grid terrain models
- Arge, Danner, et al.
- 2006
(Show Context)
Citation Context ...s Other batched geometric problems studied in the PDM model include range counting queries [221], constrained Delauney triangulation [15], and a host of problems on terrains and grid-based GIS models =-=[6; 8; 35; 36; 52; 178]-=-. Breimann and Vahrenhold [85] survey several EM problems in computational geometry. 8. BATCHED PROBLEMS ON GRAPHS We adopt the convention that the edges of the input graph, each of the form (u, v) fo... |

5 | Theory and practice of I/Oe cient algorithms for multidimensional batched searching problems - Arge, Procopiuc, et al. - 1998 |

5 | Time responsive external data structures for moving points
- Agarwal, Arge, et al.
- 2001
(Show Context)
Citation Context ... the approximation. Agarwal and Har-Peled [19] show how to maintain a provably good approximation of the minimum bounding box with need for only a constant number of refinement events. Agarwal et al. =-=[13]-=- develop persistent data structures where query time degrades the further the time fame of the query is from the current time. 13. STRING PROCESSING In this section we survey methods used to process s... |

5 | Simplified external memory algorithms for planar DAGs
- Arge, Toma
- 2004
(Show Context)
Citation Context ... several of these problems can be solved substantially faster in O ¡ Sort(E) ¢ I/Os [11; 53; 100; 225; 222; 223; 303]. Other EM algorithms for planar, near-planar, and bounded-degree graphs appear in =-=[43; 50; 51; 57; 177; 234]-=-. 8.3 Sequential Simulation of Parallel Algorithms Chiang et al. [100] exploit the key idea that efficient EM algorithms can often be developed by a sequential simulation of a parallel algorithm for t... |

4 | The Lanczos algorithm - Scott - 1981 |

4 | I/O-ecient algorithms for contour line extraction and planar graph blocking - Agarwal, Arge, et al. - 1998 |

4 | Worst-case e cient external-memory priority queues - Brodal, Katajainen |

4 |
Performance of b+-trees with partial expansions
- Baeza-Yates, Larson
- 1989
(Show Context)
Citation Context ...can be increased further by sharing among several siblings, at the cost of more complicated insertions and deletions. Some helpful space-saving techniques borrowed from hashing are partial expansions =-=[64]-=- and use of overflow nodes [295]. A cross between B-trees and hashing, where each subtree rooted at a certain level of the B-tree is instead organized as an external hash table, was developed by Litwi... |

3 | E cient bulk operations on dynamic R-trees - Arge, Hinrichs, et al. - 1999 |

3 |
Bounds on the separation of two parallel disk models
- Armen
- 1996
(Show Context)
Citation Context ...onstant factor more I/Os, thus making the two models theoretically equivalent in the randomized sense. Deterministic simulations on the other hand require a factor of log(N/D)/ log log(N/D) more I/Os =-=[58]-=-. Surveys of I/O models, algorithms, and challenges appear in [3; 31; 163; 235; 290]. Several versions of PDM have been developed for parallel computation [122; 121; 215; 294]. Models of “active disks... |

3 | Analysis of linear hashing revisited
- Baeza-Yates, Soza-Pollman
- 1998
(Show Context)
Citation Context ...chnique called spiral storage (or spiral hashing) [228; 241] combines constrained bucket splitting and overflowing buckets. More detailed surveys and analysis of methods for dynamic hashing appear in =-=[62; 138]-=-. The above hashing schemes and their many variants work very well for dictionary applications in the average case, but have poor worst-case performance. They also do not support sequential search, su... |

3 |
External memory computational geometry revisited
- Breimann, Vahrenhold
(Show Context)
Citation Context ... PDM model include range counting queries [221], constrained Delauney triangulation [15], and a host of problems on terrains and grid-based GIS models [6; 8; 35; 36; 52; 178]. Breimann and Vahrenhold =-=[85]-=- survey several EM problems in computational geometry. 8. BATCHED PROBLEMS ON GRAPHS We adopt the convention that the edges of the input graph, each of the form (u, v) for some vertices u and v, are g... |

2 |
From LIDAR to GRID DEM: A scalable approach
- Agarwal, Arge, et al.
- 2006
(Show Context)
Citation Context |

2 | I/O-efficient structures for orthogonal range-max and stabbing-max queries
- Agarwal, Arge, et al.
- 2003
(Show Context)
Citation Context ...f the isosurface (or contour) of a surface. A data structure for a related problem, which in addition has optimal output complexity, appears in [11]. Range-max and stabbing-max queries are studied in =-=[14; 16]-=-. 11.4 Bootstrapping for Three-Sided Orthogonal 2-D Range Search Arge et al. [49] provide another example of the bootstrapping paradigm by developing an optimal dynamic EM data structure for three-sid... |

2 | Deterministic load balancing and dictionaries in the parallel disk model
- Berger, Hansen, et al.
- 2006
(Show Context)
Citation Context ...eteriorate because of unwanted collisions. See Gaede and Günther [158] for a survey and in External Memory Algorithms and Data Structures: Dealing with Massive Data · 39 addition more recent work in =-=[81; 131; 189; 255]-=-, A more effective approach for sequential search is to use multiway trees, which we explore next. 10. MULTIWAY TREE DATA STRUCTURES An advantage of search trees over hashing methods is that the data ... |

1 | A functional approach to external graph - ABELLO, BUCHSBAUM, et al. - 1998 |

1 | A framework for index dynamization - AGARWAL, ARGE, et al. |

1 |
Efficient searching External Memory Algorithms and Data Structures: Dealing with Massive Data · 65 with linear constraints
- Agarwal, Arge, et al.
- 2000
(Show Context)
Citation Context ... data structures for fat orthogonal 2-D range search. By the reduction, one possible approach would be to develop optimal linear-sized data structures for three-sided 3-D range search. Agarwal et al. =-=[10]-=- consider halfspace range searching, in which a query is specified by a hyperplane and a bit indicating one of its two sides, and the output of the query consists of all the points on that side of the... |

1 |
Integrated prefetching and caching with read and write requests
- Albers, Büttner
- 2003
(Show Context)
Citation Context ...efetching We can get further improvements in merge sort by a more careful prefetching schedule for the runs. Barve et al. [66], Kallahalla and Varman [190; 191], Albers and 22 · J. S. Vitter Büttner =-=[27; 28]-=-, Shah et al. [287; 288], and Hon et al. [184] have developed competitive and optimal methods for prefetching blocks in parallel I/O systems. Hutchinson et al. [188] have demonstrated a powerful duali... |

1 |
Efficient flow computation on massive grid datasets
- Arge, Chase, et al.
- 2003
(Show Context)
Citation Context |

1 | Bounded Disorder: The Effect of the Index
- Baeza-Yates
- 1989
(Show Context)
Citation Context ... between B-trees and hashing, where each subtree rooted at a certain level of the B-tree is instead organized as an external hash table, was developed by Litwin and Lomet [217] and further studied in =-=[60; 218]-=-. O’Neil [251] proposed a B-tree variant called the SB-tree that clusters together on the disk symmetrically ordered nodes from the same level so as to optimize range queries and sequential access. Ra... |

1 |
Baeza-Yates. Expected behaviour of B+-trees under random insertions
- A
- 1989
(Show Context)
Citation Context ...), Yao [331] shows that nodes are roughly ln 2 ≈ 69% full on the average, assuming random insertions. With sharing (as in B*-trees), the average storage utilization increases to about 2 ln(3/2) ≈ 81% =-=[63; 209]-=-. Storage utilization can be increased further by sharing among several siblings, at the cost of more complicated insertions and deletions. Some helpful space-saving techniques borrowed from hashing a... |

1 |
Competitive analysis of buffer management algorithms
- Barve, Kallahalla, et al.
- 2000
(Show Context)
Citation Context ...tz CPU with six fast disk drives, as reported by Barve and Vitter [69]. 5.3 Prefetching We can get further improvements in merge sort by a more careful prefetching schedule for the runs. Barve et al. =-=[66]-=-, Kallahalla and Varman [190; 191], Albers and 22 · J. S. Vitter Büttner [27; 28], Shah et al. [287; 288], and Hon et al. [184] have developed competitive and optimal methods for prefetching blocks i... |

1 | A hierarchical technique for constructing efficient declustering schemes for range queries
- Bhatia, Sinha, et al.
(Show Context)
Citation Context ...terest, but we can discard the starting points we don’t need.) The total number of I/Os to answer the range query is thus O(logB N + z), which is optimal. Atallah and Prabhakar [59] and Bhatia et al. =-=[82]-=- consider the problem of how to tile a multidimensional array of blocks onto parallel disks so that range queries on a range queries can be answered in near-optimal time. 54 · J. S. Vitter 11.6 Other ... |