| J. M. Hellerstein, E. Koutsoupias, C. H. Papadimitriou, On the Analysis of Indexing Schemes. In Symposium on Principles of Database Systems (PODS), Tucson, Arizona, USA, May, 1997. |
....and Singh [KS99] prove that the best possible query time using any structure consuming linear O(N B) space is K B) I Os, where K is the number of objects retrieved. This bound is tight and has been realized by the O tree [KS99] and the cross tree [GI99] Applying the theory of indexability [HKP97], Arge et al. ASV99] show that a structure achieving optimal query cost O(log B (N B) K B) must occupy #( N B)log B (N B) loglog B (N B) space. They propose the external range tree that achieves these bounds. A special RS is the so called 3 sided query, where an edge of the query rectangle ....
Hellerstein, J., Koutsoupias, E., Papadimitriou, C. On the Analysis of Indexing Schemes. ACM PODS, 1997.
....Brief Note on Path Indexability Raghav Kaushik Je#rey F. Naughton University of Wisconsin Madison raghav,naughton cs.wisc.edu In [1], a theory of indexability is developed. The authors essentially classify all indexing problems as one of blocking up the data elements (from set I) into fixed size blocks for a given query workload. The performance of the indexing scheme is then measured through two metrics: the access ....
....of an indexing scheme corresponds to the average number of blocks in which an element of I occurs. The maximum storage redundancy is the maximum number of blocks in which an element of I occurs. The authors show that for a workload of set containment queries, the following holds. Theorem 1: [1] For each (maximum) redundancy r, there exists a set inclusion workload such that the (maximum) access overhead is B. Their proof can be easily modified to arrive at the following stronger version. Theorem 2: For each function r(n) that is o(n) there exists a set inclusion workload such that ....
[Article contains additional citation context not shown here]
J.M.Hellerstein, E.Koutsoupias, and C.H.Papadimitriou. On the analysis of indexing schemes. In Proceedings of the Sixteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 12-14, 1997.
....of them are very common in querying XML documents, thus an index structure to facilitate these will be very helpful in improving the performance of XML database. This report is organized as follows. In Section 3, we present the related work. Design and implementation details are given in Section 4 and Section 5, respectively. In Section 6, we present the result and analy sis of the common ancestor index extension. Future work is discussed in Section 7, and we finally conclude in Section 8. 3 Related Work As XML database management is becoming more and more important these days due to ....
....et al. developed the Generalized Search Tree (GIST) 8] at Berkley. It provides good extensi bility and flexibility that allow us to easily implement user defined index extensions. Some indexing extensions have been successfully built on the GiST soft ware framework, including Bq tree, R tree [14] for spatial data, and RD tree [9] for sets. Since no index ing structure has been developed to exploit commonparent or common ancestor relationship, it becomes our main motivation to implement a novel commonancestor index by extending the GiST framework. 4 Design Issues The idea of the ....
[Article contains additional citation context not shown here]
J. Hellerstein, E. Koutsoupias, and C. Papadim- itriou. On the Analysis of Indexing Schemes. In Proceedings of the Sixteenth ACM SIGACTSIGMODSIGART Symposium on Principles of Database Systems, Tucson, Arizona, pages 249- 256, 1997.
....Finally, these techniques had substantially higher implementation complexity than their heuristic counterparts. For these reasons, these techniques have not yet been adopted into practice. The call for a theory of indexability was answered by Hellerstein, Koutsoupias and Papadimitriou [HKP97] with the introduction of a new model of indexing, which strived to reconcile the theoretical work with some of the practical concerns that it had raised. The new model was particularly suited for the study of lower bounds, which was the focus of most of the results in [HKP97] but it also ....
....and Papadimitriou [HKP97] with the introduction of a new model of indexing, which strived to reconcile the theoretical work with some of the practical concerns that it had raised. The new model was particularly suited for the study of lower bounds, which was the focus of most of the results in [HKP97] but it also introduced a bold departure from previous theoretical models. Whereas previous models incorporated both the locality and the search aspects of indexing, indexability focused on the locality exclusively, abstracting away the search aspects of the problem. 1.3 Our thesis In this ....
[Article contains additional citation context not shown here]
J.M. Hellerstein, E. Koutsoupias, and C.H. Papadimitriou. On the analysis of indexing schemes. In Proc. ACM Symp. Principles of Database Systems, 1997.
....requirements small. There are two basic parameters that affect performance: i) the number of insertions N , ii) the number of records that fit in a page B. We assume that one I O transfers one page. Ideally, we would like our index solutions to use linear storage, i.e. O( N B ) disk pages [18]. Note that for the On Line problem an additional cost measure is the index update time (the time to process an update) This is not critical in the Off Line setting since the whole set of updates is known in advance and the index is built once. To further exemplify the above costs, consider two ....
J.M. Hellerstein, E. Koutsoupias, and C. Papadimitriou. On the Analysis of Indexing Schemes. In Proc. 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 249--256, Tucson, May 1997.
.... silver bullet for multivariate density estimation that simultaneously provides useful bounds on space, time and precision. Furthermore, formal bounds of any kind are difficult in the framework of non space partitioning trees; a better understanding may arise from theoretical work on indexability [HELL97b]. Sampling variations. Unlike simple random sampling, sampling from a ranked index has structure that might be exploitable. It would be interesting to formulate and investigate a sequential application of importance (nonequiprobable) sampling. Access path selection. The information obtained ....
....perceives two indices of the same type built over the same column as being identical. This is essentially true for B trees but plainly untrue in general, as we can see if we consider the case of two R trees which have been optimized for queries with inverse aspect ratios (cf. Proposition 1 of [HELL97b]) In the latter example, an estimator would only need to descend a short distance into each index to determine (on a heuristic basis, at least) that one is much better suited to a given query. Join selectivity estimation. Just as histograms on compatible domains can be joined to provide ....
J.M. Hellerstein, E. Koutsoupias and C.H. Papadimitriou, "On the Analysis of Indexing Schemes," Proc. 16th ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Sys., Tucson, AZ, May 1997, 249-256.
....and object models) can be reduced to special cases of twodimensional indexing. Refer to Figure 1) In particular they identified the 3 sided range searching problem (Figure 1(c) to be of major importance. In the first part of this paper (Section 2) we apply the theory of indexability [10] to two dimensional range searching problems. In indexability theory the focus is on bounding the number of disk blocks containing the answers to a query (access overhead) given a bound on the number of blocks used to store the data points (re dundancy) The search cost of computing which blocks ....
....in units of blocks rather than points. An I O operation (or simply I O) is defined as the transfer of a block of data between internal and external memory. Indexability: Background and outline of results The theory of indexability was formalized by Hellerstein, Koutsoupias, and Papadimitriou [10]. As mentioned it studies an abstraction of an indexing problem where search cost is omitted. An instance of a problem is described by a workload W , which is a simple hypergraph (I ; Q) where I is the set of instances, and Q is a set of subsets of I . The elements of Q are called queries. For a ....
[Article contains additional citation context not shown here]
J. M. Hellerstein, E. Koutsoupias, and C. H. Papadimitriou. On the analysis of indexing schemes. In Proc. ACM Symp. Principles of Database Systems, pages 249--256, 1997.
....O(log N) internal memory search term, and an O(T=B) reporting term accounting for the O(T=B) I Os needed to report T elements. Recently, the above bounds have been obtained for a number of problems (e. g [30, 26, 149, 5, 47, 87] but higher lower bounds have also been established for some problems [141, 26, 93, 101, 106, 135, 102]. We discuss these results in later sections. B trees come in several variants, like B and B trees (see e.g. 35, 63, 95, 30, 104, 3] and their references) A basic B tree is a Theta(B) ary tree (with the root possibly having smaller degree) built on top of Theta(N=B) leaves. The degree of ....
.... [141] 1 This lower bound holds in a natural external memory version of the pointer machine model [53] A similar bound in a slightly different model where the search component of the query is ignored was proved by Arge et al. 26] This indexability model was defined by Hellerstein et al. [93] and considered by several authors [101, 106, 135] Based on a sub optimal but linear space structure for answering 3 sided queries, Subramanian and Ramaswamy developed the P range tree that uses optimal O( N log(N=B) B log log B N ) space but uses more than the optimal O(log B N T=B) 1 In ....
J. M. Hellerstein, E. Koutsoupias, and C. H. Papadimitriou. On the analysis of indexing schemes. In Proc. ACM Symp. Principles of Database Systems, pages 249--256, 1997.
....data replication to improve external searching in static tree search structures. Trees are fundamental data structures for ecient query processing [9, 12] for which mappings to external storage have been developed [2, 4, 7, 8] The use of data replication in static environments has been studied in [5, 7, 10]. We present general and ecient techniques for mapping trees to external storage. Our techniques determine what data to replicate, how to control the amount of replication, and how to ensure good block utilization. Let T be a rooted tree consisting of N nodes. When the data associated with the ....
J. Hellerstein, E. Koutsoupias, and C. Papadimitriou. On the analysis of indexing schemes. In Proc. of 16th ACM Symp. on Principles of Database Systems, pages 249-256, 1997.
....and its Application to Multidimensional Range Queries Vasilis Samoladas The University of Texas at Austin vsam cs.utexas.edu Daniel P. Miranker The University of Texas at Austin miranker cs.utexas.edu Abstract Indexing schemes were proposed by Hellerstein, Koutsoupias and Papadimitriou [7] to model data indexing on external memory. Using indexing schemes, the complexity of indexing is quantified by two parameters: storage redundancy and access overhead. There is a tradeoff between these two parameters, in the sense that for some problems it is not possible for both of these to be ....
....two parameters, in the sense that for some problems it is not possible for both of these to be low. In this paper we derive a lower bounds theorem for arbitrary indexing schemes. We apply our theorem to the particular problem of d dimensional range queries. We first resolve the open problem of [7] for a tight lower bound for 2 dimensional range queries and extend our lower bound to d dimensional range queries. We then show, how, the construction in our lower bounds proof may be exploited to derive indexing schemes for d dimensional range queries, whose asymptotic complexity matches our ....
[Article contains additional citation context not shown here]
J.M. Hellerstein, E. Koutsoupias, and C.H. Papadimitriou. On the analysis of indexing schemes. In Proceedings of the Sixteenth ACM SIGACT-SIGMODSIGART Symposium on Principles of Database Systems (PODS), 1997.
....infrastructure follows from two observations. First, a tree structured index is a partitioning of an arbitrary data set at an arbitrary resolution. That is, the index recursively divides the indexed data into clusters; these clusters support efficient search, assuming that the data is indexable [24] and the index design is effective. Efficient indexed search over a given workload means that we examine a minimal number of extraneous objects over that workload. Second, in the process of implementing indices for their new data types, database extenders necessarily provide code to partition ....
....can be unstable under reasonable conditions. There are many possible areas for additional work. These are summarized in the full paper; perhaps the most significant lies in the investigation of formal performance bounds, perhaps arising from the ongoing work on the theory of indexability [24]. ....
J.M. Hellerstein, E. Koutsoupias and C.H. Papadimitriou, "On the Analysis of Indexing Schemes," Proc. 16th PODS, Tucson, AZ, May 1997, 249-256.
....requirements small. There are two basic parameters that a ect performance: i) the number of insertions N , ii) the number of records that t in a page B. We assume that one I O transfers one page. Ideally, we would like our index solutions to use linear storage, i.e. O( N B ) disk pages [18]. Note that for the On Line problem an additional cost measure is the index update time (the time to process an update) This is not critical in the O Line setting since the whole set of updates is known in advance and 5 the index is built once. To further exemplify the above costs, consider ....
J.M. Hellerstein, E. Koutsoupias, and C. Papadimitriou. On the Analysis of Indexing Schemes. In Proc. 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 249-256, Tucson, May 1997.
....B, the page size in records. We assume that one I O transfers one page. Parameter N corresponds to the minimal amount of information needed to store the frame evolution. Ideally, we would like our index solutions to use space that is linear to the number of updates, i.e. O( N B ) disk pages [18]. Note that for the On Line problem an additional cost measure is the index update time (the time to process an update) This is not critical in the Off Line mode since the whole set of updates is known in advance and the index is built once. To further exemplify the above costs, consider two ....
J.M. Hellerstein, E. Koutsoupias, and C. Papadimitriou. On the Analysis of Indexing Schemes. In Proc. 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 249--256, Tucson, May 1997.
.... has attracted a lot of interest, and thus there have been many proposals for spatial access methods geared specifically towards high dimensional indexing [25, 9, 5, 6, 8, 34] Formal worst case analyses of performance of R tree queries in high dimensions have yielded some very pessimistic results [18, 34, 10]. The focus of the latter two are on nearest neighbor queries, where the analysis in [34] makes assumptions of uniformity and independence, and where the analysis in [10] applies to very restricted conditions of dependence between dimensions. 2.3 Selectivity Estimation Selectivity estimation in ....
Joseph Hellerstein, Elias Koutsoupias, and Christos Papadimitriou. On the analysis of indexing schemes. In PODS '97, pages 249--256, Tucson, Arizona, May 1997.
....indexing structure, we can execute a range search with radius D for each object O. Once there are at least (M 1) neighbours in the D neighbourhood, we stop the search and declare O a non outlier. Otherwise, we report O as an outlier. Analyses of multidimensional indexing schemes [18] reveal that, for variants of R trees [15] k d trees [4, 27] and X trees [6] the lower bound complexity for a range search is Omega (N 1 Gamma1=k ) As k increases, a range search quickly reduces to O(N ) giving at best a constant time improvement reflecting sequential search. Thus, the ....
Hellerstein J, Koutsoupias E, Papadimitriou C (1997) On the analysis of indexing schemes. In: Proc PODS, pp 249--256
.... neighborhood problem create a bucket (or color class) for each code word, and map each point to the bucket of the nearest code word. Error correcting codes seem like an interesting idea to be explored further. 8. 4 The indexing model A model of similar flavor was introduced by Hellerstein et al. [52] for the study of more general indexing schemes. Indexes in this model are evaluated in the context of a specific workload. A workload is defined by a triplet W = D; I ; Q) where D is the domain of the workload (e.g. the space R d ) I is a finite subset of the domain of size n, called an ....
....B) and small intersection (that is, O( B ff 2 ) then the storage redundancy is Omega Gamma MB n ) They apply this theorem for the case of d dimensional range queries, and derive a lower bound Omega Gamma logB log ff ) for the storage redundancy. Range queries are also considered in [52, 60]. Another interesting family of workloads is derived by the set inclusion queries. For some m 1, the domain D is the set of all possible subsets of the set f1; 2; mg. The instance I is a subset of D. Given a set S 2 D, we define a query as Q S = fX 2 I : X Sg, that is, all the sets in I ....
[Article contains additional citation context not shown here]
J. M. Hellerstein, E. Koutsoupias, and C. H. Papadimitriou. On the analysis of indexing schemes. In Proceedings of the Sixteenth ACM Symposium on Principles of Database Systems, pages 249--256, 1997.
....have been referenced within the body of this text, as appropriate. An interesting generalization of entire classes of data partitioning access methods has been proposed by Hellerstein, Naughton and Pfeffer [HNP95] The indexability results of Hellerstein, Koutsoupias and Papadimitriou [HKP97] are directly related to the analysis of Section 2. For range queries, these authors presented a minimum bound on the access overhead of B 1 Gamma 1 d , which tends toward B as dimensionality increases (B denotes the size of a block) This result is consistent with the average case results ....
J.M. Hellerstein, E. Koutsoupias, and C.H. Papadimitriou. On the analysis of indexing schemes. In Proceedings of the ACM Symposium on Principles of Database Systems (PODS), 1997.
....are established between the number of buckets containing database points satisfying the query and the maximum size of a bucket. Another lower bound is by Indyk [31] He shows a lower bound for approximate NNS under the L1 norm in the indexing model of Hellerstein, Koutsoupias, and Papadimitriou [30]. The indexing model tries to capture the cost of using external memory devices for large data sets, and appears to be computationally more restricted than the cell probe model. Indyk shows that the superset query problem of Hellerstein et al. reduces to (1 ) approximate NNS under the L1 norm, ....
J. Hellerstein, E. Koutsoupias, and C.H. Papadimitriou. On the analysis of indexing schemes. In Proc. of PODS, 1997.
....at most 2 P Gamma 1. The worst case occurs if and only if all sets containing e max are contained in S. 4 Empirical Evaluation To demonstrate the effectiveness of the approach, we discuss examples taken from the use of UBTree in the IPP planning system [ Koehler et al. 1997 ] Following [ Hellerstein et al. 1997 ] the workload for UBTree is determined by the following factors: ffl the domain, which is the set of all possible sets, i.e. given P logical atoms to characterize states, the domain comprises 2 P sets, ffl an instance of the domain is the finite subset S 2 P that is currently stored ....
J. Hellerstein, C. Papadimitriou, and E. Koutsoupias. On the analysis of indexing schemes. In PODS-97, pages 249--256. ACM, 1997.
....the analytical models for range and join queries using R tree based structures, proposed in [TS96] and in the current paper, respectively, provide average case estimation of access cost. Their appropriate extension and formalization, according to the indexability theory issues presented in [HKP97], constitute main goals for further research. Acknowledgements The research was partially supported by the European Commission funded TMR project CHOROCHRONOS: A Research Network for Spatiotemporal Database Systems . ....
J.M. Hellerstein, E. Koutsoupias, C.H. Papadimitriou, "On the Analysis of Indexing Schemes", Proc. 16th ACM PODS Symposium, 1997.
....a uniform shape for the chunks that reduces the average number of blocks fetched for a specified access pattern. The chunks are always stored in axis order, and [21] additionally determines a good ordering of the array axes to reduce average seek time, given the access pattern. It is well known [11, 18, 22] that there is no good ordering of data points in a multi dimensional space that will permit arbitrary range queries to be answered efficiently. 22] established, given a uniform distribution of key values, that a k attribute selection on a database with N records has a file access cost of O(N ....
J. M. Hellerstein, E. Koutsoupias, and C. H. Papadimitriou. On the analysis of indexing schemes. In Proceedings of the ACM Symposium on Principles of Database Systems, pages 249--256, 1997.
....models for index structures with aggregated data and three models for index structures without aggregated data. In all models only access to leaf nodes is considered. The main goal of the new structure is to minimize the number of disk accesses without enlarging the structure too much. In [HKP97] the tradeoff between redundant data and access overhead is investigated in detail. It is assumed that all non leaf nodes are stored in main memory and all leaf nodes have to be read from secondary memory. Without loss of generality we assume in the following that every leaf node is mapped to ....
Joseph M. Hellerstein, Elias Koutsoupias, and Christos H. Papadimitriou. On the analysis of indexing schemes. In Proc. 16th ACM SIGACT-SIGMODSIGART Symp. on Principles of Database Systems, pages 249--256, May 1997.
....which can be used to decrease the effective dimensionality of a data set [1] Faloutsos and Kamel [17] have shown that fractal dimensionality is a useful measure of the inherent dimensionality of a data set. We will further discuss this below. The indexability results of Hellerstein et al. [22] are based on data sets that can be seen as regular meshes of extension n in each dimension. For range queries, these authors presented a minimum bound on the access overhead of B 1 Gamma 1 d , which tends toward B as dimensionality d increases (B denotes the size of a block) This access ....
J. Hellerstein, E. Koutsoupias, and C. Papadimitriou. On the analysis of indexing schemes. In Proc. of the ACM Symposium on Principles of Database Systems, 1997.
....if all sets out of 2 P containing the greatest element with respect to the total ordering have been inserted into the tree. 4 Empirical Evaluation To demonstrate the effectiveness of the approach, we discuss examples taken from the use of UBTree in the IPP planning system [6] Following [4], the workload for UBTree is determined by the following factors: ffl the domain, which is the set of all possible sets, i.e. given P logical atoms to characterise states, the domain comprises 2 P sets, 3 Constructing worst case search trees for jqj elements, we get at most 2 jqj nodes. ....
J. Hellerstein, C. Papadimitriou, and E. Koutsoupias. On the analysis of indexing schemes. In Proceedings of the Conference on Principles of Database Systems. ACM, 1997.
....are established between the number of buckets containing database points satisfying the query and the maximum size of a bucket. Another lower bound is by Indyk [29] He shows a lower bound for approximate NNS under the L1 norm in the indexing model of Hellerstein, Koutsoupias, and Papadimitriou [28]. The indexing model tries to capture the cost of using external memory devices for large data sets, and appears to be computationally more restricted than the cell probe model. Indyk shows that the superset query problem of Hellerstein et al. reduces to (1 ) approximate NNS under the L1 norm, ....
J. Hellerstein, E. Koutsoupias, and C.H. Papadimitriou. On the analysis of indexing schemes. In Proc. of PODS, 1997.
....Aside from the improvements previously discussed, there are many possible areas for additional work. These are summarized in [AOKI98b] perhaps the most significant lies in the investigation of formal performance bounds, perhaps arising from the ongoing work on the theory of indexability (cf. [HELL97b]) ....
J.M. Hellerstein, E. Koutsoupias and C.H. Papadimitriou, "On the Analysis of Indexing Schemes," Proc. 16th ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Sys., Tucson, AZ, May 1997, 249-256.
No context found.
J. Hellerstein, E. Koutsoupias and C. Papadimitriou. On the Analysis of Indexing Schemes. In Proc. 16th ACM Symposium on Principles of Database Systems, pp. 249-256, Tucson, May 1997.
....INSERT, which adds a (key, RID) pair to the tree; and DELETE, which removes such a pair from the tree. It implements these operations 4 This structural requirement excludes R trees [SRF87] from conforming to the GiST structure, and generally precludes AMs that store data redundantly [HKP97] 12 search(search pred) push(stack, root) while (stack is not empty) child = pop(stack) for each page entry e on child: if ( consistent(search pred, e.pred) if (child is leaf) add e to search result set; else push(stack, e.child ptr) end end end Figure 2.3: Original GiST search ....
J. Hellerstein, E. Koutsoupias, and C. Papadimitriou. On the Analysis of Indexing Schemes. In Proceedings of the Sixteenth ACM SIGACT-SIGMODSIGART Symposium on Principles of Database Systems, Tucson, Arizona, pages 249--256, 1997.
....(equal to the block size B) We also show that this result is almost tight and that the trade off even exhibits a threshold behavior: on a random set of points, expected storage redundancy O(log d Gamma1 n) achieves access overhead 2d Gamma 1. 1 Introduction Indexing schemes introduced in [3] attempt to capture the intrinsic difficulty of storing large database workloads for efficient retrieval of requested data from secondary memory. Informally, an indexing scheme is a way to organize the data into a collection of disk blocks that facilitates efficient retrieval of data. In an ideal ....
....are allowed Computer Science Department, University of California, Los Angeles, CA 90095, USA. Email: elias cs.ucla.edu y Computer Science Department, University of California, Los Angeles, CA 90095, USA. Email: dstaylor cs.ucla.edu to be stored in multiple blocks. An important proposal of [3] was that an indexing scheme can be evaluated in terms of two simple parameters: the first parameter is the storage redundancy which measures how many times an element is stored in disk blocks (there are two kinds, the maximum redundancy and the average redundancy) The blocks are chosen so that ....
[Article contains additional citation context not shown here]
J. M. Hellerstein, E. Koutsoupias, and C. H. Papadimitriou. On the analysis of indexing schemes. In Proceedings of the Sixteenth ACM SIGACT-SIGMODSIGART Symposium on Principles of Database Systems, pages 249--256, Tucson, Arizona, 12--15 May 1997.
.... indexing schemes Elias Koutsoupias University of California, Los Angeles elias cs.ucla.edu David Scot Taylor University of California, Los Angeles dstaylor cs.ucla.edu Abstract We study the trade off between storage redundancy and access overhead for range queries, using the framework of [6]. We show that the Fibonacci workload of size n, which is the regular 2 dimensional grid rotated by the golden ratio, does not admit an indexing scheme with access overhead less than the block size B (the worst possible access overhead) even for storage redundancy as high as c log n, for some ....
....the lower bound to random point sets and show that if the maximum storage redundancy is less than cloglog n, the access overhead is B. Finally, we explore the relation between indexability and fractal (Hausdorff) dimension of point sets. 1 Introduction In this paper we continue the work of [6] towards a theory of indexability that is, towards a better understanding of the storage redundancy access overhead tradeoff in the indexing problem for complex workloads, in a secondary memory device with a fixed block size B. Since data items are stored and retrieved in blocks of B, a query ....
[Article contains additional citation context not shown here]
J. M. Hellerstein, E. Koutsoupias, and C. H. Papadimitriou. On the analysis of indexing schemes. Proceedings of the Sixteenth ACM SIGACT-SIGMODSIGART Symposium on Principles of Database Systems, Tucson, Arizona, 12--15 May 1997.
....of GiSTs. 2 System Features Amdb was developed with the entire AM design and implementation process in mind and supports the designer in three areas: 1. Analysis of the dataset (i.e. the search keys of the data) and the index tree structure to evaluate the general indexability of the dataset ([HKP97]) and the effectiveness of the design. 2. Debugging of dynamic tree operations to pinpoint implementation flaws. 3. Profiling of a defined query workload to measure the level of end user performance. Central to the user interface of amdb is a graphical display of the tree structure, which greatly ....
J. Hellerstein, E. Koutsoupias, and C. Papadimitriou. On the Analysis of Indexing Schemes. In Proceedings of the Sixteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Tucson, Arizona, pages 249--256, 1997.
.... empirical performance methodologies in this domain has been noted with increasing urgency in recent years [20, 6] Recent work on generalized indexing schemes presents software and analytic frameworks for indexing that are domain independent, i.e. applicable to arbitrary sets of data and queries [11, 10]. As noted in [10] there is a simple logical characterization of the space of queries supported by an index over a data set D: they form a set S P (D) i.e. a set of subsets of the data being indexed. Note that this logical view of the query space abstracts away the semantics of the data ....
.... methodologies in this domain has been noted with increasing urgency in recent years [20, 6] Recent work on generalized indexing schemes presents software and analytic frameworks for indexing that are domain independent, i.e. applicable to arbitrary sets of data and queries [11, 10] As noted in [10], there is a simple logical characterization of the space of queries supported by an index over a data set D: they form a set S P (D) i.e. a set of subsets of the data being indexed. Note that this logical view of the query space abstracts away the semantics of the data domain and considers ....
[Article contains additional citation context not shown here]
Joseph M. Hellerstein, Elias Koutsoupias, and Christos H. Papadimitriou. On the Analysis of Indexing Schemes. In Proc. 16th ACM SIGACTSIGMOD -SIGART Symposium on Principles of Database Systems, pages 249--256, Tucson, May 1997.
.... performance methodologies in this domain has been noted with increasing urgency in recent years [SRF97, GOP 97] Recent work on generalized indexing schemes presents software and analytic frameworks for indexing that are domain independent, i.e. applicable to arbitrary sets of data and queries [HNP95, HKP97]. As noted in [HKP97] there is a simple logical characterization of the space of queries supported by an index over a data set D: they form a set S P (D) i.e. a set of subsets of the data being indexed (here P (D) denotes the power set of D) Note that this logical view of the query space ....
.... in this domain has been noted with increasing urgency in recent years [SRF97, GOP 97] Recent work on generalized indexing schemes presents software and analytic frameworks for indexing that are domain independent, i.e. applicable to arbitrary sets of data and queries [HNP95, HKP97] As noted in [HKP97], there is a simple logical characterization of the space of queries supported by an index over a data set D: they form a set S P (D) i.e. a set of subsets of the data being indexed (here P (D) denotes the power set of D) Note that this logical view of the query space abstracts away the ....
[Article contains additional citation context not shown here]
Joseph M. Hellerstein, Elias Koutsoupias, and Christos H. Papadimitriou. On the Analysis of Indexing Schemes. In Proc. 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 249--256, Tucson, May 1997.
....8.7. 4 High Dimensional Data Some index trees have been developed explicitly to address high dimensional problems (e.g. TVtrees [LJF94] and X trees [BKK96] The efficacy of these structures remains in doubt, especially in light of recent results on the hardness of indexing high dimensional space [HKP97]. Most known successful approaches involve projecting (based on some heuristics) to a space of lower, more manageable dimensionality. 9 Sampling The notion that a large set of data can be represented by a small random sample of the data elements goes back to the end of the nineteenth century ....
Joseph M. Hellerstein, Elias Koutsoupias, and Christos H. Papadimitriou. On the Analysis of Indexing Schemes. In Proc. 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 249--256, Tucson, May 1997.
No context found.
J. M. Hellerstein, E. Koutsoupias, C. H. Papadimitriou, On the Analysis of Indexing Schemes. In Symposium on Principles of Database Systems (PODS), Tucson, Arizona, USA, May, 1997.
No context found.
J. Hellerstein, E. Koutsoupias, and C.H. Papadimitriou. On the analysis of indexing schemes. In Proc. of PODS, 1997.
No context found.
J.M. Hellerstein, E. Koutsoupias, C.H. Papadimitriou, "On the Analysis of Indexing Schemes," SIGMOD-SIGART Symposim on Principles of Database Systems, pp 249-256, 1997.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC