• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

A succinct physical storage scheme for efficient evaluation of path queries in XML. (2004)

by N Zhang, V Kacholia, M T Ozsu
Venue:In Proc. ICDE,
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 38
Next 10 →

Efficient Memory Representation of XML Document Trees.

by Giorgio Busatto , Markus Lohrey , Sebastian Maneth - Inf. Syst. , 2008
"... Abstract Implementations that load XML documents and give access to them via, e.g., the DOM, suffer from huge memory demands: the space needed to load an XML document is usually many times larger than the size of the document. A considerable amount of memory is needed to store the tree structure of ..."
Abstract - Cited by 51 (14 self) - Add to MetaCart
Abstract Implementations that load XML documents and give access to them via, e.g., the DOM, suffer from huge memory demands: the space needed to load an XML document is usually many times larger than the size of the document. A considerable amount of memory is needed to store the tree structure of the XML document. Here a technique is presented that allows to represent the tree structure of an XML document in an efficient way. The representation exploits the high regularity in XML documents by "compressing" their tree structure; the latter means to detect and remove repetitions of tree patterns. The functionality of basic tree operations, like traversal along edges, is preserved in the compressed representation. This allows to directly execute queries (and in particular, bulk operations) without prior decompression. For certain tasks like validation against an XML type or checking equality of documents, the representation allows for provably more efficient algorithms than those running on conventional representations.
(Show Context)

Citation Context

...like, e.g., XGrind [28]), it is unclear how well it compares to other approaches (like ours) when documents are loaded into memory. It is also possible to use strings to represent XML trees in memory =-=[30]-=-; their experiments show that this offers good compression, while still being able to query efficiently the representation. XQueC uses a queriable XML representation that is based on compression of da...

Statistical Learning Techniques for Costing XML Queries

by Ning Zhang, Peter J. Haas, Vanja Josifovski, Guy M. Lohman, Chun Zhang - In VLDB , 2005
"... Developing cost models for query optimization is significantly harder for XML queries than for traditional relational queries. The reason is that XML query operators are much more complex than relational operators such as table scans and joins. In this paper, we propose a new approach, called C ..."
Abstract - Cited by 36 (4 self) - Add to MetaCart
Developing cost models for query optimization is significantly harder for XML queries than for traditional relational queries. The reason is that XML query operators are much more complex than relational operators such as table scans and joins. In this paper, we propose a new approach, called Comet, to modeling the cost of XML operators; to our knowledge, Comet is the first method ever proposed for addressing the XML query costing problem. As in relational cost estimation, Comet exploits a set of system catalog statistics that summarizes the XML data; the set of "simple path" statistics that we propose is new, and is well suited to the XML setting.
(Show Context)

Citation Context

...ocus of considerable research and development activity over the past few years. A wide variety of join-based, navigational, and hybrid XPath processing techniques are now available; see, for example, =-=[3, 4, 11, 25]-=-. Each of these techniques can exploit structural and/or valuebased indexes. An XML query optimizer can therefore Permission to copy without fee all or part of this material is granted provided that t...

XSEED: Accurate and fast cardinality estimation for XPath queries

by Ning Zhang, M. Tamer, Özsu Ashraf, Aboulnaga Ihab, F. Ilyas - In to appear Proc. 22nd Int. Conf. on Data Engineering (ICDE , 2006
"... We propose XSEED, a synopsis of path queries for cardinality estimation that is accurate, robust, efficient, and adaptive to memory budgets. XSEED starts from a very small kernel, and then incrementally updates information of the synopsis. With such an incremental construction, a synopsis structure ..."
Abstract - Cited by 22 (1 self) - Add to MetaCart
We propose XSEED, a synopsis of path queries for cardinality estimation that is accurate, robust, efficient, and adaptive to memory budgets. XSEED starts from a very small kernel, and then incrementally updates information of the synopsis. With such an incremental construction, a synopsis structure can be dynamically configured to accommodate different memory budgets. Cardinality estimation based on XSEED can be performed very efficiently and accurately. Extensive experiments on both synthetic and real data sets show that even with less memory, XSEED could achieve accuracy that is an order of magnitude better than that of other synopsis structures. The cardinality estimation time is under 2 % of the actual querying time for a wide range of queries in all test cases. 1
(Show Context)

Citation Context

...tructing and maintaining the XSEED kernel and HET, and utilizing them to predict the cardinality. In the construction phase, the XML document is first parsed to generate the NoK XML storage structure =-=[14]-=-, the path tree [1], and the XSEED kernel. The HET is constructed based on these three data structures if it is pre-computed. In the estimation phase, the XML documents Construction Estimation Path qu...

FIX: Feature-based Indexing Technique for XML Documents

by Ning Zhang, M. Tamer Özsu , Ihab F. Ilyas, Ashraf Aboulnaga , 2006
"... Indexing large XML databases is crucial for efficient evaluation of XML twig queries. In this paper, we propose a feature-based indexing technique, called FIX, based on spectral graph theory. The basic idea is that for each twig pattern in a collection of XML documents, we calculate a vector of feat ..."
Abstract - Cited by 19 (2 self) - Add to MetaCart
Indexing large XML databases is crucial for efficient evaluation of XML twig queries. In this paper, we propose a feature-based indexing technique, called FIX, based on spectral graph theory. The basic idea is that for each twig pattern in a collection of XML documents, we calculate a vector of features based on its structural properties. These features are used as keys for the patterns and stored in a B + tree. Given an XPath query, its feature vector is first calculated and looked up in the index. Then a further refinement phase is performed to fetch the final results. We experimentally study the indexing technique over both synthetic and real data sets. Our experiments show that FIX provides great pruning power and could gain an order of magnitude performance improvement for many XPath queries over existing evaluation techniques.

Efficient updates in dynamic XML data: from binary string to quaternary string

by Changqing Li, Tok Wang Ling, Min Hu - THE VLDB JOURNAL
"... ..."
Abstract - Cited by 18 (4 self) - Add to MetaCart
Abstract not found

System RX: one part relational, one part XML

by Kevin Beyer, Roberta J. Cochrane, Vanja Josifovski, Jim Kleewein, George Lapis, Guy Lohman, Bob Lyle, Fatma Özcan, Hamid Pirahesh, Normen Seemann, Tuong Truong, Bert Van, Linden Brian Vickery, Chun Zhang - in: Proceedings of ACM SIGMOD International Conference on Management of Data, 2005
"... This paper describes the overall architecture and design aspects of a hybrid relational and XML database system called System RX. We believe that such a system is fundamental in the evolution of enterprise data management solutions: XML and relational data will co-exist and complement each other in ..."
Abstract - Cited by 17 (0 self) - Add to MetaCart
This paper describes the overall architecture and design aspects of a hybrid relational and XML database system called System RX. We believe that such a system is fundamental in the evolution of enterprise data management solutions: XML and relational data will co-exist and complement each other in enterprise solutions. Furthermore, a successful XML repository requires much of the same infrastructure that already exists in a relational database management system. Finally, XML query languages have con-siderable conceptual and functional overlap with relational data-flow engines. System RX is the first truly hybrid system that co-mingles XML and relational data, giving them equal footing. The new support for XML includes native support for storage and indexing as well as query compilation and evaluation support for the latest industry-standard query languages, SQL/XML and XQuery. By building a hybrid system, we leverage more than 20 years of data management research to advance XML technology to the same standards expected from mature relational systems. 1.
(Show Context)

Citation Context

...d list indexes are createdsto enable efficient structural join algorithms for ancestor/descendant paths.sMore recently, other schemes for native storage of XML documents has been proposed in [25] and =-=[41]-=- that are similar to ours.sThe XML data is stored in a native tree format in which documentsnodes are in most cases clustered together on a page. Bulk processing is performed using indexes, while the ...

Query optimization in XML structured-document databases

by Dunren Che, Karl Aberer, M. Tamer Özsu - THE VLDB JOURNAL , 2006
"... While the information published in the form of XML-compliant documents keeps fast mounting up, efficient and effective query processing and optimization for XML have now become more important than ever. This article reports our recent advances in XML structureddocument query optimization. In this ar ..."
Abstract - Cited by 16 (0 self) - Add to MetaCart
While the information published in the form of XML-compliant documents keeps fast mounting up, efficient and effective query processing and optimization for XML have now become more important than ever. This article reports our recent advances in XML structureddocument query optimization. In this article, we elaborate on a novel approach and the techniques developed for XML query optimization. Our approach performs heuristic-based algebraic transformations on XPath queries, represented as PAT algebraic expressions, to achieve query optimization. This article first presents a comprehensive set of general equivalences with regard to XML documents and XML queries. Based on these equivalences, we developed a large set of deterministic algebraic transformation rules for XML query optimization. Our approach is unique, in that it performs exclusively deterministic transformations on queries for fast optimization. The deterministic nature of the proposed approach straightforwardly renders high optimization efficiency and simplicity in implementation. Our approach is a logical-level one, which is independent of any particular storage model. Therefore, the optimizers developed based on our approach can be easily adapted to a broad range of XML data/information servers to achieve fast query optimization. Experimental study confirms the validity and effectiveness of the proposed approach.

Tree-Pattern Queries on a Lightweight XML Processor

by Mirella M. Moro, Zografoula Vagena, Vassilis J. Tsotras - In VLDB , 2005
"... Popular XML languages, like XPath, use "treepattern " queries to select nodes based on their structural characteristics. While many processing methods have already been proposed for such queries, none of them has found its way to any of the existing "lightweight" XML engines ..."
Abstract - Cited by 16 (1 self) - Add to MetaCart
Popular XML languages, like XPath, use "treepattern " queries to select nodes based on their structural characteristics. While many processing methods have already been proposed for such queries, none of them has found its way to any of the existing "lightweight" XML engines (i.e. engines without optimization modules). The main reason is the lack of a systematic comparison of query methods under a common storage model. In this work, we aim to fill this gap and answer two important questions: what the relative similarities and important differences among the tree-pattern query methods are, and if there is a prominent method among them in terms of effectiveness and robustness that an XML processor should support.
(Show Context)

Citation Context

...posed using binary operators, and then some optimization is applied to produce an efficient plan [29]). As a result, numerous proposals attempt to handle tree-pattern queries with holistic techniques =-=[5, 11, 28, 25, 32]-=-. Additionaly, index structures have also been introduced [14, 23, 10, 7, 5, 17, 18, 8] to further improve performance. A common characteristic for all holistic approaches is that some preprocessing i...

Efficient Processing of XML Path Queries Using the Disk-Based F&B Index

by Wei Wang, Hongzhi Wang, Hongjun Lu, Haifeng Jiang, Xuemin Lin, Jianzhong Li - IN VLDB , 2005
"... ..."
Abstract - Cited by 14 (0 self) - Add to MetaCart
Abstract not found

Compact Access Control Labeling for Efficient Secure XML Query Evaluation

by Huaxin Zhang, Ning Zhang, Kenneth Salem, Donghui Zhuo , 2004
"... Fine-grained access controls for XML define access privileges at the granularity of individual XML nodes. In this paper, we present a fine-grained access control mechanism for XML data. This mechanism exploits the structural locality of access rights as well as correlations among the access rights o ..."
Abstract - Cited by 9 (2 self) - Add to MetaCart
Fine-grained access controls for XML define access privileges at the granularity of individual XML nodes. In this paper, we present a fine-grained access control mechanism for XML data. This mechanism exploits the structural locality of access rights as well as correlations among the access rights of different users to produce a compact physical encoding of the access control data. This encoding can be constructed using a single pass over a labeled XML database. It is block-oriented and suitable for use in secondary storage. We show how this access control mechanism can be integrated with a next-of-kin (NoK) XML query processor to provide efficient, secure query evaluation. The key idea is that the structural information of the nodes and their encoded access controls are stored together so the access privileges can be checked efficiently. Our evaluation shows that the access control mechanism introduces little overhead into the query evaluation process. 1
(Show Context)

Citation Context

...’s access control data in memory. The DOL access control representation is highly compatible with next-of-kin (NoK) pattern matching, which is an efficient technique for processing XML twig queries 1=-=[19]-=-. NoK query processing uses a compact representation of document structure to evaluate some kinds of structural query constraints (e.g. parent/child relationships) very efficiently. In this paper, we ...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University