Results 1 - 10
of
52
XBench benchmark and performance testing of XML DBMSs
- In ICDE
, 2004
"... XML support is being added to existing database management systems (DBMSs) and native XML systems are being developed both in industry and in academia. The individual performance characteristics of these approaches as well as the relative performance of various systems is an ongoing concern. In this ..."
Abstract
-
Cited by 35 (1 self)
- Add to MetaCart
XML support is being added to existing database management systems (DBMSs) and native XML systems are being developed both in industry and in academia. The individual performance characteristics of these approaches as well as the relative performance of various systems is an ongoing concern. In this paper we discuss the XBench XML benchmark and report on the relative performance of various DBMSs. XBench is a family of XML benchmarks which recognizes that the XML data that DBMSs manage are quite varied and no one database schema and workload can properly capture this variety. Thus, the members of this benchmark family have been defined for capturing diverse application domains. 1.
A Succinct Physical Storage Scheme for Efficient Evaluation
- of Path Queries in XML. In ICDE’04, pages 54 – 65
, 2004
"... Path expressions are ubiquitous in XML processing languages. Existing approaches evaluate a path expression by selecting nodes that satisfies the tag-name and value constraints and then joining them according to the structural constraints. In this paper, we propose a novel approach, next-of-kin (NoK ..."
Abstract
-
Cited by 27 (12 self)
- Add to MetaCart
Path expressions are ubiquitous in XML processing languages. Existing approaches evaluate a path expression by selecting nodes that satisfies the tag-name and value constraints and then joining them according to the structural constraints. In this paper, we propose a novel approach, next-of-kin (NoK) pattern matching, to speed up the nodeselection step, and to reduce the join size significantly in the second step. To efficiently perform NoK pattern matching, we also propose a succinct XML physical storage scheme that is adaptive to updates and streaming XML as well. Our performance results demonstrate that the proposed storage scheme and path evaluation algorithm is highly efficient and outperforms the other tested systems in most cases. 1.
Full-fledged algebraic XPath processing in Natix
- In 21st International Conference on Data Engineering (ICDE’05). IEEE Computer Society
, 2005
"... We present the first complete translation of XPath into an algebra, paving the way for a comprehensive, state-of-theart XPath (and later on, XQuery) compiler based on algebraic optimization techniques. Our translation includes all XPath features such as nested expressions, position-based predicates ..."
Abstract
-
Cited by 25 (11 self)
- Add to MetaCart
We present the first complete translation of XPath into an algebra, paving the way for a comprehensive, state-of-theart XPath (and later on, XQuery) compiler based on algebraic optimization techniques. Our translation includes all XPath features such as nested expressions, position-based predicates and node-set functions. The translated algebraic expressions can be executed using the proven, scalable, iterator-based approach, as we demonstrate in form of a corresponding physical algebra in our native XML DBMS Natix. A first glance at performance results shows that even without further optimization of the expressions, we provide a competitive evaluation technique for XPath queries. 1.
Nested Queries and Quantifiers in an Ordered Context
- In ICDE
, 2004
"... We present algebraic equivalences that allow to unnest nested algebraic expressions for order-preserving algebraic operators. We illustrate how these equivalences can be applied successfully to unnest nested queries given in the XQuery language. Measurements illustrate the performance gains possible ..."
Abstract
-
Cited by 23 (12 self)
- Add to MetaCart
We present algebraic equivalences that allow to unnest nested algebraic expressions for order-preserving algebraic operators. We illustrate how these equivalences can be applied successfully to unnest nested queries given in the XQuery language. Measurements illustrate the performance gains possible by unnesting.
An analysis of XML database solutions for the management of MPEG-7 media descriptions
- ACM Computing Surveys
, 2003
"... MPEG-7 constitutes a promising standard for the description of multimedia content. It can be expected that a lot of applications based on MPEG-7 media descriptions will be set up in the near future. Therefore, means for the adequate management of large amounts of MPEG-7-compliant media descriptions ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
MPEG-7 constitutes a promising standard for the description of multimedia content. It can be expected that a lot of applications based on MPEG-7 media descriptions will be set up in the near future. Therefore, means for the adequate management of large amounts of MPEG-7-compliant media descriptions are certainly desirable. Essentially,
Incorporating XSL Processing Into Database Engines
, 2002
"... The two observations that 1) many XML documents are stored in a database or generated from data stored in a database and 2) processing these documents with XSL stylesheet processors is an important, often recurring task justify a closer look at the current situation. Typically, the XML documen ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
The two observations that 1) many XML documents are stored in a database or generated from data stored in a database and 2) processing these documents with XSL stylesheet processors is an important, often recurring task justify a closer look at the current situation. Typically, the XML document is retrieved or constructed from the database, exported, parsed, and then processed by a special XSL processor. This cumbersome process clearly sets the goal to incorporate XSL stylesheet processing into the database engine.
Efficient Processing of XML Twig Queries with OR-Predicates
, 2004
"... An XML twig query, represented as a labeled tree, is essentially a complex selection predicate on both structure and content of an XML document. Twig query matching has been identified as a core operation in querying treestructured XML data. A number of algorithms have been proposed recently to proc ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
An XML twig query, represented as a labeled tree, is essentially a complex selection predicate on both structure and content of an XML document. Twig query matching has been identified as a core operation in querying treestructured XML data. A number of algorithms have been proposed recently to process a twig query holistically. Those algorithms, however, only deal with twig queries without OR-predicates. A straightforward approach that first decomposes a twig query with OR-predicates into multiple twig queries without OR-predicates and then combines their results is obviously not optimal in most cases. In this paper, we study novel holistic-processing algorithms for twig queries with OR-predicates without decomposition. In particular, we present a merge-based algorithm for sorted XML data and an index-based algorithm for indexed XML data. We show that holistic processing is much more efficient than the decomposition approach. Furthermore, we show that using indexes can significantly improve the performance for matching twig queries with OR-predicates, especially when the queries have large inputs but relatively small outputs.
On Distributing XML Repositories
, 2003
"... XML is increasingly used not only for data exchange but also to represent arbitrary data sources as virtual XML repositories. In many application scenarios, fragments of such a repository are distributed over the Web. However, design and query models for distributed XML data have not yet been studie ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
XML is increasingly used not only for data exchange but also to represent arbitrary data sources as virtual XML repositories. In many application scenarios, fragments of such a repository are distributed over the Web. However, design and query models for distributed XML data have not yet been studied in detail.
Query optimization in XML structured-document databases
- THE VLDB JOURNAL
, 2006
"... While the information published in the form of XML-compliant documents keeps fast mounting up, efficient and effective query processing and optimization for XML have now become more important than ever. This article reports our recent advances in XML structureddocument query optimization. In this ar ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
While the information published in the form of XML-compliant documents keeps fast mounting up, efficient and effective query processing and optimization for XML have now become more important than ever. This article reports our recent advances in XML structureddocument query optimization. In this article, we elaborate on a novel approach and the techniques developed for XML query optimization. Our approach performs heuristic-based algebraic transformations on XPath queries, represented as PAT algebraic expressions, to achieve query optimization. This article first presents a comprehensive set of general equivalences with regard to XML documents and XML queries. Based on these equivalences, we developed a large set of deterministic algebraic transformation rules for XML query optimization. Our approach is unique, in that it performs exclusively deterministic transformations on queries for fast optimization. The deterministic nature of the proposed approach straightforwardly renders high optimization efficiency and simplicity in implementation. Our approach is a logical-level one, which is independent of any particular storage model. Therefore, the optimizers developed based on our approach can be easily adapted to a broad range of XML data/information servers to achieve fast query optimization. Experimental study confirms the validity and effectiveness of the proposed approach.
Path Summaries and Path Partitioning in Modern XML Databases
- WORLD WIDE WEB (2008 ) 11:117–151
, 2008
"... XML path summaries are compact structures representing all the simple parent-child paths of an XML document. Such paths have also been used in many works as a basis for partitioning the document’s content in a persistent store, under the form of path indices or path tables. We revisit the notions of ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
XML path summaries are compact structures representing all the simple parent-child paths of an XML document. Such paths have also been used in many works as a basis for partitioning the document’s content in a persistent store, under the form of path indices or path tables. We revisit the notions of path summaries and path-driven storage model in the context of current-day XML databases. This context is characterized by complex queries, typically expressed in an XQuery subset, and by the presence of efficient encoding techniques such as structural node identifiers. We review a path summary’s many uses for query optimization, and given them a common basis, namely relevant paths. We discuss summary-based tree pattern minimization and present some efficient summary-based minimization heuristics. We consider relevant path computation and provide a time- and memory-efficient computation algorithm. We combine the principle of path partitioning with the presence of structural identifiers in a simple path-partitioned storage model, which allows for selective data access and efficient query plans. This model improves the efficiency of twig query processing up to two orders of magnitude over the similar

