Results 1 - 10
of
23
XPathMark: an XPath benchmark for the XMark generated data
- In: Proceedings of the International XML Database Symposium (XSym). Volume 3671 of LNCS. (2005) 129{143 http://www.science.uva.nl/~francesc/ xpathmark
"... Abstract. We propose XPathMark, an XPath benchmark on top of the XMark generated data. It consists of a set of queries which covers the main aspects of the language XPath 1.0. These queries have been designed for XML documents generated under XMark, a popular bench-mark for XML data management. We s ..."
Abstract
-
Cited by 59 (3 self)
- Add to MetaCart
Abstract. We propose XPathMark, an XPath benchmark on top of the XMark generated data. It consists of a set of queries which covers the main aspects of the language XPath 1.0. These queries have been designed for XML documents generated under XMark, a popular bench-mark for XML data management. We suggest a methodology to evaluate the XPathMark on a given XML engine and, by way of example, we eval-uate two popular XML engines using the proposed benchmark. 1
XCheck: a platform for benchmarking XQuery engines
- In Proceedings of the 32nd International Conference on Very Large Data Bases
, 2006
"... XCheck is a tool for assessing the relative per-formance of different XQuery/XPath engines by means of benchmarks consisting of a set of XML queries and a set of XML documents. Given a benchmark and a set of engines, XCheck runs the benchmark on these engines and produces highly informative performa ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
(Show Context)
XCheck is a tool for assessing the relative per-formance of different XQuery/XPath engines by means of benchmarks consisting of a set of XML queries and a set of XML documents. Given a benchmark and a set of engines, XCheck runs the benchmark on these engines and produces highly informative performance output. The cur-rent version of XCheck contains the most popular XQuery and XPath benchmarks and the following
XQuery Streaming à la Carte
, 2007
"... Existing work on XML query evaluation has either focused on algebraic optimization techniques suitable for XML databases, or on algorithms to efficiently process XML messages represented as a stream of parsing events. In practice, complex applications often must handle both. In this paper, we develo ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Existing work on XML query evaluation has either focused on algebraic optimization techniques suitable for XML databases, or on algorithms to efficiently process XML messages represented as a stream of parsing events. In practice, complex applications often must handle both. In this paper, we develop a physical algebra that combines streaming operators with other standard relational and XML operators. Our physical model includes marked XML streams, which permit efficient XPath evaluation, but can only be consumed once. This constraint restricts the use of streaming operators to fragments of a query plan that only access data using depth-first traversal. We develop static analysis techniques to decide which fragment of a plan can be streamed. Our experiments demonstrate the benefits of blending streaming with other evaluation techniques.
An Analysis of the Current XQuery Benchmarks
- In International Workshop on Performance and Evaluation of Data Management Systems (EXPDB
, 2006
"... This paper presents an extensive survey of the currently publicly available XQuery benchmarks — XMach-1, XMark, X007, the Michigan benchmark, and XBench — from different perspectives. We address three simple questions about these benchmarks: How are they used? What do they measure? What can one lear ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
(Show Context)
This paper presents an extensive survey of the currently publicly available XQuery benchmarks — XMach-1, XMark, X007, the Michigan benchmark, and XBench — from different perspectives. We address three simple questions about these benchmarks: How are they used? What do they measure? What can one learn from using them? Our conclusions are based on an usage analysis, on an in-depth analysis of the benchmark queries, and on experiments run
Towards microbenchmarking XQuery
- In ExpDB
, 2006
"... A substantial part of the database research field focusses on optimizing XQuery evaluation. However, optimization techniques are rarely validated by means of cross platform benchmarking. The reason for this is that there is a lack of tools that allows one to easily compare different implementations ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
(Show Context)
A substantial part of the database research field focusses on optimizing XQuery evaluation. However, optimization techniques are rarely validated by means of cross platform benchmarking. The reason for this is that there is a lack of tools that allows one to easily compare different implementations of isolated language features. This implies that there is no overview of which engines perform best at certain XQuery aspects, which in turn makes it hard to pick a reference platform for an objective comparison. This paper is a first step in a larger effort to bring an overview of the available implementations along with their strengths and weaknesses. It is meant to guide implementors in benchmarking and improving their products. 1.
Fast answering of XPath query workloads on web collections
- Springer LNCS
, 2007
"... Several web applications (such as processing RSS feeds or web service messages) rely on XPath-based data manipulation tools. Web developers need to use XPath queries effectively on increasingly larger web collections containing hundreds of thousands of XML documents. Even when tasks only need to de ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
(Show Context)
Several web applications (such as processing RSS feeds or web service messages) rely on XPath-based data manipulation tools. Web developers need to use XPath queries effectively on increasingly larger web collections containing hundreds of thousands of XML documents. Even when tasks only need to deal with a single document at a time, developers benefit from understanding the behaviour of XPath expressions across multiple documents (e.g., what will a query return when run over the thousands of hourly feeds collected during the last few months?). Dealing with the (highly variable) structure of such web collections poses additional challenges. This paper introduces DescribeX, a powerful framework that is capable of describing arbitrarily complex XML summaries of web collections, enabling the efficient evaluation of XPath workloads (supporting all the axes and language constructs in XPath). Experiments validate that DescribeX enables existing document-at-a-time XPath tools to scale up to multi-gigabyte XML collections.
Benchmarking XML data warehouses
"... Abstract. With the emergence of XML as a new standard for representing business data, new decision-support applications (namely, XML data warehouses) are being developed. To ensure their feasibility, the issue of performance must be addressed. Performance in general, and the efficiency of performanc ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
Abstract. With the emergence of XML as a new standard for representing business data, new decision-support applications (namely, XML data warehouses) are being developed. To ensure their feasibility, the issue of performance must be addressed. Performance in general, and the efficiency of performance optimization techniques in particular, is usually assessed with the help of benchmarks. However, there are, to the best of our knowledge, no XML decision-support benchmark. In this paper, we present the XML Warehouse Benchmark (XWB), which aims at filling this gap. XWB is based on an original reference model for XML data warehouses, and proposes a test XML data warehouse and its associated XQuery decision-support workload that are derived from the well-known, relational decision-support benchmark TPC-H. Though at an early stage of development, XWB has been successfully used to test the efficiency of indexing and view materialization techniques in XML data warehouses.
An Analysis of Relational Storage Strategies for Partially Structured XML
"... Abstract: This paper presents a performance analysis of strategies for storing XML data sets in relational databases, focusing on XML datasets that are a combination of structured and semi-structured data. The analysis demonstrates advantages of a hybrid approach combining structure mapping and XML ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract: This paper presents a performance analysis of strategies for storing XML data sets in relational databases, focusing on XML datasets that are a combination of structured and semi-structured data. The analysis demonstrates advantages of a hybrid approach combining structure mapping and XML data type instances. However problems remain with current technology with regards to scaling of the approach for large data sets. Also, anomalous results are identified and a threshold at which the cost of data shredding out weighs the advantages of structure mapping. 1
A Scheme for Evaluating XML Engine on
"... Abstract — There are an increasing number of DBMS vendors thinking of integrating XML data management into traditional relational database, with wider use of XML. In this case, a comprehensive evaluation methodology is needed to evaluate the XML engine in RDBMS correctly. In this paper, we analyze t ..."
Abstract
- Add to MetaCart
Abstract — There are an increasing number of DBMS vendors thinking of integrating XML data management into traditional relational database, with wider use of XML. In this case, a comprehensive evaluation methodology is needed to evaluate the XML engine in RDBMS correctly. In this paper, we analyze the characteristics of XML engine and propose an evaluation strategy of XML engine in a RDBMS. We believe that the evaluation should include functional evaluation and performance evaluation, and cover several major aspects of DB such as storage, query and update. Then we designed an evaluation scheme for the XML engine in RDBMS according the strategy. The scheme describes an evaluation scene and contains a data set, workload and index set. The data set reflects the characteristics of both data-centric and document-centric XML data. The workload covers all of the requirements of XQuery in W3C. The index set covers the aspects of storage, indexing, query and update. In the end, we complete an experiment to test an actual computer system using the proposal. The result shows that the proposal is proper. Index Terms—DBMS; XML; W3C; XQuery; Evaluation I.
Benchmark Frameworks and τBench
, 2012
"... Software engineering frameworks tame the complexity of large collections of classes by identifying structural invariants, regularizing interfaces, and increasing sharing across the collection. We wish to appropriate these benefits for families of closely related benchmarks, say for evaluating query ..."
Abstract
- Add to MetaCart
Software engineering frameworks tame the complexity of large collections of classes by identifying structural invariants, regularizing interfaces, and increasing sharing across the collection. We wish to appropriate these benefits for families of closely related benchmarks, say for evaluating query engine implementation strategies. We introduce the notion of a benchmark framework, an ecosystem of benchmarks that are related in semantically-rich ways and enabled by organizing principles. A benchmark framework is realized by iteratively changing one individual benchmark into another, say by modifying the data format, adding schema constraints, or instantiating a different workload. Paramount to our notion of benchmark frameworks are the ease of describing the differences between individual benchmarks and the utility of methods to validate the correctness of each benchmark component by exploiting the overarching ecosystem. As a detailed case study, we introduce τBench, a benchmark framework consisting of ten individual benchmarks, spanning XML, XQuery, XML Schema, and PSM, along with temporal extensions to each. A second case study examines the Mining Unstructured Data benchmark framework and a third examines the potential benefits of rendering the TPC family as a benchmark framework.