Results 1 - 10
of
23
Rewriting Regular XPath Queries on XML Views
- In Proc. ICDE
, 2007
"... We study the problem of answering queries posed on virtual views of XML documents, a problem commonly encountered when enforcing XML access control and integrating data. We approach the problem by rewriting queries on views into equivalent queries on the underlying document, and thus avoid the overh ..."
Abstract
-
Cited by 37 (3 self)
- Add to MetaCart
(Show Context)
We study the problem of answering queries posed on virtual views of XML documents, a problem commonly encountered when enforcing XML access control and integrating data. We approach the problem by rewriting queries on views into equivalent queries on the underlying document, and thus avoid the overhead of view materialization and maintenance. We consider possibly recursively defined XML views and study the rewriting of both XPath and regular XPath queries. We show that while rewriting is not always possible for XPath over recursive views, it is for regular XPath; however, the rewritten query may be of exponential size. To avoid this prohibitive cost we propose a rewriting algorithm that characterizes rewritten queries as a new form of automata, and an efficient algorithm to evaluate the automaton-represented queries. These allow us to answer queries on views in linear time. We have fully implemented a prototype system, SMOQE, which yields the first regular XPath engine and a practical solution for answering queries over possibly recursively defined XML views. 1.
Structured Materialized Views for XML Queries
, 2007
"... The performance of XML database queries can be greatly enhanced by rewriting them using materialized views. We study the problem of rewriting a query using materialized views, where both the query and the views are described by a tree pattern language, appropriately extended to capture a large XQuer ..."
Abstract
-
Cited by 29 (7 self)
- Add to MetaCart
(Show Context)
The performance of XML database queries can be greatly enhanced by rewriting them using materialized views. We study the problem of rewriting a query using materialized views, where both the query and the views are described by a tree pattern language, appropriately extended to capture a large XQuery subset. The pattern language features optional nodes and nesting, allowing to capture the data needs of nested XQueries. The language also allows describing storage features such as structural identifiers, which enlarge the space of rewritings. We study pattern containment and equivalent rewriting under the constraints expressed in a structural summary, whose enhanced form also entails integrity constraints. Our approach is implemented in the ULoad [6] prototype and we present a performance analysis.
Query Translation from XPath to SQL in the Presence of Recursive DTDs
- In VLDB
, 2005
"... The interaction between recursion in XPATH and recursion in DTDs makes it challenging to answer XPATH queries on XML data that is stored in an RDBMS via schema-based shredding. We present a new approach to translating XPATH queries into SQL queries with a simple least fixpoint (LFP) operator, ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
The interaction between recursion in XPATH and recursion in DTDs makes it challenging to answer XPATH queries on XML data that is stored in an RDBMS via schema-based shredding. We present a new approach to translating XPATH queries into SQL queries with a simple least fixpoint (LFP) operator, which is already supported by most commercial RDBMS. The approach is based on our algorithm for rewriting XPATH queries into regular XPATH expressions, which are capable of capturing both DTD recursion and XPATH queries in a uniform framework. Furthermore, we provide an algorithm for translating regular XPATH queries to SQL queries with LFP, and optimization techniques for minimizing the use of the LFP operator.
Teaching an Old Elephant New Tricks
- In Conf. on Innovative Data Systems Research (CIDR
, 2009
"... ABSTRACT In recent years, column stores (or C-stores for short) have emerged as a novel approach to deal with read-mostly data warehousing applications. Experimental evidence suggests that, for certain types of queries, the new features of C-stores result in orders of magnitude improvement over tra ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
(Show Context)
ABSTRACT In recent years, column stores (or C-stores for short) have emerged as a novel approach to deal with read-mostly data warehousing applications. Experimental evidence suggests that, for certain types of queries, the new features of C-stores result in orders of magnitude improvement over traditional relational engines. At the same time, some C-store proponents argue that C-stores are fundamentally different from traditional engines, and therefore their benefits cannot be incorporated into a relational engine short of a complete rewrite. In this paper we challenge this claim and show that many of the benefits of C-stores can indeed be simulated in traditional engines with no changes whatsoever. We then identify some limitations of our "pure-simulation" approach for the case of more complex queries. Finally, we predict that traditional relational engines will eventually leverage most of the benefits of C-stores natively, as is currently happening in other domains such as XML data. MOTIVATION In the last couple of decades, new database applications have emerged with different requirements than those in traditional OLTP scenarios. A prominent example of this trend are data warehouses, which are characterized by read-mostly workloads, snowflake-like schemas, and ad-hoc complex aggregate queries. To address these scenarios, the database industry reacted in different ways. On one hand, traditional database vendors (e.g., Microsoft, IBM, and Oracle) augmented traditional database systems with new functionality, such as support for more complex execution plans, multicolumn index support, and the ability to automatically store, query and maintain materialized views defined over the original data. On the other hand, new players in the database market devised a different way to store and process read-mostly data. This line of work was pioneered by Sybase IQ [1] in the mid-nineties and subsequently adopted in other systems Proceedings of the 2009 CIDR Conference data by column results in better compression than what is possible in a row-store. Some compression techniques used in C-stores (such as dictionary or bitmap encoding) can also be applied to rowstores. However, RLE encoding, which replaces a sequence of the same value by a pair (value, count) is a technique that cannot be directly used in a row-store, because wide tuples rarely agree on all attributes. The final ingredient in a C-store is the ability to perform query processing over compressed data as much as possible (see C-stores claim to be much more efficient than traditional rowstores. The experimental evaluation in In this paper we challenge this claim by investigating ways to simulate C-stores inside row-stores. In Section 2 we show how to exploit some of the distinguishing characteristics of C-stores inside a row-store without any engine changes. Then, in Section 3 we discuss some limitations of this approach and predict how row-stores would eventually incorporate most of the benefits of a C-store without losing the ability to process non data-warehouse workloads. Experimental Setting All our experiments were conducted using an Intel Xeon 3.2GHz CPU with 2GB of RAM and a 250GB 7200RPM SATA hard drive running Windows Server 2003 and Microsoft SQL Server 2005. To validate our results, we use the same data set and workload proposed in the original C-store paper 1 Reference
An Efficient XML Schema Typing System
, 2003
"... An XML Schema Typing System based on finite state automata is presented. Existing ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
An XML Schema Typing System based on finite state automata is presented. Existing
Vassalos: “Improving the Efficiency of XPath Execution on Relational Systems”. EDBT
, 2006
"... Abstract: This work describes a method for processing XPath on a relational back-end that significantly limits the number of SQL joins required, takes advantage of the strengths of modern SQL query processors, exploits XML schema information and has low implementation complexity. The method is based ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
Abstract: This work describes a method for processing XPath on a relational back-end that significantly limits the number of SQL joins required, takes advantage of the strengths of modern SQL query processors, exploits XML schema information and has low implementation complexity. The method is based on the splitting of XPath expressions into Primary Path Fragments (PPFs) and their subsequent combination using an efficient structural join method, and is applicable to all XPath axes. A detailed description of the method is followed by an experimental study that shows our technique yields significant efficiency improvements over other XPath processing techniques and systems. 1
Reconstructing XML Subtrees from Relational Storage of XML Documents
- Proceedings of the Second IEEE International Workshop on XML Schema and Data Management (XSDM05), in conjunction with ICDE05
, 2005
"... databases to store and query XML documents. One important component of such systems is the XML subtree reconstruction, which reconstructs the subtrees rooted at the matching nodes of an XML query and returns them to the user as the query result. Existing reconstruction algorithms either do not suppo ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
(Show Context)
databases to store and query XML documents. One important component of such systems is the XML subtree reconstruction, which reconstructs the subtrees rooted at the matching nodes of an XML query and returns them to the user as the query result. Existing reconstruction algorithms either do not support recursive XML view schema, or require expensive nested queries or joins of multiple relations. In this paper, we propose an efficient XML subtree reconstruction algorithm, Reconstruct, which overcomes these limitations and uses an efficient stack-based structural join algorithm to recover all the parent-child relationships between elements. One salient advantage of this algorithm is that it employs the inlining feature of the inlining-based storage of XML documents, which is known as one of the best relational XML storage schemes. Both our algorithmic analysis and experimental study show that Reconstruct is efficient and scalable.
Querying and Maintaining Ordered XML Data Using Relational Databases
"... Although data stored in XML is of increasing importance, most existing data repositories are still managed by relational database systems. In light of this, recent XML database research has focused on extending relational database systems to handle XML data efficiently. While there are many issues i ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Although data stored in XML is of increasing importance, most existing data repositories are still managed by relational database systems. In light of this, recent XML database research has focused on extending relational database systems to handle XML data efficiently. While there are many issues in processing XML data efficiently, containment queries are the queries that often appear and need to be optimized. Recently, structural joins have been proposed to process containment queries efficiently. To date, structural join algorithms are mostly based on stacks and/or external B-Tree indices. Most of these prototypes have been implemented on object databases. This paper proposes an efficient structural join algorithm that can be implemented on top of existing relational databases. Experiments show that our method performs far more superior than previous work in both queries and updates.
XML subtree reconstruction from relational storage of XML documents
, 2007
"... Numerous researchers have proposed to use relational databases to store and query XML documents. In these systems, the elements selected by an XML query are returned to an application either by select mode or by reconstruct mode. For the reconstruct mode, the XML subtrees that are rooted at the sele ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Numerous researchers have proposed to use relational databases to store and query XML documents. In these systems, the elements selected by an XML query are returned to an application either by select mode or by reconstruct mode. For the reconstruct mode, the XML subtrees that are rooted at the selected elements need to be extracted and reconstructed from the relational storage of XML documents. Therefore, XML subtree reconstruction is an important problem since its efficiency has a significant impact on XML query response time. In this paper, we propose (i) a linear XML subtree reconstruction algorithm Reconstruct to reconstruct an XML subtree from the structure-encoded sequence of the subtree that is extracted from the relational database by a structure-encoded sequence retrieval algorithm, (ii) a generic efficient structureencoded sequence retrieval algorithm RD-SB for a schema-based relational XML storage, and (iii) a generic efficient structure-encoded sequence retrieval algorithm RD-SL for a schema-less relational XML storage. To the best of our knowledge, our algorithms provide the first generic solutions to the XML subtree reconstruction problem that are applicable to all relational XML storage schemes proposed in the literature. Finally, our experiments show that our algorithms are efficient and scalable.
XTRON: An XML Data Management System using Relational Databases
"... Recently, there has been plenty of interest in XML. Since the amount of data in XML format has rapidly increased, the need for effective storage and retrieval of XML data has arisen. Many database researchers and vendors have proposed various techniques and tools for XML data storage and retrieval i ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Recently, there has been plenty of interest in XML. Since the amount of data in XML format has rapidly increased, the need for effective storage and retrieval of XML data has arisen. Many database researchers and vendors have proposed various techniques and tools for XML data storage and retrieval in recent years. In this paper, we present an XML data management system using a relational database as a repository. our XML management system stores XML data in a schema independent manner, and translates a comprehensive subset of XQuery expressions into a single SQL statement. Also, our system does not modify the relational engine. In this paper, we also present the experimental results in order to demonstrate the efficiency and scalability of our system compared with well-known XML processing systems. 1