Results 11 - 20
of
111
XML-to-SQL Query Translation Literature: The State of the Art and Open Problems
- In XSym
, 2003
"... Recently, the database research literature has seen an explosion of publications with the goal of using an RDBMS to store and/or query XML data. The problems addressed and solved in this area are diverse. This diversity renders it di#cult to know how the various results presented fit together, a ..."
Abstract
-
Cited by 32 (0 self)
- Add to MetaCart
(Show Context)
Recently, the database research literature has seen an explosion of publications with the goal of using an RDBMS to store and/or query XML data. The problems addressed and solved in this area are diverse. This diversity renders it di#cult to know how the various results presented fit together, and even makes it hard to know what open problems remain. As a first step to rectifying this situation, we present a classification of the problem space and discuss how almost 40 papers fit into this classification. As a result of this study, we find that some basic questions are still open. In particular, for the XML publishing of relational data and for "schema-based" shredding of XML documents into relations, there is no published algorithm for translating even simple path expression queries (with the // axis) into SQL when the XML schema is recursive.
BLAS: an efficient XPath processing system
- PROCEEDINGS OF THE ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD
, 2004
"... We present BLAS, a Bi-LAbeling based System, for efficiently processing complex XPath queries over XML data. BLAS uses P-labeling to process queries involving consecutive child axes, and D-labeling to process queries involving descendant axes traversal. The XML data is stored in labeled form, and in ..."
Abstract
-
Cited by 28 (2 self)
- Add to MetaCart
(Show Context)
We present BLAS, a Bi-LAbeling based System, for efficiently processing complex XPath queries over XML data. BLAS uses P-labeling to process queries involving consecutive child axes, and D-labeling to process queries involving descendant axes traversal. The XML data is stored in labeled form, and indexed to optimize descendent axis traversals. Three algorithms are presented for translating complex XPath queries to SQL expressions, and two alternate query engines are provided. Experimental results demonstrate that the BLAS system has a substantial performance improvement compared to traditional XPath processing using D-labeling.
Designing and Evaluating an XPath Dialect for Linguistic Queries
- In Proc. of the 22nd Int. Conf. on Data Engineering (ICDE 2006
, 2006
"... Linguistic research and natural language processing employ large repositories of ordered trees. XML, a standard ordered tree model, and XPath, its associated language, are natural choices for linguistic data and queries. However, several important expressive features required for linguistic queries ..."
Abstract
-
Cited by 27 (7 self)
- Add to MetaCart
(Show Context)
Linguistic research and natural language processing employ large repositories of ordered trees. XML, a standard ordered tree model, and XPath, its associated language, are natural choices for linguistic data and queries. However, several important expressive features required for linguistic queries are missing or hard to express in XPath. In this paper, we motivate and illustrate these features with a variety of linguistic queries. Then we propose extensions to XPath to support linguistic queries, and design an efficient query engine based on a novel labeling scheme. Experiments demonstrate that our language is not only sufficiently expressive for linguistic trees but also efficient for practical usage. 1
Approximate matching of hierarchical data using pq-grams
- IN PROC. OF VLDB
, 2005
"... When integrating data from autonomous sources, exact matches of data items that represent the same real world object often fail due to a lack of common keys. Yet in many cases structural information is available and can be used to match such data. As a running example we use residential address info ..."
Abstract
-
Cited by 26 (6 self)
- Add to MetaCart
(Show Context)
When integrating data from autonomous sources, exact matches of data items that represent the same real world object often fail due to a lack of common keys. Yet in many cases structural information is available and can be used to match such data. As a running example we use residential address information. Addresses are hierarchical structures and are present in many databases. Often they are the best, if not only, relationship between autonomous data sources. Typically the matching has to be approximate since the representations in the sources differ. We propose pq-grams to approximately match hierarchical information from autonomous sources. We define the pq-gram distance between ordered labeled trees as an effective and efficient approximation of the well-known tree edit distance. We analyze the properties of the pq-gram distance and compare it with the edit distance and alternative approximations. Experiments with synthetic and real world data confirm the analytic results and the scalability of our approach.
Why Off-the-Shelf RDBMSs are Better at XPath Than You Might Expect
, 2007
"... To compensate for the inherent impedance mismatch between the relational data model (tables of tuples) and XML (ordered, unranked trees), tree join algorithms have become the prevalent means to process XML data in relational databases, most notably the TwigStack [6], structural join [1], and stairca ..."
Abstract
-
Cited by 24 (3 self)
- Add to MetaCart
To compensate for the inherent impedance mismatch between the relational data model (tables of tuples) and XML (ordered, unranked trees), tree join algorithms have become the prevalent means to process XML data in relational databases, most notably the TwigStack [6], structural join [1], and staircase join [13] algorithms. However, the addition of these algorithms to existing systems depends on a significant invasion of the underlying database kernel, an option intolerable for most database vendors. Here, we demonstrate that we can achieve comparable XPath performance without touching the heart of the system. We carefully exploit existing database functionality and accelerate XPath navigation by purely relational means: partitioned B-trees bring access costs to secondary storage to a minimum, while aggregation functions avoid an expensive computation and removal of duplicate result nodes to comply with the XPath semantics. Experiments carried out on IBM DB2 confirm that our approach can turn off-the-shelf database systems into efficient XPath processors.
Recursive XML Schemas, Recursive XML Queries, and Relational Storage: XML-to-SQL Query Translation
- In ICDE
, 2004
"... We consider the problem of translating XML queries into SQL when XML documents have been stored in an RDBMS using a schema-based relational decomposition. Surprisingly, there is no published XML-to-SQL query translation algorithm for this scenario that handles recursive XML schemas. We present a gen ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
(Show Context)
We consider the problem of translating XML queries into SQL when XML documents have been stored in an RDBMS using a schema-based relational decomposition. Surprisingly, there is no published XML-to-SQL query translation algorithm for this scenario that handles recursive XML schemas. We present a generic algorithm to translate path expression queries into SQL in the presence of recursion in the schema and queries. This algorithm handles a general class of XML-to-Relational mappings, which includes all techniques proposed in literature. Some of the salient features of this algorithm are: (i) It translates a path expression query into a single SQL query, irrespective of how complex the XML schema is, (ii) It uses the "with" clause in SQL99 to handle recursive queries even over non-recursive schemas, (iii) It reconstructs recursive XML subtrees with a single SQL query and (iv) It shows that the support for linear recursion in SQL99 is sufficient for handling path expression queries over arbitrarily complex recursive XML schema.
Extending xpath to support linguistic queries
- In Workshop on Programming Language Technologies for XML (PLAN-X
, 2005
"... Linguistic research and language technology development employ large repositories of ordered trees. XML, a standard ordered tree model, and XPath, its associated language, are natural choices for linguistic data storage and queries. However, several important expressive features required for linguis ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
(Show Context)
Linguistic research and language technology development employ large repositories of ordered trees. XML, a standard ordered tree model, and XPath, its associated language, are natural choices for linguistic data storage and queries. However, several important expressive features required for linguistic queries are missing in XPath. In this paper, we motivate and illustrate these features with a variety of linguistic queries. Then we define extensions to XPath which support linguistic tree queries, and describe an efficient query engine based on a novel labeling scheme. Experiments demonstrate that our language is not only sufficiently expressive for linguistic trees but also efficient for practical usage. 1.
Dependencies: Making Ontology Based Data Access Work in Practice
"... Abstract. Query answering in Ontology Based Data Access (OBDA) exploits the knowledge of an ontology’s TBox to deal with incompleteness of the ABox (or data source). Current query-answering techniques with DL-Lite require exponential size query reformulations, or expensive data pre-processing, and h ..."
Abstract
-
Cited by 22 (9 self)
- Add to MetaCart
(Show Context)
Abstract. Query answering in Ontology Based Data Access (OBDA) exploits the knowledge of an ontology’s TBox to deal with incompleteness of the ABox (or data source). Current query-answering techniques with DL-Lite require exponential size query reformulations, or expensive data pre-processing, and hence may not be suitable for data intensive scenarios. Also, these techniques present severe redundancy issues when dealing with ABoxes that are already (partially) complete. It has been shown that addressing redundancy is not only required for tractable implementations of decision procedures, but may also allow for sizable improvements in execution times. Considering the previous observations, in this paper we present two complementary sets of results that aim at improving query answering performance in OBDA systems. First, we show that we can characterize completeness of an ABox by means of dependencies, and that we can use these to optimize DL-Lite TBoxes. Second, we show that in OBDA systems we can create ABox repositories that appear to be complete w.r.t. a significant portion of any DL-Lite TBox. The combination of these results allows us to design OBDA systems in which redundancy is minimal, the exponential aspect of query answering is notably reduced and that can be implemented efficiently using existing RDBMSs. 1
Native XQuery processing in Oracle XMLDB
- in: Proceedings of ACM SIGMOD International Conference on Management of Data, 2005
"... With XQuery becoming the standard language for querying XML, and the relational SQL platform being recognized as an important platform to store and process XML, the SQL/XML standard is integrating XML query capability into the SQL system by introducing new SQL functions and constructs such as XMLQue ..."
Abstract
-
Cited by 20 (0 self)
- Add to MetaCart
(Show Context)
With XQuery becoming the standard language for querying XML, and the relational SQL platform being recognized as an important platform to store and process XML, the SQL/XML standard is integrating XML query capability into the SQL system by introducing new SQL functions and constructs such as XMLQuery() and XMLTable. This paper discusses the Oracle XMLDB XQuery architecture for supporting XQuery in the Oracle ORDBMS kernel which has the XQuery processing tightly integrated with the SQL/XML engine using native XQuery compilation, optimization and execution techniques. 1.
Relational Algebra: Mother Tongue—XQuery: Fluent
- In Proc. of the 1st Twente Data Management Workshop (TDM
, 2004
"... This work may be seen as a further proof of the versatility of the relational database model. Here, we add XQuery to the catalog of languages which RDBMSs are able to \speak" °uently. Given suitable relational encodings of sequences and or-dered, unranked trees|the two data structures that form ..."
Abstract
-
Cited by 17 (8 self)
- Add to MetaCart
(Show Context)
This work may be seen as a further proof of the versatility of the relational database model. Here, we add XQuery to the catalog of languages which RDBMSs are able to \speak" °uently. Given suitable relational encodings of sequences and or-dered, unranked trees|the two data structures that form the backbone of the XML and XQuery data models|we de-scribe a compiler that translates XQuery expressions into a simple and quite standard relational algebra which we ex-pect to be e±ciently implementable on top of any relational query engine. The compilation procedure is fully composi-tional and emits algebraic code that strictly adheres to the XQuery language semantics: document and sequence order as well as node identity are obeyed. We exercise special care in translating arbitrarily nested XQuery FLWOR iteration constructs into equi-joins, an operation which RDBMSs can perform particularly fast. The resulting purely relational XQuery processor shows promising performance ¯gures in experiments.