Results 1 - 10
of
154
Structural Properties of XPath Fragments
- IN ICDT
, 2003
"... We study structural properties of each of the main sublanguages of XPath [8] commonly used in practice. First, we characterize the expressive power of these language fragments in terms of both logics and tree patterns. Second, we investigate closure properties, focusing on the ability to perform bas ..."
Abstract
-
Cited by 80 (7 self)
- Add to MetaCart
We study structural properties of each of the main sublanguages of XPath [8] commonly used in practice. First, we characterize the expressive power of these language fragments in terms of both logics and tree patterns. Second, we investigate closure properties, focusing on the ability to perform basic Boolean operations while remaining within the fragment. We give a complete picture of the closure properties of these fragments, treating XPath expressions both as functions of arbitrary nodes in a document tree, and as functions that are applied only at the root of the tree. Finally, we provide sound and complete axiom systems and normal forms for several of these fragments. These results are useful for simplification of XPath expressions and optimization of XML queries.
The Complexity of XPath Query Evaluation
, 2003
"... In this paper, we study the precise complexity of XPath 1.0 query processing. Even though heavily used by its incorporation into a variety of XML-related standards, the precise cost of evaluating an XPath query is not yet wellunderstood. The first polynomial-time algorithm for XPath processing (with ..."
Abstract
-
Cited by 79 (5 self)
- Add to MetaCart
In this paper, we study the precise complexity of XPath 1.0 query processing. Even though heavily used by its incorporation into a variety of XML-related standards, the precise cost of evaluating an XPath query is not yet wellunderstood. The first polynomial-time algorithm for XPath processing (with respect to combined complexity) was proposed only recently, and even to this day all major XPath engines take time exponential in the size of the input queries. From the standpoint of theory, the precise complexity of XPath query evaluation is open, and it is thus unknown whether the query evaluation problem can be parallelized.
XPath satisfiability in the presence of DTDs
- In PODS ’05: Proceedings of the twenty-fourth ACM Symposium on Principles of Database Systems
, 2005
"... We study the satisfiability problem associated with XPath in the presence of DTDs. This is the problem of determining, given a query p in an XPath fragment and a DTD D, whether or not there exists an XML document T such that T conforms to D and the answer of p on T is nonempty. We consider a variety ..."
Abstract
-
Cited by 79 (5 self)
- Add to MetaCart
We study the satisfiability problem associated with XPath in the presence of DTDs. This is the problem of determining, given a query p in an XPath fragment and a DTD D, whether or not there exists an XML document T such that T conforms to D and the answer of p on T is nonempty. We consider a variety of XPath fragments widely used in practice, and investigate the impact of different XPath operators on satisfiability analysis. We first study the problem for negation-free XPath fragments with and without upward axes, recursion and data-value joins, identifying which factors lead to tractability and which to NP-completeness. We then turn to fragments with negation but without data values, establishing lower and upper bounds in the absence and in the presence of upward modalities and recursion. We show that with negation the complexity ranges from PSPACE to EXPTIME. Moreover, when both data values and negation are in place, we find that the complexity ranges from NEXPTIME to undecidable. Finally, we give a finer analysis of the problem for particular classes of DTDs, exploring the impact of various DTD constructs, identifying tractable cases, as well as providing the complexity in the query size alone. 1.
Containment and equivalence for a fragment of XPath
- Journal of the ACM
, 2004
"... Abstract. XPath is a language for navigating an XML document and selecting a set of element nodes. XPath expressions are used to query XML data, describe key constraints, express transformations, and reference elements in remote documents. This article studies the containment and equivalence problem ..."
Abstract
-
Cited by 74 (0 self)
- Add to MetaCart
Abstract. XPath is a language for navigating an XML document and selecting a set of element nodes. XPath expressions are used to query XML data, describe key constraints, express transformations, and reference elements in remote documents. This article studies the containment and equivalence problems for a fragment of the XPath query language, with applications in all these contexts. In particular, we study a class of XPath queries that contain branching, label wildcards and can express descendant relationships between nodes. Prior work has shown that languages that combine any two of these three features have efficient containment algorithms. However, we show that for the combination of features, containment is coNP-complete. We provide a sound and complete algorithm for containment that runs in exponential time, and study parameterized PTIME special cases. While we identify one parameterized class of queries for which containment can be decided efficiently, we also show that even with some bounded parameters, containment remains coNP-complete. In response to these negative results, we describe a sound algorithm that is efficient for all queries, but may return false negatives in some cases.
Monadic Queries over Tree-Structured Data
, 2002
"... Monadic query languages over trees currently receive considerable interest in the database community, as the problem of selecting nodes from a tree is the most basic and widespread database query problem in the context of XML. Partly a survey of recent work done by the authors and their group on log ..."
Abstract
-
Cited by 62 (7 self)
- Add to MetaCart
Monadic query languages over trees currently receive considerable interest in the database community, as the problem of selecting nodes from a tree is the most basic and widespread database query problem in the context of XML. Partly a survey of recent work done by the authors and their group on logical query languages for this problem and their expressiveness, this paper provides a number of new results related to the complexity of such languages over so-called axis relations (such as "child" or "descendant") which are motivated by their presence in the XPath standard or by their utility for data extraction (wrapping).
Secure XML Querying with Security Views
, 2004
"... The prevalent use of XML highlights the need for a generic, flexible access-control mechanism for XML documents that supports efficient and secure query access, without revealing sensitive information to unauthorized users. This paper introduces a novel paradigm for specifying XML security constrain ..."
Abstract
-
Cited by 62 (6 self)
- Add to MetaCart
The prevalent use of XML highlights the need for a generic, flexible access-control mechanism for XML documents that supports efficient and secure query access, without revealing sensitive information to unauthorized users. This paper introduces a novel paradigm for specifying XML security constraints and investigates the enforcement of such constraints during XML query evaluation. Our approach is based on the novel concept of security views, which provide for each user group (a) an XML view consisting of all and only the information that the users are authorized to access, and (b) a view DTD that the XML view conforms to. Security views effectively protect sensitive data from access and potential inferences by unauthorized users, and provide authorized users with necessary schema information to facilitate effective query formulation and optimization. We propose an efficient algorithm for deriving security view definitions from security policies (defined on the original document DTD) for different user groups. We also develop novel algorithms for XPath query rewriting and optimization such that queries over security views can be efficiently answered without materializing the views. Our algorithms transform a query over a security view to an equivalent query over the original document, and effectively prune query nodes by exploiting the structural properties of the document DTD in conjunction with approximate XPath containment tests. Our work is the first to study a flexible, DTD-based access-control model for XML and its implications on the XML query-execution engine. Furthermore, it is among the first efforts for query rewriting and optimization in the presence of general DTDs for a rich class of XPath queries. An empirical study based on real-life DTDs verifies the ef...
Dynamic XML Documents with Distribution and Replication
- In Proc. of ACM SIGMOD
, 2003
"... The advent of XML as a universal exchange format, and of Web services as a basis for distributed computing, has fostered the apparition of a new class of documents: dynamic XML documents. These are XML documents where some data is given explicitly while other parts are given only intensionally by me ..."
Abstract
-
Cited by 61 (9 self)
- Add to MetaCart
The advent of XML as a universal exchange format, and of Web services as a basis for distributed computing, has fostered the apparition of a new class of documents: dynamic XML documents. These are XML documents where some data is given explicitly while other parts are given only intensionally by means of embedded calls to web services that can be called to generate the required information. By the sole presence of Web services, dynamic documents already include inherently some form of distributed computation. A higher level of distribution that also allows (fragments of) dynamic documents to be distributed and/or replicated over several sites is highly desirable in today's Web architecture, and in fact is also relevant for regular (non dynamic) documents.
Conjunctive Queries over Trees
, 2004
"... We study the complexity and expressive power of conjunctive queries over unranked labeled trees, where the tree structures are represented using "axis relations" such as "child", "descendant", and "following" (we consider a superset of the XPath axes) as well as unary relations for node labels. (Cyc ..."
Abstract
-
Cited by 53 (7 self)
- Add to MetaCart
We study the complexity and expressive power of conjunctive queries over unranked labeled trees, where the tree structures are represented using "axis relations" such as "child", "descendant", and "following" (we consider a superset of the XPath axes) as well as unary relations for node labels. (Cyclic) conjunctive queries over trees occur in a wide range of data management scenarios related to XML, the Web, and computational linguistics. We establish a framework for characterizing structures representing trees for which conjunctive queries can be evaluated e#- ciently. Then we completely chart the tractability frontier of the problem for our axis relations, i.e., we find all subsetmaximal sets of axes for which query evaluation is in polynomial time. All polynomial-time results are obtained immediately using the proof techniques from our framework. Finally, we study the expressiveness of conjunctive queries over trees and compare it to the expressive power of fragments of XPath. We show that for each conjunctive query, there is an equivalent acyclic positive query (i.e., a set of acyclic conjunctive queries), but that in general this query is not of polynomial size.
XPath Query Evaluation: Improving Time and Space Efficiency
- IN PROCEEDINGS OF THE 19TH IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE’03
, 2003
"... Contemporary XPath query engines evaluate queries in time exponential in the sizes of input queries, a fact that has gone unnoticed for a long time. Recently, the first mainmemory evaluation algorithm for XPath 1.0 with polynomial time combined complexity, i.e., which runs in polynomial time both wi ..."
Abstract
-
Cited by 51 (5 self)
- Add to MetaCart
Contemporary XPath query engines evaluate queries in time exponential in the sizes of input queries, a fact that has gone unnoticed for a long time. Recently, the first mainmemory evaluation algorithm for XPath 1.0 with polynomial time combined complexity, i.e., which runs in polynomial time both with respect to the size of the data and the queries, has been published (cf. [11]). In this paper
Accelerating XPath Evaluation in any RDBMS
- ACM Transactions on Database Systems
, 2003
"... This article is a proposal for a database index structure, the XPath accelerator, that has been specifically designed to support the evaluation of XPath path expressions. As such, the index is capable to support all XPath axes (including ancestor, following, preceding-sibling, descendant-or-self, et ..."
Abstract
-
Cited by 50 (8 self)
- Add to MetaCart
This article is a proposal for a database index structure, the XPath accelerator, that has been specifically designed to support the evaluation of XPath path expressions. As such, the index is capable to support all XPath axes (including ancestor, following, preceding-sibling, descendant-or-self, etc.). This feature lets the index stand out among related work on XML indexing structures which had a focus on the child and descendant axes only. The index has been designed with a close eye on the XPath semantics as well as the desire to engineer its internals so that it can be supported well by existing relational database query processing technology: the index (a) permits set-oriented (or, rather, sequence-oriented) path evaluation, and (b) can be implemented and queried using well-established relational index structures, notably B-trees and R-trees

