Results 1 
6 of
6
Semantics for NonMonotone Queries in Data Exchange and Data Integration
"... A fundamental question in data exchange and data integration is how to answer queries that are posed against the target schema, or the global schema, respectively. While the certain answers semantics has proved to be adequate for answering monotone queries, the question concerning an appropriate sem ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
A fundamental question in data exchange and data integration is how to answer queries that are posed against the target schema, or the global schema, respectively. While the certain answers semantics has proved to be adequate for answering monotone queries, the question concerning an appropriate semantics for nonmonotone queries turned out to be more difficult. This article surveys approaches and semantics for answering nonmonotone queries in data exchange and data integration.
XML Schema Mappings: Data Exchange and Metadata Management
"... Relational schema mappings have been extensively studied in connection with data integration and exchange problems, but mappings between XML schemas have not received the same amount of attention. Our goal is to develop a theory of expressive XML schema mappings. Such mappings should be able to use ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Relational schema mappings have been extensively studied in connection with data integration and exchange problems, but mappings between XML schemas have not received the same amount of attention. Our goal is to develop a theory of expressive XML schema mappings. Such mappings should be able to use various forms of navigation in a document, and specify conditions on data values. We develop a language for XML schema mappings, and study both data exchange with such mappings and metadata management problems. Specifically, we concentrate on four types of problems: complexity of mappings, query answering, consistency issues, and composition. We first analyze the complexity of mappings, i.e., recognizing pairs of documents such that one can be mapped into the other, and provide a classification based on sets of features used in mappings. Next, we chart the tractability frontier for the query answering problem. We show that the problem is tractable for expressive schema mappings and simple queries, but not vice versa. Then we move to static analysis. We study the complexity of the consistency problem, i.e., deciding whether it is possible to map some document of a source schema into a document of the target schema. Finally, we look at composition of XML schema mappings. We analyze its complexity and show that it is harder to achieve closure under composition for XML than for relational mappings. Nevertheless, we find a robust class of XML schema mappings that, in
Solutions and Query Rewriting in Data Exchange
, 2013
"... Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema. Given a source instance, there may be many solutions – target instances that satisfy the constraints of the data exchange problem. Previous work has identified two classes of des ..."
Abstract
 Add to MetaCart
Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema. Given a source instance, there may be many solutions – target instances that satisfy the constraints of the data exchange problem. Previous work has identified two classes of desirable solutions: canonical universal solutions, and their cores. Query answering in data exchange amounts to rewriting a query over the target schema to another query that, over a materialized target instance, gives the result that is semantically consistent with the source (specifically, the “certain answers”). Basic questions are then: (1) how do these solutions compare in terms of query rewriting? and (2) how can we determine whether a query is rewritable over a particular solution? Our goal is to answer these questions. Our first main result is that, in terms of rewritability by relational algebra queries, the core is strictly less expressive than the canonical universal solution, which in turn is strictly less expressive than the source. To develop techniques for proving queries nonrewritable, we establish structural properties of solutions; in fact they are derived from the technical machinery developed in the rewritability proofs. Our second result is that both the canonical universal solution and the core preserve the local structure of the data, and that every target query
Optimizing the Chase: Scalable Data Integration under Constraints
"... We are interested in scalable data integration and data exchange under constraints/dependencies. In data exchange the problem is how to materialize a target database instance, satisfying the sourcetotarget and target dependencies, that provides the certain answers. In data integration, the proble ..."
Abstract
 Add to MetaCart
(Show Context)
We are interested in scalable data integration and data exchange under constraints/dependencies. In data exchange the problem is how to materialize a target database instance, satisfying the sourcetotarget and target dependencies, that provides the certain answers. In data integration, the problem is how to rewrite a query over the target schema into a query over the source schemas that provides the certain answers. In both these problems we make use of the chase algorithm, the main tool to reason with dependencies. Our first contribution is to introduce the frugal chase, which produces smaller universal solutions than the standard chase, still remaining polynomial in data complexity. Our second contribution is to use the frugal chase to scale up query answering using views under LAV weakly acyclic target constraints, a useful language capturing RDF/S. The latter problem can be reduced to query rewriting using views without constraints by chasing the sourcetotarget mappings with the target constraints. We construct a compact graphbased representation of the mappings and the constraints and develop an efficient algorithm to run the frugal chase on this representation. We show experimentally that our approach scales to large problems, speeding up the compilation of the dependencies into the mappings by close to 2 and 3 orders of magnitude, compared to the standard and the core chase, respectively. Compared to the standard chase, we improve online query rewriting time by a factor of 3, while producing equivalent, but smaller, rewritings of the original query. 1.
Computing universal models . . .
, 2012
"... A universal model of a database D and a set Σ of integrity constraints is a database that extends D, satisfies Σ, and is most general in the sense that it contains sound and complete information. Universal models have a number of applications including answering conjunctive queries, and deciding con ..."
Abstract
 Add to MetaCart
A universal model of a database D and a set Σ of integrity constraints is a database that extends D, satisfies Σ, and is most general in the sense that it contains sound and complete information. Universal models have a number of applications including answering conjunctive queries, and deciding containment of conjunctive queries, with respect to databases with integrity constraints. Furthermore, they are used in slightly modified form as solutions in data exchange. In general, it is undecidable whether a database possesses a universal model, but in the past few years researchers identified various settings where this problem is decidable, and even efficiently solvable. This paper focuses on computing universal models under finite sets of guarded TGDs, nonconflicting keys, and negative constraints. Such constraints generalize inclusion dependencies, and were recently shown to be expressive enough to capture certain members of the DLLite family of description logics. The main result is an algorithm that, given a database without null values and a finite set Σ of such constraints, decides whether there is a universal model, and if so, outputs such a model. If Σ is fixed, the algorithm runs in polynomial time. The algorithm can be extended to cope with databases containing nulls; however, in this case, polynomial running time can be guaranteed only for databases with bounded block size.
Representing and Querying Incomplete Information: a Data Interoperability Perspective
, 2014
"... This thesis is intended to be a succinct and rather informal presentation of some of my most recent work, which has been done in collaboration with several other people. In particular this thesis concentrates on our contributions to the study of incomplete information in the context of data interope ..."
Abstract
 Add to MetaCart
This thesis is intended to be a succinct and rather informal presentation of some of my most recent work, which has been done in collaboration with several other people. In particular this thesis concentrates on our contributions to the study of incomplete information in the context of data interoperability. In this scenario data is heterogenous and decentralized, needs to be integrated from several sources and exchanged between different applications. Incompleteness, i.e. the presence of “missing” or “unknown” portions of data, is naturally generated in data exchange and integration, due to data heterogeneity. The management of incomplete information poses new challenges in this context. The focus of our study is the development of models of incomplete information suitable to data interoperability tasks, and the study of techniques for efficiently querying several forms of incompleteness. The work presented in Chapter 4 is ongoing in the context of Nadime Francis’s PhD, whom I am cosupervising together with Luc Segoufin.