Results 1 -
9 of
9
Quality-aware Service-Oriented Data Integration: Requirements, State of the Art and Open Challenges
"... With a multitude of data sources available online, data consumers might find it hard to select the best combination of sources for their needs. Aspects such as price, licensing, service and data quality play a major role in selecting data sources. We therefore advocate qualityaware data services as ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
(Show Context)
With a multitude of data sources available online, data consumers might find it hard to select the best combination of sources for their needs. Aspects such as price, licensing, service and data quality play a major role in selecting data sources. We therefore advocate qualityaware data services as a natural data source model for complex data integration tasks and mash-ups. This paper focuses on requirements, state of the art, and the main research challenges on the way to the realization of such services. 1.
Multidimensional Contexts for Data Quality Assessment ⋆
"... Abstract. The notion of data quality cannot be separated from the context in which the data is produced or used. Recently, a conceptual framework for capturing context-dependent data quality assessment has been proposed. According to it, a database D is assessed wrt. a context which is modeled as an ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
(Show Context)
Abstract. The notion of data quality cannot be separated from the context in which the data is produced or used. Recently, a conceptual framework for capturing context-dependent data quality assessment has been proposed. According to it, a database D is assessed wrt. a context which is modeled as an external system containing additional data, metadata, and definitions of quality predicates. The instance D is “put in context ” via schema mappings; and after contextual processing of the data, a collection of alternative clean versions D ′ of D is produced. The quality of D is measured in terms of its distance to this class. In this work we extend contexts for data quality assessment by including multidimensional data, which allows to analyze data from multiple perspectives and different degrees of granularity. It is possible to navigate through dimensional hierarchies in order to go for the data that is needed for quality assessment. More precisely, we introduce contextual hierarchies as components of contexts for data quality assessment. The resulting contexts are later represented as ontologies written in description logic. 1
Generic and Declarative Approaches to Data Cleaning: Some Recent Developments
"... Abstract Data assessment and data cleaning tasks have traditionally been addressed through procedural solutions. Most of the time, those solutions have been applicable to specific problems and domains. In the last few years we have seen the emergence of more generic solutions; and also of declarativ ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract Data assessment and data cleaning tasks have traditionally been addressed through procedural solutions. Most of the time, those solutions have been applicable to specific problems and domains. In the last few years we have seen the emergence of more generic solutions; and also of declarative and rule-based specifications of the intended solutions of data cleaning processes. In this chapter we review some of those recent developments. 1
Contexts and Data Quality Assessment
"... The quality of data is context dependent. Starting from this intuition and experience, we propose and develop a conceptual framework that captures in formal terms the notion of context-dependent data quality. We start by proposing a generic and abstract notion of context, and also of its uses, in ge ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
The quality of data is context dependent. Starting from this intuition and experience, we propose and develop a conceptual framework that captures in formal terms the notion of context-dependent data quality. We start by proposing a generic and abstract notion of context, and also of its uses, in general and in data management in particular. On this basis, we investigate data quality assessment and quality query answering as context-dependent activities. A context for the assessment of a database D at hand is modeled as an external database schema, with possibly materialized or virtual data, and connections to external data sources. The database D is put in context via mappings to the contextual schema, which produces a collection C of alternative clean versions of D. The quality of D is measured in terms of its distance to C. TheclassC is also used to define and do quality query answering. The proposed model allows for natural extensions, like the use of data quality predicates, the optimization of the access by the context to external data sources, and also the representation of contexts by means of more expressive ontologies.
Detecting the Temporal Context of Queries
"... Abstract. Business intelligence and reporting tools rely on a database that accurately mirrors the state of the world. Yet, even if the schema and queries are constructed in exacting detail, assumptions about the data made during extraction, transformation, and schema and query creation of the repor ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract. Business intelligence and reporting tools rely on a database that accurately mirrors the state of the world. Yet, even if the schema and queries are constructed in exacting detail, assumptions about the data made during extraction, transformation, and schema and query creation of the reporting database may be (accidentally) ignored by end users, or may change as the database evolves over time. As these assumptions are typically implicit (e.g., assuming that a sales record relation is append-only), it can be hard to even detect that a mistaken assumption has been made. In this paper, we argue that such errors are consequences of unin-tended contextual dependence, i.e., query outputs dependent on a variable characteristic of the database. We characterize contextual dependence, and explore several strategies for efficiently detecting and quantifying the effects of contextual dependence on query outputs. We present and evaluate our findings in the context of a concrete case study: Detecting temporal dependence using a database management system with ver-sioning capabilities. 1
PAY-AS-YOU-GO DATA QUALITY IMPROVEMENT FOR MEDICAL CENTERS
"... Medical practitioners often link their databases to support new use cases of the medical sector, e.g. in economic planning or treatment coordination. Data quality requirements for these use cases differ from the original requirements on the databases. We argue that any system seeking to support data ..."
Abstract
- Add to MetaCart
(Show Context)
Medical practitioners often link their databases to support new use cases of the medical sector, e.g. in economic planning or treatment coordination. Data quality requirements for these use cases differ from the original requirements on the databases. We argue that any system seeking to support data quality in this scenario requires significant evolutionary power. We suggest an approach to continuously improve data quality which scales with arising requirements in a pay-asyou-go manner.
Extending Contexts with Ontologies for Multidimensional Data Quality Assessment
"... Abstract — Data quality and data cleaning are context depen-dent activities. Starting from this observation, in previous work a context model for the assessment of the quality of a database instance was proposed. In that framework, the context takes the form of a possibly virtual database or data in ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract — Data quality and data cleaning are context depen-dent activities. Starting from this observation, in previous work a context model for the assessment of the quality of a database instance was proposed. In that framework, the context takes the form of a possibly virtual database or data integration system into which a database instance under quality assessment is mapped, for additional analysis and processing, enabling quality assess-ment. In this work we extend contexts with dimensions, and by doing so, we make possible a multidimensional assessment of data quality assessment. Multidimensional contexts are represented as ontologies written in Datalog. We use this language for representing dimensional constraints, and dimensional rules, and also for doing query answering based on dimensional navigation, which becomes an important auxiliary activity in the assessment of data. We show ideas and mechanisms by means of examples. I.
Problems and Opportunities in Context-Based Personalization
"... In a world of global networking, the increasing amount of het-erogeneous information, available through a variety of channels, has made it difficult for users to find the information they need in the current situation, at the right level of detail. This is true not only when accessing information fr ..."
Abstract
- Add to MetaCart
(Show Context)
In a world of global networking, the increasing amount of het-erogeneous information, available through a variety of channels, has made it difficult for users to find the information they need in the current situation, at the right level of detail. This is true not only when accessing information from mobile devices, char-acterized by limited – although growing – resources and by high connection costs, but also when using powerful systems, since the amount of “out-of-context ” answers to a given user request may be overwhelming. The knowledge of the context in which the data are going to be used can support the process of focussing on currently useful, personalized information. The activity needed for context-aware information personalization provides material for stimulat-ing research, briefly illustrated in this paper. 1.