Integrated access to information that is spread over multiple, distributed, and heterogeneous sources is an important problem in many scientific and commercial domains. Typically there are many ways to obtain answers to a global query, using data from different sources in different combinations, but in general, it is prohibitively expensive to obtain all answers. While much work has been done on query processing and choosing plans under cost criteria, very little is known about the important problem of incorporating the information quality aspect into query planning. In this paper we describe a framework for multidatabase query processing that fully includes the quality of information in many facets, such as completeness, timeliness, accuracy, etc. We seamlessly include information quality into a multidatabase query processor based on a view-rewriting mechanism. We model information quality at different levels: First, we perform a quality-driven source selection and continue only with the best sources. Second, we compute query-dependent information quality of the view definitions that describe the content of sources. Finally we determine the overall quality of plan alternatives by aggregating these information quality scores to find a set of high-quality query-answering plans. 1
|
1013
|
Foundations of Databases
– Abiteboul, Hull, et al.
- 1995
|
|
857
|
Federated Database Systems for Managing Distributed, Heterogeneous and Autonomous Databases
– Sheth, Larson
- 1990
|
|
801
|
Mediator in the architecture of future information systems
– Wiederhold
- 1992
|
|
611
|
Querying heterogeneous information sources using source descriptions
– Levy, Rajaraman, et al.
- 1996
|
|
482
|
programming and extensions
– Dantzig, Linear
- 1963
|
|
451
|
A comparative analysis of methodologies for database schema integration
– Batini, Lenzerini, et al.
- 1986
|
|
392
|
uAns~ering Queries Using Views
– Levy, Mendelzon, et al.
- 1995
|
|
350
|
Information integration using logical views
– Ullman
|
|
315
|
Optimal implementation of conjunctive queries in relational data bases
– CHANDRA, M
- 1977
|
|
171
|
Answering Recursive Queries Using Views
– Duschka, Genesereth
|
|
135
|
Measuring the efficiency of decision making units
– Charnes, Cooper, et al.
- 1978
|
|
119
|
Beyond Accuracy: what Data Quality Means to Data Consumers
– Wang, Strong
- 1996
|
|
78
|
Conjunctive query containment revisited
– Chekuri, Rajaraman
|
|
65
|
Biokleisli: A digital library for biomedical researchers
– Davidson, Overton, et al.
- 1996
|
|
64
|
A Product Perspective on Total Data Quality Management
– Wang
- 1998
|
|
62
|
Using probabilistic information in data integration
– Florescu, Koller, et al.
|
|
39
|
The Impact of Poor Data Quality on the Typical Enterprise
– Redman
- 1998
|
|
31
|
Data Quality and Systems Theory
– Orr
- 1998
|
|
29
|
Quality control handbook
– Juran
- 1974
|
|
28
|
Design and analysis of quality information for data warehouses
– Jeusfeld, Quix, et al.
- 1998
|
|
28
|
Examining Data Quality
– Tayi, Ballou
- 1998
|
|
19
|
Mendelian Inheritance in Man. A Catalog of Human Genes and Genetic Disorders, 12th Edn. The Johns Hopkins
– McKusick
- 1998
|
|
12
|
Combining heterogeneous data sources through query correspondence assertions
– Leser
- 1998
|
|
9
|
Yehoshua Sagiv, and Divesh Srivastava. Answering queries using views
– Levy, Mendelzon
- 1995
|
|
8
|
Data Fusion and Data Quality
– Naumann
- 1998
|
|
8
|
GDB: the Human Genome Database
– Letovsky, Cottingham, et al.
- 1998
|
|
7
|
Using the Functional Data Model to Integrate Distributed Biological Data Sources
– Kemp, Gray
- 1996
|
|
5
|
Measuring the e ciency of decision making units
– Charnes, Cooper, et al.
- 1978
|
|
3
|
The effectiveness of GlOSS for the text database recovery problem
– Gravano, Garcia-Molina, et al.
- 1994
|
|
3
|
Information infrastructure for the human genome project
– Robbins
- 1995
|
|
2
|
Designing a global information resource for molecular biology
– Leser
- 1999
|
|
2
|
Qualitydriven Source Selection Using Data Envelopment Analysis
– Naumann, Freytag, et al.
- 1998
|
|
2
|
Representing genomic maps in a relational database
– Robbins
- 1994
|
|
2
|
Multiple Attribute Decision Making. Number 186
– Hwang, Yoon
- 1981
|
|
1
|
Efficient optimisation of a class of relational expressions
– Aho, Sagiv, et al.
- 1979
|
|
1
|
DBCat - the public catalog of databases. WWW
– InfoBiogen
- 1998
|
|
1
|
Hugues Roest Crollius. Issues in developing integrated genomic databases and application to the human X chromosome
– Leser, Lehrach
- 1998
|
|
1
|
the public catalog of databases. WWW
– Dbcat
- 1998
|
|
1
|
Enumerating data errors - a survey of the counting literature
– Pierce
- 1998
|
|
1
|
Thodoros Topaloglou. Advanced query mechanisms for biological databases
– Chen, Kosky, et al.
- 1998
|
|
1
|
The impact of poor data qualityinthetypical enterprise
– Redman
- 1998
|
|
1
|
A product perspective onTotal Data Quality Management
– Wang
- 1998
|