Results 1 - 10
of
216
Querying Heterogeneous Information Sources Using Source Descriptions
, 1996
"... We witness a rapid increase in the number of structured information sources that are available online, especially on the WWW. These sources include commercial databases on product information, stock market information, real estate, automobiles, and entertainment. We would like to use the data stored ..."
Abstract
-
Cited by 724 (34 self)
- Add to MetaCart
We witness a rapid increase in the number of structured information sources that are available online, especially on the WWW. These sources include commercial databases on product information, stock market information, real estate, automobiles, and entertainment. We would like to use the data stored in these databases to answer complex queries that go beyond keyword searches. We face the following challenges: (1) Several information sources store interrelated data, and any query-answering system must understand the relationships between their contents. (2) Many sources are not full-featured database systems and can answer only a small set of queries over their data (for example, forms on the WWW restrict the set of queries one can ask). (3) Since the number of sources is very large, effective techniques are needed to prune the set of information sources accessed to answer a query. (4) The details of interacting with each source vary greatly. We describe the Information Manifold, an imp...
Answering queries using views
- In PODS Conference
, 1995
"... We consider the problem of computing answers to queries by using materialized views. Aside from its potential in optimizing query evaluation, the problem also arises in applications such as Global Information Systems, Mobile Computing and maintaining physical data independence. We consider the probl ..."
Abstract
-
Cited by 447 (32 self)
- Add to MetaCart
(Show Context)
We consider the problem of computing answers to queries by using materialized views. Aside from its potential in optimizing query evaluation, the problem also arises in applications such as Global Information Systems, Mobile Computing and maintaining physical data independence. We consider the problem of nding a rewriting of a query that uses the materialized views, the problem of nding minimal rewritings, and nding complete rewritings (i.e., rewritings that use only the views). We show that all the possible rewritings can be obtained by considering containment mappings from the views to the query, and that the problems we consider are NP-complete when both the query and the views are conjunctive and don't involve built-in comparison predicates. We show that the problem has two independent sources of complexity (the number of possible containment mappings, and the complexity of deciding which literals from the original query can be deleted). We describe a polynomial time algorithm for nding rewritings, and show that under certain conditions, it will nd the minimal rewriting. Finally, we analyze the complexity of the problems when the queries and views may be disjunctive and involve built-in comparison predicates. 1
A Scalable Comparison-Shopping Agent for the World-Wide Web
- In Proceedings of the First International Conference on Autonomous Agents
, 1997
"... The Web is less agent-friendly than we might hope. Most information on the Web is presented in loosely structured natural language text with no agent-readable semantics. HTML annotations structure the display of Web pages, but provide virtually no insight into their content. Thus, the designers of i ..."
Abstract
-
Cited by 327 (19 self)
- Add to MetaCart
The Web is less agent-friendly than we might hope. Most information on the Web is presented in loosely structured natural language text with no agent-readable semantics. HTML annotations structure the display of Web pages, but provide virtually no insight into their content. Thus, the designers of intelligent Web agents need to address the following questions: (1) To what extent can an agent understand information published at Web sites? (2) Is the agent's understanding sufficient to provide genuinely useful assistance to users? (3) Is site-specific hand-coding necessary, or can the agent automatically extract information from unfamiliar Web sites? (4) What aspects of the Web facilitate this competence? In this paper we investigate these issues with a case study using the ShopBot. ShopBot is a fullyimplemented, domain-independent comparison-shopping agent. Given the home pages of several on-line stores, ShopBot autonomously learns how to shop at those vendors. After its learning is com...
OBSERVER: An Approach for Query Processing in Global Information Systems based on Interoperation across Pre-existing Ontologies
, 1996
"... The huge number of autonomousand heterogeneous data repositories accessible on the “global information infrastructure” makes it impossible for users to be aware of the locations, structure/organization, query languages and semantics of the data in various repositories. There is a critical need to co ..."
Abstract
-
Cited by 295 (36 self)
- Add to MetaCart
The huge number of autonomousand heterogeneous data repositories accessible on the “global information infrastructure” makes it impossible for users to be aware of the locations, structure/organization, query languages and semantics of the data in various repositories. There is a critical need to complement current browsing, navigationaland informationretrieval techniques with a strategy that focuses on information content and semantics. In any strategy that focuses on information content, the most critical problem is that of different vocabularies used to describe similar information across domains. We discuss a scalable approach for vocabulary sharing. The objects in the repositories are represented as intensional descriptions by pre-existing ontologies expressed in Description Logics characterizing information in different domains. User queries are rewritten by using interontologyrelationships to obtain semanticspreserving translations across the ontologies. 1.
Query Reformulation for Dynamic Information Integration
- JOURNAL OF INTELLIGENT INFORMATION SYSTEMS
, 1996
"... The standard approach to integrating heterogeneous information sources is to build a global schema that relates all of the information in the different sources, and to pose queries directly against it. The problem is that schema integration is usually difficult, and as soon as any of the information ..."
Abstract
-
Cited by 274 (32 self)
- Add to MetaCart
The standard approach to integrating heterogeneous information sources is to build a global schema that relates all of the information in the different sources, and to pose queries directly against it. The problem is that schema integration is usually difficult, and as soon as any of the information sources change or a new source is added, the process mayhave to be repeated. The SIMS system uses an alternative approach. A domain model of the application domain is created, establishing a fixed vocabulary for describing data sets in the domain. Using this language, each available information source is described. Queries to SIMS against the collection of available information sources are posed using terms from the domain model, and reformulation operators are employed to dynamically select an appropriate set of information sources and to determine how to integrate the available information to satisfy a query. This approach results in a system that is more flexible than existing ones, more easily scalable, and able to respond dynamically to newly available or unexpectedly missing information sources.
On the Decidability of Query Containment under Constraints
"... Query containment under constraints is the problem of checking whether for every database satisfying a given set of constraints, the result of one query is a subset of the result of another query. Recent research points out that this is a central problem in several database applications, and we addr ..."
Abstract
-
Cited by 256 (56 self)
- Add to MetaCart
Query containment under constraints is the problem of checking whether for every database satisfying a given set of constraints, the result of one query is a subset of the result of another query. Recent research points out that this is a central problem in several database applications, and we address it within a setting where constraints are specified in the form of special inclusion dependencies over complex expressions, built by using intersection and difference of relations, special forms of quantification, regular expressions over binary relations, and cardinality constraints. These types of constraints capture a great variety of data models, including the relational, the entity-relational, and the object-oriented model. We study the problem of checking whether q is contained in q ′ with respect to the constraints specified in a schema S, where q and q ′ are nonrecursive Datalog programs whose atoms are complex expressions. We present the following results on query containment. For the case where q does not contain regular expressions, we provide a method for deciding query containment, and analyze its computational complexity. We do the same for the case where neither S nor q, q ′ contain number restrictions. To the best of our knowledge, this yields the first decidability result on containment of conjunctive queries with regular expressions. Finally, we prove that the problem is undecidable for the case where we admit inequalities in q′.
Context Interchange: New Features and Formalisms for the Intelligent Integration of Information
- ACM TOIS
, 1999
"... The Context Interchange strategy presents a novel perspective for mediated data access in which semantic conflicts among heterogeneous systems are not identified a priori, but are detected and reconciled by a context mediator through comparison of contexts axioms corresponding to the systems engaged ..."
Abstract
-
Cited by 238 (96 self)
- Add to MetaCart
The Context Interchange strategy presents a novel perspective for mediated data access in which semantic conflicts among heterogeneous systems are not identified a priori, but are detected and reconciled by a context mediator through comparison of contexts axioms corresponding to the systems engaged in data exchange. In this article, we show that queries formulated on shared views, export schema, and shared “ontologies ” can be mediated in the same way using the Context Interchange framework. The proposed framework provides a logic-based object-oriented formalism for representing and reasoning about data semantics in disparate systems, and has been validated in a prototype implementation providing mediated data access to both traditional and web-based information sources. Categories and Subject Descriptors: H.2.4 [Database Management]: Systems—Query processing; H.2.5 [Database Management]: Heterogeneous Databases—Data translation
Scaling Heterogeneous Databases and the Design of Disco
, 1995
"... Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources. Database administrators must deal with incorporating each new data source into the system. Database impleme ..."
Abstract
-
Cited by 146 (15 self)
- Add to MetaCart
(Show Context)
Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources. Database administrators must deal with incorporating each new data source into the system. Database implementors must deal with the transformation of queries between query languages and schemas. The Distributed Information Search COmponent (DISCO) addresses these problems. Query processing semantics give meaning to queries that reference unavailable data sources. Data modeling techniques manage connections to data sources. The component interface to data sources flexibly handles different query languages and different interface functionalities. This paper describes in detail (a) the distributed mediator architecture of DISCO, (b) its query processing semantics, (c) the data model and its modeling of data source connections, and (d) the interface to underlying data sources. We describe several advantages of our system and describe the internal architecture of our planned prototype.
Query Folding
- In ICDE
, 1996
"... Query folding refers to the activity of determining if and how a query can be answered using a given set of resources, which might be materialized views, cached results of previous queries, or queries answerable by another database. We investigate query folding in the context where queries and resou ..."
Abstract
-
Cited by 144 (2 self)
- Add to MetaCart
(Show Context)
Query folding refers to the activity of determining if and how a query can be answered using a given set of resources, which might be materialized views, cached results of previous queries, or queries answerable by another database. We investigate query folding in the context where queries and resources are conjunctive queries. We develop an exponential-time algorithm that finds all foldings, and a polynomial-time algorithm for the subclass of acyclic queries. Our results can be applied to query optimization in centralized databases, to query processing in distributed databases, and to query answering in federated databases. 1 Introduction Query folding refers to the activity of determining if and how a query can be answered using a given set of resources. These resources might be materialized views, cached results of previous queries, or even queries answerable by another database. Query folding is important because the base relations referred to in a query might be stored remotely a...
InfoSleuth: Agent-Based Semantic Integration of Information in Open and Dynamic Environments
, 1997
"... The goM of the InfoSleuth project at MCC is to exploit and synthesize new technologies into a unified system that retrieves and processes information in an ever-changing net- work of information sources. InfoSleuth has its roots in the Carnot project at MCC, which specialized in integrating heteroge ..."
Abstract
-
Cited by 111 (4 self)
- Add to MetaCart
The goM of the InfoSleuth project at MCC is to exploit and synthesize new technologies into a unified system that retrieves and processes information in an ever-changing net- work of information sources. InfoSleuth has its roots in the Carnot project at MCC, which specialized in integrating heterogeneous information bases. However, recent emerging technologies such as internetworking and the World Wide Web have significantly expanded the types, availability, and volume of data available to an information management system. Furthermore, in these new environments, there is no formal control over the registration of new information sources, and applications tend to be developed without complete knowledge of the resources that will be available when they are run. Federated database projects such as Carnot that do static data integration do not scale up and do not cope well with this ever-changing environment. On the other hand, recent Web technologies, based on keyword search engines, are scalable but, unlike federated databases, are incapable of accessing information based on concepts. In this experience paper, we describe the architecture, design, and implementation of a working version of InfoSleuth. We show how InfoSleuth integrates new technological developments such as agent technology, domain ontologies, brokerage, and internet computing, in support of mediated interoperation of data and services in a dynamic and open environment. We demonstrate the use of information brokering and domain ontologies as key elements for sea/ability.