Results 1 - 10
of
56
The Lorel Query Language for Semistructured Data
- International Journal on Digital Libraries
, 1997
"... We present the Lorel language, designed for querying semistructured data. Semistructured data is becoming more and more prevalent, e.g., in structured documents such as HTML and when performing simple integration of data from multiple sources. Traditional data models and query languages are inapprop ..."
Abstract
-
Cited by 631 (25 self)
- Add to MetaCart
We present the Lorel language, designed for querying semistructured data. Semistructured data is becoming more and more prevalent, e.g., in structured documents such as HTML and when performing simple integration of data from multiple sources. Traditional data models and query languages are inappropriate, since semistructured data often is irregular, some data is missing, similar concepts are represented using different types, heterogeneous sets are present, or object structure is not fully known. Lorel is a user-friendly language in the SQL/OQL style for querying such data effectively. For wide applicability, the simple object model underlying Lorel can be viewed as an extension of ODMG and the language as an extension of OQL. The main novelties of the Lorel language are: (i) extensive use of coercion to relieve the user from the strict typing of OQL, which is inappropriate for semistructured data
Lore: A database management system for semistructured data
- SIGMOD Record
, 1997
"... Lore (for Lightweight Object Repository) is a DBMS designed specifically for managing semistructured information. Implementing Lore has required rethinking all aspects of a DBMS, including storage management, indexing, query processing and optimization, and user interfaces. This paper provides an ov ..."
Abstract
-
Cited by 297 (21 self)
- Add to MetaCart
Lore (for Lightweight Object Repository) is a DBMS designed specifically for managing semistructured information. Implementing Lore has required rethinking all aspects of a DBMS, including storage management, indexing, query processing and optimization, and user interfaces. This paper provides an overview of these aspects of the Lore system, as well as other novel features such as dynamic structural summaries and seamless access to data from external sources.
Adding structure to unstructured data
- In 6th Int. Conf. on Database Theory (ICDT ’97),LNCS 1186, 336–350
, 1997
"... We develop a new schema for unstructured data. Traditional schemas resemble the type systems of programming languages. For unstructured data, however, the underlying type may be much less constrained and hence an alternative way of expressing constraints on the data is needed. Here, we propose that ..."
Abstract
-
Cited by 195 (22 self)
- Add to MetaCart
We develop a new schema for unstructured data. Traditional schemas resemble the type systems of programming languages. For unstructured data, however, the underlying type may be much less constrained and hence an alternative way of expressing constraints on the data is needed. Here, we propose that both data and schema be represented as edge-labeled graphs. We develop notions of conformance between a graph database and a graph schema and show that there is a natural and e ciently computable ordering on graph schemas. We then examine certain subclasses of schemas and show that schemas are closed under query applications. Finally, we discuss how they may be used in query decomposition and optimization. 1
RQL: A Declarative Query Language for RDF
"... Real-scale Semantic Web applications, such as Web Portals and E-Marketplaces, require the management of voluminous metadata repositories containing descriptive information (i.e., metadata) about the available Web resources and services. Better knowledge about the meaning, usage, accessibility or qua ..."
Abstract
-
Cited by 174 (19 self)
- Add to MetaCart
Real-scale Semantic Web applications, such as Web Portals and E-Marketplaces, require the management of voluminous metadata repositories containing descriptive information (i.e., metadata) about the available Web resources and services. Better knowledge about the meaning, usage, accessibility or quality of these resources and services will considerably facilitate the automated processing of both Web content and services. In this context, the Resource Description Framework (RDF) enables the creation and exchange of metadata as any other Web data. Although large volumes of RDF descriptions are already appearing (e.g., as exported Portal catalogs or service descriptions), sufficiently expressive declarative languages for querying both RDF descriptions and schemas are still missing. In this paper, we propose RQL, a new RDF query language, relying on a formal graph model that permits the interpretation of superimposed resource descriptions. RQL is an OQL-inspired adaptation of XML query languages to the peculiarities of RDF but, foremost, is an extension of this functionality for uniformly querying both descriptions and schemas. We illustrate the syntax, semantics and core functionality of RQL bymeans of a set of benchmark queries and report on the performance of RSSDB, our persistent RDF Store, for storing and querying voluminous RDF descriptions.
Query Optimization for XML
- In Proceedings of VLDB
, 1999
"... XML is an emerging standard for data representation and exchange on the World-Wide Web. Due to the nature of information on the Web and the inherent flexibility of XML, we expect that much of the data encoded in XML will be semistructured:the data may be irregular or incomplete, and its structu ..."
Abstract
-
Cited by 173 (2 self)
- Add to MetaCart
XML is an emerging standard for data representation and exchange on the World-Wide Web. Due to the nature of information on the Web and the inherent flexibility of XML, we expect that much of the data encoded in XML will be semistructured:the data may be irregular or incomplete, and its structure may change rapidly or unpredictably. This paper describes the query processor of Lore,aDBMS for XML-based data supporting an expressive query language. We focus primarily on Lore's cost-based query optimizer. While all of the usual problems associated with cost-based query optimization apply to XML-based query languages, a number of additional problems arise, such as new kinds of indexing, more complicated notions of database statistics, and vastly different query execution strategies for different databases. We define appropriate logical and physical query plans, database statistics, and a cost model, and we describe plan enumeration including heuristics for reducing the large search space. Our optimizer is fully implemented in Lore and preliminary performance results are reported.
Optimizing Regular Path Expressions Using Graph Schemas
, 1998
"... Several languages, such as LOREL and UnQL, support querying of semi-structured data. Others, such as WebSQL and WebLog, query Web sites. All these languages model data as labeled graphs and use regular path expressions to express queries that traverse arbitrary paths in graphs. Naive execution of pa ..."
Abstract
-
Cited by 136 (5 self)
- Add to MetaCart
Several languages, such as LOREL and UnQL, support querying of semi-structured data. Others, such as WebSQL and WebLog, query Web sites. All these languages model data as labeled graphs and use regular path expressions to express queries that traverse arbitrary paths in graphs. Naive execution of path expressions is inefficient, however, because it often requires exhaustive graph search. We describe two optimization techniques for queries with regular path expressions, which we call regular queries. Both rely on graph schemas, which specify partial knowledge of a graph's structure. Query pruning restricts search to a fragment of the graph; we give an efficient algorithm for rewriting any regular query into a pruned one. Query rewriting using state extents can entirely eliminate or substantially reduce graph traversal; it is reminiscent of optimizing relational queries using indices. There may be several ways to optimize a query using state extents; we give an exponential-time algorith...
TIMBER: A Native XML Database
- The VLDB Journal
, 2002
"... This paper describes the overall design and architecture of the Timber XML database system currently being implemented at the University of Michigan. The system is based upon a bulk algebra for manipulating trees, and natively stores XML. New access methods have been developed to evaluate queries in ..."
Abstract
-
Cited by 113 (10 self)
- Add to MetaCart
This paper describes the overall design and architecture of the Timber XML database system currently being implemented at the University of Michigan. The system is based upon a bulk algebra for manipulating trees, and natively stores XML. New access methods have been developed to evaluate queries in the XML context, and new cost estimation and query optimization techniques have also been developed. We present performance numbers to support some of our design decisions. We believe that the key intellectual contribution of this system is a comprehensive set-at-a-time query processing ability in a native XML store, with all the standard components of relational query processing, including algebraic rewriting and a cost-based optimizer.
A fast index for semistructured data
- In VLDB
, 2001
"... Queries navigate semistructured data via path expressions, and can be accelerated using an index. Our solution encodes paths as strings, and inserts those strings into a special index that is highly optimized for long and complex keys. We describe the Index Fabric, an indexing structure that provide ..."
Abstract
-
Cited by 110 (5 self)
- Add to MetaCart
Queries navigate semistructured data via path expressions, and can be accelerated using an index. Our solution encodes paths as strings, and inserts those strings into a special index that is highly optimized for long and complex keys. We describe the Index Fabric, an indexing structure that provides the efficiency and flexibility we need. We discuss how "raw paths " are used to optimize ad hoc queries over semistructured data, and how "refined paths " optimize specific access paths. Although we can use knowledge about the queries and structure of the data to create refined paths, no such knowledge is needed for raw paths. A performance study shows that our techniques, when implemented on top of a commercial relational database system, outperform the more traditional approach of using the commercial system’s indexing mechanisms to query the XML. 1.
The RDFSuite: Managing Voluminous RDF Description Bases
, 2000
"... Metadata are widely used in order to fully exploit information resources available on corporate intranets or the Internet. The Resource Description Framework (RDF) aims at facilitating the creation and exchange of metadata as any other Web data. The growing number of available information resourc ..."
Abstract
-
Cited by 87 (10 self)
- Add to MetaCart
Metadata are widely used in order to fully exploit information resources available on corporate intranets or the Internet. The Resource Description Framework (RDF) aims at facilitating the creation and exchange of metadata as any other Web data. The growing number of available information resources and the proliferation of description services in various user communities, lead nowadays to large volumes of RDF metadata. Managing such RDF resource descriptions and schemas with existing low-level APIs and file-based implementations does not ensure fast deployment and easy maintenance of realscale RDF applications. In this paper, we advocate the use of database technology to support declarative access, as well as, logical and physical independence for voluminous RDF description bases. We present RDFSuite, a suite of tools for RDF validation, storage and querying. Specifically, weintroduce a formal data model for RDF description bases created using multiple schemas. Compared to ...
Querying Documents in Object Databases
, 1997
"... We consider the problem of storing and accessing documents (SGML and HTML, in particular) using database technology. To specify the database image of documents, we use structuring schemas that consist in grammars annotated with database programs. To query documents, we introduce an extension of OQL ..."
Abstract
-
Cited by 82 (13 self)
- Add to MetaCart
We consider the problem of storing and accessing documents (SGML and HTML, in particular) using database technology. To specify the database image of documents, we use structuring schemas that consist in grammars annotated with database programs. To query documents, we introduce an extension of OQL, the ODMG standard query language for object databases. Our extension (named OQL-doc) allows to query documents without a precise knowledge of their structure using in particular generalized path expressions and pattern matching. This allows us to introduce in a declarative language (in the style of SQL or OQL), navigational and information retrieval styles of accessing data. Query processing in the context of documents and path expressions leads to challenging implementation issues. We extend an object algebra with new operators to deal with generalized path expressions. We then consider two essential complementary optimization techniques: 1. we show that almost standard database optim...

