Results 1 - 10
of
572
Agents and the Semantic Web
- IEEE INTELLIGENT SYSTEMS
, 2001
"... Many challenges of bringing communicating multiagent systems to the Web require ontologies. The integration of agent technology and ontologies could significantly affect the use of Web services and the ability to extend programs to perform tasks for users more efficiently and with less human interve ..."
Abstract
-
Cited by 2352 (18 self)
- Add to MetaCart
(Show Context)
Many challenges of bringing communicating multiagent systems to the Web require ontologies. The integration of agent technology and ontologies could significantly affect the use of Web services and the ability to extend programs to perform tasks for users more efficiently and with less human intervention.
Indexing and Querying XML Data for Regular Path Expressions
- IN VLDB
, 2001
"... With the advent of XML as a standard for data representation and exchange on the Internet, storing and querying XML data becomes more and more important. Several XML query languages have been proposed, and the common feature of the languages is the use of regular path expressions to query XML ..."
Abstract
-
Cited by 343 (9 self)
- Add to MetaCart
With the advent of XML as a standard for data representation and exchange on the Internet, storing and querying XML data becomes more and more important. Several XML query languages have been proposed, and the common feature of the languages is the use of regular path expressions to query XML data. This poses a new challenge concerning indexing and searching XML data, because conventional approaches based on tree traversals may not meet the processing requirements under heavy access requests. In this paper, we propose a new system for indexing and storing XML data based on a numbering scheme for elements. This numbering scheme quickly determines the ancestor-descendant relationship between elements in the hierarchy of XML data. We also propose several algorithms for processing regular path expressions, namely, (1) ##-Join for searching paths from an element to another, (2) ##-Join for scanning sorted elements and attributes to find element-attribute pairs, and (3) ##-Join for finding Kleene-Closure on repeated paths or elements. The ##-Join algorithm is highly effective particularly for searching paths that are very long or whose lengths are unknown. Experimental results from our prototype system implementation show that the proposed algorithms can process XML queries with regular path expressions by up to an or- # This work was sponsored in part by National Science Foundation CAREER Award (IIS-9876037) and Research Infrastructure program EIA-0080123. The authors assume all responsibility for the contents of the paper. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its...
Lore: A database management system for semistructured data
- SIGMOD Record
, 1997
"... Lore (for Lightweight Object Repository) is a DBMS designed specifically for managing semistructured information. Implementing Lore has required rethinking all aspects of a DBMS, including storage management, indexing, query processing and optimization, and user interfaces. This paper provides an ov ..."
Abstract
-
Cited by 339 (24 self)
- Add to MetaCart
(Show Context)
Lore (for Lightweight Object Repository) is a DBMS designed specifically for managing semistructured information. Implementing Lore has required rethinking all aspects of a DBMS, including storage management, indexing, query processing and optimization, and user interfaces. This paper provides an overview of these aspects of the Lore system, as well as other novel features such as dynamic structural summaries and seamless access to data from external sources.
Index Structures for Path Expressions
, 1997
"... In recent years there has been an increased interest in managing data which does not conform to traditional data models, like the relational or object oriented model. The reasons for this non-conformance are diverse. One one hand, data may not conform to such models at the physical level: it may be ..."
Abstract
-
Cited by 333 (7 self)
- Add to MetaCart
In recent years there has been an increased interest in managing data which does not conform to traditional data models, like the relational or object oriented model. The reasons for this non-conformance are diverse. One one hand, data may not conform to such models at the physical level: it may be stored in data exchange formats, fetched from the Internet, or stored as structured les. One the other hand, it may not conform at the logical level: data may have missing attributes, some attributes may be of di erent types in di erent data items, there may be heterogeneous collections, or the data may be simply specified by a schema which is too complex or changes too often to be described easily as a traditional schema. The term semistructured data has been used to refer to such data. The data model proposed for this kind of data consists of an edge-labeled graph, in which nodes correspond to objects and edges to attributes or values. Figure 1 illustrates a semistructured database providing information about a city. Relational databases are traditionally queried with associative queries, retrieving tuples based on the value of some attributes. To answer such queries efciently, database management systems support indexes for translating attribute values into tuple ids (e.g. B-trees or hash tables). In object-oriented databases, path queries replace the simpler associative queries. Several data structures have been proposed for answering path queries e ciently: e.g., access support relations 14] and path indexes 4]. In the case of semistructured data, queries are even more complex, because they may contain generalized path expressions 1, 7, 8, 16]. The additional exibility is needed in order to traverse data whose structure is irregular, or partially unknown to the user.
Semistructured data
, 1997
"... In semistructured data, the information that is normally as-sociated with a schema is contained within the data, which is sometimes called “self-describing”. In some forms of semi-structured data there is no separate schema, in others it exists but only places loose constraints on the data. Semi-str ..."
Abstract
-
Cited by 281 (0 self)
- Add to MetaCart
(Show Context)
In semistructured data, the information that is normally as-sociated with a schema is contained within the data, which is sometimes called “self-describing”. In some forms of semi-structured data there is no separate schema, in others it exists but only places loose constraints on the data. Semi-structured data has recently emerged as an important topic of study for a variety of reasons. First, there are data sources such as the Web, which we would like to treat as databases but which cannot be constrained by a schema. Second, it may be desirable to have an extremely flexible format for data exchange between disparate databases. Third, even when dealing with structured data, it may be helpful to view it. as semistructured for the purposes of browsing. This tu-torial will cover a number of issues surrounding such data: finding a concise formulation, building a sufficiently expres-sive language for querying and transformation, and opti-mizat,ion problems. 1 The motivation The topic of semistructured data (also called unstructured data) is relatively recent, and a tutorial on the topic may well be premature. It represents, if anything, the conver-gence of a number of lines of thinking about new ways to represent and query data that do not completely fit with conventional data models. The purpose of this tutorial is to to describe this motivation and to suggest areas in which further research may be fruitful. For a similar exposition, the reader is referred to Serge Abiteboul’s recent survey pa-per PI. The slides for this tutorial will be made available from a section of the Penn database home page
Accelerating XPath location steps
- ACM SIGMOD Int. Conference on Management of Data
, 2002
"... This work is a proposal for a database index structure that has been specifically designed to support the evaluation of XPath queries. As such, the index is capable to support all XPath axes (including ancestor, following, precedingsibling, descendant-or-self, etc.). This feature lets the index stan ..."
Abstract
-
Cited by 261 (18 self)
- Add to MetaCart
(Show Context)
This work is a proposal for a database index structure that has been specifically designed to support the evaluation of XPath queries. As such, the index is capable to support all XPath axes (including ancestor, following, precedingsibling, descendant-or-self, etc.). This feature lets the index stand out among related work on XML indexing structures which had a focus on regular path expressions (which correspond to the XPath axes children and descendantor-self plus name tests). Its ability to start traversals from arbitrary context nodes in an XML document additionally enables the index to support the evaluation of path traversals embedded in XQuery expressions. Despite its flexibility, the new index can be implemented and queried using purely relational techniques, but it performs especially well if the underlying database host provides support for R-trees. A performance assessment which shows quite promising results completes this proposal. 1.
XMill: an Efficient Compressor for XML Data
, 1999
"... We describe a tool for compressing XML data, with applications in data exchange and archiving, which usually achieves about twice the compression ratio of gzip at roughly the same speed. The compressor, called XMill, incorporates and combines existing compressors in order to apply them to heterogene ..."
Abstract
-
Cited by 230 (0 self)
- Add to MetaCart
We describe a tool for compressing XML data, with applications in data exchange and archiving, which usually achieves about twice the compression ratio of gzip at roughly the same speed. The compressor, called XMill, incorporates and combines existing compressors in order to apply them to heterogeneous XML data: it uses zlib, the library function for gzip, a collection of datatype specific compressors for simple data types, and, possibly, user defined compressors for application specific data types. 1 Introduction We have implemented a compressor/decompressor for XML data, to be used in data exchange and archiving, that achieves about twice the compression rate of general-purpose compressors (gzip), at about the same speed. The tool can be downloaded from www.research.att.com/sw/tools/xmill/. XML is now being adopted by many organizations and industry groups, like the healthcare, banking, chemical, and telecommunications industries. The attraction in XML is that it is a self-describi...