Results 1 - 10
of
118
Aurora: a new model and architecture for data stream management
, 2003
"... This paper describes the basic processing model and architecture of Aurora, a new system to manage data streams for monitoring applications. Monitoring applications differ substantially from conventional business data processing. The fact that a software system must process and react to continual in ..."
Abstract
-
Cited by 237 (26 self)
- Add to MetaCart
This paper describes the basic processing model and architecture of Aurora, a new system to manage data streams for monitoring applications. Monitoring applications differ substantially from conventional business data processing. The fact that a software system must process and react to continual inputs from many sources (e.g., sensors) rather than from human operators requires one to rethink the fundamental architecture of a DBMS for this application area. In this paper, we present Aurora, a new DBMS currently under construction at Brandeis University, Brown University, and M.I.T. We first provide an overview of the basic Aurora model and architecture and then describe in detail a stream-oriented set of operators.
The CQL Continuous Query Language: Semantic Foundations and Query Execution
- VLDB Journal
, 2003
"... CQL, a Continuous Query Language, is supported by the STREAM prototype Data Stream Management System at Stanford. CQL is an expressive SQL-based declarative language for registering continuous queries against streams and updatable relations. We begin by presenting an abstract semantics that relie ..."
Abstract
-
Cited by 183 (4 self)
- Add to MetaCart
CQL, a Continuous Query Language, is supported by the STREAM prototype Data Stream Management System at Stanford. CQL is an expressive SQL-based declarative language for registering continuous queries against streams and updatable relations. We begin by presenting an abstract semantics that relies only on "black box" mappings among streams and relations.
A Foundation for Representing and Querying Moving Objects
, 2000
"... Spatio-temporal databases deal with geometries changing over time. The goal of our work is to provide a DBMS data model and query language capable of handling such time-dependent geometries, including those changing continuously which describe moving objects. Two fundamental abstractions are moving ..."
Abstract
-
Cited by 143 (35 self)
- Add to MetaCart
Spatio-temporal databases deal with geometries changing over time. The goal of our work is to provide a DBMS data model and query language capable of handling such time-dependent geometries, including those changing continuously which describe moving objects. Two fundamental abstractions are moving point and moving region, describing objects for which only the time-dependent position, or position and extent, are of interest, respectively. We propose to represent such time-dependent geometries as attribute data types with suitable operations, that is, to provide an abstract data type extension to a DBMS data model and query language. This paper presents a design of such a system of abstract data types. It turns out that besides the main types of interest, moving point and moving region, a relatively large number of auxiliary data types is needed. For example, one needs a line type to represent the projection of a moving point into the plane, or a "moving real" to represent the time-dependent distance of two moving points. It then becomes crucial to achieve (i) orthogonality in the design of the type system, i.e., type constructors can be applied uniformly, (ii) genericity and consistency of operations, i.e., operations range over as many types as possible and behave consistently, and (iii) closure and consistency between structure and operations of non-temporal and related temporal types. Satisfying these goals leads to a simple and expressive system of abstract data types that may be integrated into a query language to yield apowerful language for querying spatio-temporal data, including moving objects. The paper formally defines the types and operations, offers detailed insight into the considerations that went into the design, and exempli es the use of the abstract data types using SQL. The paper o ers a precise and conceptually clean foundation for implementing a spatio-temporal DBMS extension.
Operator Scheduling in a Data Stream Manager
- In VLDB
, 2003
"... Many stream-based applications have sophisticated data processing requirements and real-time performance expectations that need to be met under asynchronous, time-varying data streams. In order to address these challenges, we propose novel operator scheduling approaches that specify (1) which operat ..."
Abstract
-
Cited by 70 (10 self)
- Add to MetaCart
Many stream-based applications have sophisticated data processing requirements and real-time performance expectations that need to be met under asynchronous, time-varying data streams. In order to address these challenges, we propose novel operator scheduling approaches that specify (1) which operators to schedule (2) in which order to schedule the operators, and (3) how many tuples to process at each execution; and study them in the context of the Aurora data stream manager. We argue and provide experimental evidence that a fine-grained scheduling approach in combination with various scheduling techniques (such as batching of operators and tuples) can significantly improve the efficiency by reducing various system overheads. We also discuss application-aware extensions that address Quality of Service (QoS) issues by making scheduling decisions according to tuple processing delays and per-application QoS specifications. Finally, we present prototype-based experimental results that characterize the efficiency and effectiveness of our approaches under various stream workloads and processing scenarios. 1
Efficient Management of Multiversion Documents by Object Referencing
, 2001
"... Traditional approaches to versioning semistructured information are edit-based, i.e., subsequent document versions are represented by using edit scripts. This paper proposes a reference-based version management scheme that preserves the logical structure of the evolving document through the use of o ..."
Abstract
-
Cited by 51 (12 self)
- Add to MetaCart
Traditional approaches to versioning semistructured information are edit-based, i.e., subsequent document versions are represented by using edit scripts. This paper proposes a reference-based version management scheme that preserves the logical structure of the evolving document through the use of object references. By preserving the document structure among versions the new scheme facilitates more efficient query support. In particular, we examine queries involving projections and selections on the document versions, as well as queries on the document evolution history. Moreover, we show that the proposed scheme provides an effective representation of multiversioned XML documents, both at the transport and exchange levels. In fact, with the reference-based scheme, a document's history can also be viewed and processed as yet another XML document. Furthermore, we demonstrate the effectiveness of the new scheme at the storage level. In particular, the scheme is enhanced with a usefulness-based page management policy that extends and adapts techniques used in transaction-time databases to ensure efficient clustering of information among versions. An extensive comparison of the reference-based versioning against representations used in temporal databases and persistent object managers depicts the performance advantages of the new approach. Finally it should be noted that reference-based versioning is applicable to other kinds of semistructured information (besides XML documents), and can be used to replace traditional version control schemes, such as the edit-based RCS and the timestamp-based SCCS.
An Abstract Semantics and Concrete Language for Continuous Queries over Streams and Relations
"... Despite the recent surge of research in query processing over data streams, little attention has been devoted to defining precise semantics for continuous queries over streams. We first present an abstract semantics based on several building blocks: formal definitions for streams and relations, mapp ..."
Abstract
-
Cited by 39 (4 self)
- Add to MetaCart
Despite the recent surge of research in query processing over data streams, little attention has been devoted to defining precise semantics for continuous queries over streams. We first present an abstract semantics based on several building blocks: formal definitions for streams and relations, mappings among them, and any relational query language. From these basics we define a precise interpretation for continuous queries over streams and relations. We then propose a concrete language, CQL (for Continuous semantics using SQL as the relational query language and window specifications derived from SQL-99 to map from streams to relations. We identify some equivalences that can be used to rewrite CQL queries for optimization, and we discuss some additional implementation issues arising from the language and its semantics. We are implementing CQL as part of a general-purpose Data Stream Management System at Stanford.
Misconceptions About Real-Time Databases
- IEEE Computer
, 1998
"... More and more databases are being used in situations where realtime constraints exist. A set of misconceptions have arisen regarding what a real-time database is and the appropriateness of using conventional databases in real-time applications. Nine misconceptions are identified and discussed. Vario ..."
Abstract
-
Cited by 38 (10 self)
- Add to MetaCart
More and more databases are being used in situations where realtime constraints exist. A set of misconceptions have arisen regarding what a real-time database is and the appropriateness of using conventional databases in real-time applications. Nine misconceptions are identified and discussed. Various research challenges are enumerated and explained. In total, the paper articulates the special nature of real-time databases. 1 Introduction In 1988 a paper entitled Misconceptions of Real-Time Computing was published [11]. This paper articulated the key differences between general purpose computing and real-time computing. The impact of the paper was significant in that it spurred a lot of research that specifically focussed on real-time issues. We believe that there is now a significant body of scientific and technological results in real-time computing, in part, due to the careful definition of real-time computing and the articulation of the important differences between real-time comp...
Semantics of Time-Varying Information
- INFORMATION SYSTEMS
, 1996
"... This paper provides a systematic and comprehensive study of the underlying semantics of temporal databases, summarizing the results of an intensive collaboration between the two authors over the last five years. We first examine how facts may be associated with time, most prominently with one or mor ..."
Abstract
-
Cited by 34 (19 self)
- Add to MetaCart
This paper provides a systematic and comprehensive study of the underlying semantics of temporal databases, summarizing the results of an intensive collaboration between the two authors over the last five years. We first examine how facts may be associated with time, most prominently with one or more dimensions of valid time and transaction time. One common case is that of a bitemporal relation, in which facts are associated with exactly one valid time and one transaction time. These two times may be related in various ways, yielding temporal specialization. Multiple transaction times arise when a fact is stored in one database, then later replicated or transferred to another database. By retaining the transaction times, termed temporal generalization, the original relation can be effectively queried by referencing only the final relation. We attempt to capture the essence of time-varying information via a very simple data model, the bitemporal conceptual data model. Emphasis is placed...
Maintaining Temporal Views Over Non-Temporal Information Sources For Data Warehousing
- In Proc. of the 1998 Intl. Conf. on Extending Database Technology
, 1998
"... An important use of data warehousing is to provide temporal views over the history of source data that may itself be non-temporal. While recent work in view maintenance is applicable to data warehousing, only non-temporal views have been considered. In this paper, we introduce a framework for mai ..."
Abstract
-
Cited by 28 (5 self)
- Add to MetaCart
An important use of data warehousing is to provide temporal views over the history of source data that may itself be non-temporal. While recent work in view maintenance is applicable to data warehousing, only non-temporal views have been considered. In this paper, we introduce a framework for maintaining temporal views over non-temporal information sources in a data warehousing environment. We describe an architecture for the temporal data warehouse that automatically maintains temporal views over non-temporal source relations, and allows users to ask temporal queries using these views. Because of the dimension of time, a materialized temporal view may need to be updated not only when source relations change, but also as time advances. We present incremental techniques to maintain temporal views for both cases, and outline the implementation of our approach in the WHIPS warehousing prototype at Stanford. 1 Introduction A data warehouse is a repository for efficient querying ...
Efficient Complex Query Support for Multiversion XML Documents
- In EDBT
, 2002
"... Managing multiple versions of XML documents represents a critical requirement for many applications. Also, there has been much recent interest in supporting complex queries on XML data (e.g., regular path expressions, structural projections, DIFF queries). In this paper, we examine the problem of su ..."
Abstract
-
Cited by 27 (6 self)
- Add to MetaCart
Managing multiple versions of XML documents represents a critical requirement for many applications. Also, there has been much recent interest in supporting complex queries on XML data (e.g., regular path expressions, structural projections, DIFF queries). In this paper, we examine the problem of supporting efficiently complex queries on multiversioned XML documents. Our approach relies on a scheme based on durable node numbers (DNNs) that preserve the order among the XML tree nodes and are invariant with respect to updates. Using the document's DNNs various complex queries are reduced to combinations partial versio retrieval queries. We examine three indexing schemes to efficiently evaluate partial version retrieval queries in this environment. A thorough performance analysis is then presented to reveal the advantages of each scheme.

