Results 1 - 10
of
198
Aurora: a new model and architecture for data stream management
, 2003
"... This paper describes the basic processing model and architecture of Aurora, a new system to manage data streams for monitoring applications. Monitoring applications differ substantially from conventional business data processing. The fact that a software system must process and react to continual in ..."
Abstract
-
Cited by 401 (31 self)
- Add to MetaCart
(Show Context)
This paper describes the basic processing model and architecture of Aurora, a new system to manage data streams for monitoring applications. Monitoring applications differ substantially from conventional business data processing. The fact that a software system must process and react to continual inputs from many sources (e.g., sensors) rather than from human operators requires one to rethink the fundamental architecture of a DBMS for this application area. In this paper, we present Aurora, a new DBMS currently under construction at Brandeis University, Brown University, and M.I.T. We first provide an overview of the basic Aurora model and architecture and then describe in detail a stream-oriented set of operators.
The CQL Continuous Query Language: Semantic Foundations and Query Execution
- VLDB Journal
, 2003
"... CQL, a Continuous Query Language, is supported by the STREAM prototype Data Stream Management System at Stanford. CQL is an expressive SQL-based declarative language for registering continuous queries against streams and updatable relations. We begin by presenting an abstract semantics that relie ..."
Abstract
-
Cited by 354 (4 self)
- Add to MetaCart
(Show Context)
CQL, a Continuous Query Language, is supported by the STREAM prototype Data Stream Management System at Stanford. CQL is an expressive SQL-based declarative language for registering continuous queries against streams and updatable relations. We begin by presenting an abstract semantics that relies only on "black box" mappings among streams and relations.
Modular Event-Based Systems
- THE KNOWLEDGE ENGINEERING REVIEW
, 2006
"... Event-based systems are developed and used to integrate components in loosely coupled systems. Research and product development focused so far on e#ciency issues but neglected methodological support to build such systems. In this article, the modular design and implementation of an event system is p ..."
Abstract
-
Cited by 149 (11 self)
- Add to MetaCart
(Show Context)
Event-based systems are developed and used to integrate components in loosely coupled systems. Research and product development focused so far on e#ciency issues but neglected methodological support to build such systems. In this article, the modular design and implementation of an event system is presented which supports scopes and event mappings, two new and powerful structuring methods that facilitate engineering and coordination of components in event-based systems. We give a
Large-scale Incremental Processing Using Distributed Transactions and Notifications
- 9th USENIX Symposium on Operating Systems Design and Implementation
"... Updating an index of the web as documents are crawled requires continuously transforming a large repository of existing documents as new documents arrive. This task is one example of a class of data processing tasks that transform a large repository of data via small, independent mutations. These ta ..."
Abstract
-
Cited by 120 (0 self)
- Add to MetaCart
(Show Context)
Updating an index of the web as documents are crawled requires continuously transforming a large repository of existing documents as new documents arrive. This task is one example of a class of data processing tasks that transform a large repository of data via small, independent mutations. These tasks lie in a gap between the capabilities of existing infrastructure. Databases do not meet the storage or throughput requirements of these tasks: Google’s indexing system stores tens of petabytes of data and processes billions of updates per day on thousands of machines. MapReduce and other batch-processing systems cannot process small updates individually as they rely on creating large batches for efficiency. We have built Percolator, a system for incrementally processing updates to a large data set, and deployed it to create the Google web search index. By replacing a batch-based indexing system with an indexing system based on incremental processing using Percolator, we process the same number of documents per day, while reducing the average age of documents in Google search results by 50%. 1
Extending Document Management Systems with User-Specific Active Properties
- ACM Transactions on Information Systems
, 1999
"... Document properties are a compelling infrastructure on which to develop document management applications. A property-based approach avoids many of the problems of traditional hierarchical storage mechanisms, reflects document organizations meaningful to user tasks, provides a means to integrate the ..."
Abstract
-
Cited by 118 (10 self)
- Add to MetaCart
(Show Context)
Document properties are a compelling infrastructure on which to develop document management applications. A property-based approach avoids many of the problems of traditional hierarchical storage mechanisms, reflects document organizations meaningful to user tasks, provides a means to integrate the perspectives of multiple individuals and groups, and does this all within a uniform interaction framework. Document properties can reflect not only categorizations of documents and document use, but also expressions of desired system activity, such as sharing criteria, replication management and versioning. Augmenting property-based document management systems with active properties that carry executable code enables the provision of document-based services on a property infrastructure. The combination of document properties as a uniform mechanism for document management, and active properties as a way of delivering document services, represents a new paradigm for document management infras...
The Active Database Management System Manifesto: A Rulebase of ADBMS Features
, 1995
"... . Active database systems have been a hot research topic for quite some years now. However, while "active functionality" has been claimed for many systems, and notions such as "active objects" or "events" are used in many research areas (even beyond database technology) ..."
Abstract
-
Cited by 71 (2 self)
- Add to MetaCart
. Active database systems have been a hot research topic for quite some years now. However, while "active functionality" has been claimed for many systems, and notions such as "active objects" or "events" are used in many research areas (even beyond database technology), it is not yet clear which functionality a database management system must support in order to be legitimately considered as an active system. In this paper, we attempt to clarify the notion of "active database management system" as well as the functionality it has to support. We thereby distinguish mandatory features that are needed to qualify as an active database system, and desired features which are nice to have. Finally, we perform a classification of applications of active database systems and identify the requirements for an active database management system in order to be applicable in these application areas. 1 Introduction Active database management systems (ADBMSs) [e.g., 4, 6, 15] have recently become a ve...
Processing flows of information: from data stream to complex event processing
- ACM COMPUTING SURVEYS
, 2011
"... A large number of distributed applications requires continuous and timely processing of information as it flows from the periphery to the center of the system. Examples include intrusion detection systems which analyze network traffic in real-time to identify possible attacks; environmental monitori ..."
Abstract
-
Cited by 67 (11 self)
- Add to MetaCart
A large number of distributed applications requires continuous and timely processing of information as it flows from the periphery to the center of the system. Examples include intrusion detection systems which analyze network traffic in real-time to identify possible attacks; environmental monitoring applications which process raw data coming from sensor networks to identify critical situations; or applications performing online analysis of stock prices to identify trends and forecast future values. Traditional DBMSs, which need to store and index data before processing it, can hardly fulfill the requirements of timeliness coming from such domains. Accordingly, during the last decade, different research communities developed a number of tools, which we collectively call Information flow processing (IFP) systems, to support these scenarios. They differ in their system architecture, data model, rule model, and rule language. In this article, we survey these systems to help researchers, who often come from different backgrounds, in understanding how the various approaches they adopt may complement each other. In particular, we propose a general, unifying model to capture the different aspects of an IFP system and use it to provide a complete and precise classification of the systems and mechanisms proposed so far.
Workflow data patterns
, 2004
"... Workflow systems seek to provide an implementation vehicle for complex, recurring business processes. Notwithstanding this common objective, there are a variety of distinct features offered by commercial workflow management systems. These differences result in significant variations in the ability o ..."
Abstract
-
Cited by 63 (9 self)
- Add to MetaCart
Workflow systems seek to provide an implementation vehicle for complex, recurring business processes. Notwithstanding this common objective, there are a variety of distinct features offered by commercial workflow management systems. These differences result in significant variations in the ability of distinct tools to represent and implement the plethora of requirements that may arise in contemporary business processes. Many of these requirements recur quite frequently during the requirements analysis activity for workflow systems and abstractions of these requirements serve as a useful means of identifying the key components of workflow languages. Previous work has identified a number of workflow control patterns which characterise the range of control flow constructs that might be encountered when modelling and analysing workflow. In this paper, we describe a series of workflow data patterns that aim to capture the various ways in which data is represented and utilised in workflows. By delineating these patterns in a form that is independent of specific workflow technologies and modelling languages, we are able to provide a comprehensive treatment of the workflow data perspective and we subsequently use these patterns as the basis for a detailed comparison of a number of commercially available workflow management systems and business process modelling languages.
An Abstract Semantics and Concrete Language for Continuous Queries over Streams and Relations
"... Despite the recent surge of research in query processing over data streams, little attention has been devoted to defining precise semantics for continuous queries over streams. We first present an abstract semantics based on several building blocks: formal definitions for streams and relations, mapp ..."
Abstract
-
Cited by 56 (5 self)
- Add to MetaCart
Despite the recent surge of research in query processing over data streams, little attention has been devoted to defining precise semantics for continuous queries over streams. We first present an abstract semantics based on several building blocks: formal definitions for streams and relations, mappings among them, and any relational query language. From these basics we define a precise interpretation for continuous queries over streams and relations. We then propose a concrete language, CQL (for Continuous semantics using SQL as the relational query language and window specifications derived from SQL-99 to map from streams to relations. We identify some equivalences that can be used to rewrite CQL queries for optimization, and we discuss some additional implementation issues arising from the language and its semantics. We are implementing CQL as part of a general-purpose Data Stream Management System at Stanford.
Design Issues for General-Purpose Adaptive Hypermedia Systems
- Proceedings of the ACM Conference on Hypertext and Hypermedia
, 2001
"... A hypermedia application offers its users much freedom to navigate through a large hyperspace. For authors finding a good compromise between offering navigational freedom and offering guidance is difficult, especially in applications that target a broad audience. Adaptive hypermedia (AH) offers (aut ..."
Abstract
-
Cited by 41 (3 self)
- Add to MetaCart
A hypermedia application offers its users much freedom to navigate through a large hyperspace. For authors finding a good compromise between offering navigational freedom and offering guidance is difficult, especially in applications that target a broad audience. Adaptive hypermedia (AH) offers (automatically generated) personalized content and navigation support, so the choice between freedom and guidance can be made on an individual basis. Many adaptive hypermedia systems (AHS) are tightly integrated with one specific application. In this paper we study design issues for general-purpose adaptive hypermedia systems, built according to an application-independent architecture. We use the Dexter-based AHAM reference model for adaptive hypermedia [7] to describe the functionality of such systems at the conceptual level. We concentrate on the architecture and behavior of a general-purpose adaptive engine. Such an engine performs adaptation and updates the user model according to a set of adaptation rules specified in an adaptation model. In our study of the behavior of such a system we concentrate on the issues of termination and confluence, which are important to detect potential problems in an adaptive hypermedia application. We draw parallels with static rule analysis in active database systems [1,2]. By using common properties of AHS we are able to obtain more precise (less conservative) results for AHS than for active databases in general, especially for the problem of termination.