Results 1 - 10
of
514
Tinydb: An acquisitional query processing system for sensor networks
- ACM Trans. Database Syst
, 2005
"... We discuss the design of an acquisitional query processor for data collection in sensor networks. Acquisitional issues are those that pertain to where, when, and how often data is physically acquired (sampled) and delivered to query processing operators. By focusing on the locations and costs of acq ..."
Abstract
-
Cited by 626 (8 self)
- Add to MetaCart
We discuss the design of an acquisitional query processor for data collection in sensor networks. Acquisitional issues are those that pertain to where, when, and how often data is physically acquired (sampled) and delivered to query processing operators. By focusing on the locations and costs of acquiring data, we are able to significantly reduce power consumption over traditional passive systems that assume the a priori existence of data. We discuss simple extensions to SQL for controlling data acquisition, and show how acquisitional issues influence query optimization, dissemination, and execution. We evaluate these issues in the context of TinyDB, a distributed query processor for smart sensor devices, and show how acquisitional techniques can provide significant reductions in power consumption on our sensor devices. Categories and Subject Descriptors: H.2.3 [Database Management]: Languages—Query languages; H.2.4 [Database Management]: Systems—Distributed databases; query processing
The CQL Continuous Query Language: Semantic Foundations and Query Execution
- VLDB Journal
, 2003
"... CQL, a Continuous Query Language, is supported by the STREAM prototype Data Stream Management System at Stanford. CQL is an expressive SQL-based declarative language for registering continuous queries against streams and updatable relations. We begin by presenting an abstract semantics that relie ..."
Abstract
-
Cited by 354 (4 self)
- Add to MetaCart
CQL, a Continuous Query Language, is supported by the STREAM prototype Data Stream Management System at Stanford. CQL is an expressive SQL-based declarative language for registering continuous queries against streams and updatable relations. We begin by presenting an abstract semantics that relies only on "black box" mappings among streams and relations.
Cartel: a distributed mobile sensor computing system
- In 4th ACM SenSys
, 2006
"... CarTel is a mobile sensor computing system designed to collect, process, deliver, and visualize data from sensors located on mobile units such as automobiles. A CarTel node is a mobile embedded computer coupled to a set of sensors. Each node gathers and processes sensor readings locally before deliv ..."
Abstract
-
Cited by 327 (30 self)
- Add to MetaCart
(Show Context)
CarTel is a mobile sensor computing system designed to collect, process, deliver, and visualize data from sensors located on mobile units such as automobiles. A CarTel node is a mobile embedded computer coupled to a set of sensors. Each node gathers and processes sensor readings locally before delivering them to a central portal, where the data is stored in a database for further analysis and visualization. In the automotive context, a variety of on-board and external sensors collect data as users drive. CarTel provides a simple query-oriented programming interface, handles large amounts of heterogeneous data from sensors, and handles intermittent and variable network connectivity. CarTel nodes rely primarily on opportunistic wireless (e.g., Wi-Fi, Bluetooth) connectivity—to the Internet, or to “data mules ” such as other CarTel nodes, mobile phone flash memories, or USB keys—to communicate with the portal. CarTel applications run on the portal, using a delaytolerant continuous query processor, ICEDB, to specify how the mobile nodes should summarize, filter, and dynamically prioritize data. The portal and the mobile nodes use a delaytolerant network stack, CafNet, to communicate. CarTel has been deployed on six cars, running on a small scale in Boston and Seattle for over a year. It has been used to analyze commute times, analyze metropolitan Wi-Fi deployments, and for automotive diagnostics.
The design of the borealis stream processing engine
- In CIDR
, 2005
"... Borealis is a second-generation distributed stream processing engine that is being developed at Brandeis University, Brown University, and MIT. Borealis inherits core stream processing functionality from Aurora [14] and distribution functionality from Medusa [51]. Borealis modifies and extends both ..."
Abstract
-
Cited by 250 (10 self)
- Add to MetaCart
(Show Context)
Borealis is a second-generation distributed stream processing engine that is being developed at Brandeis University, Brown University, and MIT. Borealis inherits core stream processing functionality from Aurora [14] and distribution functionality from Medusa [51]. Borealis modifies and extends both systems in non-trivial and critical ways to provide advanced capabilities that are commonly required by newly-emerging stream processing applications. In this paper, we outline the basic design and functionality of Borealis. Through sample real-world applications, we motivate the need for dynamically revising query results and modifying query specifications. We then describe how Borealis addresses these challenges through an innovative set of features, including revision records, time travel, and control lines. Finally, we present a highly flexible and scalable QoS-based optimization model that operates across server and sensor networks and a new fault-tolerance model with flexible consistency-availability trade-offs.
Mapreduce online
, 2009
"... MapReduce is a popular framework for data-intensive distributed computing of batch jobs. To simplify fault tolerance, many implementations of MapReduce materialize the entire output of each map and reduce task before it can be consumed. In this paper, we propose a modified MapReduce architecture tha ..."
Abstract
-
Cited by 181 (3 self)
- Add to MetaCart
(Show Context)
MapReduce is a popular framework for data-intensive distributed computing of batch jobs. To simplify fault tolerance, many implementations of MapReduce materialize the entire output of each map and reduce task before it can be consumed. In this paper, we propose a modified MapReduce architecture that allows data to be pipelined between operators. This extends the MapReduce programming model beyond batch processing, and can reduce completion times and improve system utilization for batch jobs as well. We present a modified version of the Hadoop MapReduce framework that supports online aggregation, which allows users to see “early returns” from a job as it is being computed. Our Hadoop Online Prototype (HOP) also supports continuous queries, which enable MapReduce programs to be written for applications such as event monitoring and stream processing. HOP retains the fault tolerance properties of Hadoop and can run unmodified user-defined MapReduce programs. 1
Issues in Data Stream Management
, 2003
"... Traditional databases store sets of relatively static records with no pre-defined notion of time, unless timestamp attributes are explicitly added. While this model adequately represents commercial catalogues or repositories of personal information, many current and emerging applications require sup ..."
Abstract
-
Cited by 164 (6 self)
- Add to MetaCart
Traditional databases store sets of relatively static records with no pre-defined notion of time, unless timestamp attributes are explicitly added. While this model adequately represents commercial catalogues or repositories of personal information, many current and emerging applications require support for online analysis of rapidly changing data streams. Limitations of traditional DBMSs in supporting streaming applications have been recognized, prompting research to augment existing technologies and build new systems to manage streaming data. The purpose of this paper is to review recent work in data stream management systems, with an emphasis on application requirements, data models, continuous query languages, and query evaluation.
High-performance complex event processing over streams
- In SIGMOD
, 2006
"... In this paper, we present the design, implementation, and evaluation of a system that executes complex event queries over real-time streams of RFID readings encoded as events. These complex event queries filter and correlate events to match specific patterns, and transform the relevant events into n ..."
Abstract
-
Cited by 161 (7 self)
- Add to MetaCart
(Show Context)
In this paper, we present the design, implementation, and evaluation of a system that executes complex event queries over real-time streams of RFID readings encoded as events. These complex event queries filter and correlate events to match specific patterns, and transform the relevant events into new composite events for the use of external monitoring applications. Stream-based execution of these queries enables time-critical actions to be taken in environments such as supply chain management, surveillance and facility management, healthcare, etc. We first propose a complex event language that significantly extends existing event languages to meet the needs of a range of RFID-enabled monitoring applications. We then describe a query plan-based approach to efficiently implementing this language. Our approach uses native operators to efficiently handle query-defined sequences, which are a key component of complex event processing, and pipelines such sequences to subsequent operators that are built by leveraging relational techniques. We also develop a large suite of optimization techniques to address challenges such as large sliding windows and intermediate result sizes. We demonstrate the effectiveness of our approach through a detailed performance analysis of our prototype implementation as well as through a comparison to a state-of-the-art stream processor. 1
Processing sliding window multi-joins in continuous queries over data streams
- Proceedings of the 29th international
, 2003
"... We study sliding window multi-join processing in continuous queries over data streams. Several algorithms are reported for performing continuous, incremental joins, under the assumption that all the sliding windows fit in main memory. The algorithms include multiway incremental nested loop joins (NL ..."
Abstract
-
Cited by 117 (9 self)
- Add to MetaCart
(Show Context)
We study sliding window multi-join processing in continuous queries over data streams. Several algorithms are reported for performing continuous, incremental joins, under the assumption that all the sliding windows fit in main memory. The algorithms include multiway incremental nested loop joins (NLJs) and multi-way incremental hash joins. We also propose join ordering heuristics to minimize the processing cost per unit time. We test a possible implementation of these algorithms and show that, as expected, hash joins are faster than NLJs for performing equi-joins, and that the overall processing cost is influenced by the strategies used to remove expired tuples from the sliding windows. 1
Streaming Pattern Discovery in Multiple Time-Series
- In VLDB
, 2005
"... In this paper, we introduce SPIRIT (Streaming Pattern dIscoveRy in multIple Timeseries) . Given n numerical data streams, all of whose values we observe at each time tick t, SPIRIT can incrementally find correlations and hidden variables, which summarise the key trends in the entire stream col ..."
Abstract
-
Cited by 106 (19 self)
- Add to MetaCart
(Show Context)
In this paper, we introduce SPIRIT (Streaming Pattern dIscoveRy in multIple Timeseries) . Given n numerical data streams, all of whose values we observe at each time tick t, SPIRIT can incrementally find correlations and hidden variables, which summarise the key trends in the entire stream collection.
Adaptive cleaning for rfid data streams
, 2006
"... ABSTRACT To compensate for the inherent unreliability of RFID data streams, most RFID middleware systems employ a "smoothing filter", a sliding-window aggregate that interpolates for lost readings. In this paper, we propose SMURF, the first declarative, adaptive smoothing filter for RFID ..."
Abstract
-
Cited by 101 (0 self)
- Add to MetaCart
(Show Context)
ABSTRACT To compensate for the inherent unreliability of RFID data streams, most RFID middleware systems employ a "smoothing filter", a sliding-window aggregate that interpolates for lost readings. In this paper, we propose SMURF, the first declarative, adaptive smoothing filter for RFID data cleaning. SMURF models the unreliability of RFID readings by viewing RFID streams as a statistical sample of tags in the physical world, and exploits techniques grounded in sampling theory to drive its cleaning processes. Through the use of tools such as binomial sampling and π-estimators, SMURF continuously adapts the smoothing window size in a principled manner to provide accurate RFID data to applications.