Results 1 - 10
of
31
Adaptive cleaning for rfid data streams
, 2006
"... ABSTRACT To compensate for the inherent unreliability of RFID data streams, most RFID middleware systems employ a "smoothing filter", a sliding-window aggregate that interpolates for lost readings. In this paper, we propose SMURF, the first declarative, adaptive smoothing filter for RFID ..."
Abstract
-
Cited by 101 (0 self)
- Add to MetaCart
ABSTRACT To compensate for the inherent unreliability of RFID data streams, most RFID middleware systems employ a "smoothing filter", a sliding-window aggregate that interpolates for lost readings. In this paper, we propose SMURF, the first declarative, adaptive smoothing filter for RFID data cleaning. SMURF models the unreliability of RFID readings by viewing RFID streams as a statistical sample of tags in the physical world, and exploits techniques grounded in sampling theory to drive its cleaning processes. Through the use of tools such as binomial sampling and π-estimators, SMURF continuously adapts the smoothing window size in a principled manner to provide accurate RFID data to applications.
No pane, no gain: efficient evaluation of sliding-window aggregates over data streams
- SIGMOD Record
, 2005
"... Window queries are proving essential to data-stream processing. In this paper, we present an approach for evaluating sliding-window aggregate queries that reduces both space and computation time for query execution. Our approach divides overlapping windows into disjoint panes, computes sub-aggregate ..."
Abstract
-
Cited by 34 (0 self)
- Add to MetaCart
(Show Context)
Window queries are proving essential to data-stream processing. In this paper, we present an approach for evaluating sliding-window aggregate queries that reduces both space and computation time for query execution. Our approach divides overlapping windows into disjoint panes, computes sub-aggregates over each pane, and “rolls up ” the pane-aggregates to compute window-aggregates. Our experimental study shows that using panes has significant performance benefits. 1.
Agent-based virtual organisations for the grid
- International Journal of Multi-Agent and Grid Systems
, 2005
"... The ability to create reliable and scalable virtual organisations (VOs) on demand in a dynamic, open and competitive environment is one of the major challenges that underlie Grid computing. In response, in the CONOISE-G project, we are developing an infrastructure to support robust and resilient vir ..."
Abstract
-
Cited by 27 (10 self)
- Add to MetaCart
(Show Context)
The ability to create reliable and scalable virtual organisations (VOs) on demand in a dynamic, open and competitive environment is one of the major challenges that underlie Grid computing. In response, in the CONOISE-G project, we are developing an infrastructure to support robust and resilient virtual organisation formation and operation. Specifically, CONOISE-G provides mechanisms to assure effective operation of agent-based VOs in the face of disruptive and potentially malicious entities in dynamic, open and competitive environments. In this paper, we describe the CONOISE-G system, outline its use in the context of VO formation and perturbation, and review current efforts to progress the work to deal with unreliable information sources. 1
SNIF: Sensor Network Inspection Framework
, 2006
"... Abstract — Recent experience with the deployment of sensor networks demonstrates that it is far from trivial to setup a working larger-scale sensor network in the field. Even though simulations and experiments with lab testbeds confirmed a working system, subtle real-world influences lead to frequen ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
(Show Context)
Abstract — Recent experience with the deployment of sensor networks demonstrates that it is far from trivial to setup a working larger-scale sensor network in the field. Even though simulations and experiments with lab testbeds confirmed a working system, subtle real-world influences lead to frequent failures in the field. Identifying and fixing these problems in the field is currently a difficult and cumbersome task due to the lack of appropriate concepts and tools. In this paper we address this issue by, firstly, classifying common problems that have been encountered during deployment. We then show that many of these problems can be detected by overhearing and analyzing sensor network traffic without need for an instrumentation of sensor nodes. Based on this observation, we develop a tool to inspect a deployed sensor network, consisting of a distributed network sniffer and a data-stream-based framework for online traffic analysis. We demonstrate and evaluate how this tool can be used to debug a typical data gathering application. 1 I.
Logical Foundations of Continuous Query Languages for Data Streams
"... Abstract. Data Stream Management Systems (DSMS) have attracted much interest from the database community, and extensions of relational database languages were proposed for expressing continuous queries on data streams. However, while relational databases were built on the solid bedrock of logic, the ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
(Show Context)
Abstract. Data Stream Management Systems (DSMS) have attracted much interest from the database community, and extensions of relational database languages were proposed for expressing continuous queries on data streams. However, while relational databases were built on the solid bedrock of logic, the same cannot be said for DSMS. Thus, a logic-based reconstruction of DSMS languages and their unique computational model is long overdue. Indeed, the banning of blocking queries and the fact that stream data are ordered by their arrival timestamps represent major new aspects that have yet to be characterized by simple theories. In this paper, we show that these new requirements can be modeled using the familiar deductive database concepts of closed-world assumption and explicit local stratification. Besides its obvious theoretical interest, this approach leads to the design of a powerful version of Datalog for data streams. This language is called Streamlog and takes the query and application languages of DSMS to new levels of expressive power, by removing the unnecessary limitations that severely impair current commercial systems and research prototypes. 1
Declarative support for . . .
"... Pervasive applications rely on data captured from the physical world through sensor devices. Data provided by these devices, however, tend to be unreliable. The data must, therefore, be cleaned before an application can make use of them, leading to additional complexity for application development ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Pervasive applications rely on data captured from the physical world through sensor devices. Data provided by these devices, however, tend to be unreliable. The data must, therefore, be cleaned before an application can make use of them, leading to additional complexity for application development and deployment. Here we present Extensible Sensor stream Processing (ESP), a framework for building sensor data cleaning infrastructures for use in pervasive applications. ESP is designed as a pipeline using declarative cleaning mechanisms based on spatial and temporal characteristics of sensor data. We demonstrate ESP’s effectiveness and ease of use through three real-world scenarios.
Join of Multiple Data Streams in Sensor Networks
"... Abstract — Sensor networks are multi-hop wireless networks of resource-constrained sensor nodes used to realize high-level collaborative sensing tasks. To query or access data generated by the sensor nodes, the sensor network can be viewed as a distributed database. In this article, we develop algor ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
Abstract — Sensor networks are multi-hop wireless networks of resource-constrained sensor nodes used to realize high-level collaborative sensing tasks. To query or access data generated by the sensor nodes, the sensor network can be viewed as a distributed database. In this article, we develop algorithms for communication-efficient implementation of join of multiple (two or more) data streams in a sensor network. The distributed implementation of join in sensor networks is particularly challenging due to unique characteristics of the sensor networks such as limited memory and battery energy on individual nodes, arbitrary and dynamic network topology, multihop communication, and unreliable infrastructure. One of our proposed approaches, viz., the Perpendicular Approach (PA), is load-balanced, and in fact, incurs near-optimal communication cost for the special case of binary joins in grid networks. We compare the performance of our designed approaches through extensive simulations on the ns2 simulator, and show that PA results in substantially prolonging the network lifetime compared to other approaches, especially for joins involving spatial constraints. I.
On-demand bound computation for best-first constraint optimization
- In Proc. International Conference on Principles and Practice of Constraint Programming
, 2004
"... Abstract. An important class of algorithms for constraint optimization searches for solutions guided by a heuristic evaluation function (bound). When only a few best solutions are required, significant effort can be wasted pre-computing bounds that are not used during the search. We introduce a meth ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
(Show Context)
Abstract. An important class of algorithms for constraint optimization searches for solutions guided by a heuristic evaluation function (bound). When only a few best solutions are required, significant effort can be wasted pre-computing bounds that are not used during the search. We introduce a method that generates—based on lazy, best-first variants of constraint projection and combination operators—only those bounds that are specifically required in order to generate a next best solution. 1
Tie: CGSV: An Adaptable Stream-Integrated Grid Monitoring System
- Proceedings of the International Conference on Network and Parallel Computing (NPC
, 2005
"... Abstract. Grid monitoring is essential for the grid management and efficiency improvement. ChinaGrid Super Vision (CGSV) is proposed for ChinaGrid to collect status information of each entity (such as resources, services, users, jobs, Network), and provide corresponding information data query and m ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Grid monitoring is essential for the grid management and efficiency improvement. ChinaGrid Super Vision (CGSV) is proposed for ChinaGrid to collect status information of each entity (such as resources, services, users, jobs, Network), and provide corresponding information data query and mining services. In this paper, CGSV architecture and its components are discussed. CGSV is featured by data stream integration and adaptability to cope with dynamic measurement data and multiform query requirements. Measurement data can be accessed quickly and easily through WSRF-compliant services in CGSV. Transfer and control protocols are brought forward to facilitate data stream querying and runtime producer configuration in CGSV.
Realtime Analysis of Information Diffusion in Social Media Io Taxidou
"... The goal of this thesis is to investigate real-time analysis methods on social media with a focus on information diffusion. From a conceptual point of view, we are interested both in the structural, sociological and temporal aspects of information diffusion in social media with a twist on the real t ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
The goal of this thesis is to investigate real-time analysis methods on social media with a focus on information diffusion. From a conceptual point of view, we are interested both in the structural, sociological and temporal aspects of information diffusion in social media with a twist on the real time factor of what is happening right now. From a technical side, the sheer size of current social media services (100’s of millions of users) and the large amount of data produced by these users renders conventional approaches for these costly analyses impossible. For that, we need to go beyond the state-of-the-art infrastructure for data-intensive computation. Our high level goal is to investigate how information diffuses in real time on the underlying social network and the role of different users in the propagation process. We plan to implement these analyses with full and partially missing datasets and compare the cost and quality of both approaches.