MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  dQUOB: Managing large data flows using dynamic embedded queries (2000) [31 citations — 8 self]

Download:
Download as a PDF | Download as a PS
by Beth Plale, Karsten Schwan
In IEEE International High Performance Distributed Computing (HPDC
http://www.cc.gatech.edu/systems/papers/PlaleTR00.ps
Add To MetaCart

Abstract:

The dQUOB system satisfies clients in need of specific information from high-volume data streams. The data streams we speak of are the flows of data that exist in large-scale visualizations, video streaming to a large number of distributed users, and high volume business transactions. dQUOB introduces the idea of conceptualizing a data stream as a set of relational database tables. Within this model, a scientist can request information in an SQL-like query. Transformation or computation that often needs to be performed on the data before it arrives at a client can be conceptualized as computation performed on consecutive views of the data; computation is associated with each view. Additionally, the dQUOB system moves the query code into the data stream as a quoblet; an efficient compiled code. The relational database data model has the significant advantage of presenting opportunities for efficient reoptimizations of queries and sets of queries. Using examples from global atmospheric modeling, we illustrate the usefulness of the dQUOB system. We carry the examples to the experiments and establish the viability of the approach for high performance computing with a baseline benchmark. We define a cost-metric of end-to-end latency that can be used to determine realistic cases where optimization should be applied. Finally, we show that end-to-end latency can be controlled through a probability assigned to a query that a query will evaluate to true.

Citations

1159 Tcl and the Tk Toolkit – Ousterhout - 1994
313 The Paradyn Parallel Performance Measurement Tools – Miller, Callaghan, et al. - 1995
239 Netsolve: A Network Server for Solving Computational Science Problems, Intl – Casanova, Dongma - 1997
184 The Grid: Blueprint for a Future Computing Infrastructure – Kesselman - 1999
154 Equi-Depth Histograms For Estimating Selectivity Factors For Multi-Dimensional Queries – Muralikrishna, DeWitt - 1988
122 Accurate estimation of the number of tuples satisfying a condition – Piatetsky-Shapiro, Connell - 1984
98 Scirun: A scientific programming environment for computational steering – Parker, Johnson - 1995
88 An overview of the pablo performance analysis environment – Reed - 1992
72 Principles of Database Query Processing for Advanced Applications – Yu, Meng - 1999
70 E cient checking of temporal integrity constraints using bounded history encoding – Chomicki - 1995
56 ACDS: Adapting computational data streams for high performance – Isert, Schwan - 2000
50 Event Services for High Performance Computing – Eisenhauer, Bustamante, et al. - 2000
44 Differential evaluation of continual queries – Liu, Pu, et al.
42 Falcon: On-line monitoring for steering parallel programs. Concurrency: Practice and Experience – Gu, Eisenhauer, et al. - 1998
31 A parallel spectral model for atmospheric transport processes. Concurrency: Practice andExperience – Kindler, Schwan, et al. - 1996
25 Fast Heterogeneous Binary Data Interchange for Event-based Monitoring – Plale, Eisenhauer, et al. - 2000
16 The Case for Prediction-based Best-effort Real-time Systems – Dinda, Lowekamp, et al. - 1999
16 Object-relational queries into multidimensional databases with the Active Data Repository – Ferreira, Kurc, et al. - 1999
15 Run-time detection in parallel and distributed systems: Application to safety-critical systems – Plale, Schwan - 1999
14 Realizing distributed computational laboratories – Plale, Elling, et al. - 1999
13 Software approach to hazard detection using on-line analysis of safety constraints – Schroeder, Aggarwal, et al. - 1997
11 Developmentofan intelligent monitoring and control system for a heterogeneous numerical propulsion system simulation – Afjeh, Homer, et al. - 1995
7 Reducing data distribution bottlenecks by employing data visualization lters – Franke, Magee - 1999
7 Parallel computing on wide-area clusters: the Albatross project – Bal, Plaat, et al. - 1999
4 Active I/O streams for heterogeneous high performance computing – Bustamante, Schwan - 1999
3 Event services for high performance computing. 2000. tions. 2 Prior experience with dQUOB are to safety-critical – Eisenhauer, Bustamente, et al.
1 Extensible markup language – SperbergMcQueen - 1998