Results 1 -
4 of
4
Design and Evaluation of Smart Disk Architecture for DSS Commercial Workloads
- in Proceedings of the 2000 International Conference on Parallel Processing
, 2000
"... The requirements for storage space and computational power of largescale applications are increasing rapidly. Clusters seem to be the most attractive architecture for such applications, due to their low costs and high scalability. On the other hand, smart disk systems, with their large storage capac ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
The requirements for storage space and computational power of largescale applications are increasing rapidly. Clusters seem to be the most attractive architecture for such applications, due to their low costs and high scalability. On the other hand, smart disk systems, with their large storage capacities and growing computational power are becoming increasingly popular. In this work, we compare the performance of these architectures with a single host-based system using representative queries from the Decision Support System (DSS) databases. We show how to implement individual database operations in the smart disk system and also show how to optimize the execution of the whole query by bundling frequently occurring operations together and executing the bundle in a single invocation. Besides decreasing the overall execution time, operation bundling also offers an easy-to-program and easy-to-use interface to access the data on smart disks. We also present a protocol for minimizing the communication time in the smart disk based system. To measure the response times, we have developed the DBsim, an accurate simulator which can simulate the database operations for the single host-based, cluster-based and smart disk based systems. Using this simulator, we illustrate that the smart disk architecture offers substantial benefits in terms of overall query execution times of the TPC-D benchmark suite. In particular, the average response time of the smart disk architecture for the representative queries from the TPC-D benchmark in our base configuration is 71 % smaller than the response time on the single host-based system and 4:2 % smaller than the response time on the fastest cluster architecture. We also demonstrate the effectiveness of the operation bundling. 1.
Design and Evaluation of a Smart Disk Cluster for DSS Commercial Workloads
, 2001
"... this paper, we present a detailed quantitative evaluation of a smart disk based architecture. To achieve this, we compare the performances of a smart disk system, two types of cluster systems and a single host system for whole database queries. The main contributions of this paper are as follows: f ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
this paper, we present a detailed quantitative evaluation of a smart disk based architecture. To achieve this, we compare the performances of a smart disk system, two types of cluster systems and a single host system for whole database queries. The main contributions of this paper are as follows: ffl We present how a whole database query can be executed on a smart disk system. ffl We present and evaluate a method called operation bundling for reducing the execution time of the database queries in smart disk architecture
Punctuated Data Streams
, 2005
"... As most current query processing architectures are already pipelined, it seems logical to apply them to data streams. However, two classes of query operators are impractical
for processing long or unbounded data streams. Unbounded stateful operators maintain state with no upper bound on its size, an ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
As most current query processing architectures are already pipelined, it seems logical to apply them to data streams. However, two classes of query operators are impractical
for processing long or unbounded data streams. Unbounded stateful operators maintain state with no upper bound on its size, and so eventually run out of memory. Blocking
operators read the entire input before emitting a single output, and so might never produce a result. We believe that a priori semantic knowledge of a data stream can permit the use of such operators in some cases. We explore a kind of stream semantics called punctuated streams. Punctuations in a stream mark the end of substreams, allowing us to view a non-terminating stream as a mixture of terminating streams. We introduce three kinds of
invariants to specify the proper behavior of query operators in the presence of punctuation. Pass invariants unblock blocking operators by defining when such an operator can pass results on. Keep invariants define what must be kept in local state to continue successful operation. Propagation invariants define when an operator can pass punctuation on. We then present a strategy for proving that implementations of these invariants are faithful to their finite table counterparts.
In practice, it is important to answer the following question: "How much additional overhead is required when using punctuations?" We use the scenario of a monitoring
system for an online auction. Streams of bids, new items, and new users are sent to an online auction system. There are many interesting queries that can be posed over these
auction streams. We define queries for this scenario, and execute them with different kinds and amounts of punctuations embedded in the input streams. We show that, for a reasonable ratio of punctuations to data items, the overhead is minimal. Additionally, we compare the behavior of a query using punctuations with the behavior of the same query using slack over data streams with disorder.
Clearly, not all punctuations are useful to a particular query, and it would be useful to make a determination of when they are. That is, we would like to answer the question
“Can stream query Q benefit from a particular set of punctuations?” To that end, we first define punctuation schemes to specify the collection of punctuations that will be presented to a query on a particular data stream. We show how both punctuations and query operators induce groupings over the items in the domain of the input(s). We show that a query benefits from an input punctuation scheme (in terms of being able to produce a given output scheme), if each set in the groupings induced by the operators of the query is covered by a finite number of punctuations in the scheme — a kind of compactness.
We conclude with discussion on possible future directions of research related to punctuations and data streams. These directions focus on a variety of questions, ranging
from issues in query optimization to other possible semantics that can be expressed using punctuations.
An Experimental Evaluation of Smart Disk Architectures Using DSS Commercial Workloads
, 1999
"... Smart disk systems with large storage capacities and growing computational power are becoming increasingly attractive. The idea is to perform parallel and filtering-type of data intensive computations on disks, close to data, thereby offloading the host processor and increasing the aggregate system ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Smart disk systems with large storage capacities and growing computational power are becoming increasingly attractive. The idea is to perform parallel and filtering-type of data intensive computations on disks, close to data, thereby offloading the host processor and increasing the aggregate system power. In this

