35 citations found. Retrieving documents...
Sarawagi, S.: Query Processing in Tertiary Memory Databases, in: 21th International Conference on Very Large Data Bases (VLDB'95, Zurich, Switzerland, Sept. 11-15, 1995), pp. 585-596

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

A High Performance Application Development - Environment For Large-Scale   (Correct)

....calls while the naive approach completes the whole access with a single I O call. We plan to eliminate these problems by developing techniques that help to select optimal subfile shapes given a set of potential access patterns. Our initial observation is that the techniques proposed by Sarawagi [31] might be quite useful for this problem. 4 Design of the Integrated Java Graphical User Interface As it is distributed in nature, our application development environment involves multiple resources across distant sites. For example, let us consider our current working environment that consists ....

S. Sarawagi. Query processing in tertiary memory databases. In Proc. the 21st VLDB Conference, 1995.


A Novel Application Development Environment for.. - Shen, Liao.. (2000)   (Correct)

....for the access patterns and the subfiling caused extra file seek operations. We plan to eliminate these problems by developing techniques that help to select optimal subfile shapes given a set of potential access patterns. Our initial observation is that the techniques proposed by Sarawagi [32] might be quite useful for this problem. 4 Design of the Integrated Java Graphical User Interface As it is distributed in nature, our application development environment involves multiple resources across distant sites. For example, let us consider our current working environment that consists ....

S. Sarawagi. Query processing in tertiary memory databases. In Proc. the 21st VLDB Conference, Zurich, Switzerland, 1995.


Performance Analysis of Storage Systems - Shriver, Hillyer, Silberschatz   (Correct)

....in block numbers. Striped tape organizations are modeled in [33, 34, 35, 36] Database algorithms for systems that incorporate both tape and disk are studied in [37] and issues of caching, query optimization, and mount scheduling for relational databases using tape jukeboxes is studied in [38, 39]. The problem of optimal data placement for di erent tape library technologies is studied in [40] Nemoto and colleagues [41] consider the migration of tapes between robotic libraries for load balancing purposes. Robotic tape libraries are also modeled in [42, 43] The performance ....

S. Sarawagi, \Query processing in tertiary memory databases," in Proceedings of the 21st International Conference on Very Large Databases, (Zurich, Switzerland), pp. 585-596, Morgan Kaufmann, San Francisco, Sept. 1995.


Query Pre-Execution and Batching in Paradise: A Two-Pronged.. - Yu, DeWitt   (6 citations)  (Correct)

....on tertiary storage. This abstraction is used during query pre execution to assist in the generation of reference strings for tape blocks. Integrated Approach The most comprehensive system level approach for integrating tertiary storage into a general database management system is proposed in [Sar95] A novel technique of breaking relations on tertiary storage into smaller segments (which are the units of migration from tertiary to secondary storage) is used to allow the migration of these segments to be scheduled optimally. A query involving relations on tertiary storage is decomposed into ....

S. Sarawagi. "Query Processing in Tertiary Memory databases," Proc. of the 19 th VLDB Conference, Switzerland, September, 1995.


Determining the Optimal File Size on Tertiary.. - Bernardo.. (1998)   (1 citation)  (Correct)

.... or having to rewind tapes for repositioning, there has been some research done to find the optimal way to break the data across tapes [1, 2] Other research efforts were directed at the problem of reordering the reading of query results from the tape to avoid excessive rewindings and tape mounts [6, 7, 8, 9, 3]. One important problem associated with reading data from tapes is determining the file size to store data on each tape. Some MSSs permit only the reading of entire files, and therefore what is on a file relative to the query requesting information from the file is critical. Ideally, one would ....

S. Sarawagi, Query Processing in Tertiary Memory Databases, VLDB 1995, 585-596. 16


Storage Optimization for Large Multidimensional Datasets - More, Choudhary (1999)   (Correct)

....environment. 2 Related work Various issues in tertiary storage management have been addressed by the database community. 1] evaluates issues in extending database technology for storing accessing data on tertiary storage. 18] proposes a database architecture that uses hierarchical storage. [13, 14, 12, 16] examines issues in query processing when data resides on tertiary storage. Data striping on tertiary storage has been evaluated in [3, 4] Tertiary storage space organization issues are addressed in [2, 5] 2 [2] discusses organization of multi dimensional data on a hierarchical storage system. ....

....Reading k blocks after a reverse seek 1:77k seconds Ejecting a tape 19 seconds Fetching a new tape from library 20 seconds Loading a tape 42 seconds Table 1: Analytical model for EXB 8505XL tape drive and EXB 210 tape library from [5] 5. 5 Analytical tape model Most of the literature ([14, 16, 12, 3]) uses a linear approximation of the locate time for tape drives. 6] found that such linear approximation is inaccurate. We use the analytical models of Exabyte s EXB 8505XL tape drive and EXB 210 tape library described in [5] The model uses a logical blocks size of 1MB and is described in table ....

S. Sarawagi. Query processing in tertiary memory databases. In Proceedings of 21th International Conference on Very Large Data Bases, Zurich, Switzerland, pages 585--596. Morgan Kaufmann, 1995.


Scheduling and Data Replication to Improve Tape Jukebox.. - Bruce Hillyer Rajeev (1999)   (7 citations)  (Correct)

....to replicas at the ends of the tapes can be recaptured by overwriting the replicas with base data. 5 Related Work Several recent articles describing tape technology and schemes to improve tape performance have appeared in the literature [HS96a, HS96b, GMW95, KAOP91, YD96, Chi95, Che94, ML95, SS95, SS96] None of them, however, explore replication as a means to increase tape throughput, and only [SS95, SS96] address the problem of scheduling requests among multiple tapes in a jukebox. In [YD96] query pre execution is used to determine the set of tape blocks to be retrieved from a single ....

.... Related Work Several recent articles describing tape technology and schemes to improve tape performance have appeared in the literature [HS96a, HS96b, GMW95, KAOP91, YD96, Chi95, Che94, ML95, SS95, SS96] None of them, however, explore replication as a means to increase tape throughput, and only [SS95, SS96] address the problem of scheduling requests among multiple tapes in a jukebox. In [YD96] query pre execution is used to determine the set of tape blocks to be retrieved from a single tape so that scheduling can minimize random tape accesses while returning data in the order of request. In ....

[Article contains additional citation context not shown here]

Sunita Sarawagi and Michael Stonebraker. Query processing in tertiary memory databases. In Proceedings of the 21st International Conference on Very Large Databases, pages 585--596, Zurich, Switzerland, September 11--15 1995. Morgan Kaufmann, San Francisco.


Scheduling Queries on Tape-resident Data - Sachin More Alok (2000)   (2 citations)  (Correct)

....are accessed. Various issues in tertiary storage management have been addressed by the database community. CHL93] evaluates issues in extending database technology for storing accessing data on tertiary storage. Sto91] proposes a database architecture that uses hierarchical storage. Sar95a, Sar95b, ML95, SS96] examines issues in query processing when data resides on tertiary storage. Data striping on tertiary storage has been evaluated in [DK93, GMW95] Tertiary storage space organization issues are addressed in [CDK 95, HRS99] This paper investigates issues in optimizing I O time for ....

....experimental results later in this section. Next we evaluate various scheduling algorithms using synthetic workloads. We then describe the Sequoia 2000 benchmark and evaluate the scheduling algorithms using the benchmark. 7. 1 Analytical model for the tape library Most of the literature ( Sar95b, SS96, ML95, DK93] uses a linear approximation of the locate time for tape drives. HS96a] found that such linear approximation is inaccurate. We use the analytical models of Exabyte s EXB 8505XL tape drive and EXB 210 tape library described in [HRS99] The model uses a logical blocks size of ....

Sunita Sarawagi. Query processing in tertiary memory databases. In Proceedings of 21th International Conference on Very Large Data Bases, Zurich, Switzerland, pages 585--596. Morgan Kaufmann, 1995.


Efficiently Sequencing Tape-Resident Jobs - More, Muthukrishnan, Shriver (1999)   (1 citation)  (Correct)

....data (and even more on synthetic data) which is a significant gain over the total time these operations take. Remark 1. The question of managing tape resident data as a tertiary memory database and supporting various relational query operations such as joins, has been addressed before (e.g. [23, 21, 22, 26]) in particular, EOSDIS has been studied intensely within the Database Community (e.g. Paradise project [26] and alternative EOS database architecture [3] Here we focus on the limited task of efficiently scheduling a batch of taperesident jobs, but this is nevertheless very common in ....

Sarawagi, S. Query processing in tertiary memory databases. In Proceedings for VLDB '95 (Sept. 1995), pp. 585--596.


A Novel Application Development Environment for.. - Shen, Liao.. (2000)   (Correct)

....for the access patterns and the subfiling caused extra file seek operations. We plan to eliminate these problems by developing techniques that help to select optimal subfile shapes given a set of potential access patterns. Our initial observation is that the techniques proposed by Sarawagi [30] might be quite useful for this problem. 4. DESIGN OF THE INTEGRATED JAVA GRAPHICAL USER INTERFACE As it is distributed in nature, our application development environment involves multiple resources across distant sites. For example, let us consider our current working environment that consists ....

S. Sarawagi. Query processing in tertiary memory databases. In Proc. the 21st VLDB Conference, 1995.


Efficient I/O Scheduling in Tertiary Libraries - Prabhakar, Agrawal, Abbadi..   (Correct)

....Ford and Myllymaki have proposed a log structured file system for tertiary media. The benefits of striping in tape based systems has been studied by Drapeau and Katz [DK93] and also by Golubchik and Muntz [GM95] In [GM95] a general open system model for striped libraries is developed. In [Sar95a, Sar95b] modifications to the architecture of database management systems are proposed for efficiently accessing tertiary storage directly. Other studies have looked at the problem of reorganization of data that is stored on tertiary media in order to improve retrieval performance [CDK 95, SS94] ....

S. Sarawagi. Query processing in tertiary memory databases. In Proc. of the 21st Int. Conf. on Very Large Data Bases, pages 585--596, San Francisco, California, 1995. Morgan Kaufmann.


Towards Mass Storage Systems with Object Granularity - Holtman, van der Stok (2000)   (Correct)

....in a mass storage system with both disk and tape. Our work [6] explores remapping in the disk cache, but not keeping a set of smaller files on tape to achieve some kind of object granularity in staging operations. Our architecture builds on experience from existing mass storage systems [7] 9] [10], especially with respect to cache replacement and staging policies. Many systems cache data at a finer granularity when it moves upwards in the storage hierarchy, see for example [9] At least one existing mass storage product [11] structures data into small units (atoms, like our objects) and ....

S. Sarawagi, Query Processing in Tertiary Memory Databases, Proc. of 21st VLDB Conference, Zurich, Switzerland, 1995, p. 585--596.


Scheduling and Data Replication to Improve Tape Jukebox Performance - Hillyer (1999)   (7 citations)  (Correct)

....as good as a vertical layout with full replication. Finally, the space given to replicas at the ends of the tapes can be recaptured by overwriting the replicas with base data. 5 Related Work Several recent articles describing tape technology and schemes to improve tape performance have appeared [10, 11, 9, 7, 8, 5, 12, 16, 2, 1, 13, 14, 15], but none of them study replication to increase tape performance, and only [14, 15] address the scheduling of requests over multiple tapes in a jukebox. In [16] query pre execution is used to determine the set of tape blocks to be retrieved from a single tape so that scheduling can minimize ....

....recaptured by overwriting the replicas with base data. 5 Related Work Several recent articles describing tape technology and schemes to improve tape performance have appeared [10, 11, 9, 7, 8, 5, 12, 16, 2, 1, 13, 14, 15] but none of them study replication to increase tape performance, and only [14, 15] address the scheduling of requests over multiple tapes in a jukebox. In [16] query pre execution is used to determine the set of tape blocks to be retrieved from a single tape so that scheduling can minimize random tape accesses while returning data in the order of request. In [13] algorithms ....

[Article contains additional citation context not shown here]

S. Sarawagi and M. Stonebraker. Query processing in tertiary memory databases. In Proceedings of the 21st International Conference on Very Large Databases, pages 585--596, Zurich, Switzerland, Sept. 11--15 1995. Morgan Kaufmann, San Francisco.


Scheduling Queries for Tape-resident Data - Sachin More And (2000)   (2 citations)  (Correct)

....is because read times dominate seek times for the workloads we considered. We use data volume estimates for all scheduling algorithms since it has lower computing requirements. We use a tape library simulator to execute the schedules created by various scheduling algorithm. Most of the literature [3, 17, 21, 22] uses a linear approximation of the locate time for tape drives. 7] found that such linear approximation is inaccurate. We use the analytical models of Exabyte s EXB 8505XL tape drive and EXB 210 tape library described in [6] in our tape library simulator. We use the SORT algorithm described in ....

Sarawagi, S. Query processing in tertiary memory databases. In Proceedings of 21th International Conference on Very Large Data Bases (Zurich, Switzerland, 1995), Morgan Kaufmann, pp. 585--596.


Data Engineering - September Vol No   (Correct)

....design of the scheduling unit. Section 3 presents the design of the new executor which extracts the list of subqueries and executes them in arbitrary order. Section 4 presents related work and finally Section 5 makes concluding remarks. This article is an abridged version of the work presented in [Sar96, SS96, Sar95]. 2 Architecture Overview Figure 1 sketches the architecture of our tertiary memory database system introduced in [Sar95, Sar96] We assume a process per user architecture where each user session has a separate process serving its queries. An arriving query is first compiled by the user process. ....

....in arbitrary order. Section 4 presents related work and finally Section 5 makes concluding remarks. This article is an abridged version of the work presented in [Sar96, SS96, Sar95] 2 Architecture Overview Figure 1 sketches the architecture of our tertiary memory database system introduced in [Sar95, Sar96]. We assume a process per user architecture where each user session has a separate process serving its queries. An arriving query is first compiled by the user process. The user process then extracts the list of subqueries from the query and submits the list to the scheduling unit. The scheduling ....

[Article contains additional citation context not shown here]

S. Sarawagi. Query processing in tertiary memory databases. PhD thesis, University of california at Berkeley, Dec 1996.


Tribeca: A System for Managing Large Databases of Network.. - Sullivan, Heybey (1998)   (21 citations)  (Correct)

....performed even better. The small cost is far outweighed by the flexibility and convenience of changing small simple queries rather than re writing C code to perform different analyses. 4 Related Work The difficulties in using relational databases stored on tape are overviewed in [3] Sarawagi [11] modifies a relational query optimizer to consider large tape archives in its cost formula and caches tape data on faster storage. Video on demand systems [4] might use tape storage, but in these workloads many users randomly access independent large objects instead of sequences of small ones. A ....

S. Sarawagi. Query processing in tertiary memory databases. In Proc. Conference on Very Large Data Bases, 1995.


A Brief Survey Of Tertiary Storage Systems And Research - Prabhakar, Agrawal.. (1997)   (Correct)

....I O in parallel have been investigated. The authors observe that the operations of disk and tape access can be overlapped to reduce the total execution time. Sarawagi and Stonebraker have also investigated the architecture of database management systems that incorporate tertiary devices directly [10]. The authors argue in favor of a central Scheduler that has knowledge of the currently pending queries, the contents (and semantics) of the disk cache and the state of the tertiary memory. Applications with very large data storage requirements need to use tertiary storage to hold active data. ....

S. Sarawagi. Query processing in tertiary memory databases. In Proc. of the 21st Int. Conf. on Very Large Data Bases, pages 585--596, San Francisco, California, 1995. Morgan Kaufmann.


Improving the Access Time Performance of Serpentine Tape Drives - Sandstå, Midtstraum (1999)   (Correct)

....long (twelve hours of processing) that it significantly reduces the practical value of their work. Prabhakar et al. 7] have studied the problem of scheduling I O requests for robotic libraries with tape drives, but made no efforts to schedule the processing of requests for a given tape. Sarawagi [11] has studied query processing in tertiary memory databases. She proposed to use average seek cost to model the performance of serpentine tape drives and hence missed opportunities to optimize the random I O performance. Taking an alternative approach to improve the performance of tape drives, ....

S. Sarawagi. Query processing in tertiary memory databases. In Proceedings of the 21st VLDB Conference, pages 585-- 596, Zurich, Switzerland, September 1995.


Query Pre-Execution and Batching in Paradise: A Two-Pronged.. - Yu, DeWitt   (6 citations)  (Correct)

....certain applications, when only a portion of each image is needed, it wastes tape bandwidth and staging disk capacity transferring entire images. An alternative to the use of an HSM is to incorporate tertiary storage directly into the database system. This approach is being pursued by the Postgres [8,9] and Paradise [1,2] projects, which extend tertiary storage beyond its normal role as an archive mechanism. With an integrated approach, the database query optimizer and execution engine can optimize accesses to tape so that complicated ad hoc requests for data on tertiary storage can be served ....

....and summary information) to form an abstract that is stored on disk, the majority of queries can be satisfied using only the abstracts. Integrated Approach The most comprehensive system level approach for integrating tertiary storage into a general database management system is proposed in [9]. A novel technique of breaking relations on tertiary storage into smaller segments (which are the units of migration from tertiary to secondary storage) is used to allow the migration of these segments to be scheduled optimally. A query involving relations on tertiary storage is decomposed into ....

S. Sarawagi. "Query Processing in Tertiary Memory databases, " Proc. of the 19 th VLDB Conference, Switzerland, September, 1995.


Efficient I/O for Very Large Multimedia Applications - Prabhakar, Agrawal, Abbadi   (Correct)

....have studied efficient scheduling policies for linear tapes. Both these studies do not consider scheduling requests for more than one medium. Work has also been done on optimizing the performance of relational database management systems that incorporate tertiary storage [SS93, ML95, Sar95a, Sar95b] Myllymaki and Livny have investigated the benefits of executing tape and disk I O in parallel [ML96] In [FM96] a log structured file system for tertiary media has been proposed. The feasibility of striping in tape based systems has been studied by Drapeau and Katz [DK93] and also by Golubchik ....

S. Sarawagi. Query processing in tertiary memory databases. In Proc. of the 21st Int. Conf. on Very Large Data Bases, pages 585--596, San Francisco, California, 1995. Morgan Kaufmann.


On the Modeling and Performance Characteristics of a.. - Bruce Hillyer (1996)   (33 citations)  (Correct)

....6] discuss the role of hierarchical storage management in video on demand systems. 11] presents a logstructured filesystem that spans disk and tape storage. Striped tape organizations are studied by [4, 7] 14] describes join algorithms for databases stored partly on tape and partly on disk, and [16, 17] deals with issues of caching, query optimization, and mount scheduling for relational database use of tertiary storage jukeboxes. To our knowledge, no prior publications present a realistic model of locate time for helical scan or serpentine tape drives, or use such a model to schedule random I O ....

Sarawagi, S. Query processing in tertiary memory databases. In Proceedings of the 21st International Conference on Very Large Databases (Zurich, Switzerland, Sept. 11--15 1995), pp. 585--596.


Integrated Document Caching and Prefetching in Storage.. - Kraiss, Weikum (1998)   (10 citations)  (Correct)

.... and request scheduling [HS96, NKT97] this includes work with special considerations on the real time requirements of video data [GMW94, LLW95] Motivated by the large data volume in data warehouses, tertiary storage management has also received attention in the context of relational DBMS queries [Sto91, ML97, Sa95, Jo98]. Prefetching in database systems has been studied mostly for applications where future access patterns are largely predictable due to specific structures of the underlying databases and the programs accessing them, especially in objectoriented database systems [CK89, CH91, GK94] but also in ....

Sarawagi S (1995) Query Processing in Tertiary Memory Databases. In: VLDB Conf., 1995, Zurich, Switzerland, pp 585-- 596


On the Design and Implementation of the Multidimensional - Cubestore Storage Manager (1998)   (Correct)

No context found.

Sarawagi, S.: Query Processing in Tertiary Memory Databases, in: 21th International Conference on Very Large Data Bases (VLDB'95, Zurich, Switzerland, Sept. 11-15, 1995), pp. 585-596


An Architecture for Using Tertiary Storage in a Data Warehouse - Johnson (1998)   (1 citation)  (Correct)

No context found.

S. Sarawagi. Query processing in tertiary memory databases. In Proc. 21st Very Large Database Conference, pages 585 -- 596, 1995.


Real-Time Scheduling of Tertiary Storage - Lijding (2003)   (Correct)

No context found.

Sunita Sarawagi. Query processing in tertiary memory databases. In Very Large Databases (VLDB) Journal, pages 585--596, 1995.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC