We address the problem of assigning non-partitioned files in a parallel I/O system where the file accesses exhibit Poisson arrival rates and fixed service times. We present two new file assignment algorithms based on open queueing networks which aim at simultaneously minimizing simultaneously the load balance across all disks as well as the variance of the service time at each disk. We first present an off-line algorithm, Sort Partition, which assigns to each disk files with similar access time. Next we show that, assuming that a perfectly balanced file assignment can be found for a given set of files, Sort Partition will find the file assignment with minimal mean response time. We then present an on-line algorithm, Hybrid Partition, that assigns groups of files with similar service times in successive intervals while guaranteeing that the load imbalance at any point does not exceed a certain threshold. We report on synthetic experiments which exhibit skew in file accesses and sizes, and we compare the performance of our new algorithms with the vanilla greedy file allocation algorithm.
|
7709
|
Computers and Intractability: A Guide to the Theory of NP-Completeness
– Garey, Johnson
- 1979
|
|
1465
|
The Art of Computer Programming
– Knuth
- 1981
|
|
292
|
Bounds on multiprocessing timing anomalies
– Graham
- 1969
|
|
229
|
A trace-driven analysis of the UNIX 4.2 BSD file system
– Ousterhout, Costa, et al.
- 1985
|
|
218
|
Supporting Stored Video: Reducing Rate Variability and End-to-End Resource Requirements through Optimal Smoothing
– Zhang, Salehi, et al.
- 1996
|
|
128
|
Comparative models of the file assignment problem
– Dowdy, Foster
- 1982
|
|
112
|
A microeconomic approach to optimal resource allocation in distributed computer systems
– Kurose, Simha
- 1989
|
|
103
|
Online load balancing
– Azar
- 1998
|
|
102
|
New algorithms for an ancient scheduling problem
– Bartal, Fiat, et al.
- 1995
|
|
89
|
Data Placement in Bubba
– Copeland, Alexander, et al.
- 1988
|
|
73
|
NCSA’s World Wide Web Server: Design and Performance
– KWAN, MCGRATH, et al.
- 1995
|
|
66
|
An Application of Bin-Packing to Multiprocessor Scheduling
– Coffman, Garey, et al.
- 1978
|
|
64
|
A better algorithm for an ancient scheduling problem
– Karger, Phillips, et al.
- 1996
|
|
38
|
Data partitioning and load balancing in parallel disk systems
– Scheuermann, Weikum, et al.
- 1996
|
|
24
|
The placement optimization program: a practical solution to the disk file assignment problem
– Wolf
- 1989
|
|
19
|
User Access Patterns to NCSA’s World Wide Web Server,” Dept. of Computer Science Research Report available online at http://www-pablo.cs.uiuc.edu/Papers/WWW.ps, University of Illinois at Urbana-Champaign
– Kwan, McGrath, et al.
- 1995
|
|
18
|
File placement on distributed computer systems
– Wah
- 1984
|
|
12
|
On the performance of on-line algorithms for particular problems
– Faigle, Kern, et al.
- 1989
|
|
11
|
Allocating Data and Operations to Nodes in Distributed Database Design
– March, Rho
- 1995
|
|
10
|
Disk cooling in parallel disk systems
– Scheuermann, Weikum, et al.
- 1994
|
|
7
|
A parallel hash join algorithm for managing data skew
– Wolf, Yu, et al.
- 1993
|
|
7
|
Data placement
– Copeland, Alexander, et al.
- 1988
|
|
6
|
A File Assignment Problem Model for Extended Local Area Network Environments
– Wolf, Pattipati
- 1990
|
|
5
|
I/O Parallelism in Database Systems -- Design, Implementation, and Evaluation of a Storage System for Parallel Disks
– Zabback
- 1994
|
|
4
|
Data Allocation for Multidisk Databases
– Rotem, Schloss, et al.
- 1993
|
|
4
|
A Better Algorithm for an Ancient Scheduling
– Karger, Phillips, et al.
- 1994
|
|
3
|
A self-adjusting data distribution mechanism for multidimensional load balancing in multiprocessor-based database systems
– Lee, Hua
- 1994
|
|
3
|
ªRAID
– Chen, Lee, et al.
- 1994
|
|
3
|
ªComparative Models of the File Assignment Problem,º
– Dowdy, Foster
- 1982
|
|
3
|
ªBounds on Multiprocessing Timing Anomalies,º
– Graham
- 1969
|
|
2
|
Allocating data and workload among multiple servers in a local area network
– Lee, Park
- 1995
|
|
2
|
ªNew Algorithms for an Ancient Scheduling
– Bartal, Fiat, et al.
- 1992
|
|
1
|
A note on expected makespans for largest-first sequences of independent tasks on two processors
– Coffman, Grederickson, et al.
- 1984
|
|
1
|
Models for the combined logical and physical design of databases
– Dewan, Gavish
- 1989
|
|
1
|
Database reorganization in parallel disk arrays with i/o service stealing
– Zabback, Onyuksel, et al.
- 1998
|
|
1
|
ªOn-Line Load Balancing,º Theoretical
– Azar, Broder, et al.
- 1994
|
|
1
|
ªData Allocation of Multidisk Databases,º
– Rotem, Schloss, et al.
- 1993
|
|
1
|
ªDisk Cooling in Parallel Disk Systems,º
– Scheuermann, Weikum, et al.
- 1994
|
|
1
|
ªData Partitioning and Load Balancing in Parallel Disk
– Scheuermann, Weikum, et al.
- 1998
|
|
1
|
ªI/O Parallelism in Database SystemsÐDesign, Implementation, and Evaluation of a Storage System for Parallel Disks,º
– Zabback
- 1994
|