| Sivan Toledo and Fred G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations. In Proceedings of the 4th Annual Workshop on I/O in Parallel and Distributed Systems, pages 28--40, Philadelphia, May 1996. |
.... libraries have been developed to support applications wishing to encode 8 algorithms from the parallel disk model, particularly TPIE [Ven96] Some are designed to support a compiler, such as ViC [CC94, CH97] or PASSION [TCB 96] Others are oriented toward scientific applications in general [TG96, SW96] Still others are designed for specific application domains, such as computational chemistry [NFK98] Although few attempt to be standards, except MPI 2 [MPI97] these libraries allow the application programmer to take advantage of carefully tuned algorithms and proven techniques. 2.5 ....
Sivan Toledo and Fred G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations. In Proceedings of the Fourth Workshop on Input/Output in Parallel and Distributed Systems, pages 28--40, Philadelphia, May 1996. ACM Press.
.... in C) a ( BLOCK) storage pattern corresponds to column major storage layout (as in Fortran) and a (BLOCK,BLOCK) storage pattern, on the other hand, corresponds to blocked storage layout which might be useful for large scale linear algebra applications whose data sets are amenable to blocking [30]. As an example, consider the following scenario. An I O intensive application executes in three steps manip ulating five two dimensional data sets (arrays) P, Q1, Q2, R1 and R2 whose default disk layouts are assumed to be row major (BLOCK, Step (1) a single pro cessor reads the data set P ....
S. Toledo and F. G. Gustavson. The design and implemen- tation of SOLAR, a portable library for scalable out-of-core linear algebra computations, In Proc. Fourth Annual Work- shop on 1/0 in Parallel and Distributed Systems, May 1996.
....capacity with the increase of applications requirements. Therefore, the storage capacity required by large scale data intensive applications could be a problem for these systems. Another body of work includes run time systems such as MPI I O [35 37] PASSION [8,33,34] PANDA [7,25] and others [4,27,39]. These systems provide high level structured interfaces on top of low level native parallel file systems [20] and try to match the applications data structure which is usually a multidimensional array. They also provide optimizations such as collective I O and data sieving to solve the problems ....
S. Toledo and F.G. Gustavson, The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations, in: Proc. of 4th Annual Workshop on I/O in Parallel and Distributed Systems (1996).
....from one machine to another a very difficult task. Third, the file system policies and optimization parameters are in general hard coded within the file system and, consequently, work for only a small set of access patterns. While runtime systems and libraries like MPI IO [10, 35] and others [38, 3, 8] present users with higher level, more structured interfaces, the excessive number of calls to select from, each with several parameters, make the user s job very difficult. Also, the usability of these libraries depends largely on how well user s access patterns and library calls functionality ....
.... storage layout (as in C) a ( Block) storage pattern corresponds to columnmajor storage layout (as in Fortran) and a (Block,Block) storage pattern corresponds to blocked storage layout which might be very useful for large scale linear algebra applications whose datasets are amenable to blocking [38]. Our experience with large scale, I O intensive codes indicates that, usually, the users know how their datasets will be used by parallel processors; that is, they have sufficient information to specify suitable access patterns for the datasets in their applications. Note that conveying an access ....
S. Toledo and F. G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations, In Proc. Fourth Annual Workshop on I/O in Parallel and Distributed Systems, May 1996.
....to access data located on tape via a convenient interface expressed in terms of arrays and array portions (regions) rather than files and offsets. In this sense the library can be considered as a natural extension of state of the art runtime libraries that manipulate disk resident datasets (e.g. [2, 18]) The library implements a data storage model on tapes that enables users to access portions of multi dimensional data in a fast and simple way. In order to eliminate most of the latency in accessing taperesident data, we employ a sub filing strategy in which a large multi dimensional ....
S. Toledo and F. G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations, In Proc. Workshop on I/O in Paral. and Distr. Sys., May 1996.
.... the ScaLAPACK routines, including general linear solvers via LU factorization, positive definite linear solvers via Cholesky factorization, and linear least squares solvers via QR factorization [6] A more serious effort to add out of core capabilities to LAPACK and ScaLAPACK is provided by SOLAR [15], a portable library for scalable out of core linear algebra computations. This library uses ScaLAPACK routines for in core computation, but provides an I O layer that manages matrix input output. SOLAR achieves better I O rates by allowing a different storage scheme for matrices on disk than is ....
Sivan Toledo and Fred G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computation. In Proceedings of IOPADS '96, 1996.
....Now, we consider the situation where matrix A is too large to t in main memory. We present the parallel out of core left right looking LU factorization algorithm used by the ScaLAPACK routine pfdgetrf for parallel out of core LU factorization [5] Similar algorithms are also described in [9, 6]. In the algorithm the matrix is divided in blocks of columns called superblocks. The width of the superblock is determined by the amount of physical available memory. Like the previous parallel algorithm, the matrix is logically block cyclically distributed on the p q grid of processors. But ....
Sivan Toledo and Fred G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations. In Proceedings of the Fourth Workshop on Input/Output in Parallel and Distributed Systems, pages 28-40, Philadelphia, May 1996. ACM Press. 18
....to access data located on tape via a convenient interface expressed in terms of arrays and array portions (regions) rather than files and offsets. In this sense the library can be considered as a natural extension of state of the art run time libraries that manipulate diskresident datasets (e.g. [4, 22]) The library implements a data storage model on tapes that enables users to access portions of multi dimensional data in a fast and simple way. In order to eliminate most of the latency in accessing tape resident data, we employ a sub filing strategy in which a large multi dimensional ....
S. Toledo and F. G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations, In Proc. Workshop on I/O in Paral. and Distr. Sys., May 1996.
....Data movement becomes a primary concern of the program, with computation organized around the available data. Such out of core codes are di#cult to write and modify [AUB 96] and remain heavily dependent on the computing environment. Out of core programming libraries [Ven94, TBC 94, SW95, TG96] help by o#ering high level, portable interfaces, but they lack the convenience of in core programming. ViC (Virtual Memory C ) is our approach to out of core programming, based on a high level language. ViC includes an out of core I O library [CH97] but ViC 2 programs do not make explicit ....
Sivan Toledo and Fred G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations. In Proceedings of the Fourth Annual Workshop on I/O in Parallel and Distributed Systems (IOPADS), pages 28--40, May 1996.
....by making a distinction between out of core computations, where I O operations are explicitly controlled, and persistent storage, where they are not. In out of core computations, each processor can use a local disk to store those parts of its data structures that do not t into primary memory [23, 5, 24]. This is sucient because out of core computations typically do not employ varied and unpredictable access patterns. Only when data is stored in a persistent le system is it important to support unknown and arbitrary access patterns. 3.2 Data Partitioning An important concept that has emerged ....
....as with non SGML data in SGML. In addition to prede ned basic types, it should be possible to de ne composite types. This is based on empty tags that represent the basic elements being used: This can then be used when describing real data: vector type= struct1 size= 5 5 times (int, int, char[24]) vector Things become more interesting when data is stored in another fragment. This is done using a link tag, e.g. indicates that the contents of the fragment identi ed by id123 should be inserted at this point in the le. Using separate fragments allows the data to be stored in a way that ....
S. Toledo and F. G. Gustavson, \The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations". In 4th Workshop I/O Parallel & Distributed Syst., pp. 28-40, May 1996.
.... in C) a ( BLOCK) storage pattern corresponds to column major storage layout (as in Fortran) and a (BLOCK,BLOCK) storage pattern, on the other hand, corresponds to blocked storage layout which might be useful for large scale linear algebra applications whose data sets are amenable to blocking [30]. As an example, consider the following scenario. An I O intensive application executes in three steps manipulating five two dimensional data sets (arrays) P, Q1, Q2, R1 and R2 whose default disk layouts are assumed to be row major (BLOCK, Step (1) a single processor reads the data set P and ....
S. Toledo and F. G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations, In Proc. Fourth Annual Workshop on I/O in Parallel and Distributed Systems, May 1996.
....from one machine to another a very difficult task. Third, the file system policies and optimization parameters are in general hard coded within the file system and, consequently, work for only a small set of access patterns. While runtime systems and libraries like MPI IO [9, 33] and others [35, 3, 7] present users with higher level, more structured interfaces, the excessive number of calls to select from, each with several parameters, make the user s job very difficult. Also, the usability of these libraries depends largely on how well user s access patterns and library calls functionality ....
.... storage layout (as in C) a ( Block) storage pattern corresponds to column major storage layout (as in Fortran) and a (Block,Block) storage pattern corresponds to blocked storage layout which might be very useful for large scale linear algebra applications whose datasets are amenable to blocking [35]. Our experience with large scale, I O intensive codes indicates that, usually, the users know how their datasets will be used by parallel processors; that is, they have sufficient information to specify suitable access patterns for the datasets in their applications. Note that conveying an access ....
S. Toledo and F. G. Gustavson. The design and implementation of solar, a portable library for scalable out-of-core linear algebra computations. In Proc. Fourth Annual Workshop on I/O in Parallel and Distributed Systems, 1996.
....to access data located on tape via a convenient interface expressed in terms of arrays and array portions (regions) rather than files and offsets. In this sense the library can be considered as a natural extension of state of the art runtime libraries that manipulate disk resident datasets (e.g. [2, 18]) The library implements a data storage model on tapes that enables users to access portions of multi dimensional data in a fast and simple way. In order to eliminate most of the latency in accessing taperesident data, we employ a sub filing strategy in which a large multi dimensional ....
S. Toledo and F. G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations, In Proc. Workshop on I/O in Paral. and Distr. Sys., May 1996.
No context found.
Sivan Toledo and Fred G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations. In Proceedings of the 4th Annual Workshop on I/O in Parallel and Distributed Systems, pages 28--40, Philadelphia, May 1996.
....solvers via Cholesky factorization, and linear least squares solvers via QR factorization [6] However, this implementation does not readily allow for a full out of core extension. 1.2. 2 SOLAR A more serious effort to add out of core capabilities to LAPACK and ScaLAPACK is provided by SOLAR [18], a portable library for scalable out of core linear algebra computations. This library uses ScaLAPACK routines for incore computation, but provides an I O layer that manages matrix input output. SOLAR achieves better I O rates by allowing a different storage scheme for matrices on disk than is ....
Sivan Toledo and Fred G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computation. In Proceedings of IOPADS '96, 1996.
....in core ScaLAPACK factorization routines for LU, QR and Cholesky factorization, use a right looking variant for good load balancing [1] Other work has shown [2, 3] that for an out of core factorization, a left looking variant generates less I O volume compared to the right looking variant. Toledo [5] shows that the recursively partitioned algorithm (k # n#2) may be more efficient than the left looking variant when a very large matrix is factored with minimal in core storage. 3.2. LU Factorization The out of core LU factorization PFxGETRF involves the following operations: 1. If no ....
S. TOLEDO AND F. GUSTAVSON, The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations, in IOPADS Fourth Annual Workshop on Parallel and Distributed I/O, ACM Press, 1996, pp. 28--40.
....either the right or left looking algorithms. It is always faster than the rightlooking algorithm and it is faster than the left looking algorithm when factoring large matrices on machines with a small main memory. A steady stream of recent implementations of dense out of core factorization codes [7, 16, 22, 26, 30, 34, 49, 50, 52, 54, 61]servesasatestimonytothe need for out of core dense solvers; most of these papers describe parallel out of core algorithms and implementations. The first systematic comparison of conventional and partitioned schedules was published by McKellar and Co#man in 1969 [39] in the context of virtual ....
Sivan Toledo and Fred G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations. In Proceedings of the 4th Annual Workshop on I/O in Parallel and Distributed Systems, pages 28--40, Philadelphia, May 1996.
....of a rank revealing QR factorization are likely to increase the amount of I O in an out of core factorization, but the savings in floating point arithmetic over the SVD are insignificant when the matrix is thin and tall. We implemented the new out of core QR and SV D algorithms as part of SOLAR [8], a library of out of core linear algebra subroutines. Before we started the current project, SOLAR already included sequential and parallel out of core codes for matrix multiplication, solution of triangular linear systems, Cholesky factorizations, and LU factorizations with partial pivoting. ....
....matrix multiplication routine (TRMM) Consequently, we had to use instead the more general GEMM routine, which causes the code to perform more floating point operations than necessary. This overhead is relatively small. We have also improved the I O layer of SOLAR over the one described in [8]. The changes allow SOLAR to perform non blocking I O without relying on operating system support (which sometimes performs poorly) they allow SOLAR to perform I O in distributed memory environments without a data redistribution phase, and they allow SOLAR to perform I O on large bu#ers without ....
Sivan Toledo and Fred G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations. In Proceedings of the 4th Annual Workshop on I/O in Parallel and Distributed Systems, pages 28--40, Philadelphia, May 1996.
....are factored, the panel of columns that fits in core becomes more narrow, inherently affecting both the performance of the in core kernels as well as the ratio of computation to I O operations. A more serious effort to add out of core capabilities to LAPACK and ScaLAPACK is provided by SOLAR [20], a portable library for scalable out of core linear algebra computations. This library uses ScaLAPACK routines for in core computation, but provides an I O layer that manages matrix input output. SOLAR achieves better I O rates by allowing a different storage scheme for matrices on disk than is ....
Sivan Toledo and Fred G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computation. In Proceedings of IOPADS '96, 1996.
....The Panda library, developed at the University of Illinois, also supports highperformance array access [48] It uses server directed collective I O and chunked storage as the main optimizations. SOLAR is a library for out of core linear algebra operations, developed at IBM Watson Research Center [65]. The ChemIO library, developed at Pacific Northwest National Laboratory, provides I O support for computational chemistry applications [34] HDF [63] netCDF [33] and DMF [47] are libraries designed to provide even higher level of I O support to applications. For example, they can directly ....
Sivan Toledo and Fred G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations. In Proceedings of Fourth Workshop on Input/Output in Parallel and Distributed Systems, pages 28--40. ACM Press, May 1996.
No context found.
S. Toledo and F.G. Gustavson, The Design and Implementation of SOLAR, a Portable Library for Scalable Out-of-Core Linear Algebra Computations, Proc. Fourth Workshop Input/Output in Parallel and Distributed Systems, pp. 2840, May 1996.
No context found.
S. Toledo and F. G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations, In Proc. Fourth Annual Workshop on I/O in Parallel and Distributed Systems, May 1996.
No context found.
S. Toledo and F. G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations, In Proc. Workshop on I/O in Paral. and Distr. Sys., May 1996.
No context found.
Toledo, S. and F. Gustavson. The Design and Implementation of SOLAR, a Portable Library for Scalable Out-of-Core Linear Algebra Computations. In Proceedings of the Fourth Workshop on Input/Output in Parallel and Distributed Systems, pages 28-40, Philadelphia, May 1996. ACM Press. 14
No context found.
S. Toledo and F. G. Gustavson. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations, In Proc. Fourth Annual Workshop on I/O in Parallel and Distributed Systems, May 1996.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC