Results 1 - 10
of
196
PVFS: A Parallel File System For Linux Clusters
- IN PROCEEDINGS OF THE 4TH ANNUAL LINUX SHOWCASE AND CONFERENCE
, 2000
"... As Linux clusters have matured as platforms for low-cost, high-performance parallel computing, software packages to provide many key services have emerged, especially in areas such as message passing and networking. One area devoid of support, however, has been parallel file systems, which are criti ..."
Abstract
-
Cited by 425 (34 self)
- Add to MetaCart
As Linux clusters have matured as platforms for low-cost, high-performance parallel computing, software packages to provide many key services have emerged, especially in areas such as message passing and networking. One area devoid of support, however, has been parallel file systems, which are critical for high-performance I/O on such clusters. We have developed a parallel file system for Linux clusters, called the Parallel Virtual File System (PVFS). PVFS is intended both as a high-performance parallel file system that anyone can download and use and as a tool for pursuing further research in parallel I/O and parallel file systems for Linux clusters. In this paper, we describe the design and implementation of PVFS and present performance results on the Chiba City cluster at Argonne. We provide performance results for a workload of concurrent reads and writes for various numbers of compute nodes, I/O nodes, and I/O request sizes. We also present performance results for MPI-IO on PVFS, both for a concurrent read/write workload and for the BTIO benchmark. We compare the I/O performance when using a Myrinet network versus a fast-ethernet network for I/O-related communication in PVFS. We obtained read and write bandwidths as high as 700 Mbytes/sec with Myrinet and 225 Mbytes/sec with fast ethernet.
Parallel netCDF: A high-performance scientific I/O interface
- In Proceedings of Supercomputing
, 2003
"... Dataset storage, exchange, and access play a critical role in scientific applications. For such purposes netCDF serves as a portable, efficient file format and programming interface, which is popular in numerous scientific application domains. However, the original interface does not provide an effi ..."
Abstract
-
Cited by 103 (23 self)
- Add to MetaCart
(Show Context)
Dataset storage, exchange, and access play a critical role in scientific applications. For such purposes netCDF serves as a portable, efficient file format and programming interface, which is popular in numerous scientific application domains. However, the original interface does not provide an efficient mechanism for parallel data storage and access. In this work, we present a new parallel interface for writing and reading netCDF datasets. This interface is derived with minimal changes from the serial netCDF interface but defines semantics for parallel access and is tailored for high performance. The underlying parallel I/O is achieved through MPI-IO, allowing for substantial performance gains through the use of collective I/O optimizations. We compare the implementation strategies and performance with HDF5. Our tests indicate programming convenience and significant I/O performance improvement with this parallel netCDF (PnetCDF) interface. (c) 2003 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by a contractor or affiliate of the [U.S.] Government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
An Overview of the Parallel Virtual File System
- Proceedings of the 1999 Extreme Linux Workshop
, 1999
"... As the PC cluster has grown in popularity as a parallel computing platform, the demand for system software for this platform has grown as well. One common piece of system software available for many commercial parallel machines is the parallel le system. Parallel le systems oer higher I/O performanc ..."
Abstract
-
Cited by 45 (2 self)
- Add to MetaCart
As the PC cluster has grown in popularity as a parallel computing platform, the demand for system software for this platform has grown as well. One common piece of system software available for many commercial parallel machines is the parallel le system. Parallel le systems oer higher I/O performance than single disk or RAID systems, provide users with a convenient and consistent name space across the parallel machine, support physical distribution of data across multiple disks and network entities (I/O nodes), and typically include additional I/O interfaces to support larger les and control of le parameters. The Parallel Virtual File System (PVFS) Project is an eort to provide a parallel le system for PC clusters. As a parallel le system, PVFS provides a global name space, striping of data across multiple I/O nodes, and multiple user interfaces. The system is implemented at the user level, so no kernel modi cations are necessary to install or run the system. All communication...
Scalable I/O Forwarding Framework for High-Performance Computing Systems
"... Abstract—Current leadership-class machines suffer from a significant imbalance between their computational power and their I/O bandwidth. While Moore’s law ensures that the computational power of high-performance computing systems increases with every generation, the same is not true for their I/O s ..."
Abstract
-
Cited by 40 (8 self)
- Add to MetaCart
Abstract—Current leadership-class machines suffer from a significant imbalance between their computational power and their I/O bandwidth. While Moore’s law ensures that the computational power of high-performance computing systems increases with every generation, the same is not true for their I/O subsystems. The scalability challenges faced by existing parallel file systems with respect to the increasing number of clients, coupled with the minimalistic compute node kernels running on these machines, call for a new I/O paradigm to meet the requirements of data-intensive scientific applications. I/O forwarding is a technique that attempts to bridge the increasing performance and scalability gap between the compute and I/O components of leadership-class machines by shipping I/O calls from compute nodes to dedicated I/O nodes. The I/O nodes perform operations on behalf of the compute nodes and can reduce file system traffic by aggregating, rescheduling, and caching I/O requests. This paper presents an open, scalable I/O forwarding framework for high-performance computing systems. We describe an I/O protocol and API for shipping function calls from compute nodes to I/O nodes, and we present a quantitative analysis of the overhead associated with I/O forwarding. Keywords-I/O forwarding; Parallel file systems; Leadershipclass machines
Efficient data access for parallel blast
- in International Parallel and Distributed Processing Symposium
, 2005
"... Searching biological sequence databases is one of the most routine tasks in computational biology. This task is significantly hampered by the exponential growth in sequence database sizes. Recent advances in parallelization of biological sequence search applications have enabled bioinformatics resea ..."
Abstract
-
Cited by 33 (10 self)
- Add to MetaCart
(Show Context)
Searching biological sequence databases is one of the most routine tasks in computational biology. This task is significantly hampered by the exponential growth in sequence database sizes. Recent advances in parallelization of biological sequence search applications have enabled bioinformatics researchers to utilize high-performance computing platforms and, as a result, greatly reduce the execution time of their sequence database searches. However, existing parallel sequence search tools have been focusing mostly on parallelizing the sequence alignment engine. While the computation-intensive alignment tasks become cheaper with larger machines, data-intensive initial preparation and result merging tasks become more expensive. Inefficient handling of input and output data can easily create performance bottlenecks even on supercomputers. It also causes a considerable data management overhead. In this paper, we present a set of techniques for efficient and flexible data handling in parallel sequence search applications. We demonstrate our optimizations through improving mpiBLAST, an open-source parallel BLAST tool rapidly gaining popularity. These optimization techniques aim at enabling flexible database partitioning, reducing I/O by caching small auxiliary files and results, enabling parallel I/O on shared files, and performing scalable result processing protocols. As a result, we reduce mpiBLAST users’ operational overhead by removing the requirement of prepartitioning databases. Meanwhile, our experiments show that these techniques can bring by an order of magnitude improvement to both the overall performance and scalability of mpiBLAST.
PVFS over InfiniBand: Design and Performance Evaluation
- IN THE 2003 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 03
, 2003
"... I/O is quickly emerging as the main bottleneck limiting performance in modern day clusters. The need for scalable parallel I/O and file systems is becoming more and more urgent. In this paper, we examine the feasibility of leveraging InfiniBand technology to improve I/O performance and scalability o ..."
Abstract
-
Cited by 30 (13 self)
- Add to MetaCart
I/O is quickly emerging as the main bottleneck limiting performance in modern day clusters. The need for scalable parallel I/O and file systems is becoming more and more urgent. In this paper, we examine the feasibility of leveraging InfiniBand technology to improve I/O performance and scalability of cluster file systems. We use Parallel Virtual File System (PVFS) as a basis for exploring these features. In this
Efficient Structured Data Access in Parallel File Systems
- In Proceedings of the IEEE International Conference on Cluster Computing
, 2003
"... Parallel scientific applications store and retrieve very large, structured datasets. Directly supporting these structured accesses is an important step in providing high-performance I/O solutions for these applications. High-level interfaces such as HDF5 and Parallel netCDF provide convenient APIs f ..."
Abstract
-
Cited by 29 (13 self)
- Add to MetaCart
(Show Context)
Parallel scientific applications store and retrieve very large, structured datasets. Directly supporting these structured accesses is an important step in providing high-performance I/O solutions for these applications. High-level interfaces such as HDF5 and Parallel netCDF provide convenient APIs for accessing structured datasets, and the MPI-IO interface also supports efficient access to structured data. However, parallel file systems do not traditionally support such access. In this work we present an implementation...
Noncontiguous I/O Through PVFS
- In Proceedings of the IEEE International Conference on Cluster Computing
, 2002
"... With the tremendous advances in processor and memory technology, I/O has risen to become the bottleneck in high-performance computing for many applications. The development of parallel file systems has helped to ease the performance gap, but I/O still remains an area needing significant performance ..."
Abstract
-
Cited by 27 (8 self)
- Add to MetaCart
(Show Context)
With the tremendous advances in processor and memory technology, I/O has risen to become the bottleneck in high-performance computing for many applications. The development of parallel file systems has helped to ease the performance gap, but I/O still remains an area needing significant performance improvement. Research has found that noncontiguous I/O access patterns in scientific applications combined with current file system methods to perform these accesses lead to unacceptable performance for large data sets. To enhance performance of noncontiguous I/O, we have created list I/O, a native version of noncontiguous I/O. We have used the Parallel Virtual File System (PVFS) to implement our ideas. Our research and experimentation shows that list I/O outperforms current noncontiguous I/O access methods in most I/O situations and can substantially enhance the performance of real-world scientific applications. 1.
Implementing Fast and Reusable Datatype Processing
- In EuroPVM/MPI
, 2003
"... 1 Introduction Many middleware packages now provide mechanisms for building datatypes, de-scriptions of structured data, and using these types in other operations, such as passing messages, performing remote memory access (RMA), and I/O. Thesesystems typically allow regularity of structured data to ..."
Abstract
-
Cited by 26 (4 self)
- Add to MetaCart
(Show Context)
1 Introduction Many middleware packages now provide mechanisms for building datatypes, de-scriptions of structured data, and using these types in other operations, such as passing messages, performing remote memory access (RMA), and I/O. Thesesystems typically allow regularity of structured data to be described, leading to concise descriptions of sometimes complicated layouts.The problem with many implementations of these systems is that they perform poorly [9]. Because of this application programmers are forced to avoidthe systems altogether and instead perform this processing manually in the application code. A common instance of this is manually packing structured data(placing noncontiguous data into a contiguous region for efficiently sending in a message) and then manually copying the data back into structured form on theother side.
Sockets Direct Protocol over InfiniBand in Clusters: Is it Beneficial?
, 2003
"... InfiniBand has been recently standardized by the industry to design next generation high-end clusters for both datacenter and high performance computing domains. Though InfiniBand has been able support low latency and high bandwidth, traditional sockets based applications have not been able to take ..."
Abstract
-
Cited by 25 (10 self)
- Add to MetaCart
InfiniBand has been recently standardized by the industry to design next generation high-end clusters for both datacenter and high performance computing domains. Though InfiniBand has been able support low latency and high bandwidth, traditional sockets based applications have not been able to take advantage of this; this is mainly attributed to the multiple copies and kernel context switches associated with the traditional TCP/IP protocol stack. The Sockets Direct Protocol (SDP) had been proposed recently in order to enable sockets based applications to take advantage of the enhanced features provided by InfiniBand Architecture. In this