| D. S. Greenberg, R. Brightwell, L. A. Fisk, A. B. Maccabe, and R. Riesen, "A system software architecture for high-end computing," in Proceedings of SC97: High Performance Networking and Computing. San Jose, California: ACM Press, Nov. 1997, pp. 1--15. [Online]. Available: http://doi.acm.org/10.1145/509593.509646 |
....Initiative [32] the Department of Energy Laboratories are purchasing a sequence of increasingly powerful custom supercomputers. In a parallel e#ort to increase the scalability of commodity based supercomputers, Sandia National Laboratories is also developing the Computational Plant or Cplant [46, 24, 9, 42, 7, 8]. This paper describes resource allocation algorithms to optimize processor locality in Cplant and other supercomputers. Although Sandia maintains a diverse set of computing resources, the tools for managing these resources commonly rely on scheduling queueing software such as NQS [15] or PBS ....
D. S. Greenberg, R. Brightwell, L. A. Fisk, A. McCabe, and R. Riesen. A system software architecture for high-end computing. In Proc. High Performance Networking and Computing (SC), 1997. 19
....Initiative [32] the Department of Energy Laboratories are purchasing a sequence of increasingly powerful custom supercomputers. In a parallel effort to increase the scalability of commodity based supercomputers, Sandia National Laboratories is also developing the Computational Plant or Cplant [46, 24, 9, 42, 7, 8]. This paper describes resource allocation algorithms to optimize processor locality in Cplant and other supercomputers. Although Sandia maintains a diverse set of computing resources, the tools for managing these resources commonly rely on scheduling queueing software such as NQS [15] or PBS ....
D. S. Greenberg, R. Brightwell, L. A. Fisk, A. McCabe, and R. Riesen. A system software architecture for high-end computing. In Proc. High Performance Networking and Computing (SC), 1997.
....rapid turnaround on this largescale design study. 5 ASCI Red Supercomputer 5.1 Architecture Review For this optimization study, substantial compute resources were required. Within Sandia National Laboratories, one of the primary production computing platforms is the ASCI Red supercomputer [9, 10]. ASCI Red is a massively parallel, message passing, multiple input multiple data (MIMD) computer. It achieves multiple TeraFLOPS (trillion floating point operations per second) peak performance. It is designed so that file I O, memory, disk capacity, and communication are scalable. Standard ....
Greenberg, D.S., Brightwell, R., Fisk, L.A., Maccabe, A.B., and Riesen, R.E., "A System Software Architecture for High-End Computing," Proceedings of Supercomputing 97, San Jose, CA, 1997.
....connectivity to the local LAN. The I O node provides disk service for its own SU and can be used to implement a parallel filesystem across the entire machine. The SUs are combined into a single machine using utilities at a higher level which dynamically map tasks onto available physical machines [1]. As an example, the first piece of CPlant is based on nodes which are DEC Miatas, each of which contains a 433 MHz EV5 Alpha processor, 196 MB of SDRAM, a 2 GB IDE drive, on board 100 Mb s ethernet, and a Myrinet card. The Myrinet switches are wired to form a cube of eight switches, attached to ....
D. S. Greenberg, R. Brightwell, L. A. Fisk, A. B. Maccabe, and R. Riesen. A system software architecture for high-end computing. In Proceedings of SC97. ACM/IEEE, 1997.
....to the extent of their budgets. Despite the enormous advances achieved, the requirement for ever increasing computational power remains. For example, the US Department of Energy s ASCI (Advanced Strategic Computing Initiative) has targeted a benchmark of 100 TeraFlops per second by the year 2005 [GRE97]. Indeed many of the national agencies in the US, DOE, NASA, DARPA, NSA have targeted Petaflop machines by 2010. These incredibly fast machines would be a thousand times faster than today s fastest machines. 9 Cluster Taxonomy A taxononomy as given by Thomas Sterling of the NASA JPL ....
D. Greenberg et al. (1997). "A System Software Architecture for HighEnd Computing" In Proceedings of SuperComputing '97. November 15 -- 21, 1997. San Jose, USA.
No context found.
Greenberg DS, Brightwell R, Fisk LA, Maccabe AB, Riesen R. A system software architecture for high-end computing. SC'97: High Performance Networking and Computing, San Jose, CA, 1997.
No context found.
David S. Greenberg, Ron Brightwell, Lee Ann Fisk, Arthur B. Maccabe, and Rolf Riesen. A system software architecture for high-end computing. In Proceedings of SC97: High Performance Networking and Computing, pages 1--15, San Jose, California, November 1997. ACM Press.
....nodes to provide the compute cycles required by Sandia s critical applications. Because of this scalability requirement, Cplant has been designed to address scalability in every aspect of the hardware and software architectures. machines employ the partition model of resource provision [8] that was initially developed by Intel on their early parallel platforms. This model divides the machine into several different partitions that provide specialized functionality. The main partitions are service, compute, and I O. The service partition provides a full featured UNIX environment ....
D. S. Greenberg, R. B. Brightwell, L. A. Fisk, A. B. Maccabe, and R. E. Riesen. A System Software Architecture for High-End Computing. In Proceedings of SC'97, 1997.
....as much as possible from the design of TFLOPS. We designed a support and diagnostic infrastructure for Cplant analogous to the Reliability, Availability, and Supportability (RAS) system that Intel developed. We followed the partition model (service, compute, I O, etc. of resource provision [11] that Intel developed. We decided to leverage the code development that we had done for Puma to create a scalable runtime environment. Our experience with the poor performance and scalability of full featured UNIX kernels on MPP s, such as OSF on the Paragon, motivated much of the research that 4 ....
D. S. Greenberg, R. B. Brightwell, L. A. Fisk, A. B. Maccabe, and R. E. Riesen. A System Software Architecture for High-End Computing. In Proceedings of SC'97, 1997.
....identify components of the machine that have failed and allows these components to be replaced and integrated back into the machine. This type of subsystem is vital for large scale systems. In addition to the high performance data movement software, the TFLOPS employs a partition model of resources[9]. This model divides the resources of the machine into partitions based on functionality. Each partition provides access to a specialized resource. The minimal set of partitions in the model are the service and compute partitions. The compute partition is the largest portion of the machine and is ....
....units and a support system. The definition of a scalable unit is intended to be as non specific as possible in order to allow a variety of vendors to supply components that meet the criteria. Scalable units provide resources to the Cplant. The use of the partition model of resource provision[9] allows scalable units to provide a variety of types of resources, such as service, compute, I O, and network. Most of the Cplant resources will be compute resources that can service distributed memory programs and, at a minimum, run an MPI process. At least one scalable unit must provide a ....
D. S. Greenberg, R. Brightwell, L. A. Fisk, A. B. Maccabe, and R. Riesen. A System Software Architecture for High-End Computing. In Proceedings of SC'97, 1997.
....a little insight into how Cplant works and how it is built. Then, we will discuss some of the design decisions that we have made and explain why we made them. Finally, we outline some of the future work we have planned. 2 Cplant Architecture Logically, a Cplant consists of several partitions [2]. A partition is a collection of nodes that together perform one of the functions of the total computational resource. These partitions are de ned in con guration les, and the boundaries can be moved by recon guring or rebooting the nodes in the a ected partitions. When users logon to a Cplant, ....
David S. Greenberg, Ron Brightwell, Lee Ann Fisk, Arthur B. Maccabe, and Rolf Riesen. A system software architecture for high-end computing. In Proceedings of Supercomputing'97, San Jose, CA, November 1997.
....bank so that the processor can proceed without waiting for a response. Software will allow maps to be set up and account for situations which are beyond the capabilities of the hardware. The system software for ASCI Red partitions the nodes into three logical groups: service, I O, and compute [13]. Since the service partition is relatively small (on the order of ten nodes) it runs a variant of OSF1 AD, which supplies a standard workstation image to users. In order to integrate the service partition into the machine, Sandia has added an interface module, yod, which allows parallel tasks to ....
....and decrease congestion. Puma is designed to avoid many of the costly functions and daemons of standard operating systems which provide convenience to interactive users but increase overhead for parallel computations. We are currently integrating it with Linux through a partition model [13] in which Puma can supply performance on most nodes and Linux can supply convenient features on a few nodes. Load balancing tools The Chaco software package [18] developed at Sandia, minimizes communication requirements for statically load balanced settings. When solving a set of PDEs using ....
D.S. Greenberg, R. Brightwell, L. Fisk, A.B. Maccabe, and R. Riesen, "A system software architecture for high-end computing", Proceedings of SC97, November 1997, to appear.
No context found.
D. S. Greenberg, R. Brightwell, L. A. Fisk, A. B. Maccabe, and R. Riesen, "A system software architecture for high-end computing," in Proceedings of SC97: High Performance Networking and Computing. San Jose, California: ACM Press, Nov. 1997, pp. 1--15. [Online]. Available: http://doi.acm.org/10.1145/509593.509646
No context found.
D. Greenberg, R. Brightwell, LA. Fisk, A. Maccabee, R. Riesen, "A System Software Architecture for High-End Computing". In proceedings of Supercomputing '97. Nov 1997.
No context found.
D. Greenberg, R. Brightwell, L. Fisk, A. Maccabe, and R. Riesen. A system software architecture for high-end computing. In Proceedings of Supercomputing 97, San Jose, CA, 1997.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC