Empirical Evaluation of Global Memory Support on the CRAY-T3D and CRAY-T3E
by Arvind Krishnamurthy, David E. Culler, Katherine Yelick
http://www.cs.berkeley.edu/projects/titanium/papers/krishnamurthy-etal-tr98.ps
Add To MetaCart
Abstract:
Performance prediction on parallel machines is notoriously difficult, especially using designer-supplied machine parameters for features like clock speed, network latency, and network bandwidth. The performance observed by an application programmer is a complicated function of the local memory hierarchy on each node, software
Citations
| 926 | Active Messages: A mechanism for integrated communication and computation – Eicken, Culler, et al. - 1992 |
| 434 | LogP: Towards a Realistic Model of Parallel Computation – Culler, al - 1993 |
| 115 | Synchronization and communication in the T3E multiprocessor – Scott - 1996 |
| 102 | Compositional parallel programming – Chandy, Kesselman - 1992 |
| 65 | CRAY T3D: A new dimension for Cray Research – Kessler, Schwarzmeier - 1993 |
| 52 | Empirical evaluation of the CRAY-T3D: a compiler perspective – Arpaci, Culler, et al. - 1995 |
| 47 | CPU performance evaluation and execution time prediction using narrow spectrum benchmarking – Saavedra-Barrera - 1992 |
| 45 | Divergence preserving discrete surface integral methods for maxwell's curl equations using non-orthogonal unstructured grids – Madsen - 1992 |
| 34 | Micro Benchmark Analysis of the KSR1 – Saavedra, Gaines, et al. - 1993 |
| 17 | Distributed data access in AC – Carlson, Draper - 1995 |
| 7 | A Shared-Memory MPP from Cray Research – Koeninger, Furtney, et al. - 1994 |

