Download:
by Weirong Zhu, Juan Del Cuvillo, Guang R. Gao
http://www.capsl.udel.edu/~weirong/paper/iwomp2006.pdf
Add To MetaCart
Abstract:
Abstract. Recent emerging many-core-on-a-chip architectures present massive on-chip parallelism through hardware support for multithreading. In order to achieve fast development of parallel applications that exploit this massive intrachip parallelism to achieve highly sustainable performance, suitable programming models are needed. OpenMP, the industry de facto standard for writing parallel programs on shared memory systems, could become a reasonable candidate. To increase our understanding of the behavior and performance characteristics of OpenMP programs on many-core-on-a-chip architectures, this paper presents a performance study of basic OpenMP language constructs on the IBM Cyclops-64 architecture, which consists of 160 hardware thread units in a single chip. Compared with previous work on conventional SMP systems [1], the overhead of OpenMP language constructs on C64 many-core architecture is at least one order of magnitude lower. 1
Citations
|
375
|
Algorithms for Scalable Synchronization on Shared-memory Multiprocessors
– Mellor-Crummey, Scott
- 1991
|
|
264
|
A single-chip multiprocessor
– Hammond, Nayfeh, et al.
- 1997
|
|
14
|
Measuring Synchronization And Scheduling Overheads in OpenMP
– Bull
- 1999
|
|
10
|
A Microbenchmark Suite for OpenMP 2.0
– Bull, O’Neill
- 2001
|
|
8
|
Fast: A functionally accurate simulation toolset for the cyclops-64 cellular architecture
– Cuvillo, Zhu, et al.
- 2005
|
|
7
|
Toward a software infrastructure for the Cyclops-64 cellular architecture
– Cuvillo, Zhu, et al.
- 2006
|
|
6
|
Performance evaluation of the Omni OpenMP compiler
– Kusano, Satoh, et al.
- 2000
|
|
6
|
Tiny threads: A thread virtual machine for the cyclops64 cellular architecture
– Cuvillo, Zhu, et al.
- 2005
|
|
6
|
Performance Characteristics for OpenMP Constructs on Different Parallel Computer Architectures
– Berrendorf, Nieken
- 1999
|
|
5
|
Performance characteristics of OpenMP constructs, and application benchmarks on a large symmetric multiprocessor
– Fredrickson, Afsahi, et al.
- 2003
|
|
4
|
Landing openmp on cyclops-64: An efficient mapping of openmp to a many-core system-on-a-chip
– Cuvillo, Zhu, et al.
- 2006
|
|
4
|
Performance comparisons of basic openmp constructs
– Prabhakar, Getov, et al.
- 2002
|
|
4
|
Evaluating openmp on chip multithreading platforms
– Liao, Liu, et al.
- 2005
|
|
2
|
H.S.: 64-bit Cyclops principles of operation part I
– Denneau, Warren
- 2005
|
|
2
|
H.S.: 64-bit Cyclops principles of operation part II: Memory organization, the A-switch, and SPRs
– Denneau, Warren
- 2005
|
|
1
|
José Castanos
– Almasi, Ayguadé, et al.
|
|
1
|
Optimizing NANOS openMP for the IBM Cyclops multithreaded architecture
– Ródenas, Martorell, et al.
- 2005
|