MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Universal mechanisms for data-parallel architectures (2003) [5 citations — 1 self]

Download:
pdf
by Karthikeyan Sankaralingam, Stephen W. Keckler, William R. Mark, Doug Burger
In Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture
http://www.microarch.org/micro36/html/./pdf/sankaralingam-UniversalMechanisms.pdf
Add To MetaCart

Abstract:

Data-parallel programs are both growing in importance and increasing in diversity, resulting in specialized processors targeted at specific classes of these programs. This paper presents a classification scheme for data-parallel program attributes, and proposes micro-architectural mechanisms to support applications with diverse behavior using a single reconfigurable architecture. We focus on the following four broad kinds of data-parallel programs — DSP/multimedia, scientific, networking, and real-time graphics workloads. While all of these programs exhibit high computational intensity, coarse-grain regular control behavior, and some regular memory access behavior, they show wide variance in the computation requirements, fine grain control behavior, and the frequency of other types of memory accesses. Based on this study of application attributes, this paper proposes a set of general micro-architectural mechanisms that enable a baseline architecture to be dynamically tailored to the demands of a particular application. These mechanisms provide efficient execution across a spectrum of data-parallel applications and can be applied to diverse architectures ranging from vector cores to conventional superscalar cores. Our results using a baseline TRIPS processor show that the configurability of the architecture to the application demands provides harmonic mean performance improvement of 5%–55 % over scalable yet less flexible architectures, and performs competitively against specialized architectures. 1.

Citations

3148 Computer architecture : a quantitative approach, 3rd ed – Hennessy, Patterson, et al. - 2003
128 iWarp, an integrated solution to highspeed parallel computing – Borkar, Cohn, et al. - 1988
122 M.J.: Cg: A system for programming graphics hardware in a c-like language – Mark, Glanville, et al. - 2003
121 The MasPar MP-1 Architecture – Blank - 1990
110 Smart memories: A modular reconfigurable architecture – Mai, Paaske, et al. - 2002
105 The Cray-1 Computer System – Russell - 1978
97 The raw microprocessor: A computational fabric for software circuits and general purpose programs – Taylor, Kim, et al. - 2002
88 A bandwidth-efficient architecture for media processing – Rixner, Dally, et al. - 1998
77 POWER4 system microarchitecture – Tendler, Dodson, et al. - 2001
57 The T0 Vector Microprocessor – Asanović, Beck, et al. - 1995
49 Reality Engine Graphics – Akeley - 1993
47 The design and analysis of a cache architecture for texture mapping – Hakura, Gupta - 1997
41 Custom-fit processors: Letting applications define architecures – Fisher, Faraboschi, et al. - 1996
40 Cheops: A reconfigurable data-flow system for video processing – Bove, Watlington - 1995
32 Cryptomaniac: A Fast Flexible Architecture for Se Communication – Wu, Weaver, et al. - 2001
30 The Cg Tutorial – Fernando, Kilgard - 2003
29 Efficient conditional operations for data-parallel architectures – Kapasi, Dally, et al. - 2000
24 Stream scheduling – Kapasi, Mattson, et al. - 2001
18 Tarantula: a vector extension to the alpha architecture – Espasa, Ardanaz, et al. - 2002
14 A programmable baseband processor design for software defined radios – Rajagopal, Rixner, et al. - 2002
13 Vector instruction set support for conditional operations – Smith, Faanes, et al. - 2000
10 Overcoming the limitations of conventional vector processors – Kozyrakis, Patterson - 2003
9 Data parallel computation on graphics hardware. unpublished report – BUCK, HANRAHAN - 2003
6 Vector IRAM: A Media-oriented Vector Processor with Embedded DRAM – Kozyrakis, Gebis, et al. - 2000
6 Vector unit architecture for emotion synthesis – Kunimatsu - 2000
4 Outof -order Vector Architectures – Espasa, Valero, et al. - 1996
3 Advanced Processing Techniques Using the Intrinsity FastMATH – Olson - 2002
3 Design and Implementation of a PhysicallyBased Rendering System – Pharr, Humphreys - 2003
2 Radeon 9700 Shading (ATI Technologies, white paper – Mitchell - 2002
2 Concepts of th System/370 Architecture – Moore, Padegs, et al. - 1987
2 A Design Space Exploration of Grid Processor Architectures – Nagarajan, Sankaralingam, et al. - 2001
2 NV fragment program – Corp
2 NV vertex program2 – Corp - 2002