MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Interaction Between Parallel Compilation And Data Transfer and Storage Cost Minimization for Multimedia Applications

Download:
Download as a PDF
by Chidamber Kulkarni, Koen Danckaert, Also Ph. D
http://www.imec.be/design/dtse/pdf/Kul01b.pdf
Add To MetaCart

Abstract:

Real-time multi-media applications need large processing power and yet require a low-power implementation in an embedded context. For programmable parallel processors, this poses new challenges for optimizing a given application for high-performance and low-power. In this paper, we present a case study of applying our low-power oriented data transfer and storage exploration methodology and coupling it with a state-of-the-art performance optimizing and parallelizing compiler. Experiments on two real-life applications show that this combined approach heavily reduces the memory accesses and bus-loading and hence power. At the same time a significant reduction in the total execution time is obtained. Decomposing the detailed parallelization and data transfer and storage exploration issues into two different stages is required to obtain the important benefits of both the stages without exploding the complexity of solving all the issues simultaneously. This will be demonstrated by the experimental results. Key-Words: Program transformations, parallelization, data transfer and storage, low power, multimedia applications. 1 Introduction and Related Work Parallel machines were mainly, if not exclusively, used in scientific communities until recently. Lately, the rapid

Citations

3148 Computer Architecture: A Quantitative Approach – Hennessy, Patterson - 1996
657 Advanced Compiler Design and Implementation – Muchnick - 1997
152 Unifying data and control transformations for distributed shared memory machines – Cierniak, Li - 1995
123 Automatic Array Privatization – Tu, Padua - 1993
107 Instruction level power analysis and optimization of software – Tiwari, Malik, et al. - 1996
106 The SUIF compiler for scalable parallel machines – Amarasinghe, Anderson, et al. - 1995
52 Automatic Partitioning of Parallel Loops and Data Arrays for Distributed Shared-Memory Multiprocessors – Agarwal, Krantz, et al. - 1995
50 Formalized methodology for data reuse exploration in hierarchical memory mappings – DIGUET, WUYTACK, et al. - 1997
38 Low-overhead scheduling of nested parallelism – Hummel, Schonberg - 1991
31 Communication-Free Data Allocation Techniques for Parallelizing Compilers on Multicomputers – Chen, Sheu - 1994
29 Memory size reduction through storage order optimization for embedded parallel multimedia applications – Greef, Catthoor, et al. - 1997
25 E.De Greef, F.Balasa, L.Nachtergaele, A.Vandecappelle, “Custom Memory Management Methodology – Exploration of Memory Organisation for Embedded – Catthoor - 1998
18 QSDPCM –ANewTechnique in Scene Adaptive Coding – Strobach - 1988
17 System-level memory management for weakly parallel image processing – Danckaert, Man - 1996
15 High-level address optimisation and synthesis techniques for data-transfer intensive applications – Miranda, Janssen, et al. - 1998
13 A Strategy for Array Management – Eisenbeis, Jalby, et al. - 1991
6 Transformation of nested loops with modulo indexing to affine recurrences – Balasa, Franssen, et al. - 1994
6 A.Nicolau, D.Padua, “Automatic program parallelisation – Banerjee - 1993
5 Program transformation strategies for reduced power and memory size in pseudo-regular multimedia applications”, accepted for publication – Greef, Man - 1998
5 Optimizing Supercompilers for Supercomputers", Reasearch Monographs in Parallel and Distributed Computing – Wolfe - 1989
3 J.Cornelis, “Automatic Segmentation of Cardiac MR – Bister - 1989
3 System level energy-delay exploration for multimedia applications on embedded cores with hardware caches – Kulkarni, Moolenaar, et al. - 1999
3 G.de Jong, “Fast and extensive system-level memory exploration for ATM applications – Slock, Catthoor - 1997
2 network computer and its future – Broderson - 1997
2 Cache optimization for multimedia compilation on embedded processors for low power – Kulkarni, Catthoor, et al. - 1998