See this document in CiteSeerX!

Compiler Optimizations For Parallel Loops With Fine-Grained Synchronization (1994)  (Make Corrections)  (5 citations)
Ding-Kai Chen



  Home/Search   Context   Related

 
View or download:
uiuc.edu/reports/1374.ps.gz
uiuc.edu/reports/1374.ps.gz
uiuc.edu/pub/CSRD_Reports/...1374.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  uiuc.edu/tech_reports (more)
From:  uiuc.edu/report...ports.html.save
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: this paper, we presented and evaluated a new runtime algorithm to parallelize these loops. Our scheme handles any type of data dependence pattern without requiring any special architectural support. Furthermore, compared to an older scheme with the same generality, it speeds up execution by allowing the reuse of the inspector phase across loop invocations, allowing partial overlap of dependent iterations, and optimizing the inspector for high locality and low communication. We have evaluated... (Update)

Similar documents based on text:   More   All
0.1:   Fine-Grained Dynamic Instrumentation of Commodity Operating.. - Tamches, Miller (1999)   (Correct)
0.1:   One Sense per Collocation and Genre/Topic Variations - Martinez, Agirre (2000)   (Correct)
0.1:   The Flask Security Architecture: System Support.. - Spencer, Smalley, .. (1998)   (Correct)

Related documents from co-citation:   More   All
3:   Compiler algorithm for Synchronization (context) - Midkiff, Padua - 1987
2:   Loop transformations for restructuring compilers: the foundations (context) - Banerjee - 1993
2:   Automatic synchronization elimination in synchronous FORALLs - Philippsen, Heinz - 1995

BibTeX entry:   (Update)

D. Chen. Compiler Optimizations for Parallel Loops with Fine-grained Synchronization. PhD thesis, Dept. of Computer Science, Univeristy of Illinois at Urbana-Champaign, 1994. http://citeseer.ist.psu.edu/chen94compiler.html   More

@techreport{ chen94compiler,
    author = "Ding-Kai Chen",
    title = "Compiler optimizations for parallel loops with fine-grained synchronization",
    number = "UIUCDCS-R-94-1863",
    year = "1994",
    url = "citeseer.ist.psu.edu/chen94compiler.html" }
Citations (may not include all citations):
4212   Computers and Intractability: A Guide to the Theory of NPCom.. (context) - Garey, Johnson - 1979
1254   Computational Geometry: An Introduction (context) - Preparata, Shamos - 1985
405   Depth first search and linear graph algorithms (context) - Tarjan - 1972
357   The directorybased cache coherence protocol for the DASH mul.. (context) - Lenoski, Laudon et al. - 1990
353   Software pipelining: An effective scheduling technique for v.. (context) - Lam - 1988
299   Dependence Analysis for Supercomputing (context) - Banerjee - 1988
294   A loop transformation theory and an algorithm to maximize pa.. (context) - Wolf, Lam - 1991
277   Advanced compiler optimization for supercomputers (context) - Padua, Wolfe - 1986
260   Validity of the single processor approach to achieving large.. (context) - Amdahl - 1967
234   Multilisp: A language for concurrent symbolic computation (context) - Jr - 1985
217   The PERFECT club benchmarks: Effective performance evaluatio.. - Berry, Chen - 1989
175   Matrix eigensystem routines - eispack guide (context) - Smith, Boyle et al. - 1976
168   The parallel execution of DO loops (context) - Lamport - 1974
165   SPARSKIT: A Basic Tool Kit for Sparse Matrix Computation - Saad - 1990
159   The NYU Ultracomputer -- designing an MIMD shared memory par.. (context) - Gottlieb, Grishman et al. - 1983
146   Unimodular transformation of double loops (context) - Banerjee - 1990
104   The Structure of Computers and Computations (context) - Kuck - 1978
94   Run-time parallelization and scheduling of loops (context) - Saltz, Mirchandaney et al. - 1991
92   Performance evaluation of memory consistency models for shar.. - Gharachorloo, Gupta et al. - 1991
90   The IBM research parallel processor prototype (context) - Pfister, Brantley et al. - 1985
82   Some computer organizations and their effectiveness (context) - Flynn - 1972
78   Compiler algorithms for synchronization (context) - Midkiff, Padua - 1987
69   Runtime compilation techniques for data partitioning and com.. - Ponnusamy, Saltz et al. - 1993
46   Analysis of event synchronization in a parallel programming .. - Callahan, Kennedy et al. - 1990
45   An empirical study of fortran programs for parallelizing com.. - Shen, Li et al. - 1990
44   Optimizing Compilers for Supercomputers (context) - Wolfe - 1982
42   Improving the performance of runtime parallelization - Leung, Zahorjan - 1993
31   shared resource mimd computer (context) - Smith, pipelined - 1978
31   Execution-driven tools for parallel simulation of parallel a.. - Poulsen, Yew - 1993
30   The Cedar system and an initial performance study (context) - Kuck - 1993
29   Series Architecture Manual (context) - System, FX - 1986
25   Multiprocessors: Discussion of Some Theoretical and Practica.. (context) - Padua - 1979
23   An approach to synchronization of parallel computing (context) - Krothapalli, Sadayappan - 1988
21   Dependence uniformization: A loop parallelization technique - Tzen, Ni - 1993
21   Dependence uniformization: A loop parallelization technique - Tzen, Ni - 1991
20   Compiler generated synchronization for DO loops (context) - Midkiff, Padua - 1986
19   On reducing data synchronization in multiprocessed loops (context) - Li, Abu-Sufah - 1987
19   The preprocessed doacross loop (context) - Saltz, Mirchandaney - 1991
16   On uniformization of affine dependence algorithms (context) - Chen, Shang - 1992
14   Removal of redundant dependences in doacross loops with cons.. (context) - Krothapalli, Sadayappan - 1991
14   A special purpose architecture for finite element analysis (context) - Jordan - 1978
14   Removal of redundant dependences in doacross loops with cons.. (context) - Krothapalli, Sadayappan - 1992
13   Minimization of interprocessor synchronization in multiproce.. (context) - Shaffer - 1989
9   Butterfly products overview (context) - Computers - 1987
8   Cedar architecture and its software (context) - Emrath, Padua et al. - 1989
8   MaxPar: An execution driven simulator for studying parallel .. - Chen - 1989
8   Automatic generation of synchronization instructions for par.. (context) - Midkiff - 1986
8   A synchronization scheme and its application for large multi.. (context) - Zhu, Yew - 1984
7   Compile-time Scheduling and Optimization for Asychronous Mac.. (context) - Cytron - 1984
7   Automatic transformation of Fortran program to vector form (context) - Allen, Kennedy - 1987
7   On data synchronization for multiprocessors (context) - Su, Yew - 1989
6   A comparison of four synchronization optimization techniques (context) - Midkiff, Padua - 1991
6   Prentice-Hall Inc (context) - Reingold, Nievergelt et al. - 1977
6   Scheduling loops on processors: Algorithms and complexity (context) - Simons, Munshi - 1990
6   PCF Fortran: Language definition (context) - Computing - 1988
5   Advanced loop optimizations for parallel computers (context) - Polychronopoulos - 1987
5   A technique for reducing synchronization overhead in large s.. (context) - Li, Abu-Sufah - 1985
4   Efficient doacross execution for distributed shared memory m.. (context) - Su, Yew - 1991
4   of Illinois at Urbana-Champaign (context) - Muroaka, Exploitation et al. - 1971
3   A scheduling problem arising from loop parallelization on MI.. (context) - Anderson, Munshi et al. - 1988

Documents on the same site (http://www.csrd.uiuc.edu/tech_reports.html):   More
Automatic Detection Of Nondeterminacy, And Scalar Optimizations In .. - Ghosh (1992)   (Correct)
Run-time Visualization of Program Data - Tuchman, Jablonowski, Cybenko (1991)   (Correct)
PTOPP - A Practical Toolset for the Optimization of Parallel.. - McClaughry (1992)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC