See this document in CiteSeerX!

Portable High-Performance Programs (1999)  (Make Corrections)  (1 citation)
Matteo Frigo



  Home/Search   Context   Related

 
View or download:
mit.edu/pub/cilk/f...ophdthesis.ps.gz
mit.edu/publicatio...TLCSTR785.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  mit.edu/cilk/papers/ (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: This dissertation discusses how to write computer programs that attain both high performance and portability, despite the fact that current computer systems have different degrees of parallelism, deep memory hierarchies, and diverse processor architectures. To cope with parallelism portably in high-performance programs, we present the Cilk multithreaded programming system. In the Cilk-5 system, parallel programs scale up to run efficiently on multiple processors, but unlike existing... (Update)

Context of citations to this paper:   More

...The result of automatic regulation of GC parallelism, in BH O2K. costs than on irregular programs. The Cilk performance model [1, 8] estimates the par allel running time of both regular and irregular programs that are executed in LTC strategy. We utilize this model to...

Cited by:   More
Predicting Scalability of Parallel Garbage Collectors on.. - Endo, Taura, Yonezawa (2001)   (Correct)

Similar documents (at the sentence level):   More
15.1%:   Cilk: Efficient Multithreaded Computing - Randall (1998)   (Correct)
11.7%:   The Implementation of the Cilk-5 Multithreaded Language - Frigo, Leiserson, Randall (1998)   (Correct)
8.3%:   Cache-Oblivious Algorithms (Extended Abstract) - Frigo, Leiserson, Prokop..   (Correct)

Active bibliography (related documents):   More   All
12.2:   Portable High-Performance Programs - Frigo (1992)   (Correct)
1.0:   The Fastest Fourier Transform in the West - Frigo, Johnson (1997)   (Correct)
0.6:   Dag-Consistent Distributed Shared Memory - Blumofe, Frigo, Joerg.. (1996)   (Correct)

Similar documents based on text:   More   All
1.2:   FFTW for version 3.0 - Frigo, Johnson (2003)   (Correct)
0.8:   Executing Multithreaded Programs Efficiently - Blumofe (1995)   (Correct)
0.4:   Code Compaction and Parallelization for VLIW/DSP Chip.. - Tsvetomir Petrov..   (Correct)

Related documents from co-citation:   More   All
2:   A Scalable Mark-Sweep Garbage Collector on Large-Scale Shared-Memory Machines - Endo, Taura et al. - 1997

BibTeX entry:   (Update)

Matteo Frigo. Portable High-Performance Pro- grams. PhD thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, June 1999. http://citeseer.ist.psu.edu/frigo99portable.html   More

@techreport{ frigo99portable,
    author = "M. Frigo",
    title = "Portable High-Performance Programs",
    number = "MIT/LCS/TR-785",
    pages = "169",
    year = "1999",
    url = "citeseer.ist.psu.edu/frigo99portable.html" }
Citations (may not include all citations):
3972   Introduction to Algorithms (context) - CORMEN, LEISERSON et al. - 1990
2441   Johns Hopkins University Press (context) - GOLUB, VAN LOAN et al. - 1989
2003   The Art of Computer Programming (context) - KNUTH, Searching et al. - 1973
1575   Computer Architecture: a Quantitative Approach (context) - HENNESSY, PATTERSON - 1996
1450   The Design and Analysis of Computer Algorithms (context) - AHO, HOPCROFT et al. - 1974
981   Introduction to Parallel Algorithms and Architectures: Array.. (context) - LEIGHTON - 1992
837   Cambridge University Press (context) - MOTWANI, RAGHAVAN et al. - 1995
835   High performance Fortran language specification v - FORTRAN - 1997
606   How to make a multiprocessor computer that correctly execute.. (context) - LAMPORT - 1979
488   Amortized efficiency of list update and paging rules (context) - SLEATOR, TARJAN - 1985
468   Memory consistency and event ordering in scalable shared-mem.. - GHARACHORLOO, LENOSKI et al. - 1990
447   MPI: The Complete Reference (context) - SNIR, OTTO et al. - 1995
443   Discrete-time Signal Processing (context) - OPPENHEIM, SCHAFER - 1989
422   Implementation and performance of Munin - CARTER, BENNETT et al. - 1991
406   TreadMarks: Distributed shared memory on standard workstatio.. - KELEHER, COX et al. - 1994
394   The High Performance Fortran Handbook (context) - KOELBEL, LOVEMAN et al. - 1994
301   The Midway distributed shared memory system (context) - BERSHAD, ZEKAUSKAS et al. - 1993
300   Lazy release consistency for software distributed shared mem.. - KELEHER, COX et al. - 1992
293   Hierarchical correctness proofs for distributed algorithms - LYNCH, TUTTLE - 1987
268   Tempest and Typhoon: User-level shared memory - REINHARDT, LARUS et al. - 1994
232   A study of replacement algorithms for virtual storage comput.. (context) - BELADY - 1966
230   Cilk: An efficient multithreaded runtime system - BLUMOFE, JOERG et al. - 1995
219   Bounds on multiprocessing timing anomalies (context) - GRAHAM - 1969
213   Weak ordering - new definition - ADVE, HILL - 1990
176   Shared memory consistency models: A tutorial - ADVE, GHARACHORLOO - 1996
176   Shared memory consistency models: A tutorial - ADVE, GHARACHORLOO - 1995
174   The parallel evaluation of general arithmetic expressions (context) - BRENT - 1974
173   Lazy task creation: A technique for increasing the granulari.. - MOHR, KRANZ et al. - 1991
168   Gaussian elimination is not optimal (context) - STRASSEN - 1969
165   Memory access buffering in multiprocessors (context) - DUBOIS, SCHEURICH et al. - 1986
159   and tools (context) - AHO, SETHI et al. - 1986
157   Scheduling multithreaded computations by work stealing - BLUMOFE, LEISERSON - 1994
145   CRL: High-performance allsoftware distributed shared memory - JOHNSON, KAASHOEK et al. - 1995
144   PVM: Parallel Virtual Machine - GEIST, BEGUELIN et al. - 1994
142   Solution of a problem in concurrent programming control (context) - DIJKSTRA - 1965
135   Computational Frameworks for the Fast Fourier Transform (context) - LOAN - 1992
123   Optimizing matrix multiply using PHiPAC: a portable - BILMES, ASANOVI et al. - 1997
113   output complexity of sorting and related problems (context) - AGGARWAL, VITTER et al. - 1988
109   Advanced Compiler Design Implementation (context) - MUCHNICK - 1997
107   Scope consistency: A bridge between release consistency and .. - IFTODE, SINGH et al. - 1996
90   Programming parallel algorithms - BLELLOCH - 1996
82   A model for hierarchical memory (context) - AGGARWAL, ALPERN et al. - 1987
82   Hierarchical memory with block transfer (context) - AGGARWAL, CHANDRA et al. - 1987
81   The implementation of the Cilk-5 multithreaded language - FRIGO, RANDALL et al. - 1998
79   The design and evaluation of a shared object system for dist.. - SCALES, LAM - 1994
72   Implementing and programming causal distributed shared memor.. (context) - AHAMAD, HUTTO et al. - 1991
66   A high-performance parallel Lisp (context) - KRANZ, HALSTEAD et al. - 1989
64   Cache-oblivious algorithms (context) - FRIGO, LEISERSON et al.
64   An algorithm for the machine computation of the complex Four.. (context) - COOLEY, TUKEY - 1965
62   An analysis of dag-consistent distributed shared-memory algo.. - BLUMOFE, FRIGO et al. - 1996
61   Department of Electrical Engineering and Computer Science (context) - BLUMOFE, Programs et al. - 1995
61   FFTs in external or hierarchical memory - BAILEY - 1990
59   Deterministic distribution sort in shared and distributed me.. (context) - NODINE, VITTER - 1993
58   Nonlinear array layouts for hierarchical memory systems - CHATTERJEE, JAIN et al. - 1999
45   Thread scheduling for multiprogrammed multiprocessors - ARORA, BLUMOFE et al. - 1998
42   Auto-blocking matrix-multiplication or tracking blas3 perfor.. - FRENS, WISE - 1997
42   Proving sequential consistency of high-performance shared me.. (context) - GIBBONS, MERRITT et al. - 1991
38   Implementation of Multilisp: Lisp on a multiprocessor (context) - HALSTEAD - 1984
36   Multiprocessors should support simple memory consistency pro.. - HILL - 1998
36   The Cilk System for Parallel Multithreaded Computing - JOERG - 1996
35   Uniform memory hierarchies (context) - ALPERN, CARTER et al. - 1990
34   Id: a language with implicit parallelism (context) - NIKHIL - 1990
34   Memory Consistency Models for Shared-Memory Multiprocessors - GHARACHORLOO - 1995
33   Dag-consistent distributed shared memory - BLUMOFE, FRIGO et al. - 1996
27   The implementation and evaluation of fusion and contraction .. - LEWIS, LIN et al. - 1998
27   Recursive array layouts and fast parallel matrix multiplicat.. - CHATTERJEE, LEBECK et al. - 1999
26   Massachusetts Institute of Technology (context) - FRIGO, reasonable et al. - 1998
25   Massachusetts Institute of Technology (context) - PROKOP, algorithms et al. - 1999
25   Institut National de Recherche en Informatique at Automatiqu.. (context) - LEROY, Caml et al. - 1998
25   Fast Fourier transforms: a tutorial review and a state of th.. (context) - DUHAMEL, VETTERLI - 1990
24   The nofib benchmark suite of Haskell programs - PARTAIN - 1992
23   Discrete Fourier transforms when the number of data samples .. (context) - RADER - 1968
23   The BLAZE language: A parallel language for scientific progr.. (context) - MEHROTRA, ROSENDALE - 1987
23   A Fortran to C converter - FELDMAN, GAY et al. - 1995
23   Detecting data races in Cilk programs that use locks - CHENG, FENG et al. - 1998
21   The design of optimal DFT algorithms using dynamic programmi.. (context) - JOHNSON, BURRUS - 1983
21   Specifying nonblocking shared memories (context) - GIBBONS, MERRITT - 1992
20   Discrete weighted transforms and large-integer arithmetic (context) - CRANDALL, FAGIN - 1994
20   Real-valued fast Fourier transform algorithms (context) - SORENSEN, JONES et al. - 1987
18   The function of FUNCTION in LISP or why the FUNARG problem s.. (context) - MOSES - 1970
16   LCM: Memory system support for parallel language implementat.. - LARUS, RICHARDS et al. - 1994
16   Polling efficiently on stock hardware - FEELEY - 1993
16   The control mechanism for the Myrias parallel computer syste.. (context) - BELTRAMETTI, BOBEY et al. - 1988
16   An algorithm for computing the mixed radix fast Fourier tran.. (context) - SINGLETON - 1969
15   Efficient detection of determinacy races in Cilk programs - FENG, LEISERSON - 1999
14   Lazy threads: Implementing a fast parallel call - GOLDSTEIN, SCHAUSER et al. - 1996
13   Analysis of linear digital networks (context) - CROCHIERE, OPPENHEIM - 1975
12   A framework for generating distributed-memory parallel progr.. (context) - GUPTA, HUANG et al. - 1996
12   Portable performance of data parallel languages - NGO, SNYDER et al. - 1997
11   Arrays in a lazy functional language---a case study: the fas.. - HARTEL, VREE - 1992
11   Implementation of a self-sorting in-place prime factor FFT a.. (context) - TEMPERTON - 1985
11   Empirical and analytic study of stack versus heap cost for l.. (context) - APPEL, SHAO - 1996
11   Parallel Symbolic Computing in Cid (context) - NIKHIL - 1995
11   Whole-program optimization for time and space efficient thre.. - GRUNWALD, NEVES - 1996
11   Computation-centric memory models - FRIGO, LUCHANGCO - 1998
10   Available on the Internet from http://theory (context) - Reference - 1998
10   Extending the Hong-Kung model to memory hierarchies (context) - SAVAGE - 1995
9   The effect of page allocation on caches (context) - LYNCH, BRAY et al. - 1992
8   coding methodology and its application to matrix multiply (context) - BILMES, ASANOVI et al. - 1996
8   IEEE Scalable Coherent Interface (context) - GOODMAN, consistency et al. - 1989
7   On testing cache-coherent shared memories (context) - GIBBONS, KORACH - 1994
7   Automatic generation of prime length FFT programs - SELESNICK, BURRUS - 1996
7   The Hartley Transform (context) - BRACEWELL - 1986
6   MIT Artificial Intelligence Laboratory (context) - MILLER, ROZAS et al. - 1994
6   Parallel Computations (context) - SWARZTRAUBER, FFTs - 1982
6   ACM Transactions on Programming Languages and Systems (context) - for, computation - 1985
5   Precedence-based memory models (context) - LUCHANGCO - 1997
5   GNU Scientific Library---Reference Manual (context) - GALASSI, DAVIES et al. - 1999
5   Journal of Computational Physics (context) - set, small-n et al. - 1988
5   stack mechanism (context) - HAUCK, DENT - 1968
5   Heaps o' stacks: Time and space efficient threads without op.. (context) - GRUNWALD - 1994
5   complexity: the red-blue pebbling game (context) - HONG, KUNG - 1981
5   Optimal parallel merging and sorting without memory conflict.. (context) - AKL, SANTORO - 1987
5   Department of Electrical Engineering and Computer Science (context) - MILLER, preprocessor et al. - 1995
4   FFT algorithms for prime transform sizes and their implement.. (context) - LU, COOLEY et al. - 1993
4   Programs for Digital Signal Processing (context) - COMMITTEE - 1979
4   user's manual (context) - MOTOROLA
4   Personal communication (context) - LISIECKI, MEDINA - 1998
4   Operating Systems Theory (context) - JR, DENNING - 1973
4   A prime factor FFT algorithm implementation using a program .. (context) - PEREZ, TAKAOKA - 1987
4   Testing multivariate linear functions: Overcoming the genera.. (context) - UN - 1995
4   The Art of Computer Programming (context) - Algorithms, of - 1998
3   The Fast Fourier Transform algorithm and its applications (context) - COOLEY, LEWIS et al. - 1967
3   IEEE Transactions on Computers (context) - BLUM, WASSERMAN et al. - 1996
3   Journal of Parallel and Distributed Computing - An, runtime - 1996
3   Factorization method for crystallographic Fourier transforms (context) - AN, COOLEY et al. - 1990
3   FOURGEN: a fast Fourier transform program generator (context) - MARUHN - 1976
3   an implicitly parallel lambda-calculus with letrec (context) - ARVIND, MAESSEN et al. - 1996
3   MIT Computation Structures Group (context) - NIKHIL, HICKS et al. - 1995
3   Performance nonmonotonicities: A case study of the UltraSPAR.. - KUSHMAN - 1998
3   Cilk: Efficient Multithreaded Computing - RANDALL - 1998
2   The case for high level parallel programmin in zpl (context) - CHAMBERLAIN, CHOI et al. - 1998
2   The Tragical History of Doctor Faustus (context) - MARLOWE
2   Speech and Signal Processing (context) - SORENSEN, HEIDEMAN et al. - 1986
2   VLSI support for a cactus stack oriented memory organization (context) - OM - 1988

Documents on the same site (http://supertech.lcs.mit.edu/cilk/papers/):   More
Portable Fault-Tolerant File I/O - Lyubashevskiy   (Correct)
Debugging Multithreaded Programs that Incorporate User-Level Locking - Stark (1998)   (Correct)
Scheduling Adaptively Parallel Jobs - Song (1998)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC