|
|
Compact Data Structure and Scalable Algorithms for the Sparse Grid Technique
– Josef Weidendorfer, Gerrit Buse, Daniel Butnaru, Dirk Pflüger, Technische Universität München
|
|
|
CrystalGPU: Transparent and Efficient Utilization of GPU Power
– Abdullah Gharaibeh, Samer Al-kiswany, Matei Ripeanu
|
|
|
Author manuscript, published in "Computers and mathematics with applications (2010)" DOI: 10.1016/j.camwa.2010.01.054
– unknown authors
- 2011
|
|
|
1 Data Layout Transformation for Structured-Grid Codes on GPU
– I-jui Sung, Wen-mei Hwu
|
|
7
|
Inter-Block GPU Communication via Fast Barrier Synchronization
– Shucai Xiao, Wu-chun Feng
|
|
2
|
Data Layout Transformation Exploiting Memory-Level Parallelism in Structured Grid Many-Core Applications
– I-jui Sung, John A. Stratton, Wen-mei W. Hwu
|
|
1
|
Streamlining GPU Applications On the Fly —Thread Divergence Elimination through Runtime Thread-Data Remapping
– Eddy Z. Zhang, Yunlian Jiang, Ziyu Guo, Xipeng Shen
|
|
|
Architecture-Aware Mapping and Optimization on a 1600-Core GPU
– Mayank Daga, Thomas Scogl, Wu-chun Feng
|
|
3
|
Correctly treating synchronizations in compiling fine-grained spmd-threaded programs for cpu
– Ziyu Guo, Eddy Z. Zhang, Xipeng Shen
- 2011
|
|
|
University of Alberta Jit4OpenCL: A Compiler from Python to OpenCL
– Xunhao Li, José Nelson Amaral, Computing Science, Duane Szafron, Computing Science
|
|
19
|
Efficient sparse matrix-vector multiplication on CUDA
– Nathan Bell, Michael Garl
- 2008
|
|
29
|
Implementing sparse matrix-vector multiplication on throughput-oriented processors
– Nathan Bell, Michael Garland
- 2009
|
|
|
Kernels for Multi-Core CPUs
– John A. Stratton, Sam S. Stone, Wen-mei W. Hwu
|
|
|
Efficient Mapping of Multiresolution Image Filtering Algorithms on Graphics Processors
– Richard Membarth, Frank Hannig, Hritam Dutta, Jürgen Teich
|
|
|
Acceleration of Multiresolution Imaging Algorithms: A Comparative Study
– Richard Membarth, Hritam Dutta, Frank Hannig, Jürgen Teich
|
|
|
University of Illinois at Urbana-Champaign Center for Reliable and High-Performance Computing
– John A. Stratton, Sam S. Stone, Wen-mei W. Hwu
- 2008
|
|
|
Parallel Hyperbolic PDE Simulation on Clusters: Cell versus GPU
– Scott Rostrup, Hans De Sterck
|
|
|
Distributed Stream Processing with DUP
– Kai Christian Bader, Tilo Eißler, Nathan Evans, Chris Gauthierdickey, Christian Grothoff, Krista Grothoff, Jeff Keene, Harald Meier, Craig Ritzdorf, Matthew J. Rutherford, Technische Universität München
|
|
1
|
Gaussian Elimination Based Algorithms on the GPU
– Aydın Buluç , John R. Gilbert , Ceren Budak
- 2008
|