Parallel Hierarchical Radiosity On Cache-Coherent Multiprocessors
Abstract:
Computing radiosity is a computationally very expensive problem in computer graphics. Recent hierarchical methods have greatly speeded up the computation of first diffuse and now also specular radiosity. We present a parallel algorithm for computing both diffuse and specular radiosity together, and discuss the techniques we used to improve its performance. The algorithm is both irregular and highly unpredictable. Despite this, by carefully designing a parallel algorithm that minimizes synchronization and memory access overhead and by identifying and correcting several synchronization bottlenecks that we did not anticipate, we were able to obtain speedups of 26.3 on a 32-processor machine with distributed memory and 14.2 on a 16-processor machine with centralized memory. We demonstrate how lock wait data can be used to significantly improve the performance of complex, irregular parallel applications.
Citations
| 363 | The Stanford Dash Multiprocessor – Lenoski, Laudon, et al. - 1992 |
| 329 | A rapid hierarchical radiosity algorithm – Hanrahan, Salzman, et al. - 1991 |
| 87 | MemSpy: Analyzing memory system bottlenecks in programs – MARTONOSI, GUPTA, et al. - 1992 |
| 81 | Parallel visualization algorithms: performance and architectural implications – Singh, Gupta, et al. - 1994 |
| 62 | Simulation of Multiprocessors: Accuracy and Performance – Goldschmidt - 1993 |
| 1 | Aupperle and Pat Hanrahan,."A Hierarchical Illumination Algorithm for Surfaces with Glossy Reflection – Larry - 1994 |

