Abstract:
Despite a decade of research into their use for computing applications, FPGA-based custom computing machines are still only used to accelerate a limited range of applications. Recognizing that recent advances in network technology provide an opportunity for a more general-purpose application of custom computing machines, we develop the idea of an intelligent network adapter for cluster-based parallel computing, calling the resulting architecture an Adaptable Computing Cluster. Results presented suggest that placing the FPGAs in the data path to the network dramatically improves the performance and scalability of target applications. This is especially noteworthy because the target applications have historically not performed well on either technology. This paper discusses how FPGAs can be used to provide network functionality while increasing compute power. The focus is on a specific application, the 2D Fast Fourier Transform, with additional insights into the implications for parallel computing on a cluster. 1.
Citations
|
926
|
Active Messages: A mechanism for integrated communication and computation
– Eicken, Culler, et al.
- 1992
|
|
784
|
Myrinet: A Gigabit-per-second Local Area Network
– Boden, Cohen, et al.
- 1995
|
|
527
|
U-Net: A User-Level Network Interface for Parallel and Distributed Computing
– Eicken, Basu, et al.
- 1995
|
|
323
|
Tempest and Typhoon: User-Level Shared Memory
– Reinhardt, Larus, et al.
- 1994
|
|
269
|
Virtual memory mapped network interface for the SHRIMP multicomputer
– Blumrich, Li, et al.
- 1994
|
|
267
|
FFTW: An adaptive software architecture for the FFT
– Frigo, Johnson
- 1998
|
|
81
|
Implementation and analysis of the virtual interface architecture
– Buonadonna, Geweke, et al.
- 1998
|
|
42
|
programmable port extender (FPX) for distributed routing and queuing
– Lockwood, Turner, et al.
|
|
38
|
Supporting Parallel Applications on Clusters of Workstations: The Virtual Communication Machine-based Architecture
– Rosu, Schwan, et al.
- 1998
|
|
24
|
An FPGA-based coprocessor for ATM firewalls
– McHenry, Dowd, et al.
- 1997
|
|
23
|
Scheduling communication on an SMP node parallel machine
– Falsafi, DA
- 1997
|
|
17
|
Implementing an API for Distributed Adaptive Computing Systems
– Jones, Scharf, et al.
- 1999
|
|
16
|
The design and evaluation of high performance communication using a Gigabit Ethernet
– Sumimoto, Tezuka, et al.
- 1999
|
|
13
|
An Adaptive Cryptographic Engine for IPSec Architectures
– Dandalis, Prasanna, et al.
- 2000
|
|
13
|
Sepia: scalable 3D compositing using PCI pamette
– Moll, Heirich, et al.
- 1999
|
|
11
|
Athanas, “Implementation and Evaluation of a Prototype Reconfigurable Router
– Hess, Lee, et al.
- 1999
|
|
11
|
A.: Fine grain parallel communication on general purpose LANs. In
– Mummert, Kosak, et al.
- 1996
|
|
6
|
Using embedded network processors to implement global memory management in a workstation cluster
– Coady, JS, et al.
- 1999
|
|
2
|
SPINE: A safe programmble and integrated network environment
– Fiuczynski, Martin, et al.
- 1998
|
|
1
|
Design and implementation of a multicomputer interconnection network using FPGAs
– Yeh, Wu, et al.
- 1995
|