Results 1 -
6 of
6
A Profiler for a Heterogeneous Multi-Core Multi-FPGA System by
"... A thesis submitted in conformity with the requirements ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
(Show Context)
A thesis submitted in conformity with the requirements
MPI as an Abstraction for Software-Hardware Interaction for HPRCs
"... (HPRCs) consist of one or more standard microprocessors tightly coupled with one or more reconfigurable FPGAs. HPRCs have been shown to provide good speedups and good cost/performance ratios, but not necessarily ease of use, leading to a slow acceptance of this technology. HPRCs introduce new design ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
(Show Context)
(HPRCs) consist of one or more standard microprocessors tightly coupled with one or more reconfigurable FPGAs. HPRCs have been shown to provide good speedups and good cost/performance ratios, but not necessarily ease of use, leading to a slow acceptance of this technology. HPRCs introduce new design challenges, such as the lack of portability across platforms, incompatibilities with legacy code, users reluctant to change their code base, a prolonged learning curve, and the need for a system-level Hardware/Software co-design development flow. This paper presents the evolution and current work on TMD-MPI, which started as an MPI-based programming model for Multiprocessor Systems-on-Chip implemented in FPGAs, and has now evolved to include multiple X86 processors. TMD-MPI is shown to address current design challenges in HPRC usage, suggesting that the MPI standard has enough syntax and semantics to program these new types of parallel architectures. Also presented is the TMD-MPI Ecosystem, which consists of research projects and tools that are developed around TMD-MPI to further improve HPRC usability. I.
A Parallel Programming Model for a Multi-FPGA Multiprocessor Machine
"... Recent research has shown that FPGAs can execute certain applications significantly faster than state-of-the-art processors. The penalty is the loss of generality, but the reconfigurability of FPGAs allows them to be reprogrammed for other applications. Therefore, an efficient programming model and ..."
Abstract
- Add to MetaCart
(Show Context)
Recent research has shown that FPGAs can execute certain applications significantly faster than state-of-the-art processors. The penalty is the loss of generality, but the reconfigurability of FPGAs allows them to be reprogrammed for other applications. Therefore, an efficient programming model and a flexible design flow are paramount for FPGA technology to be more widely accepted. In this thesis, a lightweight subset implementation of the MPI standard, called TMD-MPI, is presented. TMD-MPI provides a programming model capable of using multiple-FPGAs and embedded processors while hiding hardware complexities from the programmer, facilitating the development of parallel code and promoting code portability. A message-passing engine (TMD-MPE) is also developed to encapsulate the TMD-MPI functionality in hardware. TMD-MPE enables the communication between hardware engines and embedded processors. In addition, a Network-on-Chip is designed to enable intra-FPGA and inter-FPGA communications. Together, TMD-MPI, TMD-MPE and the network provide a flexible design flow for Multiprocessor System-on-Chip design.
UNIVERSITY OF TORONTO
"... Recently there have been initiatives from both the industry and academia to explore the use of FPGA-based application-specific hardware acceleration in high-performance computing platforms as traditional supercomputers based on clusters of generic CPUs fail to scale to meet the growing demand of com ..."
Abstract
- Add to MetaCart
(Show Context)
Recently there have been initiatives from both the industry and academia to explore the use of FPGA-based application-specific hardware acceleration in high-performance computing platforms as traditional supercomputers based on clusters of generic CPUs fail to scale to meet the growing demand of computation-intensive applications due to limitations in power consumption and costs. Research has shown that a heterogeneous system built on FPGAs exclusively that uses a combination of different types of computing nodes including embedded processors and application-specific hardware accelerators is a scalable way to use FPGAs for high-performance computing. An example of such a system is the TMD [11], which also uses a message-passing network to connect the computing nodes. However, the difficulty in designing high-speed hardware modules efficiently from software descriptions is preventing FPGA-based systems from being widely adopted by software developers. In this project, an automated tool flow is proposed to fill this gap. The AUTO flow is developed to automatically generate a hardware computing node from a C program that can be used directly in the TMD system. As an example application, a Jacobi heat-equation solver
Abstract Coherent Shared Memories for FPGAs
"... To build a shared-memory programming model for FPGAs, a fast and highly parallel method of accessing the shared-memory is required. This thesis presents a first look at how to implement a coherent caching system in an FPGA. The coherent caching system consists of multiple distributed caches that imp ..."
Abstract
- Add to MetaCart
(Show Context)
To build a shared-memory programming model for FPGAs, a fast and highly parallel method of accessing the shared-memory is required. This thesis presents a first look at how to implement a coherent caching system in an FPGA. The coherent caching system consists of multiple distributed caches that implement the write-once coherence protocol, allowing efficient access to system memory while simplifying the user programming model. Several test applications are used to verify functionality, and assess performance of the current system. Results show that with a processor-based system, some applications could benefit from improvements to the coherence system, but for many applications, the current system is sufficient. However, the current coherent caching system is not suf-ficient for most hardware core based systems, because the faster memory accesses quickly saturate shared system resources. As well, the performance of distributed-memory sys-tems currently surpasses that of the coherent caching system. Performance results are promising, and given the potential for improvements, future work on this system is war-ranted. ii Dedication I dedicate this to my parents for their support throughout my education. And most of all, I’d like to dedicate this to Laura for her help, understanding, and patience throughout my Masters Degree. iii
unknown title
"... (HPRCs) consist of one or more standard microprocessors tightly coupled with one or more reconfigurable FPGAs. HPRCs have been shown to provide good speedups and good cost/performance ratios, but not necessarily ease of use, leading to a slow acceptance of this technology. HPRCs introduce new design ..."
Abstract
- Add to MetaCart
(Show Context)
(HPRCs) consist of one or more standard microprocessors tightly coupled with one or more reconfigurable FPGAs. HPRCs have been shown to provide good speedups and good cost/performance ratios, but not necessarily ease of use, leading to a slow acceptance of this technology. HPRCs introduce new design challenges, such as the lack of portability across platforms, incompatibilities with legacy code, users reluctant to change their code base, a prolonged learning curve, and the need for a system-level Hardware/Software co-design development flow. This paper presents the evolution and current work on TMD-MPI, which started as an MPI-based programming model for Multiprocessor Systems-on-Chip implemented in FPGAs, and has now evolved to include multiple X86 processors. TMD-MPI is shown to address current design challenges in HPRC usage, suggesting that the MPI standard has enough syntax and semantics to program these new types of parallel architectures. Also presented is the TMD-MPI Ecosystem, which consists of research projects and tools that are developed around TMD-MPI to further improve HPRC usability. I.