Results 1 - 10
of
12
Outstanding Research Problems in NoC Design: Circuit-, Microarchitecture-, and System-Level Perspectives
"... Abstract—Networks-on-Chip (NoCs) have been recently proposed to replace global interconnects in order to alleviate complex communication problems. While several research problems concerning NoC design have been already addressed in the literature, many others remain to be solved. In this work, we fi ..."
Abstract
-
Cited by 52 (1 self)
- Add to MetaCart
Abstract—Networks-on-Chip (NoCs) have been recently proposed to replace global interconnects in order to alleviate complex communication problems. While several research problems concerning NoC design have been already addressed in the literature, many others remain to be solved. In this work, we first provide a general description of NoC architectures and applications. Then, we enumerate several related research problems organized under five main categories: Application characterization, communication paradigm, communication infrastructure, analysis and solution evaluation. Motivation, problem formulation, proposed approaches and open issues are discussed for each problem enumerated in the paper from circuit, micro-architecture and systemlevel perspectives. Finally, we address the interactions among these research problems and put the NoC design process into perspective. Index terms — On-chip communication architecture, networks-onchip, multiprocessor system-on-chip, CMP. I.
A Statically Scheduled Time-Division-Multiplexed Network-on-Chip for Real-Time Systems
"... Abstract—This paper explores the design of a circuitswitched network-on-chip (NoC) based on time-divisionmultiplexing (TDM) for use in hard real-time systems. Previous work has primarily considered application-specific systems. The work presented here targets general-purpose hardware platforms. We c ..."
Abstract
-
Cited by 18 (8 self)
- Add to MetaCart
(Show Context)
Abstract—This paper explores the design of a circuitswitched network-on-chip (NoC) based on time-divisionmultiplexing (TDM) for use in hard real-time systems. Previous work has primarily considered application-specific systems. The work presented here targets general-purpose hardware platforms. We consider a system with IP-cores, where the TDM-NoC must provide directed virtual circuits – all with the same bandwidth – between all nodes. This may not be a frequent scenario, but a general platform should provide this capability, and it is an interesting point in the design space to study. The paper presents an FPGA-friendly hardware design, which is simple, fast, and consumes minimal resources. Furthermore, an algorithm to find minimum-period schedules for all-to-all virtual circuits on top of typical physical NoC topologies like 2D-mesh, torus, bidirectional torus, tree, and fat-tree is presented. The static schedule makes the NoC timepredictable and enables worst-case execution time analysis of communicating real-time tasks. Keywords-real-time systems; network-on-chip I.
Energy efficient application mapping to noc processing elements operating at multiple voltage levels
- in Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
, 2009
"... An efficient technique for mapping application tasks to heterogeneous processing elements (PEs) on a Network-on-Chip (NoC) platform, operating at multiple voltage levels, is presented in this paper. The goal of the mapping is to minimize energy consumption subject to the performance constraints. Suc ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
(Show Context)
An efficient technique for mapping application tasks to heterogeneous processing elements (PEs) on a Network-on-Chip (NoC) platform, operating at multiple voltage levels, is presented in this paper. The goal of the mapping is to minimize energy consumption subject to the performance constraints. Such a mapping involves solving several subproblems. Most of the research effort in this area often address these subproblems in a sequential fashion or a subset of them. We take a unified approach to the problem without compromising the solution time and provide techniques for optimal and heuristic solutions. We prove that the voltage assignment component of the problem itself is NP-hard and is inapproximable within any constant factor. Our optimal solution utilizes a Mixed Integer Linear Program (MILP) formulation of the problem. The heuristic utilizes MILP relaxation and randomized rounding. Experimental results based on E3S benchmark applications and a few real applications show that our heuristic produces near-optimal solution in a fraction of time needed to find the optimal. 1
Static Routing in Symmetric Real-Time Network-on-Chips
"... With the rising number of cores on a single chip the question on how to organize the communication among those cores becomes more and more relevant. A common solution is to use a network-on-chip (NoC) that provides communication bandwidth, routing, and arbitration among the cores. The use of NoCs in ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
(Show Context)
With the rising number of cores on a single chip the question on how to organize the communication among those cores becomes more and more relevant. A common solution is to use a network-on-chip (NoC) that provides communication bandwidth, routing, and arbitration among the cores. The use of NoCs in real-time systems is problematic, since the shared network and all cores connected to it have to be analyzed to derive time bounds of real-time tasks. We propose to use a statically scheduled, time-divisionmultiplexed NoC design that allows a decoupled analysis of individual real-time tasks. Our network provides virtual circuits between all cores. These virtual circuits are implemented by delivering messages periodically on a static, fixed routing schedule. Since the routing does not change, it can be pre-computed offline. This work focuses on the computation of routing schedules for symmetric NoC topologies, e.g., torus and hypercube. Due to the symmetry, the all-to-all communication can be modeled via simplified communication patterns that are concurrently processed by all routers. The scheduling problem is solved by a heuristic that tries to maximize the overlap of active patterns. Our experiments show that, for larger networks, our heuristic yields schedule lengths that are only 15 % to 20 % longer than theoretical lower bounds. Categories and Subject Descriptors C.3 [Special-Purpose and Application-Based Systems]: Real-time and embedded systems;
Multi-path routing in time-divisionmultiplexed networks on chip,” in VLSI-SoC
, 2009
"... ..."
(Show Context)
Run-time Mapping of Applications on FPGA-based Reconfigurable Systems
"... GAs) in System-on-Chip (SoC) design considerably increased in the last few years. Their established importance is due to the large amount of hardware resources they offer, as well as to their increasing performance, and furthermore to the support for reconfigurability. Even though FPGAs seem to have ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
GAs) in System-on-Chip (SoC) design considerably increased in the last few years. Their established importance is due to the large amount of hardware resources they offer, as well as to their increasing performance, and furthermore to the support for reconfigurability. Even though FPGAs seem to have reached their maturity, there is still a lack of Computer-Aided Design (CAD) tools able to deal with dynamic reconguration. Existing algorithms aim at optimizing the performance of a set of applications, basing the computation on classic metrics (such as communication overhead), while reconfiguration-related issues are not taken into consideration. This work proposes a design methodology to map several applications on the FPGA area at run-time. Starting from a basic solution found at design-time for the initial set of applications, the proposed algorithm makes it possible to map a new application (not known at design-time), both minimizing the number of synthesis processes and optimizing the on-chip performance of the new application. Experimental results show that the proposed approach is able to achieve up to a 18 % reduction in the number of reconfigurations with respect to an off-line static-mapping approach, while generally preserving the performance of the executed applications on the FPGA. I.
A Mapping Flow for Dynamically Reconfigurable Multi-Core System-on-Chip Design
"... Abstract—Nowadays, multi-core systems-on-chip (SoCs) are typically required to execute multiple complex applications, which demand a large set of heterogeneous hardware cores with different sizes. In this context, the popularity of dynamically reconfigurable platforms is growing, as they increase th ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—Nowadays, multi-core systems-on-chip (SoCs) are typically required to execute multiple complex applications, which demand a large set of heterogeneous hardware cores with different sizes. In this context, the popularity of dynamically reconfigurable platforms is growing, as they increase the ability of the initial design to adapt to future modifications. This paper presents a design flow to efficiently map multiple multi-core applications on a dynamically reconfigurable SoC. The proposed methodology is tailored for a reconfigurable hardware architecture based on a flexible communication infrastructure, and exploits applications similarities to obtain an effective mapping. We also introduce a run-time mapper that is able to introduce new applications that were not known at design-time, preserving the mapping of the original system. We apply our design flow to a real-world multimedia case study and to a set of synthetic benchmarks, showing that it is actually able to extract similarities among the applications, as it achieves an average improvement of 29 % in terms of reconfiguration latency with respect to a communication-oriented approach, while preserving the same communication performance. Index Terms—Field programmable gate arrays, platformbased design, reconfigurable architectures, run-time adaptability. I.
2009 International Conference on Microelectronics Run-Time Mapping for Dynamically-Added Applications in Reconfigurable Embedded Systems
"... Abstract — The increasing popularity of multi-core System-on-Chip platforms introduces new challenges, both in terms of hardware platforms and design methodologies. Dynamic reconfiguration can be exploited to increase the flexibility of the system and to implement multiple applications, since it is ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract — The increasing popularity of multi-core System-on-Chip platforms introduces new challenges, both in terms of hardware platforms and design methodologies. Dynamic reconfiguration can be exploited to increase the flexibility of the system and to implement multiple applications, since it is possible to easily switch between them by reconfiguring part of the device at run-time. Additionally, new applications may be included in the system after the design time synthesis has been completed. This paper addresses the problem of mapping new applications on the device area at run-time, by reusing existing components of the system. We propose an heuristic technique that is able to determine how the new application should be mapped in a short time and, thanks to the reuse policy, to immediately deploy the solution on the device. The proposed algorithm also takes into consideration two conflicting performance metrics, in order to generate a good quality result. I.
On the Capacity of Bufferless Networks-on-Chip
"... Abstract—Networks-on-Chip (NoCs) form an emerging paradigm for communications within chips. In particular, bufferless NoCs require significantly less area and power consumption, but also pose novel major scheduling problems to achieve full capacity. In this paper, we provide first insights on the ca ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—Networks-on-Chip (NoCs) form an emerging paradigm for communications within chips. In particular, bufferless NoCs require significantly less area and power consumption, but also pose novel major scheduling problems to achieve full capacity. In this paper, we provide first insights on the capacity of bufferless NoCs. In particular, we present optimal periodic schedules for several bufferless NoCs with a completeexchange traffic pattern. These schedules particularly fit distributed-programming models and network congestioncontrol mechanisms. Finally, we analytically evaluate the performance of our scheduling algorithms. A. Background I.
doi:10.1155/2010/603059 Research Article 3D Network-on-Chip Architectures Using Homogeneous
, 2010
"... License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. We propose new 3D 2-layer and 3-layer NoC architectures that utilize homogeneous regular mesh networks on a separate layer and one or two heterogeneous floorplanning l ..."
Abstract
- Add to MetaCart
(Show Context)
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. We propose new 3D 2-layer and 3-layer NoC architectures that utilize homogeneous regular mesh networks on a separate layer and one or two heterogeneous floorplanning layers. These architectures combine the benefits of compact heterogeneous floorplans and of regular mesh networks. To demonstrate these benefits, a design methodology that integrates floorplanning, routers assignment, and cycle-accurate NoC simulation is proposed. The implementation of the NoC on a separate layer offers an additional area that may be utilized to improve the network performance by increasing the number of virtual channels, buffers size, or mesh size. Experimental results show that increasing the number of virtual channels rather than the buffers size has a higher impact on network performance. Increasing the mesh size can significantly improve the network performance under the assumption that the clock frequency is given by the length of the physical links. In addition, the 3-layer architecture can offer significantly better network performance compared to the 2-layer architecture. 1.