Results 1 - 10
of
24
Outstanding Research Problems in NoC Design: Circuit-, Microarchitecture-, and System-Level Perspectives
"... Abstract—Networks-on-Chip (NoCs) have been recently proposed to replace global interconnects in order to alleviate complex communication problems. While several research problems concerning NoC design have been already addressed in the literature, many others remain to be solved. In this work, we fi ..."
Abstract
-
Cited by 52 (1 self)
- Add to MetaCart
(Show Context)
Abstract—Networks-on-Chip (NoCs) have been recently proposed to replace global interconnects in order to alleviate complex communication problems. While several research problems concerning NoC design have been already addressed in the literature, many others remain to be solved. In this work, we first provide a general description of NoC architectures and applications. Then, we enumerate several related research problems organized under five main categories: Application characterization, communication paradigm, communication infrastructure, analysis and solution evaluation. Motivation, problem formulation, proposed approaches and open issues are discussed for each problem enumerated in the paper from circuit, micro-architecture and systemlevel perspectives. Finally, we address the interactions among these research problems and put the NoC design process into perspective. Index terms — On-chip communication architecture, networks-onchip, multiprocessor system-on-chip, CMP. I.
A Unified Approach to Mapping and Routing on a Network-on-Chip for Both Best-Effort and Guaranteed Service Traffic
, 2007
"... One of the key steps in Network-on-Chip-based design is spatial mapping of cores and routing of the communication between those cores. Known solutions to the mapping and routing problems first map cores onto a topology and then route communication, using separate and possibly conflicting objective f ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
One of the key steps in Network-on-Chip-based design is spatial mapping of cores and routing of the communication between those cores. Known solutions to the mapping and routing problems first map cores onto a topology and then route communication, using separate and possibly conflicting objective functions. In this paper, we present a unified single-objective algorithm, called Unified MApping, Routing, and Slot allocation (UMARS+). As the main contribution, we show how to couple path selection, mapping of cores, and channel time-slot allocation to minimize the network required to meet the constraints of the application. The time-complexity of UMARS+ is low and experimental results indicate a run-time only 20 % higher than that of path selection alone. We apply the algorithm to an MPEG decoder System-on-Chip, reducing area by 33%, power dissipation by 35%, and worst-case latency by a factor four over a traditional waterfall approach.
NoCOUT: NoC Topology Generation with Mixed Packet-switched and Point-to-point Networks
- Proc. ASPDAC
, 2008
"... Networks-on-Chip (NoC) have been widely proposed as the future communication paradigm for use in next-generation System-on-Chip. In this paper, we present NoCOUT, a methodology for generating an energy optimized application specific NoC topology which supports both point-to-point and packet-switched ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
(Show Context)
Networks-on-Chip (NoC) have been widely proposed as the future communication paradigm for use in next-generation System-on-Chip. In this paper, we present NoCOUT, a methodology for generating an energy optimized application specific NoC topology which supports both point-to-point and packet-switched networks. The algorithm uses a prohibitive greedy iterative improvement strategy to explore the de-sign space efficiently. A system-level floorplanner is used to evaluate the iterative design improvements and provide feedback on the effects of the topology on wire length. The algorithm is integrated within a NoC synthesis framework with characterized NoC power and area models to allow accurate explo-ration for a NoC router library. We apply the topology generation al-gorithm to several test cases including real-world and synthetic com-munication graphs with both regular and irregular traffic patterns, and varying core sizes. Since the method is iterative, it is possible to start with a known design to search for improvements. Experimental re-sults show that many different applications benefit from a mix of “on chip networks ” and “point-to-point networks”. With such a hybrid network, we achieve approximately 25 % lower energy consumption (with a maximum of 37%) than a state of the art min-cut partition based topology generator for a variety of benchmarks. In addition, the average hop count is reduced by 0.75 hops, which would significantly reduce the network latency. 1.
Elastic Flow in an Application Specific Network-on-Chip
- in: Third International Workshop on Formal Methods in Globally Asynchronous Locally Synchronous Design (FMGALS 07), Elsevier Electronic Notes in Theoretical Computer Scinece
, 2007
"... A Network-on-Chip (NoC) is increasingly needed to interconnect the large number and variety of Intellectual Property (IP) cells that make up a System-on-Chip (SoC). The network must be able to communicate between cells in different clock domains, and do so with minimal space, power, and latency over ..."
Abstract
-
Cited by 8 (6 self)
- Add to MetaCart
(Show Context)
A Network-on-Chip (NoC) is increasingly needed to interconnect the large number and variety of Intellectual Property (IP) cells that make up a System-on-Chip (SoC). The network must be able to communicate between cells in different clock domains, and do so with minimal space, power, and latency overhead. In this paper, we describe an asynchronous NoC using an elastic-flow protocol, and methods of automatically generating a topology and router placement. We use the communication profile of the SoC design to drive the binary-tree topology creation and the physical placement of routers, and a force-directed approach to determine router locations. The nature of elastic-flow removes the need for large router buffers, and thus we gain a significant power and space advantage compared to traditional NoCs. Additionally, our network is deadlock-free, and paths have bounded worst-case communication latencies. Keywords: VLSI, GALS, Network-on-chip, Asynchronous 1
VISION: a framework for voltage island aware synthesis of interconnection networks-on-chip
- GLSVLSI
"... High power dissipation has today become one of the major challenges in chip multiprocessor (CMP) design. Designers in recent years have proposed several techniques to alleviate the power challenge, one of which is the use of voltage islands (VIs) that can help reduce both switching and standby compo ..."
Abstract
-
Cited by 8 (8 self)
- Add to MetaCart
High power dissipation has today become one of the major challenges in chip multiprocessor (CMP) design. Designers in recent years have proposed several techniques to alleviate the power challenge, one of which is the use of voltage islands (VIs) that can help reduce both switching and standby components of power. The use of VIs allows groups of cores to be powered by the same supply source and permits operating different portions of the chip at different voltage levels in order to optimize the overall chip power consumption. However, the problems of VI creation, core to VI mapping, and VI-aware network on chip (NoC) design to satisfy application performance constraints are non-trivial and will only get harder as the number of cores in CMPs increase into the hundreds. In this paper, we propose a novel framework (VISION) for automating the synthesis of regular networks on chip (NoC) with VIs, to satisfy application performance while minimizing chip power dissipation. Our proposed framework uses a set of novel algorithms and heuristics to generate solutions that reduce network traffic by up to 60 % and power dissipation by up to 11%, compared to the best known prior work that also solves the same problem.
Application specific network-on-chip design with guaranteed quality approximation algorithms
- in Proc. ASPDAC, 2007
"... Abstract—Network-on-Chip (NoC) architectures with opti-mized topologies have been shown to be superior to regular architectures (such as mesh) for application specific multi-processor System-on-Chip (MPSoC) devices. The application spe-cific NoC design problem takes as input the system-level floorpl ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
Abstract—Network-on-Chip (NoC) architectures with opti-mized topologies have been shown to be superior to regular architectures (such as mesh) for application specific multi-processor System-on-Chip (MPSoC) devices. The application spe-cific NoC design problem takes as input the system-level floorplan of the computation architecture, characterized library of NoC components, and the communication performance requirements. The objective is to generate an optimized NoC topology, and routes for the communication traces on the architecture such that the performance requirements are satisfied and power consumption is minimized. The paper discusses a two stage automated approach consisting of i) core to router mapping, and ii) topology and route generation for design of custom NoC architectures. In particular it presents an optimal technique for core to router mapping (stage i), and a factor 2 approximation algorithm for custom topology generation (stage ii). The superior quality of the techniques is established by experimentation with benchmark applications, and comparisons with integer linear programming (ILP) formulations, and heuristic techniques. I.
Automated Techniques for Synthesis of Application-Specific Network-on-Chip Architectures
"... Abstract—This paper addresses the automated synthesis of a custom network-on-chip architecture whose topology is optimized for the specific communication requirements of the target de-vice. The optimization objectives include power consumption and resource usage. This paper presents a two-stage synt ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
Abstract—This paper addresses the automated synthesis of a custom network-on-chip architecture whose topology is optimized for the specific communication requirements of the target de-vice. The optimization objectives include power consumption and resource usage. This paper presents a two-stage synthesis ap-proach consisting of the following: 1) core to router mapping and 2) custom topology and route generation. In particular, it presents an optimal technique for core to router mapping [stage 1)] and a factor-2 approximation algorithm for custom topology generation [stage 2)]. The superior quality of the techniques is established by experimentation with benchmark applications and by compar-isons with existing approaches. Index Terms—Application specific integrated circuit (ASIC), approximation methods, design automation, network-on-chip. I.
A Method to Remove Deadlocks in Networks-on-Chips with Wormhole Flow Control
"... Networks-on-Chip (NoCs) are a promising interconnect paradigm to address the communication bottleneck of Systems-on-Chip (SoCs). Wormhole flow control is widely used as the transmission protocol in NoCs, as it offers high throughput and low latency. To match the application characteristics, customiz ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
Networks-on-Chip (NoCs) are a promising interconnect paradigm to address the communication bottleneck of Systems-on-Chip (SoCs). Wormhole flow control is widely used as the transmission protocol in NoCs, as it offers high throughput and low latency. To match the application characteristics, customized irregular topologies and routing functions are used. With wormhole flow control and custom irregular NoC topologies, deadlocks can occur during system operation. Ensuring a deadlock free operation of custom NoCs is a major challenge. In this paper, we address this important issue and present a method to remove deadlocks in application-specific NoCs. Our method can be applied to any NoC topology and routing function, and the potential deadlocks are removed by adding minimal number of virtual or physical channels. Experiments on a variety of realistic benchmarks show that our method results in a large reduction in the number of resources needed (88 % on average) and NoC power consumption, area reduction (66 % area savings on average) when compared to the state-of-the-art deadlock removal methods.
Systematic Customization of On-Chip Crossbar Interconnects
"... Abstract. In this paper, we present a systematic design and implementation of reconfigurable interconnects on demand. The proposed on-chip interconnection network provides identical physical topologies to logical topologies for given applications. The network has been implemented with parameterized ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
Abstract. In this paper, we present a systematic design and implementation of reconfigurable interconnects on demand. The proposed on-chip interconnection network provides identical physical topologies to logical topologies for given applications. The network has been implemented with parameterized switches, dynamically multiplexed by a traffic controller. Considering practical media applications, a multiprocessor system combined with the presented network has been integrated and prototyped in Virtex-II Pro FPGA using the ESPAM design environment. The experiment shows that the network realizes on-demand traffic patterns, occupies on average 59 % less area, and maintains performance comparable with a conventional crossbar. 1
2B-3 Communication Architecture Synthesis of Cascaded Bus Matrix
"... Abstract – For high frequency on-chip communication architecture design, we propose cascaded bus matrix-based solutions. Due to the huge design space in cascaded bus matrix design, it is crucial to perform an efficient design space exploration. In our work, we present a simulated annealing-based des ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract – For high frequency on-chip communication architecture design, we propose cascaded bus matrix-based solutions. Due to the huge design space in cascaded bus matrix design, it is crucial to perform an efficient design space exploration. In our work, we present a simulated annealing-based design space exploration method. For an efficient representation of bus topology, we propose an encoding method called traffic group encoding and apply it to AMBA3 AXI-based bus system design. In addition, we propose a method of two-step simulated annealing to improve the quality of results. Experimental results show that the proposed methods allow designing complex communication architectures (ones with up to 31 masters and 71 slaves) with high frequency constraints to which existing methods could not give solutions. I.