Results 1 - 10
of
45
Outstanding Research Problems in NoC Design: Circuit-, Microarchitecture-, and System-Level Perspectives
"... Abstract—Networks-on-Chip (NoCs) have been recently proposed to replace global interconnects in order to alleviate complex communication problems. While several research problems concerning NoC design have been already addressed in the literature, many others remain to be solved. In this work, we fi ..."
Abstract
-
Cited by 52 (1 self)
- Add to MetaCart
(Show Context)
Abstract—Networks-on-Chip (NoCs) have been recently proposed to replace global interconnects in order to alleviate complex communication problems. While several research problems concerning NoC design have been already addressed in the literature, many others remain to be solved. In this work, we first provide a general description of NoC architectures and applications. Then, we enumerate several related research problems organized under five main categories: Application characterization, communication paradigm, communication infrastructure, analysis and solution evaluation. Motivation, problem formulation, proposed approaches and open issues are discussed for each problem enumerated in the paper from circuit, micro-architecture and systemlevel perspectives. Finally, we address the interactions among these research problems and put the NoC design process into perspective. Index terms — On-chip communication architecture, networks-onchip, multiprocessor system-on-chip, CMP. I.
Simulation and analysis of network on chip architectures: ring, spidergon and 2d mesh
- in Design, Automation and Test in Europe (DATE’06
"... NoC architectures can be adopted to support general communications among multiple IPs over multi-processor Systems on Chip (SoCs). In this work we illustrate the modeling and simulation-based analysis of some recent architectures for Network on Chip (NoC). Specifically, the Ring, Spidergon and 2D Me ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
(Show Context)
NoC architectures can be adopted to support general communications among multiple IPs over multi-processor Systems on Chip (SoCs). In this work we illustrate the modeling and simulation-based analysis of some recent architectures for Network on Chip (NoC). Specifically, the Ring, Spidergon and 2D Mesh NoC topologies have been compared, both under uniform load and under more realistic load assumptions in the SoC domain. The main performance indexes considered are NoC throughput and latency, as a function of variable data-injection rates, source and destination distributions, variable number of nodes. Results show that the Spidergon topology is a good trade-off between performance, scalability of the most efficient architectures inherited from the parallel computing systems design, constraints about simple management, small energy and area requirements for SoCs. 1.
A Unified Approach to Mapping and Routing on a Network-on-Chip for Both Best-Effort and Guaranteed Service Traffic
, 2007
"... One of the key steps in Network-on-Chip-based design is spatial mapping of cores and routing of the communication between those cores. Known solutions to the mapping and routing problems first map cores onto a topology and then route communication, using separate and possibly conflicting objective f ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
One of the key steps in Network-on-Chip-based design is spatial mapping of cores and routing of the communication between those cores. Known solutions to the mapping and routing problems first map cores onto a topology and then route communication, using separate and possibly conflicting objective functions. In this paper, we present a unified single-objective algorithm, called Unified MApping, Routing, and Slot allocation (UMARS+). As the main contribution, we show how to couple path selection, mapping of cores, and channel time-slot allocation to minimize the network required to meet the constraints of the application. The time-complexity of UMARS+ is low and experimental results indicate a run-time only 20 % higher than that of path selection alone. We apply the algorithm to an MPEG decoder System-on-Chip, reducing area by 33%, power dissipation by 35%, and worst-case latency by a factor four over a traditional waterfall approach.
A Power-Aware Mapping Approach to Map IP Cores onto NoCs under Bandwidth and Latency Constraints
, 2010
"... In this article, we investigate the Intellectual Property (IP) mapping problem that maps a given set of IP cores onto the tiles of a mesh-based Network-on-Chip (NoC) architecture such that the power consumption due to intercore communications is minimized. This IP mapping problem is considered under ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
In this article, we investigate the Intellectual Property (IP) mapping problem that maps a given set of IP cores onto the tiles of a mesh-based Network-on-Chip (NoC) architecture such that the power consumption due to intercore communications is minimized. This IP mapping problem is considered under both bandwidth and latency constraints as imposed by the applications and the on-chip network infrastructure. By examining various applications ’ communication characteristics extracted from their respective communication trace graphs, two distinguishable connectivity templates are realized: the graphs with tightly coupled vertices and those with distributed vertices. These two templates are formally defined in this article, and different mapping heuristics are subsequently developed to map them. In general, tightly coupled vertices are mapped onto tiles that are physically close to each other while the distributed vertices are mapped following a graph partition scheme. Experimental results on both random and multimedia benchmarks have confirmed that the proposed template-based mapping algorithm achieves an average of 15 % power savings as compared with MOCA, a fast greedy-based mapping algorithm. Compared with a branch-andbound–based mapping algorithm, which produces near optimal results but incurs an extremely high computation cost, the proposed algorithm, due to its polynomial runtime complexity, can generate
On network-on-chip comparison
"... Abstract — This paper presents the state-of-the-art in the field of network-on-chip (NoC) benchmarking and comparison. The study identifies the mainstream approaches, how NoCs are currently evaluated, and shows which aspects have been covered and those needing more research effort. No single article ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
(Show Context)
Abstract — This paper presents the state-of-the-art in the field of network-on-chip (NoC) benchmarking and comparison. The study identifies the mainstream approaches, how NoCs are currently evaluated, and shows which aspects have been covered and those needing more research effort. No single article can cover all the aspects, and therefore, possibility to compare results from various sources must be ensured by proper scientific reporting. Basic guidelines for achieving that are given.
Computation and communication refinement for multiprocessor soc design: A system-level perspective
- In Proc. of DAC
, 2004
"... Continuous advancements in semiconductor technology enable the design of complex systems-onchips (SoCs) composed of tens or hundreds of IP cores. At the same time, the applications that need to run on such platforms have become increasingly complex and have tight power and performance requirements. ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Continuous advancements in semiconductor technology enable the design of complex systems-onchips (SoCs) composed of tens or hundreds of IP cores. At the same time, the applications that need to run on such platforms have become increasingly complex and have tight power and performance requirements. Achieving a satisfactory design quality under these circumstances is only possible when both computation and communication refinement are performed efficiently, in an automated and synergistic manner. Consequently, formal and disciplined system-level design methodologies are in great demand for future multiprocessor design. This article provides a broad overview of some fundamental research issues and state-of-the-art solutions concerning both computation and communication aspects of system-level design. The methodology we advocate consists of developing abstract application and platform models, followed by application mapping onto the target platform, and then optimizing the overall system via performance analysis. In addition, a communication refinement step is critical for optimizing the communication infrastructure in this multiprocessor setup. Finally, simulation and prototyping can be used for accurate performance evaluation purposes.
A low-overhead asynchronous interconnection network for GALS chip miltiprocessor. TCAD p. 494-507, special Issue for NOCS, Apr
- 2010 HIPS2011 5/20/2011 Slides HIPS2011 5/20/2011 � XMT vs. GTX280 GPU � MaxFlow � Bi-Connectivity 6x
"... A new asynchronous interconnection network is introduced for globally-asynchronous locally-synchronous (GALS) chip multiprocessors. The network eliminates the need for global clock distribution, and can interface multiple synchronous timing domains operating at unrelated clock rates. In particular, ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
(Show Context)
A new asynchronous interconnection network is introduced for globally-asynchronous locally-synchronous (GALS) chip multiprocessors. The network eliminates the need for global clock distribution, and can interface multiple synchronous timing domains operating at unrelated clock rates. In particular, two new highly-concurrent asynchronous components are introduced which provide simple routing and arbitration/merge functions. Post-layout simulations in identical commercial 90nm technology indicate that comparable recent synchronous router nodes have 5.6-10.7x more energy per packet and 2.8-6.4x greater area than the new asynchronous nodes. Under random traffic, the network provides significantly lower latency and competitive throughput over the entire operating range of the 800 MHz network and through mid-range traffic rates for the 1.36 GHz network, but with degradation at higher traffic rates. Preliminary evaluations are also presented for a mixedtiming (GALS) network in a shared-memory parallel architecture, running both random traffic and parallel benchmark kernels, as well as directions for further improvement. 1
Exploration of Alternative Topologies for Application-Specific 3D Networks-on-Chip ∗
"... Three dimensional (3D) Network-on-Chip (NoC) architectures combine the benefits of new integration technologies with NoC-style interconnection of large number of IP cores in a single package. In this work, we propose a fully softwaresupported exploration methodology capable of defining pattern-based ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
Three dimensional (3D) Network-on-Chip (NoC) architectures combine the benefits of new integration technologies with NoC-style interconnection of large number of IP cores in a single package. In this work, we propose a fully softwaresupported exploration methodology capable of defining pattern-based, alternative, interconnection topologies for application-specific multi-layered 3D NoC architectures. The focus of our exploration is on the number of vertical interconnects (or through silicon vias) connecting grids of different layers, considering the mesh and torus architectures. Existing 3D NoCs assume that every router of a grid can communicate directly with the neighboring routers of the same grid and with the ones of the adjacent layers. We show that this full vertical connectivity is not needed. The exploration methodology is able to evaluate pattern-based 3D topologies and propose the ones that meet the design constraints best. We evaluate the exploration employing and extending the Worm Sim NoC simulator and feeding it with various types of traffic. In this way, we achieve a decrease in the number 3D routers and in the number of vertical vias, resulting in a decrease in the area occupied by the switch blocks, reducing energy dissipation and paying a negligible penalty in the latency of the 3D NoC. 1.
Long-Range Dependence and On-chip Processor Traffic
, 2009
"... Long-range dependence is a property of stochastic processes that has an important impact on network performance, especially on the buffer usage in routers. We analyze the presence of long-range dependence in on-chip processor traffic and we study the impact of long-range dependence on networks-on-ch ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Long-range dependence is a property of stochastic processes that has an important impact on network performance, especially on the buffer usage in routers. We analyze the presence of long-range dependence in on-chip processor traffic and we study the impact of long-range dependence on networks-on-chip. We propose to investigate the presence of long-range dependence in communication traces of processor ips at the cycle-accurate level. We also study the impact of long-range dependence on a real network-on-chip using the SocLib simulation environment and traffic generators of our own. Our experiments show that long-range dependence is not an ubiquitous property of on-chip processor traffic and that its impact on the network-on-chip is highly correlated with the low level communication protocol used. Key words: Network on Chip, System on Chip, embedded software, long range dependence, network traffic 1
Link-load balance aware mapping and routing for NoC
- WSEAS Transactions on Circuits and Systems
, 2007
"... Abstract: The paper presents a novel mapping and routing technique for the mesh based NoC design problem with an objective of minimizing the energy consumption and normalized worst link-load. The proposed algorithm PLBMR is a particle swarm optimization (PSO) based two phases process, one is mapping ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Abstract: The paper presents a novel mapping and routing technique for the mesh based NoC design problem with an objective of minimizing the energy consumption and normalized worst link-load. The proposed algorithm PLBMR is a particle swarm optimization (PSO) based two phases process, one is mapping core onto NoC to minimize the NoC communication energy consumption, and another is the allocation of routing path for keeping the link-load balance. and the paper presents the detail implementation of the PLBMR algorithm. Experimental results show that the proposed technique can reduce the normalized worst link-load by 20 % on average while guarantee a low energy consumption.