Results 1 -
4 of
4
Near-optimal oblivious routing on three-dimensional mesh networks
- International Conference on Computer Design
, 2008
"... Abstract — The increasing viability of three dimensional (3D) silicon integration technology has opened new opportunities for chip architecture innovations. One direction is in the extension of two-dimensional (2D) mesh-based tiled chip-multiprocessor architectures into three dimensions. In this pap ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Abstract — The increasing viability of three dimensional (3D) silicon integration technology has opened new opportunities for chip architecture innovations. One direction is in the extension of two-dimensional (2D) mesh-based tiled chip-multiprocessor architectures into three dimensions. In this paper, we focus on efficient routing algorithms for such 3D mesh networks. As in the case of 2D mesh networks, throughput and latency are important design metrics for routing algorithms. Existing routing algorithms suffer from either poor worst-case throughput (DOR [1], ROMM [3]) or poor latency (VAL [2]). Although the minimal routing algorithm O1TURN proposed in [4] already achieves near-optimal worst-case throughput for the 2D case, the optimality result does not extend to higher dimensions. For 3D and higher dimensional meshes, the worst-case throughput of O1TURN degrades tremendously. The main contribution of this paper is the design of a new oblivious routing algorithm for 3D mesh networks called Randomized Partially-Minimal (RPM) routing. RPM provably achieves optimal worst-case throughput for 3D meshes when the network radix k is even and within a factor of 1/k 2 of optimal worst-case throughput when k is odd. RPM also outperforms VAL, DOR, ROMM, and O1TURN in average-case throughput by 33.3%, 111%, 47%, and 30%, respectively when averaged over one million random traffic patterns on an 8 × 8 × 8 topology. Finally, whereas VAL achieves optimal worst-case throughput at a penalty factor of 2 in average latency over DOR, RPM achieves (near) optimal worst-case throughput with a much smaller factor of 1.33. In practice, the average latency of RPM is expected to be closer to minimal routing because 3D mesh networks are not expected to be symmetric in 3D chip designs. The number of available device layers is expected to be much less than the number of processor tiles that can be placed along an edge of a device layer. For practical asymmetric 3D mesh configurations, the average latency of RPM reduces to just a factor of 1.11 of DOR. I.
IEEE EMBEDDED SYSTEMS LETTERS, ACCEPTED FOR PUBLICATION 1 A Layer-Multiplexed 3D On-Chip Network Architecture
"... Abstract — Programmable many-core processors are poised to become a major design option for many embedded applications. In the design of power-efficient embedded many-core processors, the architecture of the on-chip network plays a central role. Many designs have relied on a 2D mesh architecture as ..."
Abstract
- Add to MetaCart
Abstract — Programmable many-core processors are poised to become a major design option for many embedded applications. In the design of power-efficient embedded many-core processors, the architecture of the on-chip network plays a central role. Many designs have relied on a 2D mesh architecture as the underlying communication fabric. With the emergence of 3D technology, new on-chip network architectures are possible. In this paper, we propose a novel layer-multiplexed (LM) 3D network architecture that takes advantage of the short interlayer wiring delays enabled in 3D technology. In particular, the LM architecture replaces the one-layer-per-hop routing in a conventional 3D mesh with simpler vertical demultiplexing and multiplexing structures. When combined with a layer-multiplexing oblivious routing algorithm, it can achieve the same worst-case throughput as the best known oblivious routing algorithm on a conventional 3D mesh. However, in comparison to a conventional 3D mesh, the LM architecture consumes 27 % less power, attains 14.5 % higher average throughput, and achieves 33 % lower worstcase hop count. I.
A Novel 3D Layer-Multiplexed On-Chip Network ∗
"... Recently, a near-optimal oblivious routing algorithm for 3D mesh networks called Randomized Partially-Minimal (RPM) routing was proposed [12], which works by loadbalancing traffic across vertical layers and routing minimally on each horizontal layer. It achieves optimal worst-case throughput when th ..."
Abstract
- Add to MetaCart
Recently, a near-optimal oblivious routing algorithm for 3D mesh networks called Randomized Partially-Minimal (RPM) routing was proposed [12], which works by loadbalancing traffic across vertical layers and routing minimally on each horizontal layer. It achieves optimal worst-case throughput when the network radix k is even and within a factor of 1/k 2 of optimal when k is odd, and it achieves significantly lower latencies than Valiant routing [18], the best previously known optimal worst-case throughput algorithm. This paper presents a novel layer-multiplexed (LM) architecture for 3D on-chip networks that exploits the optimality of RPM together with the short inter-layer wiring delays enabled in 3D technology. The LM architecture replaces the one-layer-per-hop routing in a 3D mesh with simpler vertical demultiplexing and multiplexing structures. The proposed LM architecture can achieve the same worst-case throughput as a 3D mesh by adapting RPM routing to the LM architecture. However, the LM architecture consumes 27 % less power, occupies 27 % less area, attains 14.5 % higher average throughput, and achieves 33 % lower worst-case hop count for a symmetric 4 × 4 × 4 mesh topology. On an asymmetric 8 × 8 × 4 mesh, the LM architecture achieves comparable average-case throughput to a 3D mesh, but consumes 26% less power, takes up 27 % less area and attains 20 % lower worst-case hop count.
A Layer-Multiplexed 3D On-Chip Network Architecture
"... Abstract—Programmable many-core processors are poised to become a major design option for many embedded applications. In the design of power-efficient embedded many-core processors, the architecture of the on-chip network plays a central role. Many designs have relied on a 2D mesh architecture as th ..."
Abstract
- Add to MetaCart
Abstract—Programmable many-core processors are poised to become a major design option for many embedded applications. In the design of power-efficient embedded many-core processors, the architecture of the on-chip network plays a central role. Many designs have relied on a 2D mesh architecture as the underlying communication fabric. With the emergence of 3D technology, new on-chip network architectures are possible. In this paper, we propose a novel layer-multiplexed (LM) 3D network architecture that takes advantage of the short interlayer wiring delays enabled in 3D technology. In particular, the LM architecture replaces the one-layer-per-hop routing in a conventional 3D mesh with simpler vertical demultiplexing and multiplexing structures. When combined with a layer load-balanced oblivious routing algorithm, it can achieve the same worst-case throughput as the best known oblivious routing algorithm on a conventional 3D mesh. However, in comparison to a conventional 3D mesh, the LM architecture consumes 27 % less power, attains 14.5 % higher average throughput, and achieves 33 % lower worst-case hop count on a 4 4 4 topology. Index Terms—3D, 3D mesh, oblivious routing, integrated circuits (ICs).

