Results 1 - 10
of
13
Fastpass: A Centralized “Zero-Queue ” Datacenter Network
"... An ideal datacenter network should provide several properties, in-cluding low median and tail latency, high utilization (throughput), fair allocation of network resources between users or applications, deadline-aware scheduling, and congestion (loss) avoidance. Current datacenter networks inherit th ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
(Show Context)
An ideal datacenter network should provide several properties, in-cluding low median and tail latency, high utilization (throughput), fair allocation of network resources between users or applications, deadline-aware scheduling, and congestion (loss) avoidance. Current datacenter networks inherit the principles that went into the design of the Internet, where packet transmission and path selection deci-sions are distributed among the endpoints and routers. Instead, we propose that each sender should delegate control—to a centralized arbiter—of when each packet should be transmitted and what path it should follow. This paper describes Fastpass, a datacenter network architecture built using this principle. Fastpass incorporates two fast algorithms: the first determines the time at which each packet should be transmit-ted, while the second determines the path to use for that packet. In addition, Fastpass uses an efficient protocol between the endpoints and the arbiter and an arbiter replication strategy for fault-tolerant failover. We deployed and evaluated Fastpass in a portion of Face-book’s datacenter network. Our results show that Fastpass achieves high throughput comparable to current networks at a 240 × reduc-tion is queue lengths (4.35 Mbytes reducing to 18 Kbytes), achieves much fairer and consistent flow throughputs than the baseline TCP (5200 × reduction in the standard deviation of per-flow throughput with five concurrent connections), scalability from 1 to 8 cores in the arbiter implementation with the ability to schedule 2.21 Terabits/s of traffic in software on eight cores, and a 2.5 × reduction in the number of TCP retransmissions in a latency-sensitive service at Facebook. 1.
Integrating Microsecond Circuit Switching into the Data Center
"... Recent proposals have employed optical circuit switching (OCS) to reduce the cost of data center networks. However, the relatively slow switching times (10–100 ms) assumed by these approaches, and the accompanying latencies of their control planes, has limited its use to only the largest data center ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
(Show Context)
Recent proposals have employed optical circuit switching (OCS) to reduce the cost of data center networks. However, the relatively slow switching times (10–100 ms) assumed by these approaches, and the accompanying latencies of their control planes, has limited its use to only the largest data center networks with highly aggregated and constrained workloads. As faster switch technologies become available, designing a control plane capable of supporting them becomes a key challenge. In this paper, we design and implement an OCS prototype capable of switching in 11.5 µs, and we use this prototype to expose a set of challenges that arise when supporting switching at microsecond time scales. In response, we propose a microsecond-latency control plane based on a circuit scheduling approach we call Traffic Matrix Scheduling (TMS) that proactively communicates circuit assignments to communicating entities so that circuit bandwidth can be used efficiently.
Hunting Mice with Microsecond Circuit Switches
"... Recently, there have been proposals for constructing hybrid data center networks combining electronic packet switching with either wireless or optical circuit switching, which are ideally suited for supporting bulk traffic. Previous work has relied on a technique called hotspot scheduling, in which ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
(Show Context)
Recently, there have been proposals for constructing hybrid data center networks combining electronic packet switching with either wireless or optical circuit switching, which are ideally suited for supporting bulk traffic. Previous work has relied on a technique called hotspot scheduling, in which the traffic matrix is measured, hotspots identified, and circuits established to automatically offload traffic from the packetswitched network. While this hybrid approach does reduce CAPEX and OPEX, it still relies on having a well-provisioned packet-switched network to carry the remaining traffic. In this paper, we describe a generalization of hotspot scheduling, called traffic matrix scheduling, where most or even all bulk traffic is routed over circuits. In other words, we don’t just hunt elephants, we also hunt mice. Traffic matrix scheduling rapidly time-shares circuits across many destinations at microsecond time scales. The traffic matrix scheduling algorithm can route arbitrary traffic patterns and runs in polynomial time. We briefly describe a working implementation of traffic matrix scheduling using a custom-built data center optical circuit switch with a 2.8 microsecond switching time.
Bullet Trains: A study of NIC burst behavior at microsecond timescales
"... While numerous studies have examined the macro-level behavior of traffic in data center networks—overall flow sizes, destination variability, and TCP burstiness—little information is available on the behavior of data center traffic at packet-level timescales. Whereas one might assume that flows from ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
(Show Context)
While numerous studies have examined the macro-level behavior of traffic in data center networks—overall flow sizes, destination variability, and TCP burstiness—little information is available on the behavior of data center traffic at packet-level timescales. Whereas one might assume that flows from different applications fairly share available link bandwidth, and that packets within a single flow are uniformly paced, the reality is more complex. To meet increasingly high link rates of 10 Gbps and beyond, batching is typically introduced across the network stack—at the application, middleware, OS, transport, and NIC layers. This batching results in short-term packet bursts, which have implications for the design and performance requirements of packet processing devices along the path, including middleboxes, SDN-enabled switches, and virtual machine hypervisors. In this paper, we study the burst behavior of traffic emanating from a 10-Gbps end host across a variety of data center applications. We find that at 10–100 microsecond timescales, the traffic exhibits large bursts (i.e., 10s of packets in length). We further find that the level of this burstiness is largely outside of application control, and independent of the behavior of higher level applications. 1.
Network Support for Resource Disaggregation in Next-Generation Datacenters
"... Datacenters have traditionally been architected as a collection of servers wherein each server aggregates a fixed amount of computing, memory, storage, and communication resources. In this paper, we advocate an alternative construction in which the resources within a server are disaggregated and the ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
Datacenters have traditionally been architected as a collection of servers wherein each server aggregates a fixed amount of computing, memory, storage, and communication resources. In this paper, we advocate an alternative construction in which the resources within a server are disaggregated and the datacenter is instead architected as a collection of standalone resources. Disaggregation brings greater modularity to datacenter infrastructure, allowing operators to optimize their deployments for improved efficiency and performance. However, the key enabling or blocking factor for disaggregation will be the network since communication that was previously contained within a single server now traverses the datacenter fabric. This paper thus explores the question of whether we can build networks that enable disaggregation at datacenter scales.
DIBS: Just-in-time Congestion Mitigation for Data Centers
"... Data centers must support a range of workloads with dif-fering demands. Although existing approaches handle rou-tine traffic smoothly, intense hotspots–even if ephemeral– cause excessive packet loss and severely degrade perfor-mance. This loss occurs even though congestion is typi-cally highly local ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Data centers must support a range of workloads with dif-fering demands. Although existing approaches handle rou-tine traffic smoothly, intense hotspots–even if ephemeral– cause excessive packet loss and severely degrade perfor-mance. This loss occurs even though congestion is typi-cally highly localized, with spare buffer capacity at nearby switches. In this paper, we argue that switches should share buffer capacity to effectively handle this spot congestion without the monetary hit of deploying large buffers at in-dividual switches. Specifically, we present detour-induced buffer sharing (DIBS), a mechanism that achieves a near lossless network without requiring additional buffers at in-dividual switches. Using DIBS, a congested switch detours packets randomly to neighboring switches to avoid dropping the packets. We implement DIBS in hardware, on software routers in a testbed, and in simulation, and we demonstrate that it reduces the 99th percentile of delay-sensitive query completion time by up to 85%, with very little impact on other traffic.
Queues don’t matter when you can JUMP them!
"... QJUMP is a simple and immediately deployable ap-proach to controlling network interference in datacenter networks. Network interference occurs when congestion from throughput-intensive applications causes queueing that delays traffic from latency-sensitive applications. To mitigate network interfere ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
QJUMP is a simple and immediately deployable ap-proach to controlling network interference in datacenter networks. Network interference occurs when congestion from throughput-intensive applications causes queueing that delays traffic from latency-sensitive applications. To mitigate network interference, QJUMP applies Inter-net QoS-inspired techniques to datacenter applications. Each application is assigned to a latency sensitivity level (or class). Packets from higher levels are rate-limited in the end host, but once allowed into the network can “jump-the-queue ” over packets from lower levels. In set-tings with known node counts and link speeds, QJUMP can support service levels ranging from strictly bounded latency (but with low rate) through to line-rate through-put (but with high latency variance). We have implemented QJUMP as a Linux Traffic Con-trol module. We show that QJUMP achieves bounded latency and reduces in-network interference by up to
A multiport microsecond optical circuit switch for data center networking
- IEEE Photonics Technology Letters
"... Abstract — We experimentally evaluate the network-level switching time of a functional 23-host prototype hybrid optical circuit-switched/electrical packet-switched network for datacenters called Mordia (Microsecond Optical Research Datacenter Interconnect Architecture). This hybrid network uses a st ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
Abstract — We experimentally evaluate the network-level switching time of a functional 23-host prototype hybrid optical circuit-switched/electrical packet-switched network for datacenters called Mordia (Microsecond Optical Research Datacenter Interconnect Architecture). This hybrid network uses a standard electrical packet switch and an optical circuit-switched architecture based on a wavelength-selective switch that has a measured mean port-to-port network reconfiguration time of 11.5 µs including the signal acquisition by the network interface card. Using multiple parallel rings, we show that this architecture can scale to support the large bisection bandwidth required for future datacenters. Index Terms — Networks, optical switching. I.
Silo: predictable message latency in the cloud,”
- in Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication,
, 2015
"... ABSTRACT Many cloud applications can benefit from guaranteed latency for their network messages, however providing such predictability is hard, especially in multi-tenant datacenters. We identify three key requirements for such predictability: guaranteed network bandwidth, guaranteed packet delay a ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
ABSTRACT Many cloud applications can benefit from guaranteed latency for their network messages, however providing such predictability is hard, especially in multi-tenant datacenters. We identify three key requirements for such predictability: guaranteed network bandwidth, guaranteed packet delay and guaranteed burst allowance. We present Silo, a system that offers these guarantees in multi-tenant datacenters. Silo leverages the tight coupling between bandwidth and delay: controlling tenant bandwidth leads to deterministic bounds on network queuing delay. Silo builds upon network calculus to place tenant VMs with competing requirements such that they can coexist. A novel hypervisor-based policing mechanism achieves packet pacing at sub-microsecond granularity, ensuring tenants do not exceed their allowances. We have implemented a Silo prototype comprising a VM placement manager and a Windows filter driver. Silo does not require any changes to applications, guest OSes or network switches. We show that Silo can ensure predictable message latency for cloud applications while imposing low overhead.