Results 1 - 10
of
466
VL2: Scalable and Flexible Data Center Network”,
- ACM SIGCOMM Computer Communication Review,
, 2009
"... Abstract To be agile and cost e ective, data centers should allow dynamic resource allocation across large server pools. In particular, the data center network should enable any server to be assigned to any service. To meet these goals, we present VL, a practical network architecture that scales t ..."
Abstract
-
Cited by 461 (12 self)
- Add to MetaCart
(Show Context)
Abstract To be agile and cost e ective, data centers should allow dynamic resource allocation across large server pools. In particular, the data center network should enable any server to be assigned to any service. To meet these goals, we present VL, a practical network architecture that scales to support huge data centers with uniform high capacity between servers, performance isolation between services, and Ethernet layer- semantics. VL uses () at addressing to allow service instances to be placed anywhere in the network, () Valiant Load Balancing to spread tra c uniformly across network paths, and () end-system based address resolution to scale to large server pools, without introducing complexity to the network control plane. VL's design is driven by detailed measurements of tra c and fault data from a large operational cloud service provider. VL's implementation leverages proven network technologies, already available at low cost in high-speed hardware implementations, to build a scalable and reliable network architecture. As a result, VL networks can be deployed today, and we have built a working prototype. We evaluate the merits of the VL design using measurement, analysis, and experiments. Our VL prototype shu es . TB of data among servers in seconds -sustaining a rate that is of the maximum possible.
BCube: A High Performance, Server-centric Network Architecture for Modular Data Centers
- In SIGCOMM
, 2009
"... This paper presents BCube, a new network architecture specifically designed for shipping-container based, modular data centers. At the core of the BCube architecture is its server-centric network structure, where servers with multiple network ports connect to multiple layers of COTS (commodity off-t ..."
Abstract
-
Cited by 248 (31 self)
- Add to MetaCart
(Show Context)
This paper presents BCube, a new network architecture specifically designed for shipping-container based, modular data centers. At the core of the BCube architecture is its server-centric network structure, where servers with multiple network ports connect to multiple layers of COTS (commodity off-the-shelf) mini-switches. Servers act as not only end hosts, but also relay nodes for each other. BCube supports various bandwidth-intensive applications by speedingup one-to-one, one-to-several, and one-to-all traffic patterns, and by providing high network capacity for all-to-all traffic. BCube exhibits graceful performance degradation as the server and/or switch failure rate increases. This property is of special importance for shipping-container data centers, since once the container is sealed and operational, it becomes very difficult to repair or replace its components. Our implementation experiences show that BCube can be seamlessly integrated with the TCP/IP protocol stack and BCube packet forwarding can be efficiently implemented in both hardware and software. Experiments in our testbed demonstrate that BCube is fault tolerant and load balancing and it significantly accelerates representative bandwidthintensive applications.
Hedera: Dynamic flow scheduling for data center networks
- In Proc. of Networked Systems Design and Implementation (NSDI) Symposium
, 2010
"... Today’s data centers offer tremendous aggregate bandwidth to clusters of tens of thousands of machines. However, because of limited port densities in even the highest-end switches, data center topologies typically consist of multi-rooted trees with many equal-cost paths between any given pair of hos ..."
Abstract
-
Cited by 223 (7 self)
- Add to MetaCart
(Show Context)
Today’s data centers offer tremendous aggregate bandwidth to clusters of tens of thousands of machines. However, because of limited port densities in even the highest-end switches, data center topologies typically consist of multi-rooted trees with many equal-cost paths between any given pair of hosts. Existing IP multipathing protocols usually rely on per-flow static hashing and can cause substantial bandwidth losses due to longterm collisions. In this paper, we present Hedera, a scalable, dynamic flow scheduling system that adaptively schedules a multi-stage switching fabric to efficiently utilize aggregate network resources. We describe our implementation using commodity switches and unmodified hosts, and show that for a simulated 8,192 host data center, Hedera delivers bisection bandwidth that is 96 % of optimal and up to 113 % better than static load-balancing methods. 1
FAWN: A Fast Array of Wimpy Nodes
, 2008
"... This paper introduces the FAWN—Fast Array of Wimpy Nodes—cluster architecture for providing fast, scalable, and power-efficient key-value storage. A FAWN links together a large number of tiny nodes built using embedded processors and small amounts (2–16GB) of flash memory into an ensemble capable of ..."
Abstract
-
Cited by 212 (26 self)
- Add to MetaCart
(Show Context)
This paper introduces the FAWN—Fast Array of Wimpy Nodes—cluster architecture for providing fast, scalable, and power-efficient key-value storage. A FAWN links together a large number of tiny nodes built using embedded processors and small amounts (2–16GB) of flash memory into an ensemble capable of handling 700 queries per second per node, while consuming fewer than 6 watts of power per node. We have designed and implemented a clustered key-value storage system, FAWN-DHT, that runs atop these node. Nodes in FAWN-DHT use a specialized log-like back-end hash-based database to ensure that the system can absorb the large write workload imposed by frequent node arrivals and departures. FAWN uses a two-level cache hierarchy to ensure that imbalanced workloads cannot create hot-spots on one or a few wimpy nodes that impair the system’s ability to service queries at its guaranteed rate. Our evaluation of a small-scale FAWN cluster and several candidate FAWN node systems suggest that FAWN can be a practical approach to building large-scale storage for seek-intensive workloads. Our further analysis indicates that a FAWN cluster is cost-competitive with other approaches (e.g., DRAM, multitudes of magnetic disks, solid-state disk) to providing high query rates, while consuming 3-10x less power. Acknowledgements: We thank the members and companies of the CyLab Corporate Partners and the PDL
The Cost of a Cloud: Research Problems in Data Center Networks
"... This article is an editorial note submitted to CCR. It has NOT been peer reviewed. The author takes full responsibility for this article’s technical content. Comments can be posted through CCR Online. The data centers used to create cloud services represent a significant investment in capital outlay ..."
Abstract
-
Cited by 205 (1 self)
- Add to MetaCart
(Show Context)
This article is an editorial note submitted to CCR. It has NOT been peer reviewed. The author takes full responsibility for this article’s technical content. Comments can be posted through CCR Online. The data centers used to create cloud services represent a significant investment in capital outlay and ongoing costs. Accordingly, we first examine the costs of cloud service data centers today. The cost breakdown reveals the importance of optimizing work completed per dollar invested. Unfortunately, the resources inside the data centers often operate at low utilization due to resource stranding and fragmentation. To attack this first problem, we propose (1) increasing network agility, and (2) providing appropriate incentives to shape resource consumption. Second, we note that cloud service providers are building out geo-distributed networks of data centers. Geo-diversity lowers latency to users and increases reliability in the presence of an outage taking out an entire site. However, without appropriate design and management, these geo-diverse data center networks can raise the cost of providing service. Moreover, leveraging geo-diversity requires services be designed to benefit from it. To attack this problem, we propose (1) joint optimization of network and data center resources, and (2) new systems and mechanisms for geo-distributing state.
RouteBricks: Exploiting Parallelism to Scale Software Routers
- In Proceedings of the 22nd ACM Symposium on Operating Systems Principles
, 2009
"... We revisit the problem of scaling software routers, motivated by recent advances in server technology that enable highspeed parallel processing—a feature router workloads appear ideally suited to exploit. We propose a software router architecture that parallelizes router functionality both across mu ..."
Abstract
-
Cited by 173 (15 self)
- Add to MetaCart
(Show Context)
We revisit the problem of scaling software routers, motivated by recent advances in server technology that enable highspeed parallel processing—a feature router workloads appear ideally suited to exploit. We propose a software router architecture that parallelizes router functionality both across multiple servers and across multiple cores within a single server. By carefully exploiting parallelism at every opportunity, we demonstrate a 35Gbps parallel router prototype; this router capacity can be linearly scaled through the use of additional servers. Our prototype router is fully programmable using the familiar Click/Linux environment and is built entirely from off-the-shelf, general-purpose server hardware. 1
Secondnet: a data center network virtualization architecture with bandwidth guarantees
- In ACM CoNEXT
, 2010
"... In this paper, we propose virtual data center (VDC) as the unit of resource allocation for multiple tenants in the cloud. VDCs are more desirable than physical data centers because the resources allocated to VDCs can be rapidly adjusted as tenants ’ needs change. To enable the VDC abstraction, we de ..."
Abstract
-
Cited by 123 (7 self)
- Add to MetaCart
(Show Context)
In this paper, we propose virtual data center (VDC) as the unit of resource allocation for multiple tenants in the cloud. VDCs are more desirable than physical data centers because the resources allocated to VDCs can be rapidly adjusted as tenants ’ needs change. To enable the VDC abstraction, we designed a data center network virtualization architecture called SecondNet. SecondNet is scalable by distributing all the virtual-to-physical mapping, routing, and bandwidth reservation state in server hypervisors. Its portswitching based source routing (PSSR) further makes SecondNet applicable to arbitrary network topologies using commodity servers and switches. SecondNet introduces a centralized VDC allocation algorithm for virtual to physical mapping with bandwidth guarantee. Simulations demonstrated that our VDC allocation achieves high network utilization and low time complexity. Our implementation and experiments on our testbed demonstrate that we can build Second-Net on top of various network topologies, and SecondNet provides bandwidth guarantee and elasticity, as designed. 1.
Towards Predictable Datacenter Networks
, 2011
"... The shared nature of the network in today’s multi-tenant datacenters implies that network performance for tenants can vary significantly. This applies to both production datacenters and cloud environments. Network performance variability hurts application performance which makes tenant costs unpredi ..."
Abstract
-
Cited by 119 (13 self)
- Add to MetaCart
(Show Context)
The shared nature of the network in today’s multi-tenant datacenters implies that network performance for tenants can vary significantly. This applies to both production datacenters and cloud environments. Network performance variability hurts application performance which makes tenant costs unpredictable and causes provider revenue loss. Motivated by these factors, this paper makes the case for extending the tenant-provider interface to explicitly account for the network. We argue this can be achieved by providing tenants with a virtual network connecting their compute instances. To this effect, the key contribution of this paper is the design of virtual network abstractions that capture the trade-off between the performance guarantees offered to tenants, their costs and the provider revenue. To illustrate the feasibility of virtual networks, we develop Oktopus, a system that implements the proposed abstractions. Using realistic, large-scale simulations and an Oktopus deployment on a 25-node two-tier testbed, we demonstrate that the use of virtual networks yields significantly better and more predictable tenant performance. Further, using a simple pricing model, we find that the our abstractions can reduce tenant costs by up to 74 % while maintaining provider revenue neutrality.
Design, implementation and evaluation of congestion control for multipath TCP
- in Proc. Networked Systems Design and Implementation, March/April
, 2011
"... Multipath TCP, as proposed by the IETF working group mptcp, allows a single data stream to be split across multiple paths. This has obvious benefits for reliability, and it can also lead to more efficient use of networked resources. We describe the design of a multipath congestion control algorithm, ..."
Abstract
-
Cited by 114 (9 self)
- Add to MetaCart
(Show Context)
Multipath TCP, as proposed by the IETF working group mptcp, allows a single data stream to be split across multiple paths. This has obvious benefits for reliability, and it can also lead to more efficient use of networked resources. We describe the design of a multipath congestion control algorithm, we implement it in Linux, and we evaluate it for multihomed servers, data centers and mobile clients. We show that some ‘obvious ’ solutions for multipath congestion control can be harmful, but that our algorithm improves throughput and fairness compared to single-path TCP. Our algorithm is a drop-in replacement for TCP, and we believe it is safe to deploy. 1.
B4: Experience with a Globally-Deployed Software Defined WAN
"... We present the design, implementation, and evaluation of B4, a private WAN connecting Google’s data centers across the planet. B4 has a number of unique characteristics: i) massive bandwidth requirements deployed to a modest number of sites, ii) elastic traffic demand that seeks to maximize average ..."
Abstract
-
Cited by 111 (1 self)
- Add to MetaCart
(Show Context)
We present the design, implementation, and evaluation of B4, a private WAN connecting Google’s data centers across the planet. B4 has a number of unique characteristics: i) massive bandwidth requirements deployed to a modest number of sites, ii) elastic traffic demand that seeks to maximize average bandwidth, and iii) full control over the edge servers and network, which enables rate limiting and demand measurement at the edge. These characteristics led to a Software Defined Networking architecture using OpenFlow to control relatively simple switches built from merchant silicon. B4’s centralized traffic engineering service drives links to near 100 % utilization, while splitting application flows among multiple paths to balance capacity against application priority/demands. We describe experience with three years of B4 production deployment, lessons learned, and areas for future work.