Results 1 -
9 of
9
The Role of Network Processors in Active Networks
- In IWAN 2003
, 2002
"... Network processors (NPs) implement a balance between hardware and software that addresses the demand of performance and programmability in active networks (AN). We argue that this makes them an important player in the implementation and deployment of ANs. Besides a general introduction into the ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Network processors (NPs) implement a balance between hardware and software that addresses the demand of performance and programmability in active networks (AN). We argue that this makes them an important player in the implementation and deployment of ANs. Besides a general introduction into the relationship of NPs and ANs, we describe the power of this combination in a framework for secure and safe capsule-based active code. We also describe the advantages of offloading AN control point functionality into the NP and how to execute active code in the data path e#ciently. Furthermore, the paper reports on experiences about implementing active networking concepts on the IBM PowerNP network processor.
A cluster-based active router architecture supporting video/audio stream transcoding services
- Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS’03
, 2003
"... Active routers allow computation to be performed within the network by processing packets when they pass through the routers. We design and implement a cluster-based active router system that provides multimedia stream transcoding service. The performance of the system is evaluated with three differ ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Active routers allow computation to be performed within the network by processing packets when they pass through the routers. We design and implement a cluster-based active router system that provides multimedia stream transcoding service. The performance of the system is evaluated with three different load balancing schemes. We evaluate the out-of-order phenomenon and analyze the tradeoff between this phenomenon and the processing speed. We present a stream-based round robin algorithm for the transcoding service offered in the router and demonstrate its superiority over the conventional round-robin scheme. The main design criteria are to minimize the total transcoding time and maintain the order of media units for each outgoing stream. 1.
Load balancing for parallel forwarding
- IEEE/ACM TRANSACTIONS ON NETWORKING
, 2005
"... Workload distribution is critical to the performance of network processor based parallel forwarding systems. Scheduling schemes that operate at the packet level, e.g., round-robin, cannot preserve packet-ordering within individual TCP connections. Moreover, these schemes create duplicate informatio ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Workload distribution is critical to the performance of network processor based parallel forwarding systems. Scheduling schemes that operate at the packet level, e.g., round-robin, cannot preserve packet-ordering within individual TCP connections. Moreover, these schemes create duplicate information in processor caches and therefore are inefficient in resource utilization. Hashing operates at the flow level and is naturally able to maintain per-connection packet ordering; besides, it does not pollute caches. A pure hash-based system, however, cannot balance processor load in the face of highly skewed flow-size distributions in the Internet; usually, adaptive methods are needed. In this paper, based on measurements of Internet traffic, we examine the sources of load imbalance in hash-based scheduling schemes. We prove that under certain Zipf-like flow-size distributions, hashing alone is not able to balance workload. We introduce a new metric to quantify the effects of adaptive load balancing on overall forwarding performance. To achieve both load balancing and efficient system resource utilization, we propose a scheduling scheme that classifies Internet flows into two categories: the aggressive and the normal, and applies different scheduling policies to the two classes of flows. Compared with most state-of-the-art parallel forwarding schemes, our work exploits flow-level Internet traffic characteristics.
An Active Traffic Splitter Architecture for Intrusion Detection
- In Proceedings of the ACM Symposium on Modeling, Analysis and Simulation of Computer Telecommunications Systems
, 2003
"... Scaling network intrusion detection to high network speeds can be achieved using multiple sensors operating in parallel coupled with a suitable load balancing traffic splitter. This paper examines a splitter architecture that incorporates two methods for improving system performance: the first is th ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Scaling network intrusion detection to high network speeds can be achieved using multiple sensors operating in parallel coupled with a suitable load balancing traffic splitter. This paper examines a splitter architecture that incorporates two methods for improving system performance: the first is the use of early filtering where a portion of the rule-set is processed on the splitter instead of the sensors. The second is the use of locality buffering, where the splitter reorders packets in a way that improves memory access locality on the sensors. Our experiments suggest that early filtering reduces the number of total packets to be processed by 32%, giving a 8% increase in total performance, while locality buffers improve sensor performance by about 10%. Combined together, the two methods result in an overall improvement of 20% while the performance of the slowest sensor is improved by 14%.
Experimental Evaluation of Load Balancers in Packet Processing Systems
- In Proc. of Workshop on Building Block Engine Architectures for Computers and Networks
, 2004
"... The load balancer is a fundamental building block for implementing high-throughput applications on multi-core architectures (e.g., network processors). In this paper, we consider two canonical load balancing schemes in the context of packet processing systems: (1) packet-level load balancing that de ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
The load balancer is a fundamental building block for implementing high-throughput applications on multi-core architectures (e.g., network processors). In this paper, we consider two canonical load balancing schemes in the context of packet processing systems: (1) packet-level load balancing that determines the mapping of a packet to processor independently for each packet; and (2) flow-level load balancing that maps a flow to a processor and directs all subsequent packets of that flow to the mapped processor. By identifying the application, system, and trace characteristics that affect their relative performance, we address the fundamental question: under what operating conditions, should one choose packet-level load balancing over flow-level load balancing, and vice versa? 1
Run-time System for Scalable Network Services
"... Sophisticated middlebox services–such as network monitoring and intrusion detection, DDoS mitigation, worm scanning, XML parsing and protocol transformation–are becoming increasingly popular in today’s Internet. To support highthroughput, these services are often deployed on Distributed Memory, Mult ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Sophisticated middlebox services–such as network monitoring and intrusion detection, DDoS mitigation, worm scanning, XML parsing and protocol transformation–are becoming increasingly popular in today’s Internet. To support highthroughput, these services are often deployed on Distributed Memory, Multi-processor (DM-MP) hardware platforms such as a cluster of network processors. Scaling the throughput of such platforms, however, is challenging because of the difficulties and overheads of accessing persistent, shared state maintained by the services. In this paper, we describe the design and implementation of Oboe, a run-time system for DM-MP platforms that addresses the above challenge through two foundations: (1) categoryspecific management of shared state, and (2) adaptive flowlevel load distribution for addressing persistent processor overload. Our simulations demonstrate that Oboe can achieve performance within 0-5 % of an ideal adaptive system. Our prototype implementation of Oboe on a cluster of IXP2400 network processors, demonstrates the scalability achieved with increasing number of processors, number of flows and state size. I.
Design and Implementation of a High-Performance Network Intrusion Prevention System
- in SEC, 2005
, 2004
"... Abstract: Network intrusion prevention systems provide proactive defense against security threats by detecting and blocking attack-related traffic. This task can be highly complex, and therefore, software-based network intrusion prevention systems have difficulty in handling high speed links. This p ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract: Network intrusion prevention systems provide proactive defense against security threats by detecting and blocking attack-related traffic. This task can be highly complex, and therefore, software-based network intrusion prevention systems have difficulty in handling high speed links. This paper describes the design and implementation of a high-performance network intrusion prevention system that combines the use of software-based network intrusion prevention sensors and a network processor board. The network processor acts as a customized load balancing splitter that cooperates with a set of modified content-based network intrusion detection sensors in processing network traffic. We show that the components of such a system, if co-designed, can achieve high performance, while minimizing redundant processing and communication. We have implemented the system using low-cost, off-the-shelf technology: an IXP1200 network processor evaluation board and commodity PCs. Our evaluation shows that our enhancements can reduce the processing load of the sensors by at least 45 % resulting in a system that can handle a fully-loaded Gigabit Ethernet link using at most four commodity PCs.
ABSTRACT Sequence-Preserving Adaptive Load Balancers
"... Load balancing in packet-switched networks is a task of ever-growing importance. Network traffic properties, such as the Zipf-like flow length distribution and bursty transmission patterns, and requirements on packet ordering or stable flow mapping, make it a particularly difficult and complex task, ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Load balancing in packet-switched networks is a task of ever-growing importance. Network traffic properties, such as the Zipf-like flow length distribution and bursty transmission patterns, and requirements on packet ordering or stable flow mapping, make it a particularly difficult and complex task, needing adaptive heuristic solutions. In this paper, we present two main contributions: Firstly, we evaluate and compare two recently proposed algorithmic heuristics that attempt to adaptively balance load among the destination units. The evaluation on real life traces confirms the previously conjectured impact of the Zipf-like flow length distribution and traffic burstiness. Furthermore, we identify the distinction between the goals of preserving either the sequence order of packets, or the flowto-destination mapping, showing different strengths of each algorithm. Secondly, we demonstrate a novel hybrid scheme that combines best of the flow-based and burst-based load balancing techniques and excels in both of the key metrics of flow remapping and packet reordering.
PerformanceScalabilityofaMulti-CoreWebServer
"... Today’s large multi-core Internet servers support thousands of concurrent connections or flows. The computation ability of future server platforms will depend on increasing numbers of cores. The key to ensure that performance scales with cores is to ensure that systems software and hardware are desi ..."
Abstract
- Add to MetaCart
Today’s large multi-core Internet servers support thousands of concurrent connections or flows. The computation ability of future server platforms will depend on increasing numbers of cores. The key to ensure that performance scales with cores is to ensure that systems software and hardware are designed to fully exploit the parallelism that is inherent in independent network flows. This paper identifies the major bottlenecks to scalability for a reference server workload on a commercial server platform. However, performance scaling on commercial web servers has proven elusive. We determined that on web server running a modified SPECweb2005 Support workload, throughput scales only 4.8 × on eight cores. Our results show that the operating system, TCP/IP stack, and application exploited flow-level parallelism well with few exceptions, and that load imbalance and shared cache affected performance little. Having eliminated these potential bottlenecks, we determined that performance scaling was limited by the capacity of the address bus, which became saturated on all eight cores. If this key obstacle is addressed, commercial web server and systems software are well-positioned to scale to a large number of cores.

