Results 1 - 10
of
49
Argos: an emulator for fingerprinting zero-day attacks
- in Proc. ACM SIGOPS EUROSYS’2006
, 2006
"... for advertised honeypots with automatic signature generation ..."
Abstract
-
Cited by 108 (22 self)
- Add to MetaCart
(Show Context)
for advertised honeypots with automatic signature generation
High Performance and Scalable I/O Virtualization via Self-Virtualized Devices
- in Proc. of HPDC
, 2007
"... While industry is making rapid advances in system virtualization, for server consolidation and for improving system maintenance and management, it has not yet become clear how virtualization can contribute to the performance of high end systems. In this context, this paper addresses a key issue in s ..."
Abstract
-
Cited by 70 (7 self)
- Add to MetaCart
(Show Context)
While industry is making rapid advances in system virtualization, for server consolidation and for improving system maintenance and management, it has not yet become clear how virtualization can contribute to the performance of high end systems. In this context, this paper addresses a key issue in system virtualization – how to efficiently virtualize I/O subsystems and peripheral devices. We have developed a novel approach to I/O virtualization, termed self-virtualized devices, which improves I/O performance by offloading select virtualization functionality onto the device. This permits guest virtual machines to more efficiently (i.e., with less overhead and reduced latency) interact with the virtualized device. The concrete instance of such a device developed and evaluated in this paper is a self-virtualized network interface (SV-NIC), targeting the high end NICs used in the high performance domain. The SV-NIC (1) provides virtual interfaces (VIFs) to guest virtual machines for an underlying physical device, the network interface, (2) manages the way in which the device’s physical resources are used by guest operating systems, and (3) provides high performance, low overhead network access to guest domains. Experimental results are attained in a prototyping environment using an IXP2400-based ethernet board as a programmable network device. The SV-NIC scales to large numbers of VIFs and guests, and offers VIFs with ∼77 % higher throughput and ∼53 % less latency compared to the current standard virtualized device implementations on hypervisor-based platforms.
Melange: Creating a ”functional” internet
- In EuroSys
, 2007
"... Most implementations of critical Internet protocols are written in type-unsafe languages such as C or C++ and are regularly vulnerable to serious security and reliability problems. Type-safe languages eliminate many errors but are not used to due to the perceived performance overheads. We combine tw ..."
Abstract
-
Cited by 25 (7 self)
- Add to MetaCart
(Show Context)
Most implementations of critical Internet protocols are written in type-unsafe languages such as C or C++ and are regularly vulnerable to serious security and reliability problems. Type-safe languages eliminate many errors but are not used to due to the perceived performance overheads. We combine two techniques to eliminate this performance penalty in a practical fashion: strong static typing and generative metaprogramming. Static typing eliminates run-time type information by checking safety at compile-time and minimises dynamic checks. Meta-programming uses a single specification to abstract the lowlevel code required to transmit and receive packets. Our domain-specific language, MPL, describes Internet packet protocols and compiles into fast, zero-copy code for both parsing and creating these packets. MPL is designed for implementing quirky Internet protocols ranging from the low-level: Ethernet, IPv4, ICMP and TCP; to the complex application-level: SSH, DNS and BGP; and even file-system protocols such as 9P. We report on fully-featured SSH and DNS servers constructed using MPL and our OCaml framework MELANGE, and measure greater throughput, lower latency, better flexibility and more succinct source code than their C equivalents OpenSSH and BIND. Our quantitative analysis shows that the benefits of MPL-generated code overcomes the additional overheads of automatic garbage collection and dynamic bounds checking. Qualitatively, the flexibility of our approach shows that dramatic optimisations are easily possible. 1.
Safecard: a gigabit ips on the network card
- in Proceedings of 9th International Symposium on Recent Advances in Intrusion Detection (RAID’06
, 2006
"... Abstract. Current intrusion detection systems have a narrow scope. They target flow aggregates, reconstructed TCP streams, individual packets or application-level data fields, but no existing solution is capable of handling all of the above. Moreover, most systems that perform payload inspection on ..."
Abstract
-
Cited by 25 (7 self)
- Add to MetaCart
(Show Context)
Abstract. Current intrusion detection systems have a narrow scope. They target flow aggregates, reconstructed TCP streams, individual packets or application-level data fields, but no existing solution is capable of handling all of the above. Moreover, most systems that perform payload inspection on entire TCP streams are unable to handle gigabit link rates. We argue that network-based intrusion detection systems should consider all levels of abstraction in communication (packets, streams, layer-7 data units, and aggregates) if they are to handle gigabit link rates in the face of complex application-level attacks such as those that use evasion techniques or polymorphism. For this purpose, we developed a framework for network-based intrusion prevention at the network edge that is able to cope with all levels of abstraction and can be easily extended with new techniques. We validate our approach by making available a practical system, SafeCard, capable of reconstructing and scanning TCP streams at gigabit rates while preventing polymorphic buffer-overflow attacks, using (up to) layer-7 checks. Such performance makes it applicable in-line as an intrusion prevention system. SafeCard merges multiple solutions, some new and some known. We made specific contributions in the implementation of deep-packet inspection at high speeds and in detecting and filtering polymorphic buffer overflows. 1
Swift: A Fast Dynamic Packet Filter
"... This paper presents Swift, a packet filter for high performance packet capture on commercial off-the-shelf hardware. The key features of Swift include (1) extremely low filter update latency for dynamic packet filtering, and (2) Gbps high-speed packet processing. Based on complex instruction set com ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
(Show Context)
This paper presents Swift, a packet filter for high performance packet capture on commercial off-the-shelf hardware. The key features of Swift include (1) extremely low filter update latency for dynamic packet filtering, and (2) Gbps high-speed packet processing. Based on complex instruction set computer (CISC) instruction set architecture (ISA), Swift achieves the former with an instruction set design that avoids the need for compilation and security checking, and the latter by mainly utilizing SIMD (single instruction, multiple data). We implement Swift in the Linux 2.6 kernel for both i386 and x86 64 architectures. The Swift userspace library supports two sets of application programming interfaces (APIs): a BPF-friendly API for backward compatibility and an object oriented API for simplifying filter coding. We extensively evaluate the dynamic and static filtering performance of Swift on multiple machines with different hardware setups. We compare Swift with BPF (the BSD packet filter)—the de facto standard for packet filtering in modern operating systems—and hand-coded optimized C filters that are used for demonstrating possible performance gains. For dynamic filtering tasks, Swift is at least three orders of magnitude faster than BPF in terms of filter update latency. For static filtering tasks, Swift outperforms BPF up to three times in terms of packet processing speed, and achieves much closer performance to the optimized C filters. 1
Keep net working - on a dependable and fast networking stack
- In Dependable Systems and Networks
, 2012
"... Abstract—For many years, multiserver 1 operating systems have been demonstrating, by their design, high dependability and reliability. However, the design has inherent performance implications which were not easy to overcome. Until now the context switching and kernel involvement in the message pass ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
(Show Context)
Abstract—For many years, multiserver 1 operating systems have been demonstrating, by their design, high dependability and reliability. However, the design has inherent performance implications which were not easy to overcome. Until now the context switching and kernel involvement in the message passing was the performance bottleneck for such systems to get broader acceptance beyond niche domains. In contrast to other areas of software development where fitting the software to the parallelism is difficult, the new multicore hardware is a great match for the multiserver systems. We can run individual servers on different cores. This opens more room for further decomposition of the existing servers and thus improving dependability and live-updatability. We discuss in general the implications for the multiserver systems design and cover in detail the implementation and evaluation of a more dependable networking stack. We split the single stack into multiple servers which run on dedicated cores and communicate without kernel involvement. We think that the performance problems that have dogged multiserver operating systems since their inception should be reconsidered: it is possible to make multiserver systems fast on multicores. Keywords-Operating systems; Reliability; Computer network reliability; System performance
FPL-3: towards language support for distributed packet processing
- In Proceedings of IFIP Networking
, 2005
"... Abstract. The FPL-3 packet filtering language incorporates explicit support for distributed processing into the language. FPL-3 supports not only generic headerbased filtering, but also more demanding tasks, such as payload scanning, packet replication and traffic splitting. By distributing FPL-3 ba ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
(Show Context)
Abstract. The FPL-3 packet filtering language incorporates explicit support for distributed processing into the language. FPL-3 supports not only generic headerbased filtering, but also more demanding tasks, such as payload scanning, packet replication and traffic splitting. By distributing FPL-3 based tasks across a possibly heterogeneous network of processing nodes, the NET-FFPF network monitoring architecture facilitates very high speed packet processing. Results show that NET-FFPF can perform complex processing at gigabit speeds. The proposed framework can be used to execute such diverse tasks as load balancing, traffic monitoring, firewalling and intrusion detection directly at the critical highbandwidth links (e.g., in enterprise gateways). Key words: High-speed packet processing, traffic splitting, network monitoring 1
Enabling flexible packet filtering through dynamic code generation,” proceedings of ICC ’08,
, 2008
"... Abstract-Despite its efficiency, the general approach of hardcoding protocol format descriptions in packet processing applications suffers from many limitations. Among the others, the lack of flexibility when needing to extend the software for supporting new protocols, and the proliferation of modu ..."
Abstract
-
Cited by 8 (6 self)
- Add to MetaCart
(Show Context)
Abstract-Despite its efficiency, the general approach of hardcoding protocol format descriptions in packet processing applications suffers from many limitations. Among the others, the lack of flexibility when needing to extend the software for supporting new protocols, and the proliferation of modules with similar functionality between different applications, resulting in decreased maintainability. The NetPDL language was defined for overcoming such limitations, allowing decoupling applications from the knowledge of the format of protocol headers. The main criticism to NetPDL relates to its supposed performance penalties; this paper demonstrates that this language can be effectively used for the dynamic generation of optimized, i.e. efficient and fast, packet-processing code, and presents the architecture of a compiler implemented for such purpose.
Dandelion: a compiler and runtime for heterogeneous systems
- in Proc. of the Twenty-Fourth ACM Symp. on Operating Systems Principles. ACM
"... Computer systems increasingly rely on heterogeneity to achieve greater performance, scalability and en-ergy efficiency. Because heterogeneous systems typi-cally comprise multiple execution contexts with differ-ent programming abstractions and runtimes, program-ming them remains extremely challenging ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
(Show Context)
Computer systems increasingly rely on heterogeneity to achieve greater performance, scalability and en-ergy efficiency. Because heterogeneous systems typi-cally comprise multiple execution contexts with differ-ent programming abstractions and runtimes, program-ming them remains extremely challenging. Dandelion is a system designed to address this pro-grammability challenge for data-parallel applications. Dandelion provides a unified programming model for heterogeneous systems that span diverse execution con-texts including CPUs, GPUs, FPGAs, and the cloud. It adopts the.NET LINQ (Language INtegrated Query) ap-proach, integrating data-parallel operators into general purpose programming languages such as C # and F#. It therefore provides an expressive data model and native language integration for user-defined functions, enabling programmers to write applications using standard high-level languages and development tools. Dandelion automatically and transparently distributes data-parallel portions of a program to available comput-ing resources, including compute clusters for distributed execution and CPU and GPU cores of individual nodes for parallel execution. To enable automatic execution of.NET code on GPUs, Dandelion cross-compiles.NET code to CUDA kernels and uses the PTask runtime [85] to manage GPU execution. This paper discusses the de-sign and implementation of Dandelion, focusing on the distributed CPU and GPU implementation. We evaluate the system using a diverse set of workloads. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author. Copyright is held by the Owner/Author(s).
Netslices: scalable multi-core packet processing in user-space
- In Proc. ANCS
, 2012
"... Modern commodity operating systems do not provide developers with user-space abstractions for building high-speed packet pro-cessing applications. The conventional raw socket is inefficient and unable to take advantage of the emerging hardware, like multi-core processors and multi-queue network adap ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
(Show Context)
Modern commodity operating systems do not provide developers with user-space abstractions for building high-speed packet pro-cessing applications. The conventional raw socket is inefficient and unable to take advantage of the emerging hardware, like multi-core processors and multi-queue network adapters. In this paper we present the NetSlice operating system abstraction. Unlike the conventional raw socket, NetSlice tightly couples the hardware and software packet processing resources, and provides the application with control over these resources. To reduce shared resource con-tention, NetSlice performs domain specific, coarse-grained, spa-tial partitioning of CPU cores, memory, and NICs. Moreover, it provides a streamlined communication channel between NICs and user-space. Although backward compatible with the conven-tional socket API, the NetSlice API also provides batched (multi-) send / receive operations to amortize the cost of protection domain crossings. We show that complex user-space packet processors— like a protocol accelerator and an IPsec gateway—built from com-modity components can scale linearly with the number of cores and operate at 10Gbps network line speeds.