| David Banks and Michael Prudence. "A High-Performance Network Architecture for a PA-RISC Workstation," IEEE Journal on Selected Areas in Communications, 11:2, February 1993, pp. 191 - 202. |
....their relationship to architectural parameters or their scalability across different platforms. Previous work in high performance networking has characterized the bottlenecks of various systems and proposed solutions either in the form of hardware support or protocol stack optimizations [1][3] 8] 23] Conclusions largely agree with results presented here, emphasizing the cost of interrupt handling and data copying. However, since NIDS systems are not actively participating in network communications, many proposed optimizations such as larger frame sizes are not applicable to NIDS ....
D. Banks and M. Prudence, "A High-performance Network Architecture for a PA-RISC Workstation," IEEE Journal on Selected Areas in Communications, vol. 11, no. 2, Feb. 1993, pp. 191-202.
....user memory, where it is received by the flow library. Finally, the flow library delivers a copy of the message to each listener queue. The goal of this work has not been to achieve best case message delivery within the local host. A large amount of research has already been done in this area [3, 8, 12], and the goal of this prototype has been to demonstrate the general functionality of flows. However, it is easy to imagine how an incorporation of the flow library into the kernel network stack could eliminate one of these copies, making flow message passing similar in local overhead to TCP. A ....
D. Banks and M. Prudence. A high-performance network architecture for a PA-RISC workstation. IIEEE Journal on Selected Areas in Communications (Special Issue on High Speed Computer/Network Interfaces), 11(2):191--202, 1993.
....but what performance bene t, if any, do they provide Does transmitfile( or send file( show any improvement over the already available mmap( and writev( system calls per byte optimizations. It is well known that data touching operations, such as copying and checksumming, are expensive [5, 11, 13, 23]. BSD derived Unix operating systems [24] use di erent bu ering mechanisms in the le system and the networking code, forcing data to be copied when it is moved from one The authors are with the IBM T.J. Watson Research Center. subsystem to another. How well can we approximate a zero copy ....
David Banks and Michael Prudence. A highperformance network architecture for a PA-RISC workstation. IEEE Journal on Selected Areas in Communications, 11(2):191-202, February 1993.
....of these bottlenecks, and identified mechanisms that are at least partially effective in overcoming some of the handicaps. This is evidenced by the considerable litearature that has been published in recent years on network interface design, and on the structuring of protocols in operating systems [2,5,6,7,10,12,14,28,32,39,41]. We have attempted to integrate a number of these proven good mechanisms, along with some new ones of our own creation, in our attempt to build a state of the art high performance network interface. This document describes the design and implementation of this network interface chip (NIC) ....
....touches without requiring any changes to applications using the socket API; it has the disadvantage that processor cycles are spent copying the data, and it requires large memories resident on the NIC. 27 There are two implementations of the WITLESS architecture in the literature. The HP Medusa [2] interface was targeted at connecting a HP PA RISC Apollo series 700 workstation to an FDDI network. The second generation of this design, called the Afterburner [9] was designed to connect the same machine to a variety of network links (including HIPPI and ATM) at upto 1 Gbps. 3.2. Network ....
Banks, D., and Prudence, M., "A High Performance Network Architecture for a PA-RISC Workstation," IEEE JSAC, Vol.11 No. 2, Feb. 1993.
....ATOLL [17] uses the unused upper address bits of an uncached write transaction as an index into a routing table to initiate a DMA transfer. Avalanche places a request control structure in kernel memory and updates a queue counter in the NI using a single uncached write. The Medusa network adapter [7] combines a network packet start address and length into a 32 bit word that is written to a hardware transmit FIFO into a single bus transaction. Since a bus transaction is a natural unit of atomicity in most computer systems, it is tempting to use it to implement atomic message transfer setup. ....
D. Banks and M. Prudence, "A High-performance Network Architecture for a PARISC Workstation," IEEE Journal on Selected Areas in Communications, vol. 11, no. 2, Feb. 1993, pp. 191-202.
....appends headers to the data. Assuming a UDP transport protocol, an IP network protocol, and an ethernet link, the data will be appended with a UDP header, an IP header, and an ethernet header before being copied into the NIC buffers. In order to reduce the packet processing and movement overheads [4, 13, 14, 24, 27], many researchers have proposed various network interface architectures. Single copy schemes [4, 13, 14, 27] which move data directly from user space to the NIC buffers, attempt to reduce the multiple memory copies necessary as the data moves down the various protocol stacks. Other schemes move ....
....data will be appended with a UDP header, an IP header, and an ethernet header before being copied into the NIC buffers. In order to reduce the packet processing and movement overheads [4, 13, 14, 24, 27] many researchers have proposed various network interface architectures. Single copy schemes [4, 13, 14, 27], which move data directly from user space to the NIC buffers, attempt to reduce the multiple memory copies necessary as the data moves down the various protocol stacks. Other schemes move some or all of the packet processing onto the NIC [15, 24, 32] Instead of adding to the NIC, other schemes ....
D. Banks and M. Prudence. High-performance network architecture for a pa-risc workstation. IEEE Journal on Selected Areas in Communications, 11(2):191--202, February 1993.
....JAC93] reported a new software version of TCP IP which is not layered, eliminates much of the operating system overhead, and only copies the packet data once between the application buffer and the network buffer of the device driver. A group of researchers at Hewlett Packard s UK laboratories [BAN93, DAL93, WAT93] developed the Medusa and the Afterburner interface boards for modified buffer pool management to implement something similar to the new TCP IP version proposed by Jacobson [JAC93] The researchers reported that the Medusa card enabled their workstation (HP 720) to almost use the entire ....
Banks, D., and Prudence, M. "A High Performance Network Architecture for a PARISC Workstation," IEEE Journal Selected Areas in Communications, Vol. 11, No.2, February 1993. -10-
....the operating system, the input output unit, the memory system, and the processor. In general, the problem is that the network interface has been traditionally viewed as just an I O peripheral that causes unpredictable and infrequent events, and so, not much optimization has been put into it [1, 2, 3, 4, 5]. We first have to differentiate between the concepts of latency and throughput, as they apply to the receive operation from the network interface. Latency is the time duration between a packet arrival at the NIC and its delivery to the application. The throughput, on the other hand, is the rate ....
D.Banks and M.Prudence, "A high-performance network architecture for a PA-RISC workstation," IEEE Journal on Selected Areas in Communications, vol. 11, pp. 191-- 202, Feb. 1993.
....Increased CPU clock frequency gives the same result as slower memory. 3 Related Work Van Jacobson proposed the WITLESS network adapter design with a large on board buffer memory [48] The WITLESS design was used by a group at HP Labs in Bristol in a series of network adapters including Medusa [18], Afterburner [28, 52] and Jetstream [38] Two more high performance network adapters were developed at about the same time. Traw and Smith [74, 75, 76] implemented an adapter for the IBM RS 6000 and Davie [29, 30, 31] implemented one for DEC workstations. The performance of TCP implementations ....
....[60] Several researchers have reported on work with folding together the TCP or UDP checksum with the user to kernel copy in Unix. This design is often called single copy . It was presented by Jacobson as part of the WITLESS design [48] and implemented in the software for the Afterburner [18, 28] family of network adapters. A single copy design was also used by Partridge and Pink [62] when optimizing the Berkeley UDP implementation in SunOS. Implementing protocols in the application address space is not without problems. Maeda and Bershad [54] Thekkath et al. 73] Edwards and Muir ....
David Banks and Michael Prudence. A high-performance network architecture for a PA-RISC workstation. IEEE Journal on Selected Areas in Communications, 11(2):191--202, February 1993.
....for what we call sender based memory management. 5. 2 Copy avoidance Several projects have used page remapping and smart interface buffer allocation to accelerate processor tointerface communication, including the fbufs work at the University of Arizona [Druschel93] the Medusa FDDI interface [Lumley92, Banks93] and the follow on Afterburner project [Dalton93] The Nectar system [Cooper90] allowed applications direct access to its communication interface memory in order to eliminate copies at the cost of all accesses being to memory in the I O space. It achieved round trip RPC latencies of 500s across a ....
D. Banks and M. Prudence. A high performance network architecture for a PA-RISC workstation. IEEE Journal on Selected Areas in Communications 11(2), February 1993.
....but could not be directly compared with interrupts as the network interface was (by design) incapable of generating them. A second implementation of the hardware portion of the host interface architecture has been built as an OC 12c rate ATM Link Adapter for the HP Bristol Labs Afterburner [Banks 93] card. 2 The Afterburner ATM Link Adapter A second implementation of the UPenn SAR architecture has been developed for HP 9000 700 series workstations equipped with Afterburner generic interface cards. This chapter describes the architecture and implementation of the ATM Link Adapter. The ....
....between the general purpose Afterburner and a specific network technology. The link adapter interfaces to the Afterburner through the Link Adapter Control Interface, Tx Serial Port, and Rx Serial Port. Thus far, link adapters have been designed by HP and others for Jet Stream [Dalton 93] FDDI [Banks 93] and HiPPI network technologies. The remainder of this section discusses the architecture and implementation of an ATM link adapter based on the UPenn SAR architecture. Figure 1 shows an Afterburner and ATM Link Adapter. 2.2 Hardware The UPenn SAR architecture[Traw 93a] is the basis for the ....
D. Banks and M. Prudence, "A High-Performance Network Architecture for a PA-RISC Workstation," IEEE Journal on Selected Areas in Communications (Special Issue on High Speed Computer/Network Interfaces), 11(2), pp. 191-202 (February 1993).
....Our C cluster consists of eight HP 9000 720 Precision Architecture (PA RISC) workstations running at 50 MHz. Each machine is equipped with 32 MB RAM, a Medusa FDDI and an Intel Ethernet controller. The Medusa board is an experimental FDDI adapter adhering to the Afterburner specification [6, 2]. The FDDI Medium Access (MAC) standard specifies a maximum frame length (MTU) of 4500 bytes [14] However, the Medusa board is capable of dealing with frames up to 8 KB. As we were interested in the performance of our protocols and not in the performance of FDDI itself, we did not limit frames to ....
D. Banks and M. Prudence. A high performance network architecture for a PA-RISC workstation. IEEE Journal on Selected Areas in Communication, 11(2):191--202, Feb. 1993.
....dependent informations to the SDL specification. A standard implementation of TCP IP copies data from program buffer first to kernel buffer and then from kernel buffer to the network interface, see figure 7. The transport layer reads the data from the kernel buffer to compute checksum and header [1], 2] Some processors are able to calculate the checksum while copying the data. This is done without reducing the copy rate. Using a new network interface allows to move the socket buffer to the network interface and to eliminate the kernel buffer. The data are copied directly from application to ....
....application to network interface where it is held until an acknowledgement has been received. The function of this buffer is logically identical to the socket buffer, but we can eliminate one copy function and reduce the number of system memory access. This is called a single copy protocol stack [1]. FIGURE 7 . Data paths in a conventional protocol stack modified from [1] 0,nrconnections) TCP PortGenerator TCP in IP CPU QSDL Part system call interface application socket layer T C P I P interface driver kernel buffering user buffering network MAC 1 2 3 4 5 5 4 3 2 ....
[Article contains additional citation context not shown here]
D. Banks, M. Prudence. A High-Performance Network Architecture for a PA-RISC Workstation. IEEE Journal on Selected Areas in Communication, Vol 11, No 2, February 1993.
....of pipelines, load delay slots, and scoreboarding to minimize memory access delays; BibRef[39] describes an algorithm for checksumming with minimal loss of performance on architectures lacking carry bits. An alternate strategy is to implement the Internet checksum on the network adapter BibRef[4]BibRef[38]BibRef[57] but that requires extra adapter hardware or extra muscle from an onboard CPU. For example, DEC s FDDI adapter, which does not support onboard Internet checksum processing, is based on the relatively simple 16 bit MC68000 processor BibRef[55]BibRef[65] SGI s FDDI adapters, ....
....is sufficient to result in the effective elimination of checksum overhead BibRef[35] but it does not work as well on other machines, such as machines based on MIPS and Alpha processors BibRef[41]BibRef[57]BibRef[16] 2.2. 2 Copying From User to Kernel The HP Medusa and Afterburner adapters BibRef[4]BibRef[24] introduce a design which largely removes the performance impact of a copy between adapter and kernel by placing the kernel buffers on the adapter. These adapters each 20 include several megabytes of VRAM from which the kernel allocates network buffers. Data is moved between the host ....
[Article contains additional citation context not shown here]
D. Banks, M. Prudence, "A High-Performance Network Architecture for a PARISC Workstation," IEEE Journal on Selected Areas in Communications, pp. 191-202, February 1993.
....n) s Gamma 1) l n=w) l. In our performance model, we will use the maximum latency, L(s; n) instead of the average latency because Panda performance is determined by the slowest node. The raw network bandwidth w for FDDI is 100 Mb s for FDDI, and the raw network latency l is extracted from [Banks93] as 10 s. With the adjusted latency, the time for S senders to each send a k byte message is T1 = o L(s;k) G(k Gamma 1) For S processes each to send n k byte messages, it takes time Tn = o L(s;nk) nG(k Gamma 1) n Gamma 1)max(g; o) In a more general form, the time for a process to send n1 ....
D. Banks and M. Prudence, A HighPerformance Network Architecture for a PARISC Workstation, IEEE Journal on Selected Areas in Communications, Vol. 2, No. 2, pages 191-202, February 1993.
....Interfaces, VI, have been implemented to probe the concept of VI as well as probe the reduction in latency and bandwidth. They have used different NIC such as Myrinet and even Ethernet [Dunn98] and [Eick98] However, This work was previously started with PA RISC network interface architectures [Banks93], virtual protocols for myrinet as stated in [Rosu95] moreover several researchers have tried to localize the bottlenecks and performance improvements in NIC s like the work done by [Davi93] and [Rama93] in which they have stated the general memory management concepts as well as I O handling ....
Banks, D., Prudence M. "A high-performance Network Architecture for PA-RISC Workstation", IEEE Journal on Selected Areas of Communications", vol. 11, No. 2, February 1998, pp 191-202.
.... incurred by APIs using ATM networks [13] Clark analyzed the TCP protocol processing time and found the protocol processing time was not very significant [8] Banks and Prudence presented an improvement for higher level protocols by reducing the number of data copies required across the system bus [6]. Zitterbart proposed a functional based communication model that allows applications to request individually tailored services from the network subsystem [19] Other researchers have studied the performance of the network interface. Berenbaum et al. designed a programmable ATM host interface ....
D. Banks and M. Prudence. A High-Performance Network Architecture for a PA-RISC Workstation. IEEE Journal on Selected Areas in Communications, 11(2):191--202, February 1993.
....the length of the instruction path, minimum possible memory moves (so called single copy architectures) low over 14 head process structures (i.e. modification of the protocol syntax) minimum number of calls to the operating system, etc. must be addressed as well [DDK 90] LaPS91] MS92] [BP93], DAP 93] DWB 93] Furthermore, to compensate for the increased ratio of propagation delay to cell packet transmission time, some form of structural and or functional parallelism (or pipelining) must be used in processing the different protocols and or data structures involved in a ....
Banks, D. and Prudence, M.,"A High-Performance Network Architecture for a PARISC Workstation," IEEE Journal on Selected Areas in Communications, Vol. 11, No. 2, pp. 191 - 202, February 1993.
No context found.
David Banks and Michael Prudence. "A High-Performance Network Architecture for a PA-RISC Workstation," IEEE Journal on Selected Areas in Communications, 11:2, February 1993, pp. 191 - 202.
No context found.
D. Banks and M. Prudence. A High-Performance Network Architecture for a PA-RISC Workstation. Journal of Selected Areas in Communications, pages 191--202, February 1993.
No context found.
D. Banks and M. Prudence, "A High Performance Network Architecture for a PA-RISC Workstation", IEEE Journal on Selected Areas in Communications, vol. 11, no. 2, Feb. 1993.
No context found.
D. Banks and M. Prudence. A high-performance network architecture for a pa-risc workstation. IEEE Journal on Selected Areas in Communication, 11(2):191--202, February 1993.
No context found.
Banks, D., Prudence, M.:"A High-Performance Network Architecture for a PA-RISC Workstation", IEEE Journal on Selected Areas in Communications, Vol. 11, No. 2, February 1993, pp. 191 - 202
No context found.
D. Banks and M. Prudence. A High-Performance Network Architecture for a PA-RISC Workstation. Journal of Selected Areas in Communications, pages 191--202, February 1993.
No context found.
David Banks and Michael Prudence, "A High-Performance Network Architecture for a PA-RISC Workstation", IEEE J.Select.Areas.Commun., vol.11, no.2, pp.191-202, Feb.1993
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC