| L.-F. Cabrera, E. Hunter, M. J. Karels, D. A. Mosher, "User-Process Communication Performance in Networks of Computers, " IEEE Transactions on Software Engineering, 14(1), 38-53, Jan. 1988. |
....a reply. We can conclude that one module will have to be devoted to this duty. How the request is received, and the data is returned, is illustrated in Figure 3. 3: msg get (msg p, s s , name specification, environment specification) Read the contents of the cache ; reply (msg p, s s F[12] , cache[i] datestamp, cache[i] timestamp,cache[i] channel) It is, however, convenient to assign yet another responsibility to the Collection Layer. This is based on the recognition of a user s demand for information concerning the loggers. It should be possible for a user to retrieve ....
....by the TCP protocol can be high, and the difference among TCP and UDP in average response time can be significant. 58] suggests that the choice among these two protocols should depend on the reliability of the network environment. A performance analysis project conducted by Cabrera et al. [12] has examined the variability of throughput for the TCP and UDP protocols. In a system with unloaded hosts and network, like the performance analysis projects of this thesis, they have found that the throughput varies with the size of the messages. They found the UDP protocol to be twice as fast ....
Cabrera, Luis-Felipe, et al.: "User-process communication performance in networks of computers", IEEE Transaction on Software Engineering, Vol. 14 (1), 1988, pp. 38-53. 82
....and SunOS 5.5 (Solaris) interconnected by a 10 Megabit sec Ethernet LAN, and comment on the results obtained. Related Work The user level network performance in networks of UNIX workstations has been the subject of study in several previous papers. Very good work on UNIX internals is found in [2] and [3] While the former paper is perhaps the more comprehensive study (it also considers UDP) the latter investigates a more up to date technology and focuses mainly on TCP. In [4] the communication performance of various CORBA (Common Object Request Broker Architecture) and RPC (remote ....
....possibly interfere with the measurement process, and thus produce an outlier value. This requirement is even more stringent in a non isolated environment, such as a local network connected to the Internet, where interferences from the outside, like an electronic mail arrival, can also occur. In [2] an isolated network is considered, and a repetition count of 4000 is found to be necessary for achieving an adequate degree of confidence for the throughput measurements; for the assessment of the protocol implementations a repetition count of 10,000 is used. We also use a repetition count of ....
[Article contains additional citation context not shown here]
L. F. Cabrera, E. Hunter, M. J. Karels, and D. A. Mosher. User-Process Communication Performance in Networks of Computers. IEEE Transactions on Software Engineering, 14(1), 38--53, 1988.
....and by operation. By layer, we measure the individual processing times of the socket, UDP, IP, link, and device driver software. By operation, we measure the processing times for various copying, checksumming, and malloc free operations. Other studies have shown these operations to be expensive [3 5, 7]. Cabrera et al. discuss a number of network software bottlenecks in the 4.2 release of Berkeley Unix [3] Clark notes that checksumming and to a lesser extent copying were the dominant costs in the Multics TCP IP implementation [4] Clark et al. discuss copy and checksum costs versus other ....
....driver software. By operation, we measure the processing times for various copying, checksumming, and malloc free operations. Other studies have shown these operations to be expensive [3 5, 7] Cabrera et al. discuss a number of network software bottlenecks in the 4. 2 release of Berkeley Unix [3]. Clark notes that checksumming and to a lesser extent copying were the dominant costs in the Multics TCP IP implementation [4] Clark et al. discuss copy and checksum costs versus other protocol costs in the context of a precursor to the 4.3 Reno release of Berkeley Unix and an experiment that . ....
L.-F. Cabrera, E. Hunter, M. J. Karels, D. A. Mosher, "User-Process Communication Performance in Networks of Computers," IEEE Transactions on Software Engineering, 14(1), 38-53, Jan. 1988.
....network software, in particular, the TCP IP and UDP IP protocol stacks. In the past, significant focus has been placed on maximizing throughput, noting that data touching operations such as computing checksums and data copying are responsible for the primary bottlenecks in throughput performance [3, 5 8]. Maximal throughput is typically achieved by sending large messages. However, most TCP IP packets on local area networks are smaller than 200 bytes [9] Furthermore, almost all IP packets sent across wide area networks are no more than 576 bytes long [4] because of the suggested default TCP ....
.... data movement ( DataMove ) data structure manipulations ( Data Struct ) error checking ( ErrorChk ) network buffer management ( Mbuf ) operating system functions ( OpSys ) and protocol specific processing ( ProtSpec ) Other studies have shown some of these overheads to be expensive [3, 5 8, 11, 22]. Checksum: Computing checksums is accomplished by a single procedure, the Internet checksum routine [1] which is performed on data in TCP and UDP, and on the header in IP. DataMove: Data movement includes work involved in the moving of data from one place to another. These operations are ....
[Article contains additional citation context not shown here]
L.-F. Cabrera, E. Hunter, M. J. Karels, D. A. Mosher, "UserProcess Communication Performance in Networks of Computers, " IEEE Transactions on Software Engineering, 14(1), 38-53, January 1988.
....latency is most in need of improvement. Processing time breakdowns for control packets 64 128 bytes long fundamentally differ from those of 8 kbyte long NFS data packets. The processing time of large packets is dominated by data touching operations such as copying and checksumming BibRef[11]BibRef[19]BibRef[21]BibRef[22]BibRef[67] because these operations must be applied to each byte. However, small packets have few bytes, and thus their processing time is dominated by non data touching operations. This property is useful for drawing distinction between large and small packets: ....
....debate. The additional memory allocation overheads resulting from the more complex mbufs are the subject of still more debate most variants of Berkeley Unix use somewhat different network buffer allocators. BibRef[43] explains the Berkeley Unix network buffer descriptors, called mbufs. BibRef[11] details how the mbuf allocation policy, tuned towards memory conservation in early versions of Berkeley Unix intended to run on 4 MB VAXes, produces counterintuitive performance behavior; this behavior is also observable in the Ultrix kernel measured for this study. BibRef[32] describes the ....
[Article contains additional citation context not shown here]
L.-F. Cabrera, E. Hunter, M. J. Karels, D. A. Mosher, "User-Process Communication Performance in Networks of Computers," IEEE Transactions on Software Engineering, v14n1, pp. 38-53, Jan. 1988.
....and analytical studies Several studies have attempted to study the effects of multiprogramming on communication performance experimentally as well as analytically. An in depth study of the effect of background load on communication performance for the 4. 2 BSD TCP IP protocol stack is reported in [31]. The delivered end to end delay and throughput for TCP and UDP traffic is measured against background load for a range of message sizes. The effect of background Ethernet load on TCP and UDP performance is studied in [21] More recently, the effect of background load on the performance of SunOS ....
L. F. Cabrera, E. Hunter, M. J. Karels, and D. A. Mosher, "User-process communication performance in networks of computers," IEEE Trans. Software Engineering, vol. 14, no. 1, pp. 38--53, January 1988.
.... also show that protocol performance is influenced by several additional factors, including the size of each protocol s packet, the flow control algorithm used by the protocol, the underlying buffer management scheme, the overhead involved in parsing headers, and various hardware limitations [36, 16, 5]. In addition to suggesting guidelines for designing new protocols and proposing hardware designs that support efficient implementations, these studies also make the point that providing the right primitives in the operating system plays a major role in being able to implement protocols ....
L.-F. Cabrera, E. Hunter, M. Karels, and D. Mosher. User-process communication performance in networks of computers. IEEE Transactions on Software Engineering, SE-14(1):38--53, Jan. 1988.
....gettimeofday( to measure the time to receive the message. The two hosts exchanged initial synchronization messages using the three way handshake protocol to ensure the connection was established before timing operations began. This technique has been widely used in network throughput measurement[2, 5, 8, 13]. We used TCP because it can provide reliable transmission. We first chose two end workstations (e.g. grunt01 and grunt02, or grunt01 and grunt05) and found that the maximum TCP throughput could be up to 94 Mbps. This is because the bottleneck in the communication path grunt01page01 grunt02 (or ....
....swinging, at any given the computing speed may outpace that of the communications infrastructure, or vice versa. For example, an early paper which measured the Ethernet performance showed that using VAX780 or Sun II the maximum throughput was only around 750K bits sec , much less than 10 Mbits sec [2]. We also used four pairs of stations to send receive packets simultaneously and then calculated the total throughput by summing up the result from each pair. We found that the aggregate throughput was 770Mbps. If each station ran both sender and receiver programs, the throughput could be up to ....
L. Cabrera, E. Hunter, M. J. Karels, and D. A. Mosher. User-Process Communication Performance in Networks of Computers. IEEE Trans. on Software Engineering, 14(1):38--53, Jan. 1988.
....5.1: Unix Internet Domain Datagram Socket Timings (Send Only) The timings in figure 5.1 were taken on a machine with a relatively light load, and are therefore the ideal case. A more loaded machine will not show local sockets to have such a clear performance gain over internet domain. See [9] for detailed discussion and further references to the effect of load on UDP IP performance. Taking all of the above into consideration, it was decided that all IKM socket communications should be carried out using internet domain sockets. 5.3 Memory Copy Scatter Gather Write 5.3.1 Contiguous ....
CABRERA, L.F., HUNTER, E, KARELS, M.J. and MOSHER, D.A. (1988). User-Process Communication Performance in Networks of Computers. IEEE Transactions on Software Engineering, Vol. 14, 1, 38.
....This waiting time is denoted by t wn . The end to end communication delay T experienced by a message sent from one process to another is T = t S t wS t n t wn t R t wR . While t S and t R are constants, t wS , t wn and t wR are variable delays depending on the network and stations loads [1]. This model is similar to those presented in [13] and [9] For an Ethernet network, t p can be reasonably assumed constant (as in [13] since it is determined by the length of the bus. For FDDI, t p is not a constant, we should count the latencies of the stations located between the sending and ....
L. F. Cabrera, E. Hunter, M. Karels, and D. Mosher. User-Process Communication Performances in Networks of Computers. IEEE Transactions on Software Engineering, 14(1), 1988.
....except those generated by the replication protocol under study. ffl We consider a message loss rate of 10 Gamma6 (it includes message collisions and transmission errors) In house observations and studies in Ethernet networks show that the actual loss of messages is about one in two millions [17] whenever the network traffic is far below the achievable throughput [18] This assumption is fulfilled in the context of our simulations, where the generated load is under 15 of the available bandwidth (this can be verified by computing the number of messages generated during the simulation) ....
L. F. Cabrera, E. Hunter, M. Karels, and D. Mosher, "Userprocess communication performances in networks of computers, " IEEE Trans. Software Engineering, vol. 14, Jan. 1988.
....security tools, 4) network monitoring tools, and (5) network benchmarks. Of course, the number of papers related to specific computer network performance issues is very large. One major focus has been to evaluate and analyze techniques for improving the response time of communications mechanisms [3, 9, 10, 12, 2]. These studies are valuable for identifying specific pieces of the network communications mechanism which significantly contribute to overall performance. However, these research efforts are not intended to provide general tools for evaluating and comparing the performance of specific network ....
Cabrera, L., E. Hunter, M.J. Karels, and D.A. Mosher, User-Process Communication Performance in Networks of Computers. IEEE TSE, Vol 14, No. 1, pp. 38-53, 1988.
....time is denoted by t wn . The end to end communication delay T experienced by a message exchanged between two distant processes is T = t S t wS t n t wn t R t wR . While t S and t R are constants, t wS , t wn and t wR are variable delays depending on the network and stations loads [1]. This model is similar to those presented in [11] and [8] For an Ethernet network, t p can be reasonably assumed constant [11] since it is determined by the length of the bus. For FDDI, t p is not a constant, we should count the latencies of the stations located between the sending and the ....
L. F. Cabrera, E. Hunter, M. Karels, and D. Mosher. User-Process Communication Performances in Networks of Computers. IEEE Transactions on Software Engineering, 14(1), 1988.
....category of overhead includes all the operations which are too small to measure. Its time was computed by taking the difference between the total processing time and the sum of the times of all the other categories listed above. Other studies have shown some of these overheads to be expensive [CHKM88][CJRS89] WM87] We measured the total amount of execution time spent in the TCP IP and UDP IP protocol stacks as implemented in the DEC Ultrix 4.2a kernel, to send and receive (IP) packets of a wide range of sizes, broken down according to the categories listed above. All measurements were taken ....
L-F. Cabrera, E. Hunter, M. Karels, D. Mosher, "UserProcess Communication Performance in Networks of Computers," IEEE Trans. on Software Engineering, Vol. 14, No. 1, January 1988, pp. 38-53.
....interfaces that are more appropriate for network based multicomputer applications will be developed in parallel or on top of sockets. 3 The Host Network Interface Architecture Many papers have been published that report measurements of the overheads associated with communicating over networks [12, 4, 13, 14, 15, 16]. Even though it is difficult to compare these results because the measurements are made for different architectures, protocols, communication interfaces, and benchmarks, there is a common pattern: there is no single source of overhead. The time spent on sending and receiving data is distributed ....
L.-F. Cabrera, E. Hunter, M. J. Karels, and D. A. Mosher, "User-Process CommunicationPerformance in Networks of Computers," IEEE Transactions on Software Engineering, vol. 14, pp. 38--53, January
....of end to end behavior in the current environment would be of great benefit. To perform such a study we have undertaken a series of experiments aimed at replicating the previous NetDyn experiments in the current environment. Other studies of network behavior have appeared in the literature [11, 5, 4, 9]. In [11] the study focuses on building analytical models for network experiments. While we use some of the same analysis techniques, our focus is on characterizing network traffic. The problem of inadvertent synchronization is discussed in [5] which observed network behavior using ping. Ping ....
....which observed network behavior using ping. Ping uses ICMP packets which are treated differently than user level packets by some gateways, and so would be unsuitable for a study of user level behavior. The experiments discussed in [9] rely on kernel modifications to observe network behavior. In [4], user level performance was investigated. In this study, all of the hosts for the experiment were in one location (UC Berkeley) Our experiments differ from these studies in that we wanted to observe user level performance of end to end paths over a wide area with minimal interference. 2 ....
L.F. Cabrera, E. Hunter, M. Karels, and D.A. Mosho. User-process communication performance in networks of computers. IEEE/ACM Trans. on Software Eng., 14(1):38--53, 1988.
....except those generated by the replication protocol under study. ffl We consider a message loss rate of 10 Gamma6 (it includes message collisions and transmission errors) In house observations and studies in Ethernet networks show that the actual loss of messages is about one in two millions [5] whenever the network traffic is far below the achievable throughput [19] This assumption is fulfilled in the context of our simulations, where the generated load is under 15 of the available bandwidth (this can be verified by computing the number of messages generated during the simulation) ....
L. F. Cabrera, E. Hunter, M. Karels, and D. Mosher. User-process communication performances in networks of computers. IEEE Transactions on Software Engineering, 14(1), January 1988.
....a general purpose transport protocol can be effective in a wide range of distributed applications. This idea was also followed in our approach. The third area of performance oriented issues with the design and implementation of communication protocols is covered by a numerous wealth of papers, as [Carbrera, Hunter, Karels, Mosher, 1988] or [Son, Chang, 1990] to mention only a few. They study the impact of different processors, network hardware interfaces across machines and also the effect of the loading of hosts and communication media to catch the dynamic behavior of the communication facilities. 3. The Logical Protocol Unit ....
Carbrera, L.-F., Hunter, E., Karels, M.J., and Mosher, D.A. User-Process Communication Performance in Networks of Computers, IEEE Transactions on Software Engineering, Vol. 14, No. 1, pp. 38---53, 1988
....and implementation alternatives using their model. Measurement of network software delays was attempted by Bhargava, et al. 3] who measured delay at the UDP level on a Sun and compared it to an experimental replacement they had designed. Measured delay for TCP and UDP in 4. 2BSD was done in [4] , on both Sun II s and on Vax 11 780 s and Vax 11 750 s. They measured timings for thousands of packets on both unloaded and loaded ethernets, on both 10MBps and 3MBps Ethernet. They also used a profiler to breakdown time spent in the various routines that make up the BSD 4.2 TCP IP and UDP IP ....
....rcp had the best performance and NFS had the worst. However, since NFS is the easiest to use, the added convenience of NFS may offset its performance penalty for many applications. We plan to continue to investigate file copy performance under varying network load. A number of recent papers [4], 5] have reported on the causes of delay in the TCP, UDP and IP layers on a Unix workstation. The relative contribution of RPC and disk accesses would complete the picture and allow for a complete accouting of file transfer times. We also intend to explore the effect of varying the number of ....
L. Cabrera, E. Hunter, M. J. Karels, and D. A. Mosher "User-Process Communication Performance in Networks of Computers," IEEE Trans. on Software Eng. vol.14, no.1, January1988, pp. 38-53.
.... independent uniform random variables, and no attempt was made to order requests to schedule the disk arm, pessimistic assumptions when advanced layout policies are used [12] The data transfer processing costs were taken into account by assuming that protocol processing required 1500 instructions [13] plus 1 instruction per byte in the packet. The load that could be carried depended both on the number of disks used and the block size. The delay was dominated by the disk, with an average seek time of 16 milliseconds, an average rotational delay of 8:3 milliseconds and a transfer rate of 2:5 ....
L.-F. Cabrera, E. Hunter, M. J. Karels, and D. A. Mosher, "User-process communication performance in networks of computers," IEEE Transactions on Software Engineering, vol. 14, pp. 38--53, Jan. 1988.
....was almost as significant as the number of disks. The per message network data transfer processing costs are also an important factor in the effect of the transfer unit. For example, it was assumed that protocol processing required 1500 instructions plus 1 instruction per byte in the packet [14]. As the size of the packet increases, the protocol cost decreases proportionally to the packet size. The cost of 1 instruction per byte in the packet is for the most part unavoidable, since it reflects necessary data copying. In figure 6 we see that the demands on the processor are significant ....
L.-F. Cabrera, E. Hunter, M. J. Karels, and D. A. Mosher, "User-process communication performance in networks of computers," IEEE Transactions on Software Engineering, vol. 14, pp. 38--53, Jan. 1988.
....performance. As it is, our model provides a lower bound on the data rates that could be achieved. Transmitting a message on the network requires protocol processing, time to acquire the token, and transmission time. The protocol cost for all packets has been estimated at 1,500 instructions [18] plus one instruction per byte in the packet. The time to transmit the packet is based on the network transfer rate. 5.2 Simulation Results The simulator gave us the ability to determine what data rates were possible given a configuration of processors, interconnection medium and storage ....
L.-F. Cabrera, E. Hunter, M. J. Karels, and D. A. Mosher, "User-process communication performance in networks of computers," IEEE Transactions on Software Engineering, vol. 14, pp. 38--53, Jan. 1988.
....improve performance. As it is, our model provides a lower bound on the data rates that could be achieved. Transmitting a message on the network requires protocol processing, time to acquire the token, and transmission time. The protocol cost for all packets has been estimated at 1,500 instructions [22] plus one instruction per byte in the packet. The time to transmit the packet is based on the network transfer rate. These estimated costs remain adequate as newer hardware technology has yet to decrease the total software overhead of accessing the network. 5.2 Simulation Results The simulator ....
L.-F. Cabrera, E. Hunter, M. J. Karels, and D. A. Mosher, "User-process communication performance in networks of computers," IEEE Transactions on Software Engineering, vol. 14, pp. 38--53, Jan. 1988.
No context found.
L.-F. Cabrera, E. Hunter, M. J. Karels, D. A. Mosher, "User-Process Communication Performance in Networks of Computers, " IEEE Transactions on Software Engineering, 14(1), 38-53, Jan. 1988.
No context found.
L. F. Cabrera, E. Hunter, M. Karels and D. Mosher, `User-process communication performance in networks of computers', IEEE Trans. Software Eng., SE-14, 38--53 (1988).
No context found.
L. Cabrera, E. Hunter, M. Karels, and D. Mosho. User-process communication performance in networks of computers. IEEE/ACM Transactions on Software Engineering, 14(1):38--53, 1988.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC