| Sunil Saxena, J. Kent Peacock, Fred Yang, Vijaya Verma, and Mohan Krishnan. Pitfalls in multithreading SVR4 STREAMS and other weightless processes. In Proceedings of the Winter 1993 USENIX Conference, pages 85--96, San Diego, CA, January 1993. |
....Computer Science University of Massachusetts Amherst, MA 01003 1 Introduction In this proposal, we explore scheduling issues in parallelized network protocol processing on shared memory multiprocessors. The use of parallelism in network processing has recently become an area of active research [6, 20, 31, 34, 39, 42, 49, 64, 65, 66, 80] and significant applied commercial interest [20, 31, 55, 64] The problem area is motivated in part by emerging high speed networks, such as ATM, capable of delivering gigabit range bandwidth to individual host endpoints. Emerging large scale server applications, such as digital information ....
....proposal, we explore scheduling issues in parallelized network protocol processing on shared memory multiprocessors. The use of parallelism in network processing has recently become an area of active research [6, 20, 31, 34, 39, 42, 49, 64, 65, 66, 80] and significant applied commercial interest [20, 31, 55, 64]. The problem area is motivated in part by emerging high speed networks, such as ATM, capable of delivering gigabit range bandwidth to individual host endpoints. Emerging large scale server applications, such as digital information repositories and image archives, will require application level ....
[Article contains additional citation context not shown here]
Sunil Saxena, J. Kent Peacock, Fred Yang, Vijaya Verma, and Mohan Krishnan. Pitfalls in multithreading SVR4 STREAMS and other weightless processes. In Proceedings of the Winter 1993 USENIX Conference, pages 85--96, San Diego, CA, January 1993.
....evaluate its potential as a paradigm for communication on shared memory multiprocessor servers. We focus on three issues in this evaluation: 1 ffl Scalability in processors and connections. Connection level parallelism is an approach advocated for communication on shared memory multiprocessors [21, 65, 67]. We will provide an implementation of CLP, and experimentally demonstrate how throughput scales with the number of processors, and how throughput changes as the number of connections increases. ffl Fairness behavior. In evaluating the scalability of connection level parallelism, one measure of ....
....connections and within a single connection. The disadvantage is that it requires locking shared state, most notably the protocol state at each layer. Systems using this approach include [5, 24, 29, 31, 50] A set of connections forms the unit of concurrency in connection level parallelism [21, 60, 65, 67]. Speedup is achieved using multiple connections, which can potentially be processed in parallel. The advantage of this approach is that it exploits the natural concurrency between connections. For each set of connections which form a unit of concurrency, all or part of the protocol stack is ....
[Article contains additional citation context not shown here]
Saxena, S., Peacock, J. K., Yang, F., Verma, V., and Krishnan, M. Pitfalls in multithreading SVR4 STREAMS and other weightless processes. In Proceedings of the Winter 1993 USENIX Conference, pages 85--96, San Diego, CA, Jan. 1993.
....Parallelism gains can be achieved mainly through pipelining effects. An example is found in [10] Connection level parallelism associates connections with a single processor or thread, achieving speedup with multiple connections. Multiprocessor STREAMS most closely matches this model [26, 27]. Functional parallelism decomposes functions within a single protocol and assigns them to processing elements. Examples include [19, 23, 25] The relative merits of one approach over the others depends on many factors, including the host architecture, the number of connections, whether the ....
Sunil Saxena, J. Kent Peacock, Fred Yang, Vijaya Verma, and Mohan Krishnan. Pitfalls in multithreading SVR4 STREAMS and other weightless processes. In Winter 1993 USENIX Technical Conference, pages 85--96, San Diego, CA, January 1993.
....that processor s cache, thus avoiding accesses to the slower main memory and resulting in faster execution times. In this paper, we evaluate several different affinity based scheduling policies for parallelized protocol processing, an application which has recently generated considerable interest [4, 6, 8, 10, 13, 16, 17, 18, 19]. We consider protocol parallelization paradigms in which each message, during the course of its processing, visits a single processor and executes within the context of a single thread 2 . This captures the parallelization found in several multiprocessor protocol implementations, including ....
.... found in several multiprocessor protocol implementations, including parallelizations of the x kernel [2, 13] the STREAMS implementation in Plan 9 [17] and the ASX framework [19] A related form of parallelism is found in the STREAMS implementations in many commercial operating systems [18, 4, 6]. We present two sets of results. First, we show that affinity scheduling can significantly reduce message delay associated with protocol processing, enabling the host to support a greater number of concurrent streams, to provide a higher maximum throughput to individual streams, and to decrease ....
[Article contains additional citation context not shown here]
Sunil Saxena, J. Kent Peacock, Fred Yang, Vijaya Verma, and Mohan Krishnan. Pitfalls in multithreading SVR4 STREAMS and other weightless processes. In Proceedings of the Winter 1993 USENIX Conference, pages 85--96, San Diego, CA, January 1993.
.... implementations of TCP and UDP transport protocols built within a multi processor version of the x kernel; 16] examined performance issues in parallelizing TCP based and UDP based protocol stacks using a thread per request strategy in a different multi processor version of the x kernel; and [21] measured the performance of the TCP IP protocol stack using a thread per connection strategy in a multi processor version of System V STREAMS. The results presented in this paper extend existing research on protocol stack parallelism in several ways. First, we measure the performance of a variety ....
Sunil Saxena, J. Kent Peacock, Fred Yang, Vijaya Verma, and Mohan Krishnan. Pitfalls in Multithreading SVR4 STREAMS and other Weightless Processes. In Proceedings of the Winter USENIX Conference, pages 85--106, San Diego, CA, January 1993.
....in the host. The use of parallelism in network protocol processing has recently become an active area of research in both academia [5, 12, 20, 21, 22, 35, 46, 47, 48, 59, 62, 63, 64, 69, 70, 71, 72, 74, 75, 76, 77, 87, 88, 89, 95, 100, 106, 107, 108, 109, 111, 112, 116, 117, 124, 125] and industry [18, 37, 42, 45, 49, 68, 90, 94, 110, 120]. Many approaches to parallelism in network protocols have been proposed. We provide a brief taxonomy of parallelism in protocols here; more detailed surveys can be found in [12, 48] In general, we attempt to classify approaches by the unit of concurrency, or what it is that processing elements ....
....be acquired, namely, that for the appropriate connection. The disadvantage with connection level parallelism is that no concurrency within a single connection can be achieved. This may be a problem if traffic exhibits locality [60, 73, 81, 93] i.e. is bursty. Systems using this approach include [100, 110, 111, 123]. In packet level parallelism, packets are the unit of concurrency. Sometimes referred to as thread per packet or processor per message, packet level parallelism assigns each packet or message to a single processing element. The advantage of this approach is that packets are processed regardless ....
Saxena, S., Peacock, J. K., Yang, F., Verma, V., and Krishnan, M. Pitfalls in multithreading SVR4 STREAMS and other weightless processes. In Winter 1993 USENIX Technical Conference, pages 85--96, San Diego, CA, Jan. 1993.
....which attempts to manage processors and threads in a manner that reduces cache misses and decreases execution times. In this paper, we explore affinity based scheduling of parallel network protocol processing , an area of research which has recently generated considerable interest (e.g. [3, 8, 19, 23, 28, 32, 33, 34, 40]) The use of parallelism An earlier version of this paper was presented at the IEEE Infocom 96 Conference. The paper was selected by the conference as one of its top papers and referred to the Transactions for possible publication after the Transactions own independent review. This work was ....
....both increase the bandwidth and decrease the latency of multiprocessor communication. In functional parallelism, an individual packet concurrently visits multiple processors (e.g. 16, 19] In layer parallelism, packets visit multiple processors in a pipelined fashion (e.g. 33] Packetlevel [3, 8, 11, 13, 14, 23, 28, 32, 33, 34] and connectionlevel [8, 11, 28, 32, 34, 40] parallelisms enable concurrency at higher levels of granularity. In general, some form of network parallelism is generally necessary on multiprocessor machines, since the alternative would restrict aggregate network access to the bandwidth capacity of ....
[Article contains additional citation context not shown here]
Sunil Saxena, J. Kent Peacock, Fred Yang, Vijaya Verma, and Mohan Krishnan. Pitfalls in multithreading SVR4 STREAMS and other weightless processes. In Proc. Winter 1993 USENIX Conference, pages 85-- 96, San Diego, CA, January 1993.
....entity (in these cases, the process) is much larger than the time required to entirely reload the referenced memory locations into the cache. In this paper, we explore the benefits of affinity scheduling of parallel networking, an area of research which has recently generated considerable interest [3, 4, 6, 12, 13, 15, 17, 18, 20, 23, 25, 26, 27, 35] and one for which affinity scheduling has not yet been examined. Intuitively, parallel networking is a potential candidate for the technique since packets can be individually scheduled and the time to process a packet is relatively short. However, several aspects of the application domain ....
....of the x kernel 1 Depending on the coherence state of the referenced cache line. 2 In functional parallelism, an individual packet concurrently visits multiple processors [4, 15, 17] In layer parallelism, packets visit multiple processors in a pipelined fashion [7, 24, 26] Packet level [3, 6, 9, 12, 13, 20, 23, 25, 26, 27] and connection level [6, 9, 23, 25, 27] parallelisms enable concurrency at higher levels of granularity. 3 We use the terms thread and process interchangeably. 3, 20] the STREAMS implementation in Plan 9 [23] and the ASX framework [26, 27] A related form of parallelism is found in the ....
[Article contains additional citation context not shown here]
Sunil Saxena, J. Kent Peacock, Fred Yang, Vijaya Verma, and Mohan Krishnan. Pitfalls in multithreading SVR4 STREAMS and other weightless processes. In Proceedings of the Winter 1993 USENIX Conference, pages 85--96, San Diego, CA, January 1993.
....kept to a minimum along the fast path of data transfer. The disadvantage with connection level parallelism is that no concurrency within a single connection can be achieved, which may be a problem if traffic exhibits locality [16, 19, 22, 27] i.e. is bursty. Systems using this approach include [29, 33, 34, 38]. In packet level parallelism, packets are the unit of concurrency. Sometimes referred to as thread per packet or processor per message, packet level parallelism assigns each packet or message to a single processing element. The advantage of this approach is that packets are processed regardless ....
S. Saxena, J. K. Peacock, F. Yang, V. Verma, and M. Krishnan. Pitfalls in multithreading SVR4 STREAMS and other weightless processes. In Winter 1993 USENIX Technical Conference, pages 85--96, San Diego, CA, Jan. 1993.
....processing required by connections with individual processes or threads. On a shared memory multiprocessor, performance gains can be realized over multiple connections byexecuting these threads concurrently on different processors. Previous work on connection level parallelism can be found in [8, 25, 28, 29]. In particular, Schmidt and Suda [29] have shown good scalability of the receive side data path in connection level parallelism, using a thread for each connection, on a 20 processor Sun SPARCCenter 2000. In this paper, we experimentally evaluate connection level parallelism in a number of ....
....classifier [1, 19, 20, 32] Given a set of connections, threads, and processors, the assignment or mappingbetween them can be done in a number of different ways. Choosing a mapping defines the granularity of a connectionlevel parallel implementation. Previous work on connection level parallelism [8, 25, 28, 29] has focused on relatively static assignments of connections to processes. One novel aspect of our implementation is that it allows us to vary the mapping between processors, connections, and threads. We introduce the abstraction of a virtual processor, which allows us to vary this assignment, ....
[Article contains additional citation context not shown here]
S. Saxena, J. K. Peacock, F. Yang, V. Verma, and M. Krishnan. Pitfalls in multithreading SVR4 STREAMS and other weightless processes. In Proceedings of the Winter 1993USENIX Conference, pages 85--96, San Diego, CA, Jan. 1993.
....In contrast, the CI may make allocation decisions based on current communication behavior and on actual levels of resource utilization and availability. This auto configuration attribute of the CI distinguishes our research from other current and past work on configurable communication systems [10, 17, 22, 26, 27, 34, 16]. However, note that auto configuration is also used in recent work that deals with dynamic changes in video traffic mapped to the QoS guarantees offered by ATM networks[30] and that it has proven useful in past research on guaranteeing the predictable behavior of real time applications[3, 8] ....
S. Saxena, J. K. Peacock, et al. Pitfalls in multithreading svr4 streams and other weightless processes. Proc. USENIX, pages 85--96, Winter'93.
.... implementations of TCP and UDP transport protocols built within a multi processor version of the x kernel; 16] examined performance issues in parallelizing TCP based and UDP based protocol stacks using a Thread per Request strategy in a different multi processor version of the x kernel; and [21] measured the performance of the TCP IP protocol stack using a thread per connection strategy in a multi processor version of System V STREAMS. The results presented in this paper extend existing research on protocol stack parallelism in several ways. First, we measure the performance of a variety ....
Sunil Saxena, J. Kent Peacock, Fred Yang, Vijaya Verma, and Mohan Krishnan. Pitfalls in Multithreading SVR4 STREAMS and other Weightless Processes. In Proceedings of the Winter USENIX Conference, pages 85--106, San Diego, CA, January 1993.
....number of connections with high throughput and predictable jitter and (2) distributed real time applications, where multiple streams of data are collected and processed subject to specific and possibly dynamic rates and timing requirements. While previous work in configurable communication systems [6, 8, 10, 13, 14, 18] addresses similar application domains, the maintenance of QoS requirements by on line auto configuration of protocols pursued in our research constitutes a novel contribution. The remainder of this paper first outlines the COMM adapt software architecture and its prototype (see Section 2) ....
S. Saxena, J. K. Peacock, et al. Pitfalls in multithreading svr4 streams and other weightless processes. Proc. USENIX, pages 85--96, Winter'93.
.... implementations of TCP and UDP transport protocols built within a multi processor version of the x kernel; 12] examined performance issues in parallelizing TCP based and UDP based protocol stacks using a Thread per Request strategy in a different multi processor version of the x kernel; and [15] measured the performance of the TCP IP protocol stack using a thread per connection strategy in a multi processor version of System V STREAMS. The results presented in this paper extend existing research on protocol stack parallelism in several ways. First, we measure the performance of a variety ....
Sunil Saxena, J. Kent Peacock, Fred Yang, Vijaya Verma, and Mohan Krishnan. Pitfalls in Multithreading SVR4 STREAMS and other Weightless Processes. In Proceedings of the Winter USENIX Conference, pages 85--106, San Diego, CA, January 1993.
....generated memory references are likely to be found in that processor s cache, thus avoiding accesses to the slower main memory and resulting in faster execution times. We study affinity based scheduling of parallel network protocol processing, which has recently become an area of active research [3, 8, 12, 14, 17, 20, 24, 34, 35, 36, 41] and significant applied commercial interest [8, 12, 29, 34] The use of parallelism in protocol processing is motivated by the development of high speed networks (such as ATM) capable of delivering gigabit range bandwidthto individualmachines. Emerging large scale server applications, such as ....
....accesses to the slower main memory and resulting in faster execution times. We study affinity based scheduling of parallel network protocol processing, which has recently become an area of active research [3, 8, 12, 14, 17, 20, 24, 34, 35, 36, 41] and significant applied commercial interest [8, 12, 29, 34]. The use of parallelism in protocol processing is motivated by the development of high speed networks (such as ATM) capable of delivering gigabit range bandwidthto individualmachines. Emerging large scale server applications, such as digital multimedia information repositories, require ....
[Article contains additional citation context not shown here]
Sunil Saxena, J. Kent Peacock, Fred Yang, Vijaya Verma, and Mohan Krishnan. Pitfalls in multithreading SVR4 STREAMS and other weightless processes. In Proceedings of the Winter 1993 USENIX Conference, pages 85--96, San Diego, CA, January 1993.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC