| A. Garg. Parallel STREAMS: A multi-processor implementation. In Winter 1990. |
....Computer Science University of Massachusetts Amherst, MA 01003 1 Introduction In this proposal, we explore scheduling issues in parallelized network protocol processing on shared memory multiprocessors. The use of parallelism in network processing has recently become an area of active research [6, 20, 31, 34, 39, 42, 49, 64, 65, 66, 80] and significant applied commercial interest [20, 31, 55, 64] The problem area is motivated in part by emerging high speed networks, such as ATM, capable of delivering gigabit range bandwidth to individual host endpoints. Emerging large scale server applications, such as digital information ....
....proposal, we explore scheduling issues in parallelized network protocol processing on shared memory multiprocessors. The use of parallelism in network processing has recently become an area of active research [6, 20, 31, 34, 39, 42, 49, 64, 65, 66, 80] and significant applied commercial interest [20, 31, 55, 64]. The problem area is motivated in part by emerging high speed networks, such as ATM, capable of delivering gigabit range bandwidth to individual host endpoints. Emerging large scale server applications, such as digital information repositories and image archives, will require application level ....
[Article contains additional citation context not shown here]
Arun Garg. Parallel STREAMS: A multi-processor implementation. In Proceedings of the Winter 1990 USENIX Conference, pages 163--176, Washington, D.C., January 1990.
....evaluate its potential as a paradigm for communication on shared memory multiprocessor servers. We focus on three issues in this evaluation: 1 ffl Scalability in processors and connections. Connection level parallelism is an approach advocated for communication on shared memory multiprocessors [21, 65, 67]. We will provide an implementation of CLP, and experimentally demonstrate how throughput scales with the number of processors, and how throughput changes as the number of connections increases. ffl Fairness behavior. In evaluating the scalability of connection level parallelism, one measure of ....
....connections and within a single connection. The disadvantage is that it requires locking shared state, most notably the protocol state at each layer. Systems using this approach include [5, 24, 29, 31, 50] A set of connections forms the unit of concurrency in connection level parallelism [21, 60, 65, 67]. Speedup is achieved using multiple connections, which can potentially be processed in parallel. The advantage of this approach is that it exploits the natural concurrency between connections. For each set of connections which form a unit of concurrency, all or part of the protocol stack is ....
[Article contains additional citation context not shown here]
Garg, A. Parallel STREAMS: a multi-processor implementation. In Proceedings of the Winter 1990 USENIX Conference, Washington, D.C., Jan. 1990.
....this model, protocol processing is treated as work strictly local to each processor, resulting in an implicit sharing between the computation and communication subsystems. An alternative approach treats protocol processing as global work that can be scheduled uniformly on any available processor [10, 11]; this results in explicit sharing between the two subsystems. These approaches may not suffice for QoS sensitive protocol processing since they introduce unpredictability in the availability and allocation of processing resources, and complicate global coordination for network access. Our ....
A. Garg, "Parallel STREAMS: A multiprocessor implementation," in Winter 1990 USENIX Conference, pp. 163--176, January 1990.
....model, protocol processing is treated as work strictly local to each processor, resulting in an implicit sharing between the computation and communication subsystems. An alternative approach 34 treats protocol processing as global work that can be scheduled uniformly on any available processor [67, 99]; this results in explicit sharing between the two subsystems. Implications for QoS sensitive protocol processing These approaches to exploiting communication parallelism may not suffice for QoS sensitive protocol processing since they introduce unpredictability in the availability and ....
A. Garg, "Parallel STREAMS: A multiprocessor implementation," in Winter 1990 USENIX Conference, pp. 163--176, January 1990.
....that processor s cache, thus avoiding accesses to the slower main memory and resulting in faster execution times. In this paper, we evaluate several different affinity based scheduling policies for parallelized protocol processing, an application which has recently generated considerable interest [4, 6, 8, 10, 13, 16, 17, 18, 19]. We consider protocol parallelization paradigms in which each message, during the course of its processing, visits a single processor and executes within the context of a single thread 2 . This captures the parallelization found in several multiprocessor protocol implementations, including ....
.... found in several multiprocessor protocol implementations, including parallelizations of the x kernel [2, 13] the STREAMS implementation in Plan 9 [17] and the ASX framework [19] A related form of parallelism is found in the STREAMS implementations in many commercial operating systems [18, 4, 6]. We present two sets of results. First, we show that affinity scheduling can significantly reduce message delay associated with protocol processing, enabling the host to support a greater number of concurrent streams, to provide a higher maximum throughput to individual streams, and to decrease ....
Arun Garg. Parallel STREAMS: A multi-processor implementation. In Proceedings of the Winter 1990 USENIX Conference, pages 163--176, Washington, D.C., January 1990.
....additional message must be sent upward to inform the Stream Head. This working mode is now automatically set by the XTI library. 3.2. 2 Parallelization of the demultiplexed STREAMS stack STREAMS has been extended to facilitate the development of components on a symmetric multiprocessor platform [Garg90, Kleiman92, Saxena93]. These extensions, which are proprietary, define several levels of parallelism (Figure 3) according to the span of the mutual exclusion section (e.g. the module level allows a single thread to execute in a driver or module) Non parallelized components will run with minimal changes if a module ....
A. Garg, "Parallel STREAMS: a multiprocessor implementation", USENIX, Vol 3, No 1, pp. 163-176, Winter 1990.
....in the host. The use of parallelism in network protocol processing has recently become an active area of research in both academia [5, 12, 20, 21, 22, 35, 46, 47, 48, 59, 62, 63, 64, 69, 70, 71, 72, 74, 75, 76, 77, 87, 88, 89, 95, 100, 106, 107, 108, 109, 111, 112, 116, 117, 124, 125] and industry [18, 37, 42, 45, 49, 68, 90, 94, 110, 120]. Many approaches to parallelism in network protocols have been proposed. We provide a brief taxonomy of parallelism in protocols here; more detailed surveys can be found in [12, 48] In general, we attempt to classify approaches by the unit of concurrency, or what it is that processing elements ....
Garg, A. Parallel STREAMS: a multi-processor implementation. In Proceedings of the Winter 1990 USENIX Conference, pages 163--176, Washington, D.C., Jan. 1990.
....which attempts to manage processors and threads in a manner that reduces cache misses and decreases execution times. In this paper, we explore affinity based scheduling of parallel network protocol processing , an area of research which has recently generated considerable interest (e.g. [3, 8, 19, 23, 28, 32, 33, 34, 40]) The use of parallelism An earlier version of this paper was presented at the IEEE Infocom 96 Conference. The paper was selected by the conference as one of its top papers and referred to the Transactions for possible publication after the Transactions own independent review. This work was ....
....both increase the bandwidth and decrease the latency of multiprocessor communication. In functional parallelism, an individual packet concurrently visits multiple processors (e.g. 16, 19] In layer parallelism, packets visit multiple processors in a pipelined fashion (e.g. 33] Packetlevel [3, 8, 11, 13, 14, 23, 28, 32, 33, 34] and connectionlevel [8, 11, 28, 32, 34, 40] parallelisms enable concurrency at higher levels of granularity. In general, some form of network parallelism is generally necessary on multiprocessor machines, since the alternative would restrict aggregate network access to the bandwidth capacity of ....
[Article contains additional citation context not shown here]
Arun Garg. Parallel STREAMS: A multi-processor implementation. In Proc. Winter 1990 USENIX Conference, pages 163--176, Washington, D.C., January 1990.
....entity (in these cases, the process) is much larger than the time required to entirely reload the referenced memory locations into the cache. In this paper, we explore the benefits of affinity scheduling of parallel networking, an area of research which has recently generated considerable interest [3, 4, 6, 12, 13, 15, 17, 18, 20, 23, 25, 26, 27, 35] and one for which affinity scheduling has not yet been examined. Intuitively, parallel networking is a potential candidate for the technique since packets can be individually scheduled and the time to process a packet is relatively short. However, several aspects of the application domain ....
....of the x kernel 1 Depending on the coherence state of the referenced cache line. 2 In functional parallelism, an individual packet concurrently visits multiple processors [4, 15, 17] In layer parallelism, packets visit multiple processors in a pipelined fashion [7, 24, 26] Packet level [3, 6, 9, 12, 13, 20, 23, 25, 26, 27] and connection level [6, 9, 23, 25, 27] parallelisms enable concurrency at higher levels of granularity. 3 We use the terms thread and process interchangeably. 3, 20] the STREAMS implementation in Plan 9 [23] and the ASX framework [26, 27] A related form of parallelism is found in the ....
[Article contains additional citation context not shown here]
Arun Garg. Parallel STREAMS: A multi-processor implementation. In Proceedings of the Winter 1990 USENIX Conference, pages 163--176, Washington, D.C., January 1990.
....processing required by connections with individual processes or threads. On a shared memory multiprocessor, performance gains can be realized over multiple connections byexecuting these threads concurrently on different processors. Previous work on connection level parallelism can be found in [8, 25, 28, 29]. In particular, Schmidt and Suda [29] have shown good scalability of the receive side data path in connection level parallelism, using a thread for each connection, on a 20 processor Sun SPARCCenter 2000. In this paper, we experimentally evaluate connection level parallelism in a number of ....
....classifier [1, 19, 20, 32] Given a set of connections, threads, and processors, the assignment or mappingbetween them can be done in a number of different ways. Choosing a mapping defines the granularity of a connectionlevel parallel implementation. Previous work on connection level parallelism [8, 25, 28, 29] has focused on relatively static assignments of connections to processes. One novel aspect of our implementation is that it allows us to vary the mapping between processors, connections, and threads. We introduce the abstraction of a virtual processor, which allows us to vary this assignment, ....
[Article contains additional citation context not shown here]
A. Garg. Parallel STREAMS: a multi-processor implementation. In Proceedings of the Winter 1990 USENIX Conference, Washington, D.C., Jan. 1990.
....that execute the protocol related computations stacks can handle dynamic variations in the resource requirements, but no details are provided on the corresponding mechanisms. The ADAPTIVE system[27] enables only application level adaptation decisions. The parallel STREAMS implementation in [7] offers only userdirected stream configuration, and the policy for runtime scheduling of communication related tasks ignores QoS requirements. The framework presented in [34] supports QoS requirements in the context of functionally decomposed protocol stacks by CI level connection time ....
A. Garg. Parallel streams: a multi-processor implementation. Proc. USENIX, pages 163--176, Winter '90.
....application level adaptation decisions. The framework presented in [18] supports QoS requirements by connection time configuration decided by the CI. 8] only comment on the necessity of runtime adjustments in the mapping of protocol modules to processors. The parallel STREAMS implementation in [4] offers only user directed stream configuration, and the policy for runtime scheduling of communication related tasks ignores QoS requirements. We consider that dynamic auto configurability can provide benefits to protocol architectures as HOPS [5] F CSS [18] or micro protocols [1] different ....
A. Garg. Parallel streams: a multi-processor implementation. Proc. USENIX, pages 163--176, Winter'90.
....generated memory references are likely to be found in that processor s cache, thus avoiding accesses to the slower main memory and resulting in faster execution times. We study affinity based scheduling of parallel network protocol processing, which has recently become an area of active research [3, 8, 12, 14, 17, 20, 24, 34, 35, 36, 41] and significant applied commercial interest [8, 12, 29, 34] The use of parallelism in protocol processing is motivated by the development of high speed networks (such as ATM) capable of delivering gigabit range bandwidthto individualmachines. Emerging large scale server applications, such as ....
....accesses to the slower main memory and resulting in faster execution times. We study affinity based scheduling of parallel network protocol processing, which has recently become an area of active research [3, 8, 12, 14, 17, 20, 24, 34, 35, 36, 41] and significant applied commercial interest [8, 12, 29, 34]. The use of parallelism in protocol processing is motivated by the development of high speed networks (such as ATM) capable of delivering gigabit range bandwidthto individualmachines. Emerging large scale server applications, such as digital multimedia information repositories, require ....
[Article contains additional citation context not shown here]
Arun Garg. Parallel STREAMS: A multi-processor implementation. In Proceedings of the Winter 1990 USENIX Conference, pages 163--176, Washington, D.C., January 1990.
No context found.
A. Garg. Parallel STREAMS: A multi-processor implementation. In Winter 1990.
No context found.
Arun Garg, "Parallel STREAMS: a Multi-Processor Implementation," USENIX (Winter 1990), pp. 163-176.
No context found.
A. Garg. Parallel STREAMS: a multiprocessor implementation. In Proceedings of the Winter 1990 USENIX Conference, pages 163--76, Washington, DC, January 1990.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC