42 citations found. Retrieving documents...
K. K. Parhi, "Algorithm transformation techniques for concurrent processors, " Proc. IEEE, vol. 77, pp. 1879--1895, Dec. 1989.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Power vs. Performance Tradeoffs for Reduced Resolution LMS.. - Riten Gupta And (1998)   (Correct)

....in these handsets. There have been many digital hardware design strategies proposed for power reduction including: reduction of supply voltage, reduction of clock speed and data rate, parallelization and pipelining of operations, using sign magnitude arithmetic, and differential encoding of data [2], 3] Another technique, which is the springboard for this paper, is the reduction of the number of bits used to represent the data and control variables in the digital circuit [4] The bit width reduction strategy is very highly leveraged since it reduces the power dissipation everywhere in the ....

K. K. Parhi, "Algorithm transformation techniques for concurrent processors," IEEE Proceedings, vol. 77, pp. 1879-- 1895, Dec. 1989.


Behavioral Optimization Using the Manipulation of Timing.. - Potkonjak, Srivastava (1998)   (4 citations)  (Correct)

....during an iteration: it consumes a single data sample from each of its incoming data edges and produces a single sample on each of its output edge. However, single rate CDFG s are not sufficient to represent the complexity of many present day DSP designs, which often require multirate CDFG s [28] [35]. In multirate systems, the computation loop may require different nodes to be executed a different number of times in a single computation iteration. While we do not develop the theoretical framework underlying rephasing of multirate CDFG s, in this section, we show that rephasing is indeed ....

....not develop the theoretical framework underlying rephasing of multirate CDFG s, in this section, we show that rephasing is indeed applicable to such CDFG s and carries similar benefits as in rephasing of single rate CDFG s. Fig. 9(a) shows an example of a multirate CDFG. Following the notation of [35], the numbers at the inputs of a node represent the number of samples consumed by it from that input on each invocation of the node. Similarly, the numbers at the outputs of a node represent the number of samples produced by it on that output on each invocation. In this example, node A fires twice ....

[Article contains additional citation context not shown here]

K. K. Parhi, "Algorithm transformation techniques for concurrent processors, " Proc. IEEE, vol. 77, pp. 1879--1895, Dec. 1989.


A Simple Hardware Implementation of the Tabu Search.. - Traferro, Uncini (1999)   (Correct)

....of the edge is x at the n th iteration, then x at the (n 1) st iteration is in the output side of the arc: xx n D n 1 The tabu set, T, is supposed to have a constant number of elements and it equals the tabu tenure, t. All the closed path in figure1 have a simple unit delay, so, as argued in [13], no unfolding process is necessary at this abstraction level to improve the parallel degree of the system. Surveying the data dependences, we can indicate the executive temporal sequence of the algorithm exploiting the precedence graph shown in figure 2. The generation of the CL is the heavier ....

K.K. Parhi, "Algorithm Transformation Techniques for Concurrent Processors", Proc. of IEEE, vol. 77, n.12, Dec. 1989, pp.1879-1895.


Algorithms Transformation Techniques for Low-Power Wireless VLSI .. - Shanbhag (1998)   (Correct)

....tools based on such a paradigm will be necessary to realize complex VLSI systems for signal processing and communications. The present design trend (see Fig. l(b) is to incorporate VLSI issues as constraints into the algorithm design phase. In particular, algorithm transformation techniques [8] were proposed to as an intermediary step in the translation to VLSI hardware. These techniques were originally developed for high throughput applications. However, they have found applications in low power design as well [9] Algorithm transformation techniques modify the algorithm structure ....

....it by removing arcs with non zero delays. Thus, the iteration period of the DFG in Fig. 3 is 30 time units. The critical path of a DFG is a path p such that d(p) IP. The goal of most algorithm transformation techniques is to reduce the delay of the critical path. The iteration period bound (IPB) [8] for a DFG is defined as follows: max (7) vL EeL w(e) where L is a loop in the DFG, where a loop is defined as a path p whose source and destination nodes are identical. Note that IP can be altered via the application of various algorithm transformation techniques. However, the IP will always ....

[Article contains additional citation context not shown here]

K. K. Parhi, Algorithm transformation techniques for concurrent processors, Proceedings of the IEEE, vol. 77, pp. 1879-1895, Dec. 1989.


Analytical Estimation of Signal Transition Activity from .. - Ramprasad, Shanbhag.. (1997)   (12 citations)  (Correct)

....been proposed at all lev els of the design hierarchy beginning with algorithms and architectures and ending with circuits and technological innovations. Existing techniques include those at the algorithmic level (such as reduced complexity algorithms [6] architectural level (such as pipelining [12,25] and parallel pro cessing) logic (logic minimization [31] and precomputation [1] circuit (reduced voltage swing [21] adiabatic logic [3] and technological level [8] It is now well recognized that an astute algorithmic and architectural design can have a large impact on the final power ....

K. K. Parhi, "Algorithm transformation techniques for concurrent processors," Proceedings of the IEEE, vol. 77, pp. 1879-1895, December 1989.


Low-Power Adaptive Filter Architectures and their Application .. - Shanbhag, Goel (1997)   (8 citations)  (Correct)

....adaptive equalizer archi tectures. Traditionally, the focus in algorithm design has been to obtain performance in terms of better signal to noise ratios ( NR) and or bit error rates (BER) The present trend is to trade off a small amount of performance via algorithm transformation techniques [31] for a much superior VLSI architecture. Algorithm transformation techniques [6,31] such as look ahead [32] relaxed look ahead [37] block processing [33] associa tivity [36] unfolding [15,34] folding [35] retiming [21] have all been employed to design high speed algorithms and architectures. ....

....has been to obtain performance in terms of better signal to noise ratios ( NR) and or bit error rates (BER) The present trend is to trade off a small amount of performance via algorithm transformation techniques [31] for a much superior VLSI architecture. Algorithm transformation techniques [6,31] such as look ahead [32] relaxed look ahead [37] block processing [33] associa tivity [36] unfolding [15,34] folding [35] retiming [21] have all been employed to design high speed algorithms and architectures. Low power operation was then achieved by trading off excess speed with power. Of ....

K. K. Parhi, "Algorithm transformation techniques for concurrent processors," Proceedings of the IEEE, vol. 77, pp. 1879-1895, Dec. 1989.


VLSI Systems Design of 51.84 Mb/s Transceivers for ATM-LAN and.. - Shanbhag   (Correct)

....based on such a paradigm will be necessary to realize complex VLSI systems for signal processing and communications. One way to integrate algorithmic concerns (such as SNR) and implementation issues such as area, power dissipation and throughput is to employ algorithm transformation techniques [27] such as pipelining [25,28,31] parallel processing [28] unfolding [16] folding [29] retiming [22] etc. Employed traditionally for high speed applications, pipelined algorithms have found use in low power applications as well. Furthermore, by combining pipelining with folding, it is possible ....

K. K. Parhi, "Algorithm transformation techniques for concurrent processors," Proceedings of the IEEE, vol. 77, pp. 1879-1895, Dec. 1989.


Information-Theoretic Bounds on Average Signal Transition.. - Ramprasad, Shanbhag, Hajj (1999)   (3 citations)  (Correct)

....switching activity, achievable bounds, CMOS circuits, information theory, busses I. INTRODUCTION Power dissipation has become a critical VLSI design con cern in recent years [3] and a substantial amount of research is being conducted at the algorithmic [3] architectural (such as pipelining [13] and parallel processing) logic [9, 18] and circuit [4, 8] levels in order to develop power reduction techniques. Most of these efforts focus upon reducing the on chip dynamic power dissipation of CMOS circuits, which at a node is given by, PD iTCLVdf, 1.1) z where T is the transition ....

K. K. Parhi, "Algorithm transformation techniques for concur- rent processors," Proceedings of the IEEE, vol. 77, pp. 18791895, December 1989.


Retiming Synchronous Data-Flow Graphs to Minimize Execution.. - O'Neil, Sha, Jonsson (2000)   (Correct)

....techniques is retiming [16, 17] where delays are redistributed among the edges so that the application s function remains the same while the execution time decreases. Despite its usefulness when applied to HDFGs, the application of retiming to SDFGs was explored only marginally prior to 1994 [11, 18] before being studied by Zivojnovic et al. primarily as a way to minimize the delay count of a SDFG [25,27] In this section we intend to review the basics of retiming, explore some of the pitfalls which arise when studying retiming of SDFGs, demonstrate the effectiveness of retiming, and propose ....

....we have described here will prove just as valuable despite this logical gap. 5 Examples In this section, we illustrate our methods further by applying them to various SDFGs found in the literature. 20 5. 1 First Example Consider the SDFG in Figure 15(a) a variation on the example from [18] with a BRV of q = 2 1 2 . In our example, nodes A and B take 1 time unit to execute and C takes 2; thus we will attempt to retime it to have an optimal clock period of 2. There are four edges in the SDFG, so the first condition of Theorem 4.2 gives us an initial set of four inequalities: ....

K.K. Parhi. Algorithm transformation techniques for concurrent processors. Proceedings of the IEEE, 77:1879--1895, 1989.


Quantization Strategies For Low-Power Communications - Gupta (2001)   (1 citation)  (Correct)

....in these handsets. Many digital hardware design strategies have been proposed for power reduction including: reduction of supply voltage, reduction of clock speed and data rate, parallelization and pipelining of operations, using sign magnitude arithmetic, and di erential encoding of data [16, 51]. Another technique is the reduction of the number of bits (wordlength) used to represent the data and control variables in the digital circuit [52] The wordlength reduction strategy is very highly leveraged since it reduces the power dissipation everywhere in the data and control ow paths. This ....

K. K. Parhi, \Algorithm Transformation Techniques for Concurrent Processors," IEEE Proceedings, vol. 77, pp. 1879-1895, Dec. 1989.


Parallelizing Synchronous Data-Flow Graphs via Retiming - O'Neil, Sha, Tongsima   (Correct)

....techniques is retiming [15, 16] where delays are redistributed among the edges so that the application s function remains the same while the execution time decreases. Despite its usefulness when applied to HDFGs, the application of retiming to SDFGs was explored only marginally prior to 1994 [10,17] before being studied by Zivojnovic et al. primarily as a way to minimize the delay count of a SDFG [24,26] In this section we intend to review the basics of retiming, explore some of the pitfalls which arise when studying retiming of SDFGs, demonstrate the effectiveness of retiming, and propose ....

.... of algorithm; b) Its EHG 4 A B C D 4 4 4 3 2 1 3 1 2 1 4 2 1 10 6 3 4 4 (a) B 1 B 2 B 3 C 1 C 2 C 3 C 4 A 1 A 2 D 5 5 (b) Figure 14: a) Figure 11(a) retimed; b) Its EHG 14 5 A Simple Example To illustrate our method further, consider the SDFG in Figure 15, a variation on the example from [17] with a BRV of T . In our example, nodes and take time unit to execute and takes . We will attempt to retime it to have a clock period of . Our algorithm requires three passes to complete. At the outset, we compute the longest path lengths and find that T , ....

K.K. Parhi. Algorithm transformation techniques for concurrent processors. Proceedings of the IEEE, 77:1879--1895, 1989.


A Platform Independent Parallelising Tool Based on Graph.. - Sinnen, Sousa (2000)   (Correct)

....analyse the structure of the DAG for scheduling. New approaches exist that take genetic algorithms into account [24, 25] Apart from the DAG algorithms, algorithms based on the ITG are being implemented. For this graph model unfolding, re timing and software pipelining are popular techniques [26, 27, 11]. Some of these algorithms utilise again DAG scheduling algorithms for partially unfolded ITGs. To bene t from regular structures of graphs, especially from graphs derived from equations, techniques known from the VLSI processor design [20] are employed. These techniques use the regular structure ....

Keshab K. Parhi. Algorithm transformation techniques for concurrent processors. Proceedings of the IEEE, 77(12):18791895, December 1989.


A Simple Hardware Implementation of the Tabu Search.. - Traferro, Uncini (1999)   (Correct)

....of the edge is x at the n th iteration, then x at the (n 1) st iteration is in the output side of the arc: xx n D n 1 The tabu set, T, is supposed to have a constant number of elements and it equals the tabu tenure, t. All the closed path in figure1 have a simple unit delay, so, as argued in [13], no unfolding process is necessary at this abstraction level to improve the parallel degree of the system. Surveying the data dependences, we can indicate the executive temporal sequence of the algorithm exploiting the precedence graph shown in figure 2. The generation of the CL is the heavier ....

K.K. Parhi, "Algorithm Transformation Techniques for Concurrent Processors", Proc. of IEEE, vol. 77, n.12, Dec. 1989, pp.1879-1895.


A Low-Power, Reconfigurable Adaptive Equalizer Architecture - Tschanz, Shanbhag (1999)   (1 citation)  (Correct)

....in Section 2 the architecture of the reconfigurable equalizer is presented, while simulation results are shown in Section 3. 1. 1 Dynamic algorithm transformations (DAT) Traditionally, signal processing systems have been designed for low power operation by applying certain algorithm transforms [2, 3] in order to optimize the architecture. For example, pipelining [4] may be used to reduce the critical path of a design, thereby allowing the supply voltage to be reduced. Once the algorithm is sufficiently optimized, custom circuits are designed which provide the necessary balance between power ....

K. K. Parhi, "Algorithm transformation techniques for concurrent processors," Proceedings of the IEEE, vol. 77, no. 12, pp. 1879--1895, Dec. 1989.


Low-Power Channel Coding Via Dynamic Reconfiguration - Manish Goel And (1999)   (5 citations)  (Correct)

....) variations from 7dB 10dB. On an average 55 energy savings are achieved. 1. INTRODUCTION Power reduction techniques havebeen proposed at all levels of VLSI design hierarchy ranging from the circuits to algorithms. Of particular interest in this paper are algorithm transformation techniques [1]. Channel SNR Variable Spectrum Modulator and Shaping Demod. Equalizer and SMA Block Data out Data in TRANSCEIVER OUTER (Fixed) INNER TRANSCEIVER (Reconfigurable) Fixed BER Reconfig. RS Encoder Reconfig. RS Decoder (SPA) SPA) 7 7.5 8 8.5 9 9.5 10 10 15 10 10 10 5 10 E N (dB) BER ....

K. K. Parhi, "Algorithm transformation techniques for concurrent processors," Proceedings of the IEEE, vol. 77, no. 12, pp. 1879--1895, Dec. 1989.


Optimal Scheduling of Iterative Data-Flow Programs onto.. - Piriyakumar, Levi   (Correct)

....in [ScBa 86] utilize exhaustive search to generate cyclo static schedules, which may reduce the iteration periods most of the times. To exploit the hidden parallelism available in the DFP, transformation techniques such as unfolding and retiming have been applied to the corresponding DFG [PaKK 89] The retiming technique minimizes the critical path length of a DFG but does not guarantee a critical path time less than a speci ed iteration period. In fact, the DFG tasks need to be scheduled optimally to minimize the iteration period, which was not given adequate focus previously. Moreover, ....

K.K.Parhi, "Algorithm transformation techniques for concurrent processors", Proceedings of the IEEE, vol. 77, no. 12, Dec. 1989.


Power vs. Performance Tradeoffs for Reduced Resolution LMS.. - Gupta, Hero (1998)   (Correct)

....in these handsets. There have been many digital hardware design strategies proposed for power reduction including: reduction of supply voltage, reduction of clock speed and data rate, parallelization and pipelining of operations, using sign magnitude arithmetic, and differential encoding of data [2], 3] Another technique, which is the springboard for this paper, is the reduction of the number of bits used to represent the data and control variables in the digital circuit [4] The bit width reduction strategy is very highly leveraged since it reduces the power dissipation everywhere in the ....

K. K. Parhi, "Algorithm transformation techniques for concurrent processors," IEEE Proceedings, vol. 77, pp. 1879-- 1895, Dec. 1989.


On Retiming of Multirate DSP Algorithms - Zivojnovic, Schoenen (1996)   (1 citation)  (Correct)

....was introduced as a technique to optimize hardware circuits by redistributing registers without affecting functionality [1] Retiming is also useful for DSP software design. It changes precedence constraints among instructions or tasks, and can improve single processor [2] and multiprocessor [3,4] schedules. In both cases, hardware and software design, marked graph can be used as an appropriate model of computation, and retiming is a transformation changing the distribution of tokens on arcs. This paper extends retiming principles to non ordinary marked graphs, characterized by nodes ....

....token conservation theorem is not valid anymore, limiting the applicability of numerous useful results developed for the ordinary case. In the past retiming was treated mostly as ordinary (unitrate) retiming. Only marginal treatment of non ordinary (multirate) retiming can be found (e.g. in [3]) The focus of this paper is on reachability of non ordinary marked graphs. It continues along the work of Teruel et al. 8] and provides new reachability results useful for retiming of multirate DSP algorithms. After the introduction, we revise the background and introduce the notation. In ....

[Article contains additional citation context not shown here]

K. Parhi, "Algorithm transformation techniques for concurrent processors," Proceedings of the IEEE, vol. 77, pp. 1879--1895, Dec. 1989.


Determining the Minimum Iteration Period of an Algorithm - Ito, Parhi (1995)   (9 citations)  Self-citation (Parhi)   (Correct)

....propose a novel algorithm for faster determination of the iteration bound of the MRDFG. 1 Introduction Digital signal processing algorithms are repetitive in nature. These algorithms are described by iterative data flow graphs (DFGs) where nodes represent tasks and edges represent communication [1, 2]. Execution of all nodes of the DFG once completes an iteration. Successive iterations of any node are executed with a time displacement referred 1 This research was supported by the Advanced Research Projects Agency and monitored by Wright Patterson AFB under contract number F33615 93 C 1309. ....

....Moreover, different nodes may be invoked for a different number of times in an iteration. In other words, one node is invoked at a different rate from another node in MRDFGs. The definition of the edge in MRDFGs also differs from that in SRDFGs. An MRDFG can be expanded into the equivalent SRDFG [1]. The equivalence means that the MRDFG and its expanded SRDFG express identical signal processing algorithm. In this section we describe a method to expand an 8 MRDFG into its equivalent SRDFG which is similar to unfolding an SRDFG [5] 5.1 The number of invocations of node In SRDFGs, it is ....

K. K. Parhi, "Algorithm Transformation Techniques for Concurrent Processors," Proc. of the IEEE, vol. 77, pp. 1879--1895, Dec. 1989.


IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI).. - Systematic Approach To   (Correct)

No context found.

K. K. Parhi, "Algorithm transformation techniques for concurrent processors, " Proc. IEEE, vol. 77, pp. 1879--1895, Dec. 1989.


IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI).. - Systematic Approach To   (Correct)

No context found.

K. K. Parhi, "Algorithm transformation techniques for concurrent processors, " Proc. IEEE, vol. 77, pp. 1879--1895, Dec. 1989.


Hw/Sw Co-exploration at TLM Level for the Implementation of.. - Robelly, Fettweis (2003)   (Correct)

No context found.

K. K. Parhi, "Algorithm transformation techniques for concurrent processors ", In Proc. IEEE, vol. 77, pp.1879-1895, Dec. 1989


Pipelined Adaptive Cdma Mobile Receivers - Ramin Baghaie And (1998)   (Correct)

No context found.

K. Parhi, "Algorithm transformation techniques for concurrent processors," Proceedings of the IEEE, vol. 77, pp. 1879-1895, December 1989.


Relaxed Look-Ahead Technique For - Pipelined Implementation Of   (Correct)

No context found.

K. Parhi, "Algorithm transformation techniques for concurrent processors," Proceedings of the IEEE, vol. 77, pp. 1879-1895, Dec. 1989.


Received November 7, 1997; revised June 15, 1998;.. - Communicated By.. (2000)   (Correct)

No context found.

K. K. Parhi, "Algorithm transformation techniques for concurrent processors," IEEE Proceedings, Vol. 77, 1989, pp. 1879-1895.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC