## Self-Timed Carry-Lookahead Adders (2000)

### Download From

IEEE### Download Links

- [www.cs.columbia.edu]
- [www.cs.columbia.edu]
- [www.eng.utah.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | IEEE Transactions on Computers |

Citations: | 11 - 2 self |

### BibTeX

@ARTICLE{Cheng00self-timedcarry-lookahead,

author = {Fu-Chiung Cheng and Stephen H. Unger and Michael Theobald and Student Member},

title = {Self-Timed Carry-Lookahead Adders},

journal = {IEEE Transactions on Computers},

year = {2000},

volume = {49},

pages = {659--672}

}

### OpenURL

### Abstract

This paper proposes a self-timed carry-lookahead adder in which the logic complexity is a linear function of #, the number of inputs, and the average computation time is proportional to the logarithm of the logarithm of #. To the best of our knowledge, our adder has the best area-time efficiency which is ### ### ### ##. An economic implementation of this adder in CMOS technology is also presented. SPICE simulation results show that, based on random inputs, our 32-bit self-timed carry-lookahead adder is 2.39 and 1.42 times faster than its synchronous counterpart and self-timed ripple-carry adder, respectively; and, based on statistical data gathered from a 32-bit ARM simulator, it is 1.99 and 1.83 times faster than its synchronous counterpart and self-timed ripple-carry adder, respectively. Index TermsSelf-timed circuits, delay-insenstive circuits, carry-lookahead adders, tree it

### Citations

4375 |
Computer Architecture: A Quantitative Approach
- Hennessy, Patterson
- 1996
(Show Context)
Citation Context ...on to explicit arithmetic (such as addition, subtraction, multiplication, and division) performed in a program, additions are performed to increment program counters and calculate effective addresses =-=[1]-=-. Statistics presented in [1], [2] show that, in a prototypical RISC machine (DLX), 72 percent of the instructions perform additions (or subtractions) in the datapath. The statistics reported in ARM p... |

119 |
The limitations to delay-insensitivity in asynchronous circuits
- Martin
- 1990
(Show Context)
Citation Context ... and connection wires. Thus, DI circuits are the most robust circuits in terms of the operating variations such as temperature, voltage, and processing. The class of pure DI circuits is quite limited =-=[15]-=-. However, extending pure DI circuits with isochronic forks is sufficient to construct any circuit of interest. (Such circuits are sometimes called quasi-DI.) For this paper, we assume isochronic fork... |

115 |
Computer Arithmetic Principles, Architecture and Design
- Hwang
- 1979
(Show Context)
Citation Context ...t rates. A good example for this is that n-bit ripple-carry adders (which are synchronous), shown in Fig. 1, have worst case computation time …n†, 1 whereas n-bit carry-completion sensing adders [=-=4], [5], [6-=-] (which are asynchronous), shown in Fig. 3, have average computation time …log n† [7]. This paper proposes a self-timed carry-lookahead adder in which the logic complexity is a linear function of... |

82 |
Preliminary discussion of the logical design of an electronic computing instrument
- Burks, Goldstine, et al.
- 1946
(Show Context)
Citation Context ..., shown in Fig. 1, have worst case computation time …n†, 1 whereas n-bit carry-completion sensing adders [4], [5], [6] (which are asynchronous), shown in Fig. 3, have average computation time …l=-=og n† [7]. Th-=-is paper proposes a self-timed carry-lookahead adder in which the logic complexity is a linear function of n, …n†, and the average computation time is proportional to the logarithm of the logarith... |

51 |
A formal approach to designing delay-insensitive circuits
- Ebergen
- 1991
(Show Context)
Citation Context ...…n† and the average computation time for randomly distributed inputs is …log n† [7]. Note that the worst case computation time is …n†. 2.2.2 Delay-Insensitive Circuits Delay-insensitive (D=-=I) circuits [14]-=- are a subclass of asynchronous circuits. The defining property of DI circuits is that their correctness is insensitive to delays in both gate elements and connection wires. Thus, DI circuits are the ... |

47 |
Translating Concurrent Communicating Programs into Asynchronous Circuits
- Brunvand
- 1991
(Show Context)
Citation Context ... 2 n†-stage delays. However, the logic complexity of these adders goes up to …n log n†. 2.2 Asynchronous Adders 2.2.1 Carry-Completion Sensing Adders A Carry-Completion Sensing Adder (CCSA) [4],=-= [5], [13]-=- may be regarded as an asynchronous version of an RCA. Instead of using clock pulses to synchronize adder operation, a CCSA uses some extra circuitry to implement the start and completion signals. Fig... |

29 |
Probability and Statistics for Engineering
- Scheaffer, McClave
- 1995
(Show Context)
Citation Context ...IRCA and DICLASP. Comparing 10,000 simulated cases to the sample space, it is obvious that only a very tiny percentage (5:42 10 14 percent) of cases are simulated. Table 5 shows the confidence limits =-=[25]-=-: sample mean, standard deviation, confidence interval with 95 percent confidence level, and confidence interval with 99 percent confidence level for DIRCA and DICLASP. General distribution is assumed... |

12 |
A building block approach to unclocked systems
- Unger
- 1993
(Show Context)
Citation Context ...on operation. The req, ack, and the input and output signals of the adder must be reset to zero before next addition starts. The registers used by the DI adder are dual-rail asynchronous registers in =-=[17]-=-. Martin [18] proposed a very good design of DIRCA adder by using CMOS technology. The transistor count per DIRCA cell is 42. Compared to the synchronous RCA cell which needs 40 transistors, it is cle... |

10 |
The Logic of Computer Arithmetic
- Flores
(Show Context)
Citation Context ...ons on fanin and fan-out, irregular structure, and many long wires [8], [1]. However, the carry-lookahead scheme may be built in the form of a tree-like circuit, which has a simple, regular structure =-=[9], [10], [1], by reformulating-=- (6) into Pi;k ˆ Pi;jPj 1;k …block carry propagate† …7† Gi;k ˆ Gi;j ‡ Pi;jGj 1;k …block carry generate† …8† Cj ˆ Gj 1;k ‡ Pj 1;kCk; …9† where i j>k, Gi;i ˆ gi, and Pi;i ˆ ... |

8 | Practical Design and Performance Evaluation of Completion Detection Circuits - Cheng - 1998 |

8 |
Skip techniques for high-speed carrypropagation in binary arithmetic units
- Lehman, Burla
- 1961
(Show Context)
Citation Context ...xity and computation time for both synchronous and asynchronous adders. They are Ripple Carry Adder (RCA), ConditionalSum Adder(CSA1) [11], Carry-Select Adder(CSA2) [21], [1], Carry-Skip Adder (CSA3) =-=[22]-=-, [1], Carry-Completion Sensing Adder (CCSA) [4], Delay-Insensitive Ripple Carry Adder (DIRCA) [18], Conditional-Sum Completion-Sensing Adder (CSCSA) [23], Brent and Kung Carry-Lookahead Adder (BKA) [... |

5 |
area-time efficient carry-lookahead adders
- Ngai, Irwin, et al.
- 1986
(Show Context)
Citation Context ...) The circuit of CLA. For large i, it is impractical to build a two-stage full carrylookahead adder because of the practical limitations on fanin and fan-out, irregular structure, and many long wires =-=[8], [-=-1]. However, the carry-lookahead scheme may be built in the form of a tree-like circuit, which has a simple, regular structure [9], [10], [1], by reformulating (6) into Pi;k ˆ Pi;jPj 1;k …block car... |

3 |
Private communication
- Greenstreet
- 1995
(Show Context)
Citation Context ...he sum bits. The computation time of an adder is sensitive to the numbers to be added. The upper and lower bound proofs of average computation time are an extension of proofs for CCSAs by Greenstreet =-=[20]-=-. Theorem 1. For any input configuration, the carry propagation time is proportional to the logarithm of the length of the longest carry chain. Proof. Consider a carry chain with length x in an input ... |

1 |
ªPerformance Comparison of Asynchronous Adders,º
- Franklin, Pan
- 1994
(Show Context)
Citation Context ... addition, subtraction, multiplication, and division) performed in a program, additions are performed to increment program counters and calculate effective addresses [1]. Statistics presented in [1], =-=[2]-=- show that, in a prototypical RISC machine (DLX), 72 percent of the instructions perform additions (or subtractions) in the datapath. The statistics reported in ARM processors even reaches 80 percent ... |

1 |
ªA CMOS VLSI Implementation of an Asynchronous ALU,º Asynchronous Design Methodologies
- Garside
- 1993
(Show Context)
Citation Context ... show that, in a prototypical RISC machine (DLX), 72 percent of the instructions perform additions (or subtractions) in the datapath. The statistics reported in ARM processors even reaches 80 percent =-=[3]-=-. Thus, the performance of processors is significantly influenced by the speed of their adders. Circuits may be classified as synchronous or asynchronous. Synchronous circuits have a clock to synchron... |

1 |
ªFast Carry Logic for Digital Computers,º
- Gilchrist, Pomerene, et al.
- 1955
(Show Context)
Citation Context ... worst rates. A good example for this is that n-bit ripple-carry adders (which are synchronous), shown in Fig. 1, have worst case computation time …n†, 1 whereas n-bit carry-completion sensing add=-=ers [4], [5-=-], [6] (which are asynchronous), shown in Fig. 3, have average computation time …log n† [7]. This paper proposes a self-timed carry-lookahead adder in which the logic complexity is a linear functi... |

1 |
ªSome New Results on Average Worst Case Carry,º
- Briley
- 1973
(Show Context)
Citation Context ...es. A good example for this is that n-bit ripple-carry adders (which are synchronous), shown in Fig. 1, have worst case computation time …n†, 1 whereas n-bit carry-completion sensing adders [4], [=-=5], [6] (whic-=-h are asynchronous), shown in Fig. 3, have average computation time …log n† [7]. This paper proposes a self-timed carry-lookahead adder in which the logic complexity is a linear function of n, …... |

1 |
ªTree Realizations of Iterative Circuits,º
- Unger
- 1977
(Show Context)
Citation Context ...n fanin and fan-out, irregular structure, and many long wires [8], [1]. However, the carry-lookahead scheme may be built in the form of a tree-like circuit, which has a simple, regular structure [9], =-=[10], [1], by reformulating (6) i-=-nto Pi;k ˆ Pi;jPj 1;k …block carry propagate† …7† Gi;k ˆ Gi;j ‡ Pi;jGj 1;k …block carry generate† …8† Cj ˆ Gj 1;k ‡ Pj 1;kCk; …9† where i j>k, Gi;i ˆ gi, and Pi;i ˆ pi. Th... |

1 |
ªConditional-Sum Addition Logic,º IRE Trans
- Sklansky
- 1960
(Show Context)
Citation Context ... possible condition. The worst propagation delay of an n-bit CLA is two unit delays 2 of the A-module plus 2 log 2 n 1 unit delays of the B-module. 2.1.3 Other Tree-Like Adders Conditional-Sum Adders =-=[11], Ty-=-pe-2 Adders [10], and Brent and Kung Adders (BKA) [12] were proposed to further improve the worst case delay. The worst propagation delay of these adders is about …log 2 n†-stage delays. However, ... |

1 |
ªA Regular Layout for Parallel Adders,º
- Brent, Kung
- 1982
(Show Context)
Citation Context ...it CLA is two unit delays 2 of the A-module plus 2 log 2 n 1 unit delays of the B-module. 2.1.3 Other Tree-Like Adders Conditional-Sum Adders [11], Type-2 Adders [10], and Brent and Kung Adders (BKA) =-=[12] were pr-=-oposed to further improve the worst case delay. The worst propagation delay of these adders is about …log 2 n†-stage delays. However, the logic complexity of these adders goes up to …n log n†.... |

1 |
ªMicropipelines,º Comm
- Sutherland
- 1989
(Show Context)
Citation Context ...ny circuit of interest. (Such circuits are sometimes called quasi-DI.) For this paper, we assume isochronic forks. The CCSA, shown in Fig. 3, is not a DI circuit. It must meet the bundling constraint =-=[16]-=-, [13]: The start signal cannot be asserted unless all the input data bits have arrived and the sum bits must arrive at the environment before the environment receives the finish signal. One way to me... |

1 |
ªAsynchronous Datapaths and the Design of an Asynchronous Adder,º Formal Methods
- Martin
- 1992
(Show Context)
Citation Context ... The req, ack, and the input and output signals of the adder must be reset to zero before next addition starts. The registers used by the DI adder are dual-rail asynchronous registers in [17]. Martin =-=[18]-=- proposed a very good design of DIRCA adder by using CMOS technology. The transistor count per DIRCA cell is 42. Compared to the synchronous RCA cell which needs 40 transistors, it is clear that the a... |

1 |
ªCarry-Select Adder,º IRE Trans
- Bedrij
- 1962
(Show Context)
Citation Context ...ection, we compare the logic complexity and computation time for both synchronous and asynchronous adders. They are Ripple Carry Adder (RCA), ConditionalSum Adder(CSA1) [11], Carry-Select Adder(CSA2) =-=[21]-=-, [1], Carry-Skip Adder (CSA3) [22], [1], Carry-Completion Sensing Adder (CCSA) [4], Delay-Insensitive Ripple Carry Adder (DIRCA) [18], Conditional-Sum Completion-Sensing Adder (CSCSA) [23], Brent and... |

1 |
ªConditional-Sum Early Completion Adder Logic,º
- Martin, Hufnagel
- 1980
(Show Context)
Citation Context ...dder(CSA2) [21], [1], Carry-Skip Adder (CSA3) [22], [1], Carry-Completion Sensing Adder (CCSA) [4], Delay-Insensitive Ripple Carry Adder (DIRCA) [18], Conditional-Sum Completion-Sensing Adder (CSCSA) =-=[23]-=-, Brent and Kung Carry-Lookahead Adder (BKA) [12], Type-2 Adder [10], and, our DICLA, SPDICLA, and DICLASP. The logic (area) complexity and computation time of the above adders are listed in Table 1. ... |

1 |
ªSynthesis of High Performance Self-Checking Delay-Insensitive Tree Circuits,º
- Cheng
- 1998
(Show Context)
Citation Context ...hematical model for the DICLASP has been found yet. Thus, simulation was used to analyze the TABLE 1 Logic and Time Complexity of Adders average computation delay. For simulations of adders, see [2], =-=[24]-=-. The DIRCA, the DICLA, and the DICLASP adders have been simulated with C++ programs. The results are shown in Table 2. The results of 8-bit adders are produced by exhaustive enumeration of possible c... |

1 |
ªAn Evaluation of Asynchronous Addition,º
- Kinniment
- 1996
(Show Context)
Citation Context ...emonstrate the superiority of asynchronous circuits in the domain of average case performance vs. worst case performance of synchronous circuits. The results also contradict the report from Kinniment =-=[26]. -=-The report concludes that ªasynchronous adders only give a performance improvement over more conventional hardware in very limited conditions,º which is wrong. The major problem of Kinniment's repor... |

1 |
ªAsynchronous Wrapper for Heterogeneous Systems,º
- Bormann, Cheung
- 1997
(Show Context)
Citation Context ...I implementation because of their regular structure. We believe this work can be applied in the design of high speed asynchronous processors. With an interface of asynchronous and synchronous modules =-=[27]-=-, DICLASP may be used to improve the performance of synchronous processors.s672 IEEE TRANSACTIONS ON COMPUTERS, VOL. 49, NO. 7, JULY 2000 TABLE 7 Performance Improvement Based on Real Data ACKNOWLEDGM... |