# Using Charge Self-compensation Domino Full-adder with Multiple Supply and Dual Threshold Voltage in 45nm Technology

Jinhui Wang, Wuchen Wu, Ligang Hou, Shuqin Geng Wang Zhang, Xiaohong Peng VLSI & System Lab Beijing University of Technology Beijing 100022, China wangjinhui888@yahoo.com.cn

Abstract-A charge self-compensation technique, based on P-type logic dynamic node charging to N-type logic dynamic node, is proposed in this paper. A novel Zipper CMOS domino full-adder is implemented using this technique, dual threshold voltage technique, and multiple supply technique for power reduction. A power distribution simulation running indicates that the active power of the implemented full-adder can be reduced by up to 37%, 5% and 7%, and its leakage power can be reduced by up to 41%, 20% and 43% as compared to the standard, the dual threshold voltage, and the multiple supply Zipper CMOS domino full-adder with similar delay time, respectively. At last, the influence of the combination idle state determined by inputs and clock signals on the leakage current is analyzed and the optimal idle state is obtained.

# I. INTRODUCTION

Full-adder is one of the core components of microprocessor and other complex chips. It is therefore inherent that the performance of the full-adder would affect the system as a whole [1]. Due to the superior speed and area characteristics comparing to static CMOS full-adder, Zipper CMOS domino full-adder has been extensively applied in modern high performance microprocessors and cache designs. However, domino gates typically consume more power as compared to static CMOS gates [2]-[5]. Moreover, high clock frequency over 1GHz and the aggressive downscaling lead to a linear increase in power of CMOS devices, which degrades the fulladder performance, especially when it is used in the batterypowered portable applications, such as cell phones and laptop computers. Therefore, the low power and high performance Zipper CMOS domino full-adder implementing is becoming a major challenge in the current microprocessor design [6].

Mainly there are two major contributions to power consumption in CMOS circuits. One is the active power due to charging and discharging of the circuit capacitances during switching, and the other is the leakage power due to the leakage current [7]. As technology scales down, the supply voltage must be reduced to keep active power within acceptable levels. At the same time, the threshold voltage  $(V_t)$  and gate oxide thickness  $(t_{ox})$  of the transistors must be reduced with the supply voltage scaling down to meet the performance requirements. However, the sub-threshold leakage (I<sub>sub</sub>) and gate leakage (Igate) current increase exponentially with the scaling of  $V_t$  and  $t_{ox}$ . Worse than all, during the sleep mode when the circuits are not operating, the leakage current still occurs. It is predicted that leakage power may constitute as much as 50 percent of the total power consumption for the sub-65nm generation [8]. Hence, low power full-adder design

Na Gong College of Electronic and Informational Engineering Hebei University Baoding 071002, China gongna\_china@yahoo.com.cn

considering both low active power and low leakage power is of a continuous interest.

A number of approaches have been proposed to reduce power, such as the low swing technique [9], the multiple supply technique [10], the short pulse technique [11], and the dual  $V_t$  technique [12]. These techniques are helpful to reduce the power consumption, but at the same time, they may degrade the speed and weaken noise immunity of the circuits more or less. In this paper, a charge self-compensation technique, base on P-type logic dynamic node charging to N-type logic dynamic node, is presented to lower the active power without any speed loss. In order to reduce both the active power and the leakage power, a novel Zipper CMOS fulladder, using the charge self-compensation technique, the dual  $V_t$  technique and the multiple supply technique, is proposed in this paper.

## II. PROPOSED ZIPPER CMOS DOMINO FULL-ADDER

High leakage power consumption has become an important issue affecting Zipper CMOS full-adder performance in 45nm technology. The dual V<sub>t</sub> technique is proposed to efficiently suppress the leakage power [10]. The critical signal transitions determining the domino circuit delay occur along the evaluation path. In a dual V<sub>t</sub> domino circuit, therefore, all of the transistors, activated during the evaluation phase, have a low V<sub>t</sub>. Alternatively, the precharge/ predischarge phase transitions are not critical for the performance of a domino circuit [12]. Those transistors have a high V<sub>t</sub>. Hence, the leakage current decreases with the increasing of V<sub>t</sub>, it can be expressed as follows [13].

$$I_{sub} = \frac{W_{eff}}{L_{eff}} u \sqrt{\frac{q\epsilon_{si} N_{ch}}{2\Phi_s} V_T^2 exp\left(\frac{V_{gs} - V_t}{nV_T}\right)} \left(1 - exp\left(-\frac{V_{ds}}{V_T}\right)\right)$$
(1)

where  $L_{eff}$  and  $W_{eff}$  are the effective channel length and width respectively, and other parameters have their usual meanings.

TABLE I Normalized Leakage Current of the Devices at 25°C

| 1 | Norman               | 0                  | ikage Current of the Devices at 25 C |                    |                |  |  |  |
|---|----------------------|--------------------|--------------------------------------|--------------------|----------------|--|--|--|
|   |                      | NM                 | OS                                   | PMOS               |                |  |  |  |
|   |                      | Low-V <sub>t</sub> | High-V <sub>t</sub>                  | Low-V <sub>t</sub> | High $-V_t$    |  |  |  |
|   | A: I <sub>leak</sub> | 126.2              | 60.4                                 | 56.3               | 4.4            |  |  |  |
|   | (Isub,Igate)         | (66.5,59.6)        | (0.8,59.6)                           | (52.8,3.4)         | (1,3.4)<br>5.3 |  |  |  |
|   | B: Igate             | 159.1              | 124.0                                | 5.3                |                |  |  |  |



Fig.1 Standard Zipper CMOS domino full-adder



Fig.2 Zipper CMOS domino full-adder with the dual  $V_t$  technique, the multiple supply technique and the charge self-compensation technique

The normalized leakage current of low  $V_t$  and high  $V_t$  devices at 25°C is shown in Table I. Obviously, the leakage current of PMOS is lower than that of NMOS due to hole's low mobility. In addition, utilizing this technique needs to gate all the initial inputs to place the sleep domino gates into a low sub-threshold leakage state, which is analyzed in section III.

In the multiple supply technique, the lower supply  $(V_{ddl})$ and the higher ground (Gndh) are applied to reduce the supply swing ( $V_{swing}$ ) from  $V_{dd}$ -Gnd to  $V_{ddl}$ -Gnd and  $V_{dd}$ -Gndh, respectively. The power consumption for the low swing domino circuit is

$$P = P_{active} + P_{leak} = \alpha f C_L V_{dd} V_{swing} + I_{leak} V_{dd}$$
(2)

where  $\alpha$  and f are the switching activity factor and clock frequency, respectively. C<sub>L</sub> is the capacitive load at the keeper gate. (2) indicates that the multiple supply technique can reduce the power consumption effectively.

The standard Zipper CMOS domino full-adder is composed of the N-type logic and the P-type logic, as shown in Fig.1. The pull-down network (PDN) in the N-type logic consists of N1, N2, N3, N4 and N5. The N-type logic is operated as follows. In the precharge phase, clock is set low. Pc1 is turned on. And the evaluation phase begins when the clock is set high. Pc1 is cut off. Provided that the necessary input combination to discharge the evaluation node is applied, the circuit evaluates and the dynamic node is discharged to ground. Otherwise, the high state of the dynamic node will be preserved until the following precharge phase. The pull-up network (PUN) in the P-type logic consists of P1, P2, P3, P4, P5, P6 and P7. The operation of the P-type logic behaves in the following manner. The circuit is set in the predischarge phase by high clock signal. After the clock transition is low, provided that the necessary input combination to charge the evaluation node is applied, the circuit evaluates and the dynamic node is charged to V<sub>dd</sub>. Otherwise, the low state of the evaluation node will be kept until the next predischarge phase [12].

Obviously, after the N-type logic and the P-type logic evaluate, if dynamic node of the N-type logic changes from high to low and that of the P-type logic changes by contraries from low to high, in the following phase, the N-type and the P-type logic would both consume active power by the N-type logic dynamic node charging and the P-type logic dynamic node discharging, respectively. Therefore, in order to reduce the active power efficiently, a charge self-compensation technique is proposed in this paper. In this technique, the N-type logic dynamic node is charged by the P-type logic dynamic node through the charge self-compensation path, and thus the active power of circuit is decreased greatly. What's more, this charge self-compensation path must have two judgement functions: 1) The path is available only in the precharge (the N-type logic) or predischarge (the P-type logic) phase. 2) This path is available only when the N-type logic dynamic node charges and the P-type logic dynamic node discharges.

The charge self-compensation path is shown in Fig.2. The charge self-compensation path is operated as follows. In precharge/ predischarge phase the judgement transistors N<sub>P1</sub> is turned on. If the N-type logic dynamic node is low and the P-type logic dynamic node is high, the charge self-compensation path would be available. Otherwise, the charge self-compensation path does not work. When the charge self-compensation path does not work. When the charge self-compensation path is available, the P-type logic dynamic node voltage and N-type logic dynamic node voltage and N-type logic dynamic node voltage are set to V<sub>dd</sub> and ground initially. Then V<sub>p</sub> charges V<sub>n</sub>, which makes V<sub>p</sub> decreases and V<sub>n</sub> increases gradually. This charging process would not end until V<sub>p</sub>-V<sub>th</sub>=V<sub>n</sub>+|V<sub>tp</sub>| (V<sub>th</sub> and V<sub>tp</sub> are the threshold voltage of P<sub>p</sub> and N<sub>P2</sub>, reservively.). And then V<sub>p</sub> continue to discharge to ground by Nc2, meanwhile V<sub>n</sub> is charged to V<sub>dd</sub>[14][15].

Obviously, the increased W/L of transistors in the charge self-compensation path heightens the charging speed. However, the charge self-compensation path also consumes the active power, and therefore the increased size of transistors in the self-compensation path will lead more power overhead. Hence, the total energy reduction ( $E_{reduction}$ ) of domino circuits is the difference between saved energy that is produced in the process of V<sub>p</sub> charging V<sub>n</sub> ( $E_{charging}$ ) and energy of the charge self-compensation path ( $E_{path}$ ), which can be expressed as follows.

$$E_{charging} = Q_{charging} U$$
(3)  
$$E_{reduction} = E_{charging} - E_{nath}$$
(4)

TABLE II PARAMETERS OF DEVICES

|       | PARAMETERS OF DEVICES |                  |     |         |                          |                          |                         |                           |  |  |
|-------|-----------------------|------------------|-----|---------|--------------------------|--------------------------|-------------------------|---------------------------|--|--|
| Tech. | ech. Supply           |                  |     | voltage | V <sub>t</sub> of Device |                          |                         |                           |  |  |
| node  | $V_{dd}$              | V <sub>ddl</sub> | Gnd | Gndh    | Low-V <sub>t</sub> NMOS  | High-V <sub>t</sub> PMOS | Low-V <sub>t</sub> NMOS | High -V <sub>t</sub> NMOS |  |  |
| 45nm  | 0.8V                  | 0.7V             | 0V  | 0.1V    | 0.35V                    | -0.35V                   | 0.22V                   | -0.22V                    |  |  |

| CLOCK STATES AT 25 °C (A) |             |              |          |          |          |          |          |          |          |
|---------------------------|-------------|--------------|----------|----------|----------|----------|----------|----------|----------|
| Full-adders               | Clock State | Input vector |          |          |          |          |          |          |          |
|                           |             | (0,0,0)      | (0,0,1)  | (0,1,0)  | (0,1,1)  | (1,0,0)  | (1,0,1)  | (1,1,0)  | (1,1,1)  |
| Standard                  | 0           | 6.945e-7     | 4.822e-7 | 4.993e-7 | 5.301e-7 | 5.025e-7 | 5.371e-7 | 5.406e-7 | 5.515e-7 |
| Stanuaru                  | 1           | 8.562e-6     | 4.447e-6 | 1.536e-6 | 1.059e-6 | 7.368e-6 | 1.058e-6 | 1.220e-6 | 1.919e-6 |
| Multiply supply           | 0           | 5.130e-7     | 4.943e-7 | 5.073e-7 | 5.329e-7 | 5.027e-7 | 5.323e-7 | 5.316e-7 | 5.395e-7 |
| winnipry suppry           | 1           | 5.648e-7     | 1.778e-5 | 9.153e-7 | 1.022e-6 | 1.649e-5 | 1.021e-6 | 1.183e-6 | 1.326e-6 |
| Dual-V,                   | 0           | 8.619e-7     | 6.700e-7 | 6.668e-7 | 6.976e-7 | 6.700e-7 | 7.046e-7 | 7.081e-7 | 7.190e-7 |
| Dual-V <sub>t</sub>       | 1           | 3.589e-7     | 5.039e-7 | 6.556e-7 | 8.361e-7 | 7.936e-7 | 8.355e-7 | 9.974e-7 | 1.123e-6 |

4.199e-7 6.021e-7

6.747e-7 6.560e-7 6.691e-7 6.945e-7 6.644e-7 6.939e-7 6.934e-7

7.936e-7 6.770e-7

 TABLE III

 LEAKAGE CURRENT OF FOUR ZIPPER CMOS DOMINO FULL-ADDERS IN EIGHT INPUT VECTORS AND TWO

 CLOCK STATES AT 25 °C (A)

where  $Q_{charging}$  is the charge that is transported in the process of  $V_p$  charging  $V_n$ .

2.829e-7

0

Proposed

the clock and input vector are 1 and (0,0,0), respectively, the leakage current is lowest.

7.930e-7 9.550e-7

7.012e-7

4.923e-6

In this paper, with the charge self-compensation technique, the dual  $V_t$  technique and the multiple supply technique, a low power Zipper CMOS domino full-adder is proposed in 45nm technology.

### III. SIMULATION RESULTS

In this section, the standard Zipper CMOS domino fulladder, the dual V<sub>t</sub> Zipper CMOS domino full-adder, the multiple supply Zipper CMOS domino full-adder and the proposed Zipper CMOS domino full-adder are simulated respectively based on 45nm BSIM4 models [16] by the HSPICE tool. At a worst case temperature of 110°C at which the fulladder is active and the room temperature of 25°C at which the full-adder is idle, each domino gate drives a capacitive load of 8fF and is turned to operate at 1GHz clock frequency. The parameters of devices are listed in table II. The W/L of transistors in PDN and in PUN is set to 8-12 and 40-60, respectively.

Obviously, the W/L of NMOS and PMOS in the selfcompensation path determines its power consumption which affects the effectivity of the charge self-compensation technique, as described in section II. Therefore, it is critical to find the optimal self-compensation path in which the W/L of NMOS and PMOS is optimum to make the circuit power lowest. A novel power distribution simulation method is introduced in this section. With the changes of the W/L of PMOS and NMOS in the charge self-compensation path (W/L ranges from 1 to 20), the power distribution of the proposed Zipper CMOS full-adder is shown in Fig.3. It can be seen that when the W/L of NMOS and PMOS are both 17, the active power of the proposed Zipper CMOS full-adder is lowest.

The Table III lists the leakage current of four full-adders in eight input vectors with two clock states. When the clock is set high, Nc1 (low-V<sub>t</sub>) and Pc2 (low-V<sub>t</sub>) are turned on, while Pc1 (high- $V_t$ ) and Nc2 (high- $V_t$ ) are cut off. The total leakage current is 229.2 (159.1+5.3+4.4+60.4=229.2, as listed in Table I). Alternatively, when the clock is set low, Nc1 (low- $V_t$ ) and Pc2 (low-V<sub>t</sub>) are cut off, while Pc1 (high-V<sub>t</sub>) and Nc2 (high-V<sub>t</sub>) are turned on. The total leakage current is 311.8(126.2+124.0+56.3+5.3=311.8). Thus, the high clock signal is more effective to suppress leakage current. In addition, the input vector affects the state of the transistors in PUN and PDN. In PDN, the leakage current of each low-Vt NMOS transistor that is cut off (126.2) is lower than that of the transistor turned on (159.1). Similarly, in PUN, the leakage current of each low- $V_t$  PMOS transistor that is turn on (5.3) is further lower than that of the transistor cut off (56.3). The input vector (0,0,0) leads the NMOS transistors in PDN to be cut off and the PMOS transistors in PUN to be turned on. Therefore, for the proposed full-adder in the idle state, when

Also, the lowest leakage current state of the dual V<sub>t</sub> Zipper CMOS domino full-adder is that the clock and input vector are 1 and (0,0,0). However, for the standard Zipper CMOS domino full-adder and the multiple supply Zipper CMOS domino full-adder, the high clock signal and the input vector (0,0,1) minimize the leakage current.



Fig.3 Active Power distribution of the proposed Zipper CMOS domino full-adder



Fig.4 The comparison of the active power and minimum leakage power of four Zipper CMOS full-adders. s\_adder: Standard Zipper CMOS domino full-adder; d\_adder: the dual  $V_t$  Zipper CMOS full-adder; m\_adder: the multiple supply Zipper CMOS domino full-adder; p\_adder: the proposed Zipper CMOS domino full-adder

The comparison of the active power and minimum leakage power of four Zipper CMOS full-adders, which is produced at the lowest leakage current state, is shown in Fig.4. The active power of the proposed Zipper CMOS full-adder can be reduced by up to 37%, 5% and 7%, as compared to the standard, the dual threshold voltage, and the multiple supply Zipper CMOS domino full-adder with similar delay time, respectively. One reason for this is that the self-compensation path is utilized in the proposed circuit to suppress the active power. Second, the speed of the transistor (v) is [13]

$$v \propto \frac{V_{dd}^{0.3} (1 - \frac{V_T}{V_{dd}})^{1.3}}{t_{ox}^{0.5}}$$
(5)

The speed (v) has a positive dependence on supply. As shown in Fig.2, in the precharge/ predischarge phase, the self-compensation path is available. Then the two supplies ( $V_{dd}$  and  $V_p$ ) charge  $V_n$  together, which makes the precharging speed much higher as compared to single supply  $V_{dd}$  charging  $V_n$  without the self-compensation path. Therefore, the self-compensation path is able to improve the speed of the circuit, and then the physical size of some transistors in the proposed full-adder could be decreased to provide the similar delay time to other three full-adders. The decreased size transistors consume lower active power, which further saves the total active power.

As also can be seen from Fig.4, the leakage power of the proposed Zipper CMOS full-adder can be reduced by up to 41%, 20% and 43%, as compared to the standard, the dual threshold voltage, and the multiple supply Zipper CMOS domino full-adder, respectively. As discussed above, with the small size of transistors, the proposed Zipper CMOS domino full-adder realizes low leakage power operation (defined in (1) and (6) [13]).

$$I_{gate} = W_{eff} L_{eff} A_g \left(\frac{V_{ox}}{t_{ox}}\right)^2 exp \left(\frac{-B_g \left(1 - \left(1 - V_{ox} / \Phi_{ox}\right)^3 - \frac{1}{2}\right)}{V_{ox} / t_{ox}}\right)$$
(6)

where  $L_{eff}$  and  $W_{eff}$  are the effective channel length and width respectively, and other parameters have their usual meanings.

The multiple supply technique suppresses the power assumption through reducing supply swing, and the dual  $V_t$ technique utilizes high  $V_t$  device on non-critical path to reduce the power consumption. However, the two techniques both lead speed loss according to (5). Thus, to provide the similar delay to the proposed Zipper CMOS domino fulladder, the size of transistors in the dual  $V_t$  Zipper CMOS domino full-adder and the multiple supply Zipper CMOS domino full-adder must be increased, which leads more leakage current according to (1) and (6). Hence, the proposed Zipper CMOS domino full-adder exhibits optimal leakage current characteristics.

#### IV CONCLUSION

With the rapid development of CMOS integrated circuit and the clock frequency increasing dramatically, low-power and high-speed Zipper CMOS domino full-adder as a main building block in processor has been under extensive interest. In this paper, a charge self-compensation technique is presented. With this technique, the dual V<sub>t</sub> technique and the multiple supply technique, a low power Zipper CMOS domino full-adder is proposed in 45nm technology. A novel power distribution simulation running indicates that the proposed Zipper CMOS full-adder possesses some prominent benefits. Simulation results prove that its active power can be reduced by up to 37%, 5% and 7%, and its leakage power can be reduced by up to 41%, 20% and 43% as compared to the standard, the dual threshold voltage, and the multiple supply Zipper CMOS domino full-adder with similar delay time, respectively. At last, the inputs and clock signals combination idle state dependent leakage current characteristics is analyzed and the optimal idle state, at which clock and input vector are set 1 and (0,0,0) respectively, is obtained.

#### REFERENCE

- Hung Tien Bui, Al-Sheraidah, A. Yuke Wang. "Design and analysis of 10-transistor full adders using novel XOR-XNOR gates," WCCC-ICSP 2000, Volume 1, Issue, pp:619-622.
- [2] S. Rusu and G. Singer, "The First IA-64 Microprocessor," IEEE Journal of Solid-State Circuits, Vol. 35, No. 11, pp. 1539 - 1544, November 2000.
- [3] P. E. Gronowski et al., "High-Performance Microprocessor Design," IEEE Journal of Solid-State Circuits, Vol. 33, No. 5, pp. 676 - 686, May 1998.
- [4] Zhiyu Liu, Volkan Kursun "Robust Dynamic Node Low Voltage Swing Domino Logic with Multiple Threshold Voltages," ISQED 2006: 31-36
- [5] C. M. Lee, and E. W. Szeto, "Zipper CMOS," IEEE Circuits Devices Mag., pp: 10-16, May 1986.
- [6] S. Borkar, "Low Power Design Challenges for the Decade," Proceedings of the IEEE/ACM International Design Automation Conference, pp. 293-296, June 2001.
- [7] Shams, A.M. Bayoumi, M.A. "A New Full Adder Cell for Low-Power Applications," Proceedings of the Great Lakes Symposium on VLSI '98, pp: 45-49.
- [8] International Technology Roadmap for Semiconductors, 2001, Hhttp://public.itrs.net/H
- [9] R Mader, I Kourtev. "Reduced Dynamic Swing Domino Logic," Proceedings of the ACM/SIGDA Great Lakes Symposium on VLSI, 2003, pp: 33-35.
- [10] Phillip Chin, Charles A. Zukowski, George Gristede, Stephen V. Kosonocky. "Characterization of logic circuit techniques and optimization for high-leakage CMOS technologies," The VLSI journal, 2005, 38(3), pp: 491-504.
- [11] Shang-Jyh Shieh, Jinn-Shyan Wang. "Design of lowpower domino circuits using multiple supply voltages," The 8th IEEE International Conference on Electronics, Circuits and Systems, 2001, pp: 711-714.
- [12] V. Kursun, E. G. Friedman. "Sleep Switch Dual Threshold Voltage Domino Logic with Reduced Standby Leakage Current," IEEE Tran. on VLSI Systems, 2004(5), pp: 485-496.
- [13] Y.Taur and T.H.Ning, Fundamentals of modern VLSI devices, Cambridge University Press, 1998.
- [14] Jinhui Wang, Na Gong, Shuqin Geng, Ligang Hou, Wuchen Wu, Limin Dong, "PN Mixed Pull-down Network Domino XOR Gate Design in 45nm Technology," Chinese Journal of Semiconductors, Vol. 29, pp. 2443-2448, December 2008..
- [15] Na Gong, Baozeng Guo, Jianzhong Lou, Jinhui Wang, "Analysis and Optimization of Leakage Current Characteristics in Sub-65nm Dual Vt Footed Domino Circuits," Microelectronics Journal. Vol. 39, pp. 1149-1155, September 2008.
- [16] Predictive Technology Model(PTM), Hhttp://www.eas.asu.edu/~ptm