## **Innovative Science and Technology Publications**

# International Journal of Future Innovative Science and Technology ISSN: 2454-194X Volume - 1, Issue - 1



Manuscript Title

# Bidirectional Barrel Shifter using Differential Cascode Pre-Resolve Adiabatic Logic

Marreddy Guru Sai Prasad Reddy, Amit Maruti Kunjir, V S Kanchana Bhaaskaran

School of Electronics Engineering VIT University Chennai

E-Mail: g.prasad1993@gmail.com, kunjir.amit2013@vit.ac.in, vskanchana@gmail.com

May - 2015

www.istpublications.com



# Bidirectional Barrel Shifter using Differential Cascode Pre-Resolve Adiabatic Logic

### Marreddy Guru Sai Prasad Reddy, Amit Maruti Kunjir, V S Kanchana Bhaaskaran

School of Electronics Engineering VIT University Chennai

E-Mail: g.prasad1993@gmail.com, kunjir.amit2013@vit.ac.in, vskanchana@gmail.com

#### **ABSTRACT**

This paper presents the design of **Bidirectional Barrel Shifter** using Differential Cascode Pre-resolve Adiabatic Logic (DCPAL). The distinct attribute of the proposed design is its capability for the bidirectional shifting operations, namely, *bidirectional logical shift* and *arithmetic shift and rotate* operations, while the conventional barrel shifter designs do not perform all these operations with the use of a single circuit. The design is realized using Cadence® Virtuoso simulation environment employing 180nm technology files. The design is implemented in the diode free, dual rail DCPAL adiabatic logic seeking low power operation. For justifiable comparisons, the equivalent static CMOS counterpart is also implemented in the same design environment. The results show 85% static power reduction in a 2-bit Bidirectional barrel shifter using DCPAL as against the static CMOS equivalent circuit.

Keywords—bidirectional logical shift, bidirectional arithmetic shift, binary shifter, binary rotator, DCPAL

#### Introduction

With the increasing number of transistors, more complex circuits and higher speed of operation requirements, the power dissipation has become a major issue. The increased power dissipation of static CMOS based circuits lead to research for lowering the power dissipation through various structural designs of logic circuits. The adiabatic or energy recovery logic circuits, such as the DCPAL, 2N2P, 2N2N2P and Data-Driven Dynamic Logic (D3L) [1] are few of the recently proposed logic circuits. The DCPAL is found to be more energy efficient than the static CMOS and is proved to exhibit better speed characteristics than the adiabatic logic circuits presented in the literature [2]. The barrel shifters are normally employed in most of the processor architectures involving shifting operations and arithmetic operations, to name a few. The static CMOS based circuits require more number of transistors in the design of the barrel shifter. This paper presents the design of the DCPAL based bidirectional shifter, with the capability of six types of operations. The DCPAL circuit is chosen due to its better energy dissipation and speed performance characteristics than the 2N2P, 2N2N2P, PFAL and IPGL equivalent circuits [2].

An n-bit barrel shifter performs the logical, arithmetic and rotate operations on a maximum of (n-1) bit positions in one clock cycle. It may be noted that the logical left and right shifts are equivalent to the multiply and the divide operations, respectively on an n-bit data. Thus, it can find many applications as a part of arithmetic, logical and floating point units. Arithmetic shift operation preserves the sign bit while shifting. This finds applications while performing arithmetic

operations on the signed numbers. On the other hand, the rotate operation rotates the bit values from one end to the other end cyclically.

### A. Shift and Rotate operations

The proposed design of bidirectional barrel shifter can perform the following operations: Shift Right Logical (SRL), Shift Left Logical (SLL), Shift Right Arithmetic (SRA), Shift Left Arithmetic (SLA), Rotate Right (RR) and Rotate Left (RL).

TABLE I. SHIFTS AND ROTATES

| Operation              | Output                                                      |
|------------------------|-------------------------------------------------------------|
| Operation              | $Y_3Y_2Y_1Y_0$                                              |
| Shift Right Logical    | 0a <sub>3</sub> a <sub>2</sub> a <sub>1</sub>               |
| Shift Left Logical     | a <sub>2</sub> a <sub>1</sub> a <sub>0</sub> 0              |
| Shift Right Arithmetic | a <sub>3</sub> a <sub>3</sub> a <sub>2</sub> a <sub>1</sub> |
| Shift Left Arithmetic  | a <sub>3</sub> a <sub>1</sub> a <sub>0</sub> 0              |
| Rotate Right           | a <sub>0</sub> a <sub>3</sub> a <sub>2</sub> a <sub>1</sub> |
| Rotate Left            | $a_2a_1a_0a_3$                                              |

Table I depicts the possible shifts and rotate operations for a 4-bit barrel shifter with 1-bit shift limit. Here,  $a_3a_2a_1a_0$  is assumed the input and the respective output bit combinations  $(Y_3Y_2Y_1Y_0)$  of each of the operations are shown in the table. The shift operations are explained as follows, for shifting by S-bits of an n-bit binary word, where S indicates the number of bits to be shifted and n depicts the word length:



## International Journal of Future Innovative Science and Technology, ISSN: 2454- 194X Volume-1, Issue-1, May - 2015 editor@istpublications.com

- An S-bit Shift Right Logical operation shifts S-bits right and fixes the most significant S-bits of the result to zero.
- An S-bit Shift Left Logical operation shifts S-bits left and fixes the least significant S-bits of the result to zero.
- An S-bit Shift Right Arithmetic operation shifts S-bits right and fixes the most significant S-bits to MSB(a<sub>n-1</sub>).
- An S-bit Shift Left Arithmetic operation preserves the MSB bit while shifting the remaining bits by S-bit positions and fixes the lower S-bits to 0.
- In S-bit Rotate Right operation, S-bits are moved out of the least significant bits of the result and are rotated to the most significant bits of the result.
- In S-bit Rotate Left operation, S-bits are moved out of the most significant bits of the result and are rotated back to the least significant bits of the result.

The paper is organized as follows. Section II describes the design of 2:1 MUX in DCPAL. Section III explains the proposed design. Section IV presents the comparison of the static power dissipation between the conventional CMOS and the DCPAL based circuits. Section V concludes.

### II. Design of the 2:1 MUX

This section describes the design of a 2:1 multiplexer (MUX) in DCPAL style. It is used as the basic building block of the proposed design. The truth table of a 2:1 MUX is given in Table II. Any one of the inputs is passed to the output based upon the *select line* bit input.

TABLE II. 2:1 Mux Truth Table

| S | IN1 | IN2 | OUT |
|---|-----|-----|-----|
| 0 | 0   | 0   | 0   |
| 0 | 0   | 1   | 0   |
| 0 | 1   | 0   | 1   |
| 0 | 1   | 1   | 1   |
| 1 | 0   | 0   | 0   |
| 1 | 0   | 1   | 1   |
| 1 | 1   | 0   | 0   |
| 1 | 1   | 1   | 1   |

## A. Design aspects of DCPAL

The important aspects while designing in DCPAL style are as follows.

- The structured tree used in the differential cascade logic can minimize the latency of the design.
- The number of transistors connected to the ground is made minimum, to reduce the leakage path and nodal capacitances.
- Avoid the use of diodes which causes more power dissipation. Isolate the power clock from the logic block during pre-resolve state and ground during recovery [2].

# B. Design and operation of the 2:1 multiplexer (MUX)

Figure 1 shows the multiplexer using the DCPAL logic. The DCPAL is a diode free and dual rail logic. It uses the structured logic tree arrangement of the Differential Cascode Voltage Switch Logic (DCVSL). The nMOS logic tree of the 2:1 MUX is implemented in the differential cascode style that takes in inputs in complemented and un-complemented form.



Figure 1. 2:1 MUX design in DCPAL style

As shown in Fig. 1, the transistors MN2 - MN7 form the structured nMOS logic tree for the MUX. The transistors MP1 and MP2 form a latch and it provides the charging and discharging path during the evaluation and recovery phases of the power clock signal PC1. Fig. 2 depicts the waveform transients of the power clocks PC1 to PC4. The four phases of operation of the DCPAL are as follows:

- Pre-resolve to zero
- Evaluate
- Hold
- Recover

# International Journal of Future Innovative Science and Technology, ISSN: 2454- 194X Volume-1, Issue-1, May - 2015 editor@istpublications.com



Figure 2. Power Clock

Fig. 3 depicts the transient signals of the input, the select signal and the complements, along with the power clocks PC1 and PC3. During the pre-resolving phase, considering the input IN1 and /S to be *high*, the transistor MN1 conducts with the PC3 *high* at its Gate terminal and the output node /OUT is connected to the ground potential or *pre-resolved*. During the *Evaluate* phase, the power clock PC1 rises, making the OUT follow the PC1, which is *held* during its Hold phase. The Recovery phase follows with the charge from OUT node recycled by the PC1. Note that the pipelining of adiabatic circuits are made possible by cascading the following stage of the circuit operated by PC2 and PC4, and the third stage operated by PC3 and PC1, and so on.



Figure 3. 2:1 MUX output waveforms

## ш. Proposed design

This section presents the design of the 2-bit bidirectional barrel shifter using the 2:1 MUX. Here, the 2-bit design is discussed for generating a better understanding of the design process involved. The design process can be extended for any word length. Table III depicts the various bit combinations of the control input signals and the corresponding operations realized for the barrel shifter. Note also that logic-1 represents the presence of a pulse and the logic-0 represents the absence of a pulse.

The conventions used here are as follows:

'S' is the amount of shift

'D' is for the Direction of shift, i.e., left or right

"SRA, SLA, RR, RL" are the control inputs for selecting the respective operations.

TABLE III. CONTROL INPUTS

| S | D | SRA | SLA | RR | RL | Operation |
|---|---|-----|-----|----|----|-----------|
| 0 | - | -   | -   | -  | -  | No Shift  |
| 1 | 0 | 0   | 0   | 0  | 0  | SRL       |
| 1 | 1 | 0   | 0   | 0  | 0  | SLL       |
| 1 | 0 | 1   | 0   | 0  | 0  | SRA       |
| 1 | 1 | 0   | 1   | 0  | 0  | SLA       |
| 1 | 0 | 0   | 0   | 1  | 0  | RR        |
| 1 | 1 | 0   | 0   | 0  | 1  | RL        |

Fig. 4 shows the 2-bit MUX-based bidirectional barrel shifter. When S=0, a 0-bit shift, i.e., no shift is performed. For S=1, the input data is shifted by a 1-bit position. D is made 0 for all *right* direction operations, while D is made 1 for all the *left* direction related operations. Based on the operation to be performed, one of the control signals from SLA, RR, RL and SRA is to be enabled. If none of these are enabled, then either SLL or SRL is performed based on the value of D.



Figure 4. 2-bit MUX-based bidirectional barrel shifter

All the inputs applied to each of the 2:1 MUX are to be applied in complemented and un-complemented form. Only un-complimented versions are shown in the *Fig.* 4 for simplicity. Similarly, each MUX produces two outputs which are applied as inputs to the next stage.

As discussed in the previous section, each stage in the design is operated using different set of power clocks. Note that each of the power clock sets lags behind the previous one by exactly 90°. This makes the *Hold* phase of the previous stage align with the *Evaluate* phase of the current stage. Once the successive stage evaluates the output from the previous stage, the charge in the output node is no more needed and as depicted in Fig. 2, the *Recovery* phase commences of the previous stage starts. This operating feature of adiabatic circuits makes the pipelining and timing possible. Say, if SRA MUX is operated with PC2 and PC4, then the RR, RL, SLA

# International Journal of Future Innovative Science and Technology, ISSN: 2454- 194X Volume-1, Issue-1, May - 2015 editor@istpublications.com

are operated on PC1 and PC3, D is operated with PC4 and PC2 and S is operated with PC3 and PC1. Inputs applied at each stage are in conjunction with the power clocks at that particular stage. For example, input  $a_I$  applied to RL stage is a delayed version of the same input applied to SRA. A buffer can be used for this purpose, or to synchronize the cascaded stages and the corresponding input signal phases to each stage of the pipeline.

Waveforms for the 2-bit MUX-based bidirectional barrel shifter in DCPAL style is shown in Fig. 5. Here  $a_1a_0=11$ , S=1, D=1 and all other control signals are *zero*, which implies a Shift Left Logical operation on data 11. The expected output is 10, i.e.,  $Y_1=1$ ,  $Y_0=0$  and  $Y_1=0$ ,  $Y_0=1$ .



Figure 5. 2-bit MUX-based barrel shifter output waveforms

## iv. Results and Comparison

The proposed design is implemented for the static CMOS logic style and the DCPAL style for 2-bit and 4-bit word lengths of barrel shifters. The static power dissipation incurred for both the circuits is measured and tabulated as shown in Table IV. The design is simulated for a maximum of 1GHz frequency. Reduction of supply voltage for the static CMOS circuits lead to reduced dynamic power consumption. However, the technology down-scaling also leads to lower threshold voltages and corresponding increase in power dissipation due to leakage currents, thus contributing for rising static power dissipation. The leakage current for the DCPAL circuits is less because of the use of the footer transistor, which reduces the leakage current due to the stacking effect it introduces.

It may also be noted that the tabulated readings pertain only to the static power dissipation comparisons. The dynamic power dissipation of the static CMOS design is expected to be much larger than the DCPAL counterpart, since the later has the capability to get the output nodal charge recovered by the power clock signals across the total cascade of circuits.

TABLE IV. STATIC POWER DISSIPATION COMPARISON

| Static CMOS Logic |         | DCPAL   |         |  |
|-------------------|---------|---------|---------|--|
| 2-bit             | 4-bit   | 2-bit   | 4-bit   |  |
| 0.514nW           | 1.608nW | 0.072nW | 0.500nW |  |

#### v. Conclusions

The novel barrel shifter circuit named *Bidirectional Barrel Shifter* is proposed in this paper. The design focuses on achieving lower power operation by using the DCPAL. The results show that the power consumed by the 2-bit and 4-bit bidirectional barrel shifter is reduced by 85% and 68% in comparison with the 2-bit and 4-bit bidirectional barrel shifter in static CMOS logic respectively, which indicates the benefit of the method over the conventional methods. The other digital design concerns like reduced latency, leakage, transistor count and improved frequency of operation can be achieved by the use of DCPAL style.

The capability to shift by more than one bit position in one cycle and to shift or rotate in either of the directions gives added flexibility for the circuit style, which can be included in the implementation of several algorithms.

#### References

- [1] Ramin Rafati, Sied Mehdi Fakhraie and Kenneth Carless Smith (2008). A 16-Bit Barrel-Shifter Implemented in Data-Driven Dynamic Logic(D3L), IEEE Transactions on Circuits and Systems, Vol.53, No. 10
- [2] Kanchana Bhaaskaran, V.S. and Raina, J.P. (2008). Differential cascade adiabatic logic structure for low power, Jl. of Low Power Electronics, Vol.4, No. 2, pp.178-191.
- [3] Ioannis Voyiatzis (2008). An ALU-based BIST scheme for wordorganized RAMs, IEEE Trans. on Computers, Vol.57, No. 5, pp.154-162
- [4] Matthew R. Pillmeier, Michael J. Schulte and E. George Walters (2002). Design alternatives for barrel shifters, Proceedings of SPIE: Advanced Signal Processing Algorithms, Architectures, and Implementations XII, Vol.4791, pp.436-447.
- [5] Prasad D Khandekar and Dr. Mrs. Shaila Subbaraman. Low Power 2:1 MUX for Barrel Shifter, International Conference on Emerging Trends in Engineering and Technology.