Design of an Energy Efficient 4-2 Compressor

In this paper an energy efficient 4-2 compressor is designed by utilizing only 36 transistors. The architecture of this new compressor consists of 8 transistors (8T) Exclusive-OR-Exclusive-NOR (XOR-XNOR) module and multiplexer based on transmission gate logic. The 8T XOR-XNOR module not only offers high speed, full voltage swing at outputs but also consumes less power. Simulations are done by using Cadence Virtuoso Tool in 180nm CMOS technology. The performance parameters, viz. maximum output delay, average power dissipation and power-delay-product (PDP) are varied from 920.1ps to 211.4ps, 11.85µW to 123.4µW and 10.90fJ to 26.09fJ respectively with a variation of supply voltage from 1V to 3V. Further a comparison of the performance parameters of the proposed compressor is performed with a number of the existing 4-2 compressors at 1.8V supply. Simulation results depict that the proposed compressor attains improvement in terms of speed, power and PDP.


INTRODUCTION
The popularity and demand of high speed electronic systems are continuously increasing day by day. Hence the development of a fast and efficient system design has been a subject of interest of VLSI design engineers over decades. A processing element called compressor is widely used in high speed system. Therefore the popularity and demand of high speed compressors are rapidly increasing in many parts of a digital system, especially in digital signal processors, digital filters, general purpose microprocessors, three dimensional (3-D) graphics applications, motion estimation accelerators etc [1]- [3].
In the multiplication process compressors are introduced for the reduction of the number of operands when partial products are added [4]. A multiplication process comprises three stages namely generation of partial product, reduction of partial product and computation of final product [5]. The second stage affects the performance of the multiplier regarding power dissipation and speed. For the improvement of the performance of multiplier high speed and low power compressors are employed in this stage [2], [6]- [7]. VLSI designers have designed various types of 3-2, 4-2, 4-3, 5-2, 5-3, 6-3, 7-2, 7-3 etc compressors [8]- [10]. The regular interconnection and simple structure make the 4-2 compressor suitable for rapid digital computational circuits.
In this paper a less power consuming and highly energy efficient 4-2 compressor is presented. A description about the existing architectures and designs of 4-2 compressor is presented in the Section 2. In the Section 3 the operation of 8 transistors (8T) Exclusive-OR-Exclusive-NOR (XOR-XNOR) module and the design of the proposed 4-2 compressor are described. The Section 4 discusses the results simulation and finally the paper is concluded by the Section 5.

4-2 COMPRESSOR
A 4-2 compressor is a combinatory device which compresses four partial products into two partial products. The block diagram of a 4-2 compressor is shown in Fig. 1(a). It accepts five inputs namely M1, M2, M3, M4 and Cin; and generates three outputs, viz. Sum, Carry and Cout. The input Cin is the output coming from a compressor in preceding lower significant stage and the output Cout is the input to a compressor in the consecutive higher significant stage. M1, M2, M3, M4, Cin and Sum are weighted equally as i; and Carry is weighted one binary bit order higher i.e. i+1. An important property of compressor is that Cout is independent of Cin which makes it superior to the full adder. A 4-2 compressor implemented with two serially joined full adders is presented in Fig. 1(b). This implementation involves a critical path delay of four XOR gates. A 4-2 compressor has to comply with the basic equation: In literature various designs have been described to implement the 4-2 compressor [8]- [13]. Fig. 2 shows two architectures which are commonly used to implement 4-2 compressors [8], [11]. The former one consists of XOR gates and multiplexers (MUX). It involves a critical path delay of three XOR gates. The later one is the modified architecture of the former one which consists of XOR-XNOR modules and MUXs. It involves a critical path delay of a XOR-XNOR module plus two MUXs. These architectures are implemented based on the following equations:   3 shows a 4-2 compressor designed with CMOS logic style. It consumes more power, covers more area and offers more delay as more number transistors are used [11], [12].
A fully Double Pass-Transistor Logic (DPL) multiplexer based 4-2 compressor is presented in [9]. This high speed compressor is designed with the first architecture. The XOR gates are replaced by DPL XOR-XNOR modules which generate XOR and XNOR signals simultaneously. The XOR and XNOR signals are linked to the select and complement of select inputs of multiplexer respectively. This eliminates the demand of inverter for select inputs of multiplexer. But the power consumed by the compressor is more as input inverters are required for DPL XOR-XNOR modules which increase the switching activity.
Another 4-2 compressor is designed in [11] by employing the architecture in Fig. 2(b) where CMOS logic is used for XOR-XNOR 1,2 modules and MUX 1,3,4. For MUX 2 transmission gate logic style is used which heightens the speed of the compressor. The power consumption is also decreased as MUXs don't require extra inverter for select inputs. The 4-2 compressor presented in [12] uses the 10T XOR-XNOR module and the transmission version of MUX. The power consumption is more due to presence of inverters at of MUX. Although XOR and XNOR signals are available these are not properly utilized in succeeding stage.
A new 8T XOR-XNOR module is proposed in [13] and by using the same a less power consuming 4-2 compressor is designed. Also the compressor covers small area as less number of transistors is used.

PROPOSED 4-2 COMPRESSOR
In this section a new design of a 4-2 compressor with two 8T XOR-XNOR modules and four MUXs based on transmission gate logic is described. A new 6T inverter based XNOR gate is proposed in [14]. This XNOR gate not only provides high speed with full voltage swing at output but also consumes less power. For XOR operation an inverter is need to be added at output of XNOR gate. Hence total eight transistors are used to implement the XOR-XNOR module which is shown in Fig. 4 XNOR module, the inverter consisting of MP1 and MN1 transistors generates the complement of input Y. The output of this inverter controls the second inverter consisting of transistors MP2 and MN2. This second inverter nearly generates the XNOR function of X and Y with a problem of voltage degradation for the combinations X=0, Y=1 and X=Y=1 [15]. To avoid the problem two level restoring pass transistors MP3 and MN3 are used. In this module when inputs X=Y=0, the transistors MP1 and MP2 are turned on and output (XNOR) is at logic high level. When inputs X=0 and Y=1, transistors MP2, MP3, MN1 and MN3are turned on and a logic low is passed to the output node. Again for inputs X=1 and Y=0, transistors MP1 and MN2 are turned on and output node shows a logic low. Finally when inputs X=Y=1, transistors MP3, MN1, MN2 and MN3 are turned on and output is at logic high level. A MUX implemented with transmission gate logic is shown in Fig. 4(b). In this design only Sum generator (MUX 3 ) uses an inverter to generate the signal () which is the complement of its select input. Both the outputs of XOR-XNOR modules are fed to the select inputs of the Cout generator (MUX 1 ) and MUX 2 . Also the output and its complement form of MUX 2 are linked to the Carry generator (MUX 4 ). In this way inverters are eliminated which reduces the critical path delay of the compressor and hence the number of transistor counts. For the reduction of power consumption the transistors of the inverter are appropriately sized and also the speed of the compressor is heightened by sizing the transistors of transmission gate logic style MUXs equally [14]. In Fig. 4

PERFORMANCE ANALYSIS OF THE PROPOSED 4-2 COMPRESSOR
The simulations are done by using Cadence Virtuoso Tool in 180nm CMOS technology. The performance parameters of the proposed compressor are listed in Table I for the supply voltage ranges from 1V to 3V. The results show that the maximum output delay, average power consumption and power-delay-product (PDP) are varied from 920.1ps to211.4ps, 11.85μW to 123.4μW and 10.90fJ to 26.09fJ respectively. The transient response of the proposed compressor is shown in Fig. 6. The waveform for power consumption is also shown in Fig. 7.   Table II shows the comparison of maximum output delay, average power consumption and PDP of the proposed 4-2 compressor with the existing compressors at supply voltage of 1.8V with an operating frequency of 100MHz. The proposed compressor uses only 36 transistors so the average power consumption of the compressor is minimal among all the 4-2 compressors listed in the table. It is the fastest design among all except the compressor presented in [9]. Hence the new design attains a significant improvement in PDP. Fig. 8 shows the variation of performance parameters of compressors at different supplies.

CONCLUSION
A new high speed, low power and power and PDP of the proposed comp 36 transistors the proposed design si and speed. Further the performance available compressors reported in lite the average power consumed by the p is 321.9ps. Hence PDP is 13.10fJ wh