Low-power and high-speed approximate multiplier using higher order compressors for measurement systems ACTA IMEKO ISSN: 2221-870X June 2022, Volume 11, Number 2, 1 - 6 ACTA IMEKO | www.imeko.org June 2022 | Volume 11 | Number 2 | 1 Low-power and high-speed approximate multiplier using higher order compressors for measurement systems M. V. S. Ram Prasad, B. Kushwanth, P. R. D. Bharadwaj, P. T. Sai Teja 1 Department of EECE, GITAM (Deemed to be University), Visakhapatnam, AP, India Section: RESEARCH PAPER Keywords: Approximate computing; approximate compressors; digital circuits; partial products; measurement systems Citation: M. V. S. Ram Prasad, B. Kushwanth, P. R. D. Bharadwaj, P. T. Sai Teja, Low-power and high-speed approximate multiplier using higher order compressors for measurement systems, Acta IMEKO, vol. 11, no. 2, article 36, June 2022, identifier: IMEKO-ACTA-11 (2022)-02-36 Section Editor: Md Zia Ur Rahman, Koneru Lakshmaiah Education Foundation, Guntur, India Received January 30, 2022; In final form April 22, 2022; Published June 2022 Copyright: This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Corresponding author: M. V. S. Ram Prasad, e-mail: mvsrprasadgitam@gmail.com 1. INTRODUCTION In digital signal processing applications multipliers plays an important role in complex arithmetic operations. Here, the approximate computing methods are used to lessen the power consumption. Hence approximate multipliers came into the existence and these approximate multipliers are widely used in the Digital Signal Processing applications to lessen the power consumption [1]-[4]. The complex multipliers used in the DSP applications are replaced with these approximate multipliers. These can perform multiple operations like filtering, convolution, and correlation of the digital signals. To perform these complex multiplications multipliers, adders and shifters are widely used. Here, designing of the multiplier is the hardest part in the design of the DSP. These multipliers consume more power compared with the remaining adders and shifters. In multiplication process there is a propagation of partial products, alignment of partial products, and lessens the partial products finally addition all these partial products. Reducing the partial products count requires more time and power consuming. Multiple techniques are implemented to overcome this issue. The approximate computing technique gives the better results compared to all the previous techniques. Hence approximate adders came into existence and then compressors are designed for the addition of multiple bits [5]-[8]. Higher order compressors are required and lessen delay of addition process. With the use of these approximate compressors approximate multipliers are designed to improve the performance of the Digital Signal Processing applications. The exact multipliers are consuming high speed and require huge delay to obtain exact outputs. Due to these exact multipliers, there is only one major defect is that it can’t optimize further while using multiple techniques [9]-[11]. Hence for the image processing and signal processing applications accept the errors data and gives the modulated signals. Hence approximate compressors and multipliers came into existence and lessen the power consumption along with delay reduction due to the carry bits in the addition process. Due to these approximate multipliers approximate results are obtained and these are sufficient for the ABSTRACT At present, approximate multipliers are used in the image processing applications. These approximate multipliers are designed with the help of higher order compressors to decrease the number of addition stages involved for the lessening stages. The approximate computing is the best technique to improve the power efficiency and reduce delay path. With the use of approximate computing multiple compressors are designed. In this paper, 10:2 compressors are designed and implemented in the 32-bit multiplier and compared with the exact 32-bit multipliers. The proposed higher bit compressors along with the lower bit compressors are implemented to reduce the delay of the design. This type of digital circuits has much significance in measurement technologies, for enabling fast and accurate measurements. With the use of approximate compressors, the result may be ineffective, but the power consumption and delay are getting reduced. Hence, these proposed multipliers are only implemented the digital signal processing applications, where there is need for combining two or more signals. The proposed multiplier is used for implementing FIR filter resulted 27 ns delay which is far better than the exact multiplier having 119 ns. These multipliers also used in image processing applications and PSNR of image has been employed. mailto:mvsrprasadgitam@gmail.com ACTA IMEKO | www.imeko.org June 2022 | Volume 11 | Number 2 | 2 combining of the two signals. The approximate compressors are designed to reduce the computation occurred in the addition process [12]-[15]. Hence a greater number of inputs are added to produce only two outputs called as sum and carry bits and another one more output is also attained called as carry out. In general, 4:2 compressors are widely used, and these give the better results when compared to the previous architectures. Hence with the use of these approximate compressors the partial products are gets reduced and the adder count (gate count) is also reduced. The approximate compressors widely used [16], [17] in multipliers to reduce the power to improve the performance of the circuit. Hence these approximate multipliers need some extra compressors to improve its performance. Then the higher order compressors came into the existence and 10:2 compressors are designed. The lower order compressors require more power consumption of the higher order multipliers is also yields better results [18], [19] when compared to the approximate multipliers using with the lower order compressors. In this paper, a new 32- bit approximate multiplier is designed with the use of higher order compressors achieves low power consumption and lesser delay along with the less error rate. 2. DESIGN OF COMPRESSORS A compressor is a combination circuit having multiple inputs and multiple outputs, the outputs consist of one sum bit and one carry bit along with these multiple carry propagating bits are also generated according to their input bit length [6] as shown in Figure 1. Here, the compressors are adding multiple bits which have same bit length and add the inputs of different bit lengths. These compressors are used in the decrease the partial products generated in multiplication process and these are implemented to add the multiple partial products into two partial outputs along with a carry propagating bit. According to the multiplier bit length these compressors are designed to reduce the partial product count [9]. Due to these approximate compressors the addition of partial products is done easily. 2.1. 4:2 compressor The basic 4:2 compressor consists of 4 inputs, and it gives only two outputs known as sum and carry bit along with these input and output pins one more input pin is added to the compressor called as carry in and it gives one more output called are carry out is shown in Figure 2. The carry out generated from the compressor is propagated to the next bit positions. Hence these compressors generally have multiple input and multiple outputs 𝑋1 + 𝑋2 + 𝑋3 + 𝑋4 + 𝐢𝐼𝑁 = π‘†π‘’π‘š + 2 βˆ— (πΆπ‘Žπ‘Ÿπ‘Ÿπ‘¦ + 𝐢𝑖𝑛) . (1) 2.2. Design of 10:2 compressor The higher order compressor is implemented in the proposed approximate multipliers for the partial product reduction and adder count. In general, huge adder count and produce high power consumption [20]. Here, in the proposed higher order compressors XOR gates and MUX are used to reduce the operation time along with the power consumption and is shown in Figure 3. The higher order compressors are replaced with the normal adders without any disturbance in their truth tables. In approximate compressors the output is obtained based on the input combinations. Here the output is either directly taken from the input or a small calculation is used. In the exact compressors the output is calculated exactly based on the multiple gates and equations. Hence in the approximate compressors the performance is improved in terms of delay and power. With the same approach for 4:2 compressor has been designed as shown in Figure 4. The 4:2 compressors are the basic compressor used in designing every multiplier. In the proposed system the delay is getting reduced due to the termination of carry signals and the carry is not propagated to the next bits, due to this the delay is getting reduced in the proposed system. The proposed design is implemented with the XOR gates and multiplexers to reduce the delay and power consumption. The 8:2 approximate compressors are implemented. The XOR-MUX implementation of 8:2 compressor is shown in Figure 5. Figure 1. N:2 Compressors schematic diagram. Figure 2. 4:2 compressor schematic diagram. Figure 3. Conventional implementation of 3:2 compressors. ACTA IMEKO | www.imeko.org June 2022 | Volume 11 | Number 2 | 3 While using the higher order compressors a greater number of bits are calculated at the same time, while calculating greater number of bits at the same the power consumption is getting reduced which leads to improve the performance of the circuits. With the use of approximate computing there is an occurrence of error is very high due to the reduction of the carry bits. Hence the lower order compressors are consuming high power although they are producing error outputs. While adding multiple bits at the same time a greater number of lower order circuits are used which consumes more power consumption and delay is also increased. To overcome this issue higher order compressor are designed to perform the multi-bit addition. The 10:2 compressor is having only 10 inputs and it gives only 2 outputs [22], [23]. This compressor has fourteen XOR gates and six multiplexers. Each XOR gate receives two primary inputs and it generates single output. Outputs of the XOR gates are propagating to the multiplexers in which the final multiplexer will generate the carry bit and the final XOR gate generate the sum output. 3. DESIGN OF 32-BIT APPROXIMATE MULTIPLIER To obtain the exact multiplication there is a use of exact compressors in the multiplication design process. For the approximate multipliers approximate compressors are utilized where complex multiplication process is implemented. These approximate compressors are implemented in the DCT applications like image processing and signal processing. Here, 32-bit approximate multipliers are implemented. In this paper higher order compressor means a 10:2 compressor are designed and implemented in the 32-bit approximate multipliers. Design of 32-bit approximate multipliers both higher order and lower order compressors are implemented to reduce the adder count and delay of the multipliers. In exact multipliers the output is calculated by using the normal adders which gives the accurate results which consume more power consumption. In the approximate multipliers higher order circuits are used to reduce the partial product generation and reduce the delay by neglecting the carry propagation to the next bits. Hence the power consumption is reduced in Figure 4. Block diagram of 4:2 Compressor using XOR-MUX. Figure 5. Block diagram of proposed 10:2 compressor based on XOR-MUX modules. ACTA IMEKO | www.imeko.org June 2022 | Volume 11 | Number 2 | 4 approximate multipliers compared with the exact multipliers. 8- bit approximate multiplier with reduction stages is shown in Figure 6. The proposed higher order approximate multiplier is having higher compressor like 10:2 compressor along with these lower order compressors like 8:2, 4:2 compressors are also used in this design. The proposed higher order compressors are designed with multiplexers and XOR gates only. The proposed multiplier is having only 3 addition stages which are very less compared to the design of multipliers with the use of lower order compressors. Hence, it is clearly showing that the previous approximate and exact multipliers are not sufficient to design the FIR filters. 4. SIMULATION RESULTS In this section, the proposed multiplier is implemented in Xilinx ISE design suite with the help of Verilog HDL. Figure 7 shows the simulation results obtained in the Xilinx ISIM simulator where the inputs are a and b and the output of the multiplier is obtained at the y signal of the exact multiplier. Figure 6. 8-bit approximate multiplier with reduction stages. Figure 7. Simulation result of the 32-bit exact multiplier. Figure 8. Technology schematic of the 32-bit exact multiplier. ACTA IMEKO | www.imeko.org June 2022 | Volume 11 | Number 2 | 5 The proposed multiplier in which the logic of the proposed design is converted into the LUT’s (Look Up Tables) which are already pre-existed in the Xilinx tools it shows that how many number of LUT’s and IOB’s are utilized for the design. The proposed multipliers are used in the FIR filters to multiply the message signal and carrier signal. For this process generally exact multipliers are used, due to the exact multipliers the power consumption and delay are increased and there is a reduction of excess terms to remove all these extra processes the proposed approximate multipliers are used in the FIR filter applications. Technology schematic of the 32-bit Exact multiplier is shown in Figure 8. Figure 9 show that the simulation results of the proposed 32- bit approximate multiplier, where these results are quite different from the exact multipliers due to the usage of the approximate compressors in the proposed design. The proposed approximate multiplier is completely designed based on the approximate computing. In the approximate computing the output is calculated approximately to reduce the power consumption and delay. Table 1 shows a comparison of 16- and 32-bit exact and approximate multipliers and Table 2 shows a comparison of 32-bit exact and approximate multipliers in FIR filters. 5. APPLICATIONS In this project work a modified 10:2 compressor based on XOR-MUX modules are implemented. Using this proposed compressor, a 32-bit multiplier is designed. The proposed design obtains better results. The proposed multiplier used in multimedia processing for the purposed of 2 different image multiplications and then Peak Signal to Noise Ratio (PSNR) is employed. The data transmission in Digital Signal processing applications requires the convolution and correlation process instead of actual multiplication. Hence the proposed approximate multiplier does not achieve the 100 % accuracy, but these proposed approximate multipliers are suitable for the convolution and correlation processes. 6. CONCLUSION Approximate 32-bit multipliers are implemented using the modified 10:2 compressor. Approximate multipliers provide better results when compared with the exact multipliers. The proposed multiplier achieves better PSNR values with the previous designs. The accuracy of the image those are processed shows that our proposed multipliers are very effective compared to the exact multipliers. The proposed approximate multiplier is best for the multimedia applications where there is a need of combining two or more signals without any exact output. These proposed approximate multipliers are only used in the multimedia and signal processing applications only. Hence the proposed approximate multiplier achieves better results compared to the previous architectures. REFERENCES [1] J. Han, M. Orshansky, Approximate computing: An emerging paradigm for energy-efficient design, 2013 18th IEEE European Test Symposium (ETS), 2013, pp. 1-6, DOI: 10.1109/ETS.2013.6569370 [2] K. Roy, A. Raghunathan, Approximate Computing: An Energy- Efficient Computing Technique for Error Resilient Applications, 2015 IEEE Computer Society Annual Symposium on VLSI, 2015, pp. 473-475, DOI: 10.1109/ISVLSI.2015.130 [3] S. Surekha, M. Z. Ur Rahman, N. Gupta, A low complex spectrum sensing technique for medical telemetry system, Journal of Scientific and Industrial Research, vol. 80, no. 5, 2021, pp. 449- 456. Online [Accessed 27 June 2022] http://op.niscpr.res.in/index.php/JSIR/article/view/41462 [4] V. Sze, Y.-H. Chin, T.-J. Yang, J. Emer, Efficient processing of deep neural networks: a tutorial and survey, Proc. of the IEEE, Vol. 15, No. 12, 2017. DOI: 10.1109/JPROC.2017.2761740 [5] J. Emer, V. Sze, Y.-H. Chen, Hardware architectures for neural networks, Tutorial. International Symposium on Computer Architecture, 2017. Online [Accessed 27 June 2022] http://eyeriss.mit.edu/tutorial-previous.html [6] Mantravadi, Nagesh, Md Zia Ur Rahman, Sala Surekha, Navarun Guptha, Spectrum Sensing using Energy Measurement in Wireless Telemetry Networks using Logarithmic Adaptive Learning, Acta IMEKO, vol. 11, no. 1, 2022, pp. 1-7. DOI: 10.21014/acta_imeko.v11i1.1231 [7] I. Qiqieh, R. Shafik, G. Tarawneh, D. Sokolov, A. Yakovlev, Energy-efficient approximate multiplier design using bit significance driven logic compression, 21st Design, Automation & Test in Europe Conference & Exhibition, Lausanne, Switzerland, 27-31 March 2017. DOI: 10.23919/DATE.2017.7926950 [8] T. Yang, T. Ukezono, T. Sato, A low-power high-speed accuracy controllable approximate multiplier design, 23rd Asia and South Figure 9. Simulation result of the 32-bit approximate multiplier. Table 1. Comparison of 8, 16 and 32-bit exact and approximate multipliers. MULTIPLIER DELAY (ns) LUT COUNT Exact 8-bit 7.664 124 Approximate 8-bit 2.969 84 Exact 16-bit 12.007 551 Approximate 16- bit 4.689 242 Exact 32-bit 20.55 2255 Approximate 32-bit 18.898 1705 Table 2. Comparison of 32-bit exact and approximate multipliers in FIR filters. FILTER DELAY Accurate 119 approximate 27 https://doi.org/10.1109/ETS.2013.6569370 https://doi.org/10.1109/ISVLSI.2015.130 http://op.niscpr.res.in/index.php/JSIR/article/view/41462 https://doi.org/10.1109/JPROC.2017.2761740 http://eyeriss.mit.edu/tutorial-previous.html https://doi.org/10.21014/acta_imeko.v11i1.1231 https://doi.org/10.23919/DATE.2017.7926950 ACTA IMEKO | www.imeko.org June 2022 | Volume 11 | Number 2 | 6 Pacific Design Automation Conference, 2018. DOI: 10.1109/ASPDAC.2018.8297389 [9] D. Esposito, D. De Caro, E. Napoli, N. Petra, A. G. M. Strollo, On the use of approximate adders in carry-save multiplier- accumulators, International Symposium on Circuits and Systems, 2017. DOI: 10.1109/ISCAS.2017 .8050437 [10] M. Z. U. Rahman, S. Surekha, K. P. Satamraju, S. S. Mirza, A. Lay- Ekuakille, A Collateral Sensor Data Sharing Framework for Decentralized Healthcare Systems, IEEE Sensors Journal, vol. 21, no. 124, pp. 227848-27857, 2021. DOI: 10.1109/JSEN.2021.3125529 [11] R. Marimuthu, D. Bansal, S. Balamurugan, P. S. Mallick, Design of 8:4 and 9:4 Compressor For High Speed Multiplication. American Journal of Applied Sciences, 10(8), 2013, p. 893. DOI: 10.3844/ajassp.2013.893.900 [12] B. Silveira, G. Paiim, B. Abreu, M. Grellert, C. M. Diniz, E. A. C. da Costa, S. Bampii, 2017. Power efficient sum of absolute differences architecture using adder compressors for integer motion estimation design. IEEE Transactions on Circuits and Systems I: Regular Papers, 64(112), pp. 326-337. DOI: 10.1109/TCSI.2017.2728802 [13] T. Schiiavon, G. Paiim, M. Fonsieca, E. Costta, S. Almeiida, Exploiting adder’s compressor for power-efficient 2-D approximate DCT. In 2016 IEEE 7th Latin American Symposium on Circuits & Systems (LASCAS), 2016, pp. 3183-3186. DOI: 10.1109/LASCAS.2016.7451090 [14] M. V. S. Ramprasad, Pradeep Vinaik Kodavanti, Design of high speed and area efficient 16-bit mac architecture using hybrid adders for sustainable applications, JGE, vol. 10, issue 11, Nov. 2020 [15] M. V. S. Ramprasad, B. Suribabu Naick, Zaamin Zainuddin Aarif, Design and implementation of high speed 16-bit approximate multiplier, IJITEE, vol. 8, issue 4, Feb. 2019. [16] A. Momeni, J. Han, P. Montuschi, F. Lombardi, Design and Analysis of Approximate Compressors for Multiplication, IEEE Trans. on Computers, vol. 64, no. 4, pp. 984-994, 2015. DOI: 10.1109/TC.2014.2308214 [17] C. Liu, J. Han, F. Lombardi, A Low-Power, High-Performance Approximate Multiplier with Configurable Partial Error Recovery, Proc. of IEEE Design, Automation & Test in Europe Conference &Exhibition (DATE), 2014. DOI: 10.7873/DATE.2014.108 [18] G. Zervakis, K. Tsoumanis, S. Xydis, D. Soudris, K. Pekmestzi, Design-Efficient Approximate Multiplication Circuits Through Partial Product Perforation, IEEE Trans. on Very Large Scale Integration (VLSI) Systems, vol.24, no.10, 2016, pp. 3105-3117. DOI: 10.1109/TVLSI.2016.2535398 [19] A. Tarannum, M. Z. Ur Rahman, T. Srinivasulu, An efficient multi-mode three phase biometric data security framework for cloud computing-based servers, International Journal of Engineering Trends and Technology, 68 (9), 2020, pp. 10-17. DOI: 10.14445/22315381/IJETT-V68I9P203 [20] A. Cilardo, D. De Caro, N. Petra, F. Caserta, N. Mazzocca, A. G. M. Strollo, E. Napoli, High speed speculative multipliers based on speculative carry-save tree, IEEE Transactions in Circuits and Systems I, Vol. 61, No. 12, 2014. DOI: 10.1109/TCSI.2014.2337231 [21] J. Liang, J. Han, F. Lombardi, New Metrics for The Reliability of Approximate and Probabilistic Adders, IEEE Trans. on Computers, vol. 62, no. 9, 2013, pp.1760-1771. DOI: 10.1109/TC.2012.146 [22] P. Kulkarni, P. Gupta, M. D. Ercegovac, Trading accuracy for power in a multiplier architecture, J. Low Power Electron., vol. 7, no. 4, 2011, pp. 490–501. DOI: 10.1109/VLSID.2011.51 [23] C.-H. Lin, C. Lin, High accuracy approximate multiplier with error correction, in Proc. IEEE 31st Int. Conf. Computer. Design, Sep. 2013, pp. 33–38. DOI: 10.1109/ICCD.2013.6657022 https://doi.org/10.1109/ASPDAC.2018.8297389 https://doi.org/10.1109/ISCAS.2017%20.8050437 https://doi.org/10.1109/JSEN.2021.3125529 https://doi.org/10.3844/ajassp.2013.893.900 https://doi.org/10.1109/TCSI.2017.2728802 https://doi.org/10.1109/LASCAS.2016.7451090 https://doi.org/10.1109/TC.2014.2308214 https://doi.org/10.7873/DATE.2014.108 https://doi.org/10.1109/TVLSI.2016.2535398 https://doi.org/10.14445/22315381/IJETT-V68I9P203 https://doi.org/10.1109/TCSI.2014.2337231 https://doi.org/10.1109/TC.2012.146 https://doi.org/10.1109/VLSID.2011.51 https://doi.org/10.1109/ICCD.2013.6657022