|
Abstract.
The article is devoted to the design problem of a self-timed multiplier with accumulation in accordance with the IEEE 754 standard. The article analyzes and compares two manual design cases of a multiply-accumulate block. In the first case, multiplication and subsequent addition and subtraction of operands are performed in a ternary self-timed code. The second case uses dual-rail coding of operands and operation results for these purposes. The final stages of the operation result normalization and rounding in both cases use dual-rail coding for all data. The article shows that the hardware costs of the dual-rail multiplier case in the complementary metal-oxide-semiconductor (CMOS) technology basis are 16% less than the ternary case complexity due to easier one-bit adder implementation in 2.8 times, despite the fact that total summation stages in its Wallace tree equals to 7 instead of 4 in the ternary multiplier case. As a result, the ternary multiplier case's layout implementation, when manufactured in 65-nm CMOS process, takes up 26% more chip area than the case with a dual-rail multiplier. This causes the performance of the ternary multiplier case, taking into account the parasitic parameters extracted from the layout, is 11% worse than the dual-rail case.
Keywords:
self-timed circuit, multiplier with accumulation, dual-rail code, ternary code, ECAD, cell library.
DOI 10.14357/20718632260111
EDN TMIEDQ
PP. 122-132.
References
1. Sorokin A.A., Malkovsky S.I. Performance evaluation of hybrid computing systems based on modern IBM POWER processors. Informatsionnyye tekhnologii i vychislitel'nyye sistemy. 2022;(4):27-40. (In Russ). https//doi.org/10.14357/20718632210303. 2. Maslov A.E., Zorin A.A. Performance analysis of vectorized algorithms. Informatsionnyye tekhnologii i vychislitel'nyye sistemy. 2021;(3):50-61. (In Russ). https//doi.org/10.14357/20718632220405. 3. Hennessy J.L., Patterson D.A. Computer architecture: A quantitative approach. 6th ed. San Mateo, CA, USA: Morgan Kaufmann; 2019. 936 p. 4. Yosys Open Synthesis Suite. Available from: https://yosyshq.net/yosys [Accessed 30 May 2025]. 5. Sparsø J. Introduction to Asynchronous Circuit Design. DTU Compute, Technical University of Denmark; 2020. Available from: https://backend.orbit.dtu.dk/ws/files/215895041/JSPA_async_book_2020_PDF.pdf [Accessed 30 May 2025]. 6. Muller D.E., Bartky W.S. A theory of asynchronous circuits. Proceedings of the International Symposium on the Theory of Switching, Harvard University Press, Cambridge, Massachusetts; 1959. P. 204-243. 7. Kishinevsky M., Kondratyev A., Taubin A., Varshavsky V. Concurrent Hardware: The Theory and Practice of SelfTimed Design. New York: J. Wiley & Sons; 1994. 386 p. 8. Fant K.M. Logically determined design: clockless system design with NULL convention logic. New York: J. Wiley & Sons; 2005. 292 p. 9. Plekhanov L.P. Osnovy samosinkhronnyh shem = Fundamentals of self-timed electronic circuits. M.: Binom. Laboratory of knowledge; 2013. 208 p. (In Russ). 10. Sokolov I., Stepchenkov Y., Diachenko Y., Khilko D. Mathematical Models of Critical Soft Error in Synchronous and Self-Timed Pipeline. Mathematics. 2025;13:(5):695. https//doi.org/10.3390/math13050695. 11. Chikarenko S.K., Ivanova K.M., Skornyakova A.Y., Tyurin S.F. Self-Timed FPGA Design Perspectives. Proceedings of the International Conference on Information and Digital Technologies, 22-24 June 2021 Zilina, Slovakia; 2021. P. 106-112. https//doi.org/10.1109/IDT52577.2021.9497620. 12. Kushnerov A., Medina M., Yakovlev A. Towards HazardFree Multiplexer Based Implementation of Self-Timed Circuits. Proceedings of the 27th IEEE International Symposium on Asynchronous Circuits and Systems, 7-10 September 2021, Beijing, China; 2021. P. 17-24. https//doi.org/10.1109/ASYNC48570.2021.00011. 13. Nautiyal V., Singla G., Maiti B., Kinkade M. Self-Timed Write Aid Circuit for Tall Memories in Advanced CMOS Technologies. Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), 22–28 May 2021, Daegu, Republic of Korea; 2021. P. 1-4. https//doi.org/10.1109/ISCAS51556.2021.9401420. 14. Fiorentino M., Thibeault C., Savaria Y. Introducing KeyRing self-timed micro architecture and timing-driven design flow. IET Computers & Digital Techniques. 2021;(15):409-426. https//doi.org/15.10.1049/cdt2.12032. 15. Sakib A.A. Soft Error Tolerant Quasi-Delay Insensitive Asynchronous Circuits: Advancements and Challenges. Proceedings of the 34th SBC/SBMicro/IEEE/ACM Symposium on Integrated Circuits and Systems Design, 23–27 August 2021, Brazil (virtual); 2021. P. 1-6. https//doi.org/10.1109/SBCCI53441.2021.9530001. 16. IEEE Std. 754. IEEE Standard for floating-point arithmetic. IEEE Computer Society. 2008. https//doi.org/10.1109/IEEESTD.2008.4610935. 17. Pillai R.V.K., Shah S.Y.A., Al-Khalili A.J., Al-Khalili D. Low power floating-point MAFs – A comparative study. Proceedings of the 6th International Symposium on Signal Processing and its Applications (ISSPA 2001), 13–16 August 2001, Kuala Lumpur; 2001. Vol. 1. P. 284-287. 18. Seidel P.-M. Multiple path IEEE floating-point fused multiply-add. Proceedings of the 46th IEEE International Midwest Symposium on Circuits and Systems, 27–30 December, Cairo, Egypt; 2003. P. 1359-1362. 19. Quinnell E.C. Floating-point fused multiply-add architectures [Dissertation]. The University of Texas at Austin, 2007. 150 p. Available from: https://repositories.lib.utexas.edu/bitstream/handle/ 2152/3082/quinnelle60861.pdf [Accessed 30 May 2025]. 20. Bruintjes T.M. Design of a Fused Multiply-Add FloatingPoint and Integer Datapath [Dissertation]. The University of Twente, Enschede, Netherlands. 2011. 154 p. 21. Galal S., Horowitz M. Energy-Efficient Floating-Point Unit Design. IEEE Transactions on computers. 2011;60(7): 913-922. 22. Noche J.R., Araneta J.C. An asynchronous IEEE floatingpoint arithmetic unit. Science Diliman, Philippines. 2007;19(2):12-22. 23. Patent US No. 20130124592/ 2013. Manohar R., Sheikh B.R. Operand-optimized asynchronous floating-point units and method of use therefor. 24. Bingham N., Manohar R. Self-Timed Adaptive Digit-Serial Addition. IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 2019;27(9):2131-2141. https//doi.org/10.1109/TVLSI.2019.2918441. 25. Sokolov I.A., Stepchenkov Yu.A., Bobkov S.G., Rogdestvenski Yu.V., Diachenko Yu.G. Multiplier with accumulation: methodological aspects. Systemy i sredstva informatiki. 2014;24(3):44-62. (In Russ). https//doi.org/10.14357/08696527140304. 26. Stepchenkov Yu.A., Diachenko Yu.G., Rogdestvenski Yu.V., Morozov N.V., Stepchenkov D.Yu., Rogdestvenskene A.V., Surkov A.V. Self-timed multiplier with accumulation: practical implementation. Systemy i sredstva informatiki. 2014;24(3):63-77. (In Russ). https//doi.org/10.14357/08696527140305. 27. Stepchenkov Y., Rogdestvenski Y., Diachenko Y., Stepchenkov D., Shikunov Y. Energy-efficient speed-independent 64-bit fused multiply-add unit. Proceedings of the IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering, 28–31 January 2019, Saint Petersburg and Moscow, Russia; 2019. P. 1709-1714. https//doi.org/10.1109/EIConRus.2019.8657207. 28. Makino H., Nakase Y., Suzuki H., Morinaka H., Shinohara H., Mashiko K. An 8.8-ns 54x54-bit multiplier with highspeed, redundant binary architecture. IEEE Journal of Solid-State Circuits. 1996;31(6):773-783. https//doi.org/10.1109/4.509863.
|