A Survey of Approximate Computing: From Arithmetic Units Design to High-Level Applications

Que, Hao-Hua; Jin, Yu; Wang, Tong; Liu, Ming-Kai; Yang, Xing-Hua; Qiao, Fei

doi:10.1007/s11390-023-2537-y

A Survey of Approximate Computing: From Arithmetic Units Design to High-Level Applications

Survey
Published: 30 March 2023

Volume 38, pages 251–272, (2023)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Hao-Hua Que¹,
Yu Jin¹,
Tong Wang¹,
Ming-Kai Liu¹,
Xing-Hua Yang¹ &
…
Fei Qiao²

291 Accesses
2 Citations
Explore all metrics

Abstract

Realizing a high-performance and energy-efficient circuit system is one of the critical tasks for circuit designers. Conventional researchers always concentrated on the tradeoffs between the energy and the performance in circuit and system design based on accurate computing. However, as video/image processing and machine learning algorithms are widespread, the technique of approximate computing in these applications has become a hot topic. The errors caused by approximate computing could be tolerated by these applications with specific processing or algorithms, and large improvements in performance or power savings could be achieved with some acceptable loss in final output quality. This paper presents a survey of approximate computing from arithmetic units design to high-level applications, in which we try to give researchers a comprehensive and insightful understanding of approximate computing. We believe that approximate computing will play an important role in the circuit and system design in the future, especially with the rapid development of artificial intelligence algorithms and their related applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Xu Q, Mytkowicz T, Kim N S. Approximate computing: A survey. IEEE Design & Test, 2016, 33(1): 8–22. DOI: https://doi.org/10.1109/MDAT.2015.2505723.
Article Google Scholar
Zervakis G, Saadat H, Amrouch H, Gerstlauer A, Parameswaran S, Henkel J. Approximate computing for ML: State-of-the-art, challenges and visions. In Proc. the 26th Asia and South Pacific Design Automation Conference (ASP-DAC), Jan. 2021, pp.189–196. DOI: 10.1145/3394885.3431632.
Jiang H L, Santiago F J H, Mo H, Liu L B, Han J. Approximate arithmetic circuits: A survey, characterization, and recent applications. Proceedings of the IEEE, 2020, 108(12): 2108–2135. DOI: https://doi.org/10.1109/JPROC.2020.3006451.
Article Google Scholar
Amanollahi S, Kamal M, Afzali-Kusha A, Pedram M. Circuit-level techniques for logic and memory blocks in approximate computing systems. Proceedings of the IEEE, 2020, 108(12): 2150–2177. DOI: https://doi.org/10.1109/JPROC.2020.3020792.
Article Google Scholar
Cheemalavagu S, Korkmaz P, Palem K V et al. A probabilistic CMOS switch and its realization by exploiting noise. In Proc. IFIP International Conference on VLSI, Oct. 2005, pp.535–541.
Gupta V, Mohapatra D, Raghunathan A, Roy K. Lowpower digital signal processing using approximate adders. IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, 2013, 32(1): 124–137. DOI: https://doi.org/10.1109/TCAD.2012.2217962.
Article Google Scholar
Kim Y, Zhang Y, Li P. An energy efficient approximate adder with carry skip for error resilient neuromorphic VLSI systems. In Proc. the 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), Nov. 2013, pp.130–137. DOI: 10.1109/ICCAD.2013.6691108.
Zhu N, Goh W L, Wang G, Yeo K S. Enhanced low-power high-speed adder for error-tolerant application. In Proc. the 2010 International SoC Design Conference, Nov. 2010, pp.323–327. DOI: 10.1109/SOCDC.2010.5682905.
Lin I C, Yang Y M, Lin C C. High-performance low-power carry speculative addition with variable latency. IEEE Trans. Very Large Scale Integration (VLSI) Systems, 2015, 23(9): 1591–1603. DOI: https://doi.org/10.1109/TVLSI.2014.2355217.
Article Google Scholar
Hu J J, Li Z J, Yang M, Huang Z X, Qian W K. A highaccuracy approximate adder with correct sign calculation. Integration, 2019, 65: 370–388. DOI: https://doi.org/10.1016/j.vlsi.2017.09.003.
Article Google Scholar
Yang X H, Xing Y, Qiao F, Yang H Z. Multistage latency adders architecture employing approximate computing. Journal of Circuits, Systems and Computers, 2017, 26(3): 1750039. DOI: https://doi.org/10.1142/S0218126617500396.
Article Google Scholar
Zhang T T, Liu W Q, McLarnon E, O’Neill M, Lombardi F. Design of majority logic (ML) based approximate full adders. In Proc. the 2018 IEEE International Symposium on Circuits and Systems (ISCAS), May 2018. DOI: https://doi.org/10.1109/ISCAS.2018.8350962.
Liang J H, Han J, Lombardi F. New metrics for the reliability of approximate and probabilistic adders. IEEE Trans. Computers, 2013, 62(9): 1760–1771. DOI: https://doi.org/10.1109/TC.2012.146.
Article MathSciNet MATH Google Scholar
Niharika A, Ramesh M K. 16×16 modified booth multiplier implementation using Wallace tree structures. Journal of Signal Processing, 2022, 8(1): 16–21.
Google Scholar
Kulkarni P, Gupta P, Ercegovac M. Trading accuracy for power with an underdesigned multiplier architecture. In Proc. the 24th Internatioal Conference on VLSI Design, Jan. 2011, pp.346–351. DOI: 10.1109/VLSID.2011.51.
Rehman S, El-Harouni W, Shafique M, Kumar A, Henkel J, Henkel J. Architectural-space exploration of approximate multipliers. In Proc. the 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), Nov. 2016. DOI: https://doi.org/10.1145/2966986.2967005.
Waris H, Wang C H, Xu C Y, Liu W Q. AxRMs: Approximate recursive multipliers using high-performance building blocks. IEEE Trans. Emerging Topics in Computing, 2022, 10(2): 1229–1235. DOI: https://doi.org/10.1109/TETC.2021.3096515.
Article Google Scholar
Mahdiani H R, Ahmadi A, Fakhraie S M, Lucas C. Bioinspired imprecise computational blocks for efficient VLSI implementation of soft-computing applications. IEEE Trans. Circuits and Systems I: Regular Papers, 2009, 57(4): 850–862. DOI: https://doi.org/10.1109/TCSI.2009.2027626.
Article Google Scholar
Baran D, Aktan M, Oklobdzija V G. Energy efficient implementation of parallel CMOS multipliers with improved compressors. In Proc. the 16th ACM/IEEE International Symposium on Low-Power Electronics and Design, Aug. 2010, pp.147–152. DOI: 10.1145/1840845.1840876.
Zendegani R, Kamal M, Bahadori M, Afzali-Kusha A, Pedram M. RoBA multiplier: A rounding-based approximate multiplier for high-speed yet energy-efficient digital signal processing. IEEE Trans. Very Large Scale Integration (VLSI) Systems, 2017, 25(2): 393–401. DOI: https://doi.org/10.1109/TVLSI.2016.2587696.
Article Google Scholar
Narayanamoorthy S, Moghaddam H A, Liu Z H, Park T, Kim N S. Energy-efficient approximate multiplication for digital signal processing and classification applications. IEEE Trans. Very Large Scale Integration (VLSI) Systems, 2015, 23(6): 1180–1184. DOI: https://doi.org/10.1109/TVLSI.2014.2333366.
Article Google Scholar
Liu W Q, Qian L Y, Wang C H, Jiang H L, Han J, Lombardi F. Design of approximate radix-4 booth multipliers for error-tolerant computing. IEEE Trans. Computers, 2017, 66(8): 1435–1441. DOI: https://doi.org/10.1109/TC.2017.2672976.
Article MathSciNet MATH Google Scholar
Venkatachalam S, Adams E, Lee H J, Ko S B. Design and analysis of area and power efficient approximate booth multipliers. IEEE Trans. Computers, 2019, 68(11): 1697–1703. DOI: https://doi.org/10.1109/TC.2019.2926275.
Article MathSciNet MATH Google Scholar
Waris H, Wang C H, Liu W Q. Hybrid low radix encoding-based approximate booth multipliers. IEEE Trans. Circuits and Systems II: Express Briefs, 2020, 67(12): 3367–3371. DOI: https://doi.org/10.1109/TCSII.2020.2975094.
Mitchell J N. Computer multiplication and division using binary logarithms. IRE Trans. Electronic Computers, 1962, EC-11(4): 512–517. DOI: https://doi.org/10.1109/TEC.1962.5219391.
Article MathSciNet MATH Google Scholar
Liu W Q, Xu J H, Wang D Y, Wang C H, Montuschi P, Lombardi F. Design and evaluation of approximate logarithmic multipliers for low power error-tolerant applications. IEEE Trans. Circuits and Systems I: Regular Papers, 2018, 65(9): 2856–2868. DOI: https://doi.org/10.1109/TCSI.2018.2792902.
Article Google Scholar
Zhang T T, Jiang H L, Mo H, Liu W Q, Lombardi F, Liu L B, Han J. Design of majority logic-based approximate booth multipliers for error-tolerant applications. IEEE Trans. Nanotechnology, 2022, 21: 81–89. DOI: https://doi.org/10.1109/TNANO.2022.3145362.
Article Google Scholar
Chen L B, Han J, Liu W Q, Lombardi F. On the design of approximate restoring dividers for error-tolerant applications. IEEE Trans. Computers, 2016, 65(8): 2522–2533. DOI: https://doi.org/10.1109/TC.2015.2494005.
Article MathSciNet MATH Google Scholar
Ercegovac M D, Lang T, Montuschi P. Very-high radix division with prescaling and selection by rounding. IEEE Trans. Computers, 1994, 43(8): 909–918. DOI: https://doi.org/10.1109/12.295853.
Article MATH Google Scholar
Chen L B, Lombardi F, Montuschi P, Han J, Liu W Q. Design of approximate high-radix dividers by inexact binary signed-digit addition. In Proc. the on Great Lakes Symposium on VLSI 2017, May 2017, pp.293–298. DOI: https://doi.org/10.1145/3060403.3060404.
Lin C P, Tseng P C, Chiu Y T, Lin S S, Cheng C C, Fang H C, Chao W M, Chen L G. A 5mW MPEG4 SP encoder with 2D bandwidth-sharing motion estimation for mobile applications. In Proc. the 2006 IEEE International Solid State Circuits Conference-Digest of Technical Papers, Feb. 2006, pp.1626–1635. DOI: 10.1109/ISSCC.2006.1696217.
Carroll A, Heiser G. An analysis of power consumption in a smartphone. In Proc. the 2010 USENIX Conference on USENIX Annual Technical Conference, Jun. 2010.
Chang I J, Mohapatra D, Roy K. A priority-based 6T/8T hybrid SRAM architecture for aggressive voltage scaling in video applications. IEEE Trans. Circuits and Systems for Video Technology, 2011, 21(2): 101–112. DOI: https://doi.org/10.1109/TCSVT.2011.2105550.
Article Google Scholar
Zhou N, Qiao F, Yang H Z, Wang H. Low-power off-chip memory design for video decoder using embedded bus-invert coding. In Proc. the 10th International Symposium on Autonomous Decentralized Systems, Mar. 2011, pp.251–255. DOI: 10.1109/ISADS.2011.33.
Joo Y, Choi Y, Shim H. Energy exploration and reduction of SDRAM memory systems. In Proc. the 2002 Design Automation Conference, Jun. 2002, pp.892–897. DOI: 10.1109/DAC.2002.1012748.
Liu S, Pattabiraman K, Moscibroda T, Zorn B G. Flikker: Saving DRAM refresh-power through critical data partitioning. In Proc. the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, Mar. 2011, pp.213–224. DOI: 10.1145/1950365.1950391.
Tian Y, Zhang Q, Wang T, Yuan F, Xu Q. ApproxMA: Approximate memory access for dynamic precision scaling. In Proc. the 25th Edition on Great Lakes Symposium on VLSI, May 2015, pp.337–342. DOI: 10.1145/2742060.2743759.
Shiga H, Takashima D, Shiratake S I, Hoya K, Miyakawa T, Ogiwara R, Fukuda R, Takizawa R, Hatsuda K, Matsuoka F, Nagadomi Y, Hashimoto D, Nishimura H, Hio-ka T, Doumae S, Shimizu S, Kawano M, Taguchi T, Watanabe Y, Fujii S, Ozaki T, Kanaya H, Kumura Y, Shimojo Y, Yamada Y, Minami Y, Shuto S, Yamakawa K, Yamazaki S, Kunishima I, Hamamoto T, Nitayama A, Furuyama T. A 1.6 GB/s DDR2 128 Mb chain FeRAM with scalable octal bitline and sensing schemes. IEEE Journal of Solid-State Circuits, 2010, 45(1): 142–152. DOI: https://doi.org/10.1109/JSSC.2009.2034414.
Article Google Scholar
Li B X, Xia L X, Gu P, Wang Y, Yang H Z. Merging the interface: Power, area and accuracy co-optimization for RRAM crossbar-based mixed-signal computing system. In Proc. the 52nd ACM/EDAC/IEEE Design Automation Conference, Jun. 2015. DOI: 10.1145/2744769.2744870.
Nelson J, Sampson A, Ceze L. Dense approximate storage in phase-change memory. In Proc. the Wild and Crazy Ideas w/International Conference on Architectural Support for Programming Languages and Operating Systems (WACI w/ASPLOS), Mar. 2011.
Sidiroglou-Douskos S, Misailovic S, Hoffmann H, Rinard M. Managing performance vs. accuracy trade-offs with loop perforation. In Proc. the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, Sept. 2011, pp.124–134. DOI: 10.1145/2025113.2025133.
Lashgar A, Atoofian E, Baniasadi A. Loop perforation in OpenACC. In Proc. the 2018 IEEE International Conference on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/-BDCloud/SocialCom/SustainCom), Dec. 2018, pp.163–170. DOI: 10.1109/BDCloud.2018.00036.
Rubio-González C, Nguyen C, Nguyen H D, Demmel J, Kahan W, Sen K, Bailey D H, Iancu C, Hough D. Precimonious: Tuning assistant for floating-point precision. In Proc. the International Conference on High Performance Computing, Networking, Storage and Analysis, Nov. 2013. DOI: https://doi.org/10.1145/2503210.2503296.
Hsiao C C, Chu S L, Chen C Y. Energy-aware hybrid precision selection framework for mobile GPUs. Computers & Graphics, 2013, 37(5): 431–444. DOI: https://doi.org/10.1016/j.cag.2013.03.003.
Article Google Scholar
Lesser B, Mücke M, Gansterer W N. Effects of reduced precision on floating-point SVM classification accuracy. Procedia Computer Science, 2011, 4: 508–517. DOI: https://doi.org/10.1016/j.procs.2011.04.053.
Article Google Scholar
Venkataramani S, Ranjan A, Roy K, Raghunathan A. AxNN: Energy-efficient neuromorphic systems using approximate computing. In Proc. the 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), Aug. 2014, pp.27–32. DOI: 10.1145/2627369.2627613.
Gupta S, Agrawal A, Gopalakrishnan K, Narayanan P. Deep learning with limited numerical precision. In Proc. the 32nd International Conference on Machine Learning, Jul. 2015, pp.1737–1746.
Krishnamoorthi R. Quantizing deep convolutional networks for efficient inference: A whitepaper. arXiv: 1806. 08342, 2018. https://arxiv.org/abs/1806.08342, April 2023.
Zhu F, Gong R H, Yu F W, Liu X L, Wang Y F, Li Z L, Yang X Q, Yan J J. Towards unified INT8 training for convolutional neural network. In Proc. the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.1966–1976. DOI: https://doi.org/10.1109/CVPR42600.2020.00204.
Gysel P, Pimentel J, Motamedi M, Ghiasi S. Ristretto: A framework for empirical study of resource-efficient inference in convolutional neural networks. IEEE Trans. Neural Networks and Learning Systems, 2018, 29(11): 5784–5789. DOI: https://doi.org/10.1109/TNNLS.2018.2808319.
Article Google Scholar
Banner R, Nahshan Y, Soudry D. Post training 4-bit quantization of convolutional networks for rapid-deployment. In Proc. the 33rd International Conference on Neural Information Processing Systems, Dec. 2019, pp.7950–7958.
Sun X, Choi J, Chen C Y, Wang N G, Venkataramani S, Srinivasan V V, Cui X D, Zhang W, Gopalakrishnan K. Hybrid 8-bit floating point (HFP8) training and inference for deep neural networks. In Proc. the 33rd International Conference on Neural Information Processing Systems, Dec. 2019, pp.4900–4909.
Micikevicius P, Narang S, Alben J, Diamos G, Elsen E, Garcia D, Ginsburg B, Houston M, Kuchaiev O, Venkatesh G, Wu H. Mixed precision training. arXiv: 1710.03740, 2017. https://arxiv.org/abs/1710.03740#, April 2023.
Hanson S J, Pratt L Y. Comparing biases for minimal network construction with back-propagation. In Proc. the 1st International Conference on Neural Information Processing Systems, Jan. 1988, pp.177–185.
LeCun Y, Denker J S, Solla S A. Optimal brain damage. In Proc. the Advances in Neural Information Processing Systems, Nov. 1989. pp.598–605.
Zhu M, Gupta S. To prune, or not to prune: Exploring the efficacy of pruning for model compression. In Proc. the 6th International Conference on Learning Representations, Apr. 2018.
Han S, Pool J, Tran J, Dally W J. Learning both weights and connections for efficient neural networks. In Proc. the 28th International Conference on Advances in Neural Information Processing Systems, Dec. 2015. pp.1135–1143.
Liu Z, Li J G, Shen Z Q, Huang G, Yan S M, Zhang C S. Learning efficient convolutional networks through network slimming. In Proc. the 2017 IEEE International Conference on Computer Vision, Oct. 2017, pp.2755–2763. DOI: 10.1109/ICCV.2017.298.
Ye J B, Lu X, Lin Z, Wang J Z. Rethinking the smallernorm-less-informative assumption in channel pruning of convolution layers. In Proc. the 6th International Conference on Learning Representations, Apr. 2018.
Fleischer B, Shukla S, Ziegler M, Silberman J, Oh J, Srinivasan V, Choi J, Mueller S, Agrawal A, Babinsky T, Cao M Z, Chen C Y, Chuang P, Fox T, Gristede G, Guillorn M, Haynie H, Klaiber M, Lee D, LO S H, Maier G, Scheuermann M, Venkataramani S, Vezyrtzis C, Wang N G, Yee F, Zhou C, Lu P F, Curran B, Chang L, Gopalakrishnan K. A scalable multi-TeraOPS deep learning processor core for AI Trainina and inference. In Proc. the 2018 IEEE Symposium on VLSI Circuits, Jun. 2018, pp.35–36. DOI: https://doi.org/10.1109/VLSIC.2018.8502276.
Li H, Pang Y R, Zhang J L. Security enhancements for approximate machine learning. In Proc. the on Great Lakes Symposium on VLSI 2021, Jun. 2021, pp.461–466. DOI: https://doi.org/10.1145/3453688.3461753.
Leipnitz M T, Nazar G L. High-level synthesis of resource-oriented approximate designs for FPGAs. In Proc. the 56th ACM/IEEE Design Automation Conference (DAC), Jun. 2019.

Download references

Author information

Authors and Affiliations

College of Science, Beijing Forestry University, Beijing, 100091, China
Hao-Hua Que, Yu Jin, Tong Wang, Ming-Kai Liu & Xing-Hua Yang
Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China
Fei Qiao

Authors

Hao-Hua Que
View author publications
You can also search for this author in PubMed Google Scholar
Yu Jin
View author publications
You can also search for this author in PubMed Google Scholar
Tong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ming-Kai Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xing-Hua Yang
View author publications
You can also search for this author in PubMed Google Scholar
Fei Qiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xing-Hua Yang.

Supplementary Information

ESM 1

(PDF 148 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Que, HH., Jin, Y., Wang, T. et al. A Survey of Approximate Computing: From Arithmetic Units Design to High-Level Applications. J. Comput. Sci. Technol. 38, 251–272 (2023). https://doi.org/10.1007/s11390-023-2537-y

Download citation

Received: 28 May 2022
Accepted: 16 March 2023
Published: 30 March 2023
Issue Date: April 2023
DOI: https://doi.org/10.1007/s11390-023-2537-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Survey of Approximate Computing: From Arithmetic Units Design to High-Level Applications

Abstract

Access this article

References

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation