Abstract
Convolution widely has been used as the main part of the improvement in digital image processing applications. In convolutional computations, a large number of memory accesses and a huge amount of computations challenge its performance. Many of the related proposed convolvers are based on exact computations. Although exact convolvers keep the accuracy of the convolution operation at the top level, sometimes by missing a negligible amount of accuracy, the performance can be improved. Approximate computing is a new technique for solving computation overhead problems. In this paper, approximate 2D convolvers are presented which minimize the memory access rate and computations by a special factor of multiply-and-accumulate (MAC) terms. On the other hand, to preserve the flexibility for supporting different required accuracy, the proposed approximate convolvers are combined with the exact designs with real-time pre-processing stages by exploiting innovative methods which manage the hardware overhead. In comparison with conventional convolvers, the proposed designs improve the number of active resources which causes a significant reduction in power consumption. For 3 × 3 kernel size, the evaluation results on the Xilinx Virtex-7 (XC7V2000t) FPGA device show 34% and 20% power optimization of the proposed approximate and combined convolvers, respectively, in comparison with exact convolver (EC). Also, this improvement grows by increasing the kernel size. Finally, a comparison based on RMSE and PSNR for different sample images and filters reveals that the error rate and image quality reduction are acceptable for many real-time image processing applications.
Similar content being viewed by others
Data and materials availability
All the materials including VHDL codes, experimental road map, data collected are available and ready to be provided as requested by the reviewers.
References
Masters BR, Gonzalez RC, Woods R (2009) Digital image processing. J Biomed Opt 14(2):029901
Zhao Y, Wang M, Yang G, Chan JCW (2018) FOV Expansion of Bioinspired Multiband Polarimetric Imagers With Convolutional Neural Networks. IEEE Photonics J 10(1):1–14
Xu Q, Mytkowicz T, Kim NS (2016) Approximate Computing: A Survey. IEEE Design & Test 33(1):8–22
Kalbasi M, Nikmehr H (2019) A fine-grained pipelined 2-D convolver for high-performance applications. IEEE Trans Circuits Syst II Express Briefs 66(1):146–150
Licciardo GD, Cappetta C, Benedetto LD (2016) FPGA optimization of convolution-based 2d filtering processor for image processing. In: 2016 8th Computer Science and Electronic Engineering (CEEC):180–185
Cabello F, Len J, Iano Y, Arthur R (2015) Implementation of a fixed-point 2d Gaussian filter for image processing based on FPGA. In: 2015 Signal processing: Algorithms, Architectures, Arrangements, and Applications (SPA) 28–33
Chen K, Fabrizio L, Jie H (2016) Design and analysis of an approximate 2D convolver. In: 2016 IEEE international symposium on defect and fault tolerance in VLSI and nanotechnology systems (DFT). IEEE, 2016
Chen K et al (2018) Efficient implementations of reduced precision redundancy (RPR) multiply and accumulate (MAC). IEEE Trans Comput 68(5):784–790
Sborz G, Felipe V, Cesar Z (2020) Architectural exploration of an FPGA-based hardware accelerator for the gaussian filter using approximate computing. In: Anais Estendidos do X Simpósio Brasileiro de Engenharia de Sistemas Computacionais, SBC
Menaka R, Janarthanan S, Deeba K (2020) FPGA implementation of low power and high speed image edge detection algorithm. Microprocess Microsyst 75:103053
Sangeetha D, Deepa P (2019) Fpga implementation of cost-effective robust Canny edge detection algorithm. J Real Time Image Process 16(4):957–970
Zhang H, Xia M, Hu G (2007) A multi-window partial buffering scheme for FPGA-based 2-D convolvers. IEEE Trans Circuits Syst II Express Briefs 54(2):200–204
Bosi B, Bois G, Savaria Y (1999) Reconfigurable pipelined 2-D convolvers for fast digital signal processing. IEEE Trans Very Larg Scale Integration (VLSI) Syst 7(3):299–308
Sunwoo MH, Oh SK (2004) A multiplier-less 2-D convolver chip for real-time image processing. J VLSI Signal Process Syst Signal, Image Video Technol 38(1):63–71. https://doi.org/10.1023/B:VLSI.0000028534.35761.a8
Toledo-Moreo FJ, Martnez-Alvarez JJ, Garrigs-Guerrero J, Ferrndez-Vicente JM (2012) FPGA-based architecture for the real-time computation of 2-d convolution with large kernel size. J Syst Archit 58(8):277–285
Zhang MZ, Ngo HT, Asari VK (2007) Multiplier-less VLSI architecture for real-time computation of multi-dimensional convolution. Microprocess Microsyst 31(1):25–37
Zhang MZ, Asari VK (2007) An efficient multiplier-less architecture for 2-D convolution with quadrant symmetric kernels. Integr VLSI J 40(4): 490 – 502. System-Level Interconnect Prediction. [Online] Available: http://www.sciencedirect.com/science/article/pii/S0167926006000666
Ma ZB, Yang Y, Liu YX, Bharath AA (2016) Recurrently decomposable 2-D convolvers for FPGA-based digital image processing. IEEE Trans Circuits Syst II Express Briefs 63(10):979–983
Fons F, Fons M, Cant E (2011) Run-time self-reconfigurable 2D convolver for adaptive image processing. Microelectron J 42(1):204–217
Yadav DK, Gupta AK, Mishra AK (2008) A fast and area efficient 2-d convolver for real time image processing. In: TENCON 2008 – 2008 IEEE Region 10 Conference. 1–6
Artix-7 FPGAs Data Sheet, Xilinx, Inc., 2017, v1.22
Kalbasi M, Nikmehr H (2020) A Classified and Comparative Study of 2-D Convolvers. In: 2020 International Conference on Machine Vision and Image Processing (MVIP). 1-5.https://doi.org/10.1109/MVIP49855.2020.9116874
Dehghani A, Kavari A, Kalbasi M et al (2021) A new approach for design of an efficient FPGA-based reconfigurable convolver for image processing. J Supercomput. https://doi.org/10.1007/s11227-021-03963-6
Salomon D (2004) Data compression: the complete reference. Springer, Northridge
Wang Y, Lin J, Wang Z (2017) An energy-efficient architecture for binary weight convolutional neural networks. IEEE Trans Very Large Scale Integration (VLSI) Syst 26(2):280–293
Ahmed HO, Maged G, Mohamed D (2018) Concurrent MAC unit design using VHDL for deep learning networks on FPGA. In: 2018 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE). IEEE
Solovyev RA, et al. (2018) FPGA implementation of convolutional neural networks with fixed-point calculations. arXiv preprint arXiv:1808.09945
Garland J, Gregg D (2018) Low complexity multiply-accumulate units for convolutional neural networks with weight-sharing. ACM Trans Archit Code Optim (TACO) 15(3):1–24
Acknowledgements
This publication was supported by grant No. RD-51-9911-0039 from the R&D Center of Mobile Telecommunication Company of Iran (MCI) for advancing information and communications technologies.
Funding
This research and publication was supported by grant No. RD-51-9911-0039 from the R&D Center of Mobile Telecommunication Company of Iran (MCI) for advancing information and communications technologies.
Author information
Authors and Affiliations
Contributions
All authors have contributed to the writing and reviewing of the manuscript text and the level of contribution is according to the name list order maintained in the paper.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Ethical approval
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ramezanzad, A., Rezaei, M., Nikmehr, H. et al. Real-time approximate and combined 2D convolvers for FPGA-based image processing. J Supercomput 79, 18910–18946 (2023). https://doi.org/10.1007/s11227-023-05377-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05377-y