Skip to main content
Log in

Real-time approximate and combined 2D convolvers for FPGA-based image processing

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Convolution widely has been used as the main part of the improvement in digital image processing applications. In convolutional computations, a large number of memory accesses and a huge amount of computations challenge its performance. Many of the related proposed convolvers are based on exact computations. Although exact convolvers keep the accuracy of the convolution operation at the top level, sometimes by missing a negligible amount of accuracy, the performance can be improved. Approximate computing is a new technique for solving computation overhead problems. In this paper, approximate 2D convolvers are presented which minimize the memory access rate and computations by a special factor of multiply-and-accumulate (MAC) terms. On the other hand, to preserve the flexibility for supporting different required accuracy, the proposed approximate convolvers are combined with the exact designs with real-time pre-processing stages by exploiting innovative methods which manage the hardware overhead. In comparison with conventional convolvers, the proposed designs improve the number of active resources which causes a significant reduction in power consumption. For 3 × 3 kernel size, the evaluation results on the Xilinx Virtex-7 (XC7V2000t) FPGA device show 34% and 20% power optimization of the proposed approximate and combined convolvers, respectively, in comparison with exact convolver (EC). Also, this improvement grows by increasing the kernel size. Finally, a comparison based on RMSE and PSNR for different sample images and filters reveals that the error rate and image quality reduction are acceptable for many real-time image processing applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30
Fig. 31
Fig. 32
Fig. 33
Fig. 34
Fig. 35
Fig. 36
Fig. 37
Fig. 38

Similar content being viewed by others

Data and materials availability

All the materials including VHDL codes, experimental road map, data collected are available and ready to be provided as requested by the reviewers.

References

  1. Masters BR, Gonzalez RC, Woods R (2009) Digital image processing. J Biomed Opt 14(2):029901

    Article  Google Scholar 

  2. Zhao Y, Wang M, Yang G, Chan JCW (2018) FOV Expansion of Bioinspired Multiband Polarimetric Imagers With Convolutional Neural Networks. IEEE Photonics J 10(1):1–14

    Google Scholar 

  3. Xu Q, Mytkowicz T, Kim NS (2016) Approximate Computing: A Survey. IEEE Design & Test 33(1):8–22

    Article  Google Scholar 

  4. Kalbasi M, Nikmehr H (2019) A fine-grained pipelined 2-D convolver for high-performance applications. IEEE Trans Circuits Syst II Express Briefs 66(1):146–150

    Google Scholar 

  5. Licciardo GD, Cappetta C, Benedetto LD (2016) FPGA optimization of convolution-based 2d filtering processor for image processing. In: 2016 8th Computer Science and Electronic Engineering (CEEC):180–185

  6. Cabello F, Len J, Iano Y, Arthur R (2015) Implementation of a fixed-point 2d Gaussian filter for image processing based on FPGA. In: 2015 Signal processing: Algorithms, Architectures, Arrangements, and Applications (SPA) 28–33

  7. Chen K, Fabrizio L, Jie H (2016) Design and analysis of an approximate 2D convolver. In: 2016 IEEE international symposium on defect and fault tolerance in VLSI and nanotechnology systems (DFT). IEEE, 2016

  8. Chen K et al (2018) Efficient implementations of reduced precision redundancy (RPR) multiply and accumulate (MAC). IEEE Trans Comput 68(5):784–790

    Article  MathSciNet  MATH  Google Scholar 

  9. Sborz G, Felipe V, Cesar Z (2020) Architectural exploration of an FPGA-based hardware accelerator for the gaussian filter using approximate computing. In: Anais Estendidos do X Simpósio Brasileiro de Engenharia de Sistemas Computacionais, SBC

  10. Menaka R, Janarthanan S, Deeba K (2020) FPGA implementation of low power and high speed image edge detection algorithm. Microprocess Microsyst 75:103053

    Article  Google Scholar 

  11. Sangeetha D, Deepa P (2019) Fpga implementation of cost-effective robust Canny edge detection algorithm. J Real Time Image Process 16(4):957–970

    Article  Google Scholar 

  12. Zhang H, Xia M, Hu G (2007) A multi-window partial buffering scheme for FPGA-based 2-D convolvers. IEEE Trans Circuits Syst II Express Briefs 54(2):200–204

    Article  Google Scholar 

  13. Bosi B, Bois G, Savaria Y (1999) Reconfigurable pipelined 2-D convolvers for fast digital signal processing. IEEE Trans Very Larg Scale Integration (VLSI) Syst 7(3):299–308

    Article  Google Scholar 

  14. Sunwoo MH, Oh SK (2004) A multiplier-less 2-D convolver chip for real-time image processing. J VLSI Signal Process Syst Signal, Image Video Technol 38(1):63–71. https://doi.org/10.1023/B:VLSI.0000028534.35761.a8

    Article  Google Scholar 

  15. Toledo-Moreo FJ, Martnez-Alvarez JJ, Garrigs-Guerrero J, Ferrndez-Vicente JM (2012) FPGA-based architecture for the real-time computation of 2-d convolution with large kernel size. J Syst Archit 58(8):277–285

    Article  Google Scholar 

  16. Zhang MZ, Ngo HT, Asari VK (2007) Multiplier-less VLSI architecture for real-time computation of multi-dimensional convolution. Microprocess Microsyst 31(1):25–37

    Article  Google Scholar 

  17. Zhang MZ, Asari VK (2007) An efficient multiplier-less architecture for 2-D convolution with quadrant symmetric kernels. Integr VLSI J 40(4): 490 – 502. System-Level Interconnect Prediction. [Online] Available: http://www.sciencedirect.com/science/article/pii/S0167926006000666

  18. Ma ZB, Yang Y, Liu YX, Bharath AA (2016) Recurrently decomposable 2-D convolvers for FPGA-based digital image processing. IEEE Trans Circuits Syst II Express Briefs 63(10):979–983

    Google Scholar 

  19. Fons F, Fons M, Cant E (2011) Run-time self-reconfigurable 2D convolver for adaptive image processing. Microelectron J 42(1):204–217

    Article  Google Scholar 

  20. Yadav DK, Gupta AK, Mishra AK (2008) A fast and area efficient 2-d convolver for real time image processing. In: TENCON 2008 – 2008 IEEE Region 10 Conference. 1–6

  21. Artix-7 FPGAs Data Sheet, Xilinx, Inc., 2017, v1.22

  22. Kalbasi M, Nikmehr H (2020) A Classified and Comparative Study of 2-D Convolvers. In: 2020 International Conference on Machine Vision and Image Processing (MVIP). 1-5.https://doi.org/10.1109/MVIP49855.2020.9116874

  23. Dehghani A, Kavari A, Kalbasi M et al (2021) A new approach for design of an efficient FPGA-based reconfigurable convolver for image processing. J Supercomput. https://doi.org/10.1007/s11227-021-03963-6

    Article  Google Scholar 

  24. Salomon D (2004) Data compression: the complete reference. Springer, Northridge

    MATH  Google Scholar 

  25. Wang Y, Lin J, Wang Z (2017) An energy-efficient architecture for binary weight convolutional neural networks. IEEE Trans Very Large Scale Integration (VLSI) Syst 26(2):280–293

    Article  Google Scholar 

  26. Ahmed HO, Maged G, Mohamed D (2018) Concurrent MAC unit design using VHDL for deep learning networks on FPGA. In: 2018 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE). IEEE

  27. Solovyev RA, et al. (2018) FPGA implementation of convolutional neural networks with fixed-point calculations. arXiv preprint arXiv:1808.09945

  28. Garland J, Gregg D (2018) Low complexity multiply-accumulate units for convolutional neural networks with weight-sharing. ACM Trans Archit Code Optim (TACO) 15(3):1–24

    Article  Google Scholar 

Download references

Acknowledgements

This publication was supported by grant No. RD-51-9911-0039 from the R&D Center of Mobile Telecommunication Company of Iran (MCI) for advancing information and communications technologies.

Funding

This research and publication was supported by grant No. RD-51-9911-0039 from the R&D Center of Mobile Telecommunication Company of Iran (MCI) for advancing information and communications technologies.

Author information

Authors and Affiliations

Authors

Contributions

All authors have contributed to the writing and reviewing of the manuscript text and the level of contribution is according to the name list order maintained in the paper.

Corresponding author

Correspondence to Mehran Rezaei.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ramezanzad, A., Rezaei, M., Nikmehr, H. et al. Real-time approximate and combined 2D convolvers for FPGA-based image processing. J Supercomput 79, 18910–18946 (2023). https://doi.org/10.1007/s11227-023-05377-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-023-05377-y

Keywords

Navigation