Advertisement

The Journal of Supercomputing

, Volume 75, Issue 3, pp 1336–1349 | Cite as

High-performance architecture for digital transform processing

  • H. MoraEmail author
  • M. T. Signes-Pont
  • A. Jimeno-Morenilla
  • J. L. Sánchez-Romero
Article
  • 159 Downloads

Abstract

The digital transforms are intensive in multiplication and accumulation operations which have a high computational cost. Advances in computer arithmetic and digital technologies allow simplifying the processing of complex algorithms when they are implemented in modern circuits. New computation techniques can be explored to provide efficient operational methods for implementing algorithms that avoid much of the complex and costly mathematical operations. This work aims to design a high-performance architecture for computing some common digital transforms. The proposed architecture has been compared to other methods. The transform used as example in this work is the discrete cosine transform. The results show that the proposal offers high-performance results comparable or better than best-known methods.

Keywords

Digital transform implementation Computational techniques design Computer arithmetic DCT 

Notes

Acknowledgements

This work was supported by the Spanish Research Agency (AEI) and the European Regional Development Fund (FEDER) under project “CloudDriver4Industry” TIN2017-89266-R.

References

  1. 1.
    Mora H, Signes-Pont MT, Azorín-López J, Corral Sánchez L (2015) High-speed architecture for direct computation of DCT In: International Conference on systems, control, signal processing and informatics, pp 176–183Google Scholar
  2. 2.
    Sungwook Y, Swartziander EE (2001) DCT implementation with distributed arithmetic. IEEE Trans Comput 50(9):985–991.  https://doi.org/10.1109/12.954513 CrossRefGoogle Scholar
  3. 3.
    Shams AM, Chidanandan A, Pan A, Bayoumi MA (2006) NEDA: a low-power high-performance DCT architecture. IEEE Trans Signal Process 54(3):955–964.  https://doi.org/10.1109/TSP.2005.862755 CrossRefzbMATHGoogle Scholar
  4. 4.
    Sharma VK, Mahapatra KK, Pati UC (2011) An efficient distributed arithmetic based VLSI architecture for DCT. In: International conference on devices and communications, pp 1–5.  https://doi.org/10.1109/icdecom.2011.5738484
  5. 5.
    Bernabé G, Hernández R, Acacio ME (2016) Parallel implementations of the 3D fast wavelet transform on a Raspberry Pi 2 cluster. J Supercomput.  https://doi.org/10.1007/s11227-016-1933-2
  6. 6.
    Chen WH, Smith C, Fralick S (1977) A fast computational algorithm for the Discrete Cosine Transform. IEEE Trans Commun 25(9):1004–1009.  https://doi.org/10.1109/TCOM.1977.1093941 CrossRefzbMATHGoogle Scholar
  7. 7.
    Vetterli M, Kovacevic J (2013) Wavelets and subband coding. CreateSpace Independent Publishing Platform. ISBN: 978-1484886991Google Scholar
  8. 8.
    Loeffler C, Lightenberg A, Moschytz GS (1989) Practical fast 1-D DCT algorithms with 11-multiplications. Proc of ICASSP Glagow 2:988–991.  https://doi.org/10.1109/ICASSP.1989.266596 Google Scholar
  9. 9.
    El Aakif M, Belkouch S, Chabini N, Hassani MM (2011) Low power and fast DCT architecture using multiplier-less method. Faible Tension Faible Consomm.  https://doi.org/10.1109/FTFC.2011.5948920 Google Scholar
  10. 10.
    Liang J, Tran TD (2001) Fast multiplierless approximations of the DCT with the lifting scheme. IEEE Trans Signal Process 49(12):3032–3044.  https://doi.org/10.1109/78.969511 CrossRefGoogle Scholar
  11. 11.
    Huang H, Xiao L (2013) CORDIC based fast Radix-2 DCT algorithm. IEEE Signal Process Lett 20(5):483–486.  https://doi.org/10.1109/LSP.2013.2252616 MathSciNetCrossRefGoogle Scholar
  12. 12.
    Ghodhbani R, Saidani T, Horrigue L et al (2017) An efficient pass-parallel architecture for embedded block coder in JPEG 2000. Real-Time Image Proc.  https://doi.org/10.1007/s11554-017-0666-7 Google Scholar
  13. 13.
    Signes MT et al (2009) Improvement of the Discrete Cosine Transform calculation by means of a recursive method. Math Comput Model 50:750–764.  https://doi.org/10.1016/j.mcm.2009.05.004 MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Xie J, Meher PK, He J (2013) Hardware-efficient realization of prime-length DCT based on distributed arithmetic. IEEE Trans Comput 62(6):1170–1178.  https://doi.org/10.1109/TC.2012.64 MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Coutinho VA et al (2016) A multiplierless pruned DCT-like transformation for image and video compression that requires ten additions only. J Real-Time Image Process 12(2):247–255.  https://doi.org/10.1007/s11554-015-0492-8 CrossRefGoogle Scholar
  16. 16.
    Revathi KG, Reeja Malar J (2016) Efficient diagonal data mapping for large size 2D DCT/IDCT using single port SRAM based transpose memory. Conf Electr Electron Optim Tech, Int.  https://doi.org/10.1109/ICEEOT.2016.7755651 CrossRefGoogle Scholar
  17. 17.
    ISO/IEC (1994) Information technology-digital compression and coding of continuous-tone still images-requirements and guidelines. ISO 81: 09-92, 1994Google Scholar
  18. 18.
    Mora-Mora H, Mora-Pascual J, Sánchez-Romero JL, García-Chamizo JM (2008) Partial product reduction by using look-up tables for M × N multiplier. Integr VLSI J 41(4):557–571.  https://doi.org/10.1016/j.vlsi.2008.01.005 CrossRefGoogle Scholar
  19. 19.
    Tanaka Y (2016) Efficient signed-digit-to-canonical-signed-digit recoding circuits. Microelectron J 57:21–25.  https://doi.org/10.1016/j.mejo.2016.09.001 CrossRefGoogle Scholar
  20. 20.
    Mora H et al (2017) Mathematical model and implementation of rational processing. J Comput Appl Math 309:575–586.  https://doi.org/10.1016/j.cam.2016.05.001 MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Ghasemzadeh M, Mahdavi S, Zokaei A, Hadidi K (2016) A new ultra high speed 5-2 compressor with a new structure. In: International conference on mixed design of integrated circuits and systems, pp 151–154.  https://doi.org/10.1109/mixdes.2016.7529721
  22. 22.
    Mora H et al (2010) Mathematical model of stored logic based computation. Math Comput Model 52(7–8):1243–1250.  https://doi.org/10.1016/j.mcm.2010.02.034 MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    IEEE Std. 1180-1990 (1990) IEEE standard specification for the implementation of 8 × 8 inverse cosine transform. Institute of Electrical and Electronics Engineers, International Standard, New York, USAGoogle Scholar
  24. 24.
    Joshi A et al (2017) A comparative performance analysis of various CMOS design techniques for XOR and XNOR circuits. Int J Res Appl Sci Eng.  https://doi.org/10.22214/ijraset.2017.4241
  25. 25.
    Bernardi P, Restifo M, Sánchez E, Sonza Reorda M (2017) On the in-field test of embedded memories. International symposium on on-line testing and robust system design (IOLTS).  https://doi.org/10.1109/IOLTS.2017.8046236 Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer Science Technology and ComputationUniversity of AlicanteAlicanteSpain

Personalised recommendations