Skip to main content

Versatile Quaternion Multipliers Based on Distributed Arithmetic


This paper introduces the idea of versatile circuits for multiplying 4-dimensional hypercomplex numbers in hardware. Depending on the settings of such a device, a variable quaternion can be left- or right-multiplied by a constant coefficient or by its conjugate, as various operations are useful in transform-type algorithms. Multiplierless circuits based on distributed arithmetic (DA) are reviewed that compute quaternion products by additions and bit shifts. It is shown that they can be made versatile by extending memory of partial results, but a better solution is our method for preprocessing bits used to address this memory. The method allows for using the same partial results to compute different inner products that are related to a quaternion multiplication. So versatile multipliers can be implemented with memory optimized so as to save area or to speed up reprogramming compared to the basic DA-based circuit. This has been demonstrated by hardware design experiments, which show that 13–69% area can be saved in a case of ASIC implementation, while for FPGA implementation, spending only 11% more logic resources allows a multiplier to be reprogrammed 75% faster. Additionally, it has been explained how versatile multipliers can be used to realize low-area analysis/synthesis filter banks.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10


  1. As to \(\tilde{\varvec{\Lambda }}(z)\) and \(\varvec{\Lambda }(z)\), they are inverses of each other up to delay, \(\tilde{\varvec{\Lambda }}(z) = \varvec{\Lambda }^{-1}(z) z^{-1} \). This makes the synthesis filters causal at the price of reconstructing a delayed copy of the input signal.


  1. Q. Barthélemy, A. Larue, J.I. Mars, Sparse approximations for quaternionic signals. Adv. Appl. Clifford Algebras 24(2), 383–402 (2014)

    MathSciNet  Article  Google Scholar 

  2. G.C. Cardarilli, S. Pontarelli, M. Re, A. Salsano, On the use of signed digit arithmetic for the new 6-inputs LUT based FPGAs. in Proceedings of 15th International Conference on Electronics, Circuits and Systems (ICECS), St. Julian’s, Malta, pp. 602–605 (2008)

  3. T.S. Chang, C. Chen, C.W. Jen, New distributed arithmetic algorithm and its application to IDCT. IEE Proc. Circuits Devices Syst. 146(4), 159–163 (1999)

    Article  Google Scholar 

  4. T.A. Ell, N. Le Bihan, S.J. Sangwine, Quaternion Fourier Transforms for Signal and Image Processing (ISTE/Wiley, Hoboken, 2014)

    Book  Google Scholar 

  5. S. Gai, L. Luo, Image denoising using normal inverse gaussian model in quaternion wavelet domain. Multimed. Tools Appl. 74(3), 1107–1124 (2015)

    Article  Google Scholar 

  6. O. Gustafsson, On lifting-based fixed-point complex multiplications and rotations. in Proceedings of 24th IEEE Symposium on Computer Arithmetic (ARITH), London, pp. 43–49 (2017)

  7. T.D. Howell, J.C. Lafon, The complexity of the quaternion product. Tech. Rep. TR 75-245, Cornell University (1975)

  8. S.-F. Hsiao, C.-Y. Lau, J.-M. Delosme, Redundant constant-factor implementation of multi-dimensional CORDIC and its application to complex SVD. J. VLSI Signal Process. 25(2), 155–166 (2000)

    Article  Google Scholar 

  9. J.B. Kuipers, Quaternions and Rotation Sequences: A Primer with Applications to Orbits, Aerospace and Virtual Reality (Princeton University Press, Princeton, 1999)

    MATH  Google Scholar 

  10. F. Kurth, M. Clausen, Filter bank tree and \(M\)-band wavelet packet algorithms in audio signal processing. IEEE Trans. Signal Process. 47(2), 549–554 (1999)

    Article  Google Scholar 

  11. Y. Liu, J. Jin, Q. Wang, Y. Shen, X. Dong, Region level based multi-focus image fusion using quaternion wavelet and normalized cut. Signal Process. 97(4), 9–30 (2014)

    Article  Google Scholar 

  12. P. Meher, J. Valls, T.-B. Juang, K. Sridharan, K. Maharatna, 50 years of CORDIC: algorithms, architectures, and applications. IEEE Trans. Circuits Syst. I 56(9), 1893–1907 (2009)

    MathSciNet  Article  Google Scholar 

  13. P.K. Meher, LUT optimization for memory-based computation. IEEE Trans. Circuits Syst. II 57(4), 285–289 (2010)

    Article  Google Scholar 

  14. J.P. Morais, S. Georgiev, W. Sprößig, Real Quaternionic Calculus Handbook (Birkhäuser, Basel, 2014)

    Book  Google Scholar 

  15. M. Parfieniuk, S.Y. Park, Sparse-iteration 4D CORDIC algorithms for multiplying quaternions. IEEE Trans. Comput. 65(9), 2859–2871 (2016a)

    MathSciNet  Article  Google Scholar 

  16. M. Parfieniuk, S.Y. Park, A versatile quaternion multiplier based on sparse-iteration 4D CORDIC. in Proceedings of 14th International New Circuits and Systems Conference (NEWCAS), Vancouver, p. 4 (2016b)

  17. M. Parfieniuk, A. Petrovsky, Quaternionic lattice structures for four-channel paraunitary filter banks. EURASIP J. Adv. Signal Process. Spec. Issue Multirate Syst. Appl. 2007(9), 12 (2007). (Article ID 37481)

    MATH  Google Scholar 

  18. M. Parfieniuk, A. Petrovsky, Inherently lossless structures for eight- and six-channel linear-phase paraunitary filter banks based on quaternion multipliers. Signal Process. 90(6), 1755–1767 (2010a)

    Article  Google Scholar 

  19. M. Parfieniuk, A. Petrovsky, Quaternion multiplier inspired by the lifting implementation of plane rotations. IEEE Trans. Circuits Syst. I 57(10), 2708–2717 (2010b)

    MathSciNet  Article  Google Scholar 

  20. M. Parfieniuk, N.A. Petrovsky, A.A. Petrovsky, Rapid prototyping technology: principles and functional requirements, in Rapid Prototyping of Quaternion Multiplier: From Matrix Notation to FPGA-Based Circuits, ed. by M.E. Hoque (InTech, London, 2011), pp. 227–246

    MATH  Google Scholar 

  21. K.K. Parhi, C.Y. Wang, A.P. Brown, Synthesis of control circuits in folded pipelined DSP architectures. IEEE J. Solid State Circuits 27(1), 29–43 (1992)

    Article  Google Scholar 

  22. N. Petrovsky, M. Parfieniuk, The CORDIC-inside-lifting architecture for constant-coefficient hardware quaternion multipliers. in Proceedings of 8th International Conference on Signals Electronic Systems (ICSES), Wroclaw, p. 6 (2012)

  23. N. Petrovsky, A. Stankevich, A. Petrovsky, Low read-only memory distributed arithmetic implementation of quaternion multiplier using split matrix approach. Electron. Lett. 50(24), 1809–1811 (2014)

    Article  Google Scholar 

  24. N.A. Petrovsky, A.V. Stankevich, A.A. Petrovsky, Pipelined block-lifting-based embedded processor for multiplying quaternions using distributed arithmetic. in Proceedings of 5th Mediterranean Conference on Embedded Computing (MECO), Bar, pp. 222–225 (2016)

  25. N.A. Petrovsky, E.V. Rybenkov, A.A. Petrovsky, Design and implementation of reversible integer quaternionic paraunitary filter banks on adder-based distributed arithmetic. in Proceedings of 21th Signal Processing Workshop: Algorithms, Architectures, Arrangements, and Applications (SPA), Poznan, pp. 17–22 (2017)

  26. E. Salamin, Application of quaternions to computation with rotations. Tech. rep, Stanford AI Lab (1979)

  27. G. Tariova, A. Tariov, Aspekty algorytmiczne redukcji liczby bloków mnożacych w układzie do obliczania iloczynu dwóch kwaternionów [Algorithmic aspects of multiplication block number reduction in a two quaternion hardware multiplier]. Pomiary, Automatyka, Kontrola 56(7), 688–690 (2010). (in Polish)

    Google Scholar 

  28. TSMC, TSMC 90nm general-purpose CMOS standard cell libraries: tcbn90ghptc (2004). Accessed 24 March 2006

  29. R.G. Valenti, I. Dryanovski, J. Xiao, A linear Kalman filter for MARG orientation estimation using the algebraic quaternion algorithm. IEEE Trans. Instrum. Meas. 65(2), 467–481 (2016)

    Article  Google Scholar 

  30. J. Vince, Quaternions for Computer Graphics (Springer, London, 2011)

    Book  Google Scholar 

  31. S.A. White, Applications of distributed arithmetic to digital signal processing: a tutorial review. IEEE ASSP Mag. 6(3), 4–19 (1989)

    Article  Google Scholar 

  32. Xilinx, 7 Series FPGAs Configurable Logic Block User Guide UG474 (v1.7). Xilinx (2014)

  33. H. Yoo, D.V. Anderson, Hardware-efficient distributed arithmetic architecture for high-order digital filters. in Proceedings of IEEE International Conference on Acoustics, Speech, Signal Process. (ICASSP), Philadelphia, vol. 5, pp. 125–128 (2005)

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Sang Yoon Park.

Additional information

This work was supported by Bialystok University of Technology under Grand S/WI/3/2018 and 2016 Research Fund of Myongji University.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Parfieniuk, M., Park, S.Y. Versatile Quaternion Multipliers Based on Distributed Arithmetic. Circuits Syst Signal Process 37, 4880–4906 (2018).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Quaternion
  • Hypercomplex
  • Multiplier
  • Distributed arithmetic
  • FPGA
  • Circuit