Abstract
This paper presents a novel unified and programmable 2-D Discrete Wavelet Transform (DWT) system architecture, which was implemented using a Field Programmable Gate Array (FPGA)-based Nios II soft-core processor working in combination with custom hardware accelerators generated through high-level synthesis. The proposed system architecture, synthesized on an Altera DE3 Stratix III FPGA board, was developed through an iterative design space exploration methodology using Altera’s C2H compiler. Experimental results show that the proposed system architecture is capable of real-time video processing performance for grayscale image resolutions of up to 1920 × 1080 (1080p) when ran on the Altera DE3 board, and it outperforms the existing 2-D DWT architecture implementations known in literature by a considerable margin in terms of throughput. While the proposed 2-D DWT system architecture satisfies real-time performance constraints, it can also perform both forward and inverse DWT, support a number of popular DWT filters used for image and video compression and provide architecture programmability in terms of number of levels of decomposition as well as image width and height. Based from the design principles used to implement the proposed 2-D DWT system architecture, a system design guideline can be formulated for SOC designs which plan to incorporate dedicated 2-D DWT hardware acceleration.
Similar content being viewed by others
References
Mallat, S. (1989). A theory for multiresolution signal decomposition: The wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7), 674–693.
Daubechies, I., & Sweldens, W. (1998). Factoring wavelet transforms into lifting steps. Journal of Fourier Analysis and Applications, 4(3), 247–269.
Reichel, J. (2001). On the arithmetic and bandwidth complexity of the lifting scheme (pp. 198–201). Greece: International Conference on Image Processing.
Acharya, T. (1997). A high speed reconfigurable integrated architecture for DWT. IEEE Global Telecommunications Conference, Phoenix, 2, 669–673.
Acharya, T., & Chakrabarti, C. (2006). A survey on lifting-based discrete wavelet transform architectures. The Journal of VLSI Signal Processing, 42(3), 321–339.
Andra, K., Chakrabarti, C., & Acharya, T. (2002). A VLSI architecture for lifting-based forward and inverse wavelet transform. IEEE Transactions on Signal Processing, 50(4), 966–977.
Angelopoulou, M. E., Masselos, K., Andreopoulos, Y., & Cheung, P. (2008). Implementation and comparison of the 5/3 lifting 2D discrete wavelet transform computation schedules on FPGAs. The Journal of Signal Processing Systems, 51(1), 3–21.
Chakrabarti, C., Vishwanath, M., & Owens, R. (1996). Architectures for wavelet transforms: a survey. Journal of VLSI Signal Processing, 14(2), 171–192.
Chakrabarti, C., & Vishwanath, M. (1995). Efficient realizations of the discrete and continuous wavelet transforms: from single chip implementations to mappings on SIMD array computer. IEEE Transactions on Signal Processing, 43(3), 759–771.
Chang, Y. N., & Li, Y. S. (2001). Design of highly efficient VLSI architectures for 2-D DWT and 2-D IDWT (pp. 133–140). Belgium: IEEE Workshop on Signal Processing Systems.
Chrysafis, C., & Ortega, A. (2000). Line-based, reduced memory, wavelet image compression. IEEE Transactions on Image Processing, 9(3), 378–389.
Dai, Q., Chen, X., & Lin, C. (2004). A novel VLSI architecture for multidimensional discrete wavelet transform. IEEE Transactions on Circuits and Systems for Video Technology, 14(8), 1105–1110.
Lafruit, G., Nachtergaele, L., Vanhoof, B., & Catthoor, F. (2000). The local wavelet transform: a memory-efficient, high-speed architecture optimized to a region-oriented zero-tree coder. Integrated Computer-Aided Engineering, 7(2), 89–103.
Lee, S. W., & Lim, S. C. (2006). VLSI design of a wavelet processing core. IEEE Transactions on Circuits and Systems for Video Technology, 16(11), 1350–1361.
Lian, C. J., Chen, K. F., Chen, H. H., & Chen, L. G. (2001). Lifting based discrete wavelet transform architecture for JPEG2000 (pp. 445–448). Sydney: IEEE International Symposium on Circuits and Systems.
Liu, C. C., Shiau, Y. H., and Jou J. M. (2000). Design and implementation of a progressive image coding chip based on the lifted wavelet transform. Proceedings of the 11th VLSI Design/CAD Symposium, Taiwan.
Marino, F. (2001). Two fast architectures for the direct 2-D discrete wavelet transform. IEEE Transactions on Signal Processing, 49(6), 1248–1259.
Vishwanath, M., Owens, R., & Irwin, M. J. (1995). VLSI architectures for the discrete wavelet transform. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 42(5), 305–316.
Wu, P. C., & Chen, L. G. (2001). An efficient architecture for two-dimensional discrete wavelet transform. IEEE Transactions on Circuits and Systems for Video Technology, 11(4), 536–545.
Yu, C., & Chen, S. J. (1999). Design of an efficient VLSI architecture for 2-D discrete wavelet transforms. IEEE Transactions on Consumer Electronics, 45(1), 135–140.
Zhang, C., Long, Y., & Kurdahi, F. (2007). A hierarchical pipelining architecture and FPGA implementation for lifting-based 2-D DWT. Journal of Real-Time Image Processing, 2(4), 281–291.
Taubman, D. (2000). High performance scalable image compression with EBCOT. IEEE Transactions on Image Processing, 9(7), 1158–1170.
Altera Corporation. (2010). Avalon Interface Specifications [Online]. Available: http://www.altera.com/literature/lit-fs.jsp [2011, March 6].
Altera Corporation. (2009). Nios II C2H Compiler User Guide [Online]. Available: http://www.altera.com/literature/lit-nio2.jsp [2011, March 6].
Adams, M. D., & Kossentini, F. (2000). Reversible integer-to-integer wavelet transforms for image compression: performance evaluation and analysis. IEEE Transactions on Image Processing, 9(6), 1010–1024.
Lau, D., Pritchard, O., & Molson, P. (2006). Automated Generation of Hardware Accelerators with Direct Memory Access from ANSI/ISO C Functions, 14 th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, Napa, California, 45–56
Altera Corporation. (2008). Optimizing Nios II C2H Compiler Results, Embedded Design Handbook [Online]. Available: http://www.altera.com/literature/lit-nio2.jsp [2011, March 6].
Altera Corporation. (2010). DDR and DDR2 SDRAM Controllers with ALTMEMPHY IP User Guide, External Memory Interface Handbook [Online], 3, Available: http://www.altera.com/literature/lit-ip.jsp [2011, March 6].
Altera Corporation. (2010). Performance Counter Core, Embedded Peripherals IP User Guide [Online]. Available: http://www.altera.com/literature/lit-ip.jsp [2011, March 6].
Altera Corporation. (2010). PowerPlay Power Analysis, Quartus II Handbook v10.1.0 [Online], 3, Available: http://www.altera.com/literature/lit-qts.jsp [2011, March 6].
Altera Corporation. (2011). Hardcopy III Device Family Overview, Hardcopy III Device Handbook [Online], 1, Available: http://www.altera.com/literature/lit-hardcopy-iii.jsp [2011, March 6].
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sameen, I., Chang, Y.C., Ng, M.S. et al. A Unified FPGA-Based System Architecture for 2-D Discrete Wavelet Transform. J Sign Process Syst 71, 123–142 (2013). https://doi.org/10.1007/s11265-012-0687-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-012-0687-1