A New Overlap Save Algorithm for Fast Block Convolution and Its Implementation Using FFT
Convolution of data with a long-tap filter is often implemented by overlap save algorithm (OSA) using fast Fourier transform (FFT). But there are some redundant computations in the traditional OSA because the FFT is applied to the overlapped data (concatenation of previous block and the current block) while the DFT computations are recursive. In this paper, we first analyze the redundancy by decomposing the OSA into two processes related to the previous and current block. Then we eliminate the redundant computations by introducing a new transform which is applied only to the current data, not to the overall overlapped data. Hence the size of transform is reduced by half compared to the traditional OSA. The new transform is in the form of DFT and it can be implemented by defining a new butterfly structure. However we implement it by a cascade of twiddle factor and conventional FFT in this paper, in order to use the FFT libraries in PC and DSP. The computational complexity in this case is analyzed and compared with the existing methods. In the experiment, the proposed method is applied to several block convolutions and partitioned-block convolutions. The CPU time is reduced more than expected from the arithmetic analysis, which implies that the reduced transform size gives additional advantage in data manipulation.
KeywordsOverlap save algorithm QDFT Block convolution
This research was performed for the Intelligent Robotics Development Program, one of the 21st Century Frontier R&D Programs funded by the Ministry of Knowledge Economy (MKE).
- 1.Oppenhiem, A. V., & Schafer, R. W. (1989). Discrete-time signal processing. Englewood Cliffs: Prentice-Hall.Google Scholar
- 7.Gardner, W. G. (1995). Efficient convolution without input–output delay. Journal of Audio Engineering Society, 43(3), 127–136.Google Scholar
- 8.Torger, A., & Farina, A. (2001). Real-time partitioned convolution for ambiophonics surround sound. In IEEE workshop on applications of signal processing to audio and acoustics (pp. 21–24).Google Scholar
- 10.Farina, A., Glasgal, R., Armelloni, E., & Torger, A. (2001). Ambiophonic principles for the recording and reproduction of surround sound for music. In 19th AES conference (pp. 21–24).Google Scholar
- 11.Matusiak, R. (1997). Implementing fast Fourier transform algorithms of real-valued sequences with the TMS320 DSP family. Application Report of Texas Instruments.Google Scholar
- 12.Prati, G. (1978). A discrete adaptive equalizer based on the overlap save filtering technique. In Canadian communications and power conference (pp. 141–144).Google Scholar
- 14.Kuk, J. G., Kim, S. Y., & Cho, N. I. (2009). An overlap save algorithm for block convolution with reduced complexity. In IEEE international conference on acoustics, speech and signal processing (pp. 605–608).Google Scholar
- 15.Intel Performance Libraries. Intel integrated performance primitives website. http://software.intel.com/en-us/intel-ipp/.