Abstract
Audio fingerprinting technology is widely applied to the analysis and processing of digital signal, especially in the application of speech recognition which is one of the most popular fields of the intelligent multimedia and artificial intelligence. Traditional audio fingerprinting extraction algorithm is based on the decomposition and reconstruction of the wavelet packet. But the requirement of computational capacity and memory is so large. So this paper proposed an algorithm which is based on the lifting wavelet packet and the improved optimal-basis selection to find the coefficient of optimal wavelet packet. Then the average of the logarithmic energy entropy is adopted as the characteristic parameter. And the capacity of computing and memory is better than the traditional algorithm because of the lifting wavelet packet which is more suitable for processing of speech online and the design of intelligent multimedia. And the experiment results indicate that this algorithm is not only robust for the audio which is handled by some kinds of methods and can reflect the overall characteristics of the audio very well, but also has good distinguishability between different audio.
Similar content being viewed by others
References
Baluja S, Covell M (2008) Waveprint: Efficient wavelet-based audio fingerprinting. Pattern Recogn 41:3467–3480
Brechet L, Lucas MF, Doncarli C, Farina D (2007) Compression of biomedical signals with mother wavelet optimization and best-basis wavelet packet selection. IEEE Trans Biomed Eng 54:2186–2192
Burges CJC, Platt JC, Jana S (2003) Distortion discriminant analysis for audio fingerprinting. IEEE Transactions on Speech and Audio Processing 11:165–174
Calvi JP, Phung VM (2013) On the continuity of multivariate Lagrange interpolation at natural lattices. LMS J Comput Math 16:45–60
Cano P, Batlle E, Gomez E, Gomes LDT, Bonnet M (2005) Audio fingerprinting: Concepts and applications. In: S. Halgamuge and L. Wang, (eds). Computational Intelligence for Modelling and Prediction. vol. 2, pp. 233–245
Cano P, Batlle E, Kalker T, Haitsma J (2005) A review of audio fingerprinting. Journal of Vlsi Signal Processing Systems for Signal Image and Video Technology 41:271–284
Cano P, Koppenberger M, Wack N (2005) Content-based music audio recommendation. 13th Annual ACM International Conference on Multimedia, pp. 211–212
Coifman RR, Wickerhauser MV (1992) Entropy-based algorithms for best basis selection. IEEE Trans Inf Theory 38:713–718
Cotton CV, Ellis DPW, and IEEE (2010) Audio fingerprinting to identify multiple videos of an event. In: 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 2386–2389
Daubechies I (1990) The wavelet transform, time-frequency localization and signal analysis. IEEE Trans Inf Theory 36:961–1005
Daubechies I, Sweldens W (1998) Factoring wavelet transforms into lifting steps. J Fourier Anal Appl 4:247–269
Graps A (1995) An introduction to wavelets. IEEE Comput Sci Eng 2:50–61
Guo GD, Li SZ (2003) Content-based audio classification and retrieval by support vector machines. IEEE Trans Neural Netw 14:209–215
Haitsma J, Kalker T (2003) A highly robust audio fingerprinting system with an efficient search strategy. Journal of New Music Research 32:211–221
Lowen R (1990) A fuzzy Lagrange interpolation theorem. Fuzzy Sets Syst 34:33–38, 10
Lu CS, IEEE, and IEEE (2002) Audio fingerprinting based on analyzing time-frequency localization of signals
Mitrovic D, Zeppelzauer M, Breiteneder C (2010) Features for content-based audio retrieval. In: Zelkowitz MV (ed). Advances in Computers, Vol 78: Improving the Web. vol. 78, pp. 71–150
Muntean O, Diosan L, Oltean M, and ACM (2007) Best subtree genetic programming
Ramalingam A, Krishnan S (2006) Gaussian mixture modeling of short-time Fourier transform features for audio fingerprinting. IEEE Transactions on Information Forensics and Security 1:457–463
Seo JS, Jin MH, Lee S, Jang DW, Lee S, Yoo CD (2006) Audio fingerprinting based on normalized spectral subband moments. IEEE Signal Processing Letters 13:209–212
Suyi L, Guangda L, Zhenbao L (2009) Comparisons of wavelet packet, lifting wavelet and stationary wavelet transform for de-noising ECG. 2009 2nd IEEE International Conference on Computer Science and Information Technology (ICCSIT 2009), pp. 491–494
Sweldens W (1995) The lifting scheme: A new philosophy in biorthogonal wavelet constructions image processing. Proceedings of the SPIE - The International Society for Optical Engineering 2569:68–79
Sweldens W (1998) The lifting scheme: A construction of second generation wavelets. SIAM J Math Anal 29:511–546
Umapathy K, Krishnan S, Rao RK (2007) Audio signal feature extraction and classification using local discriminant bases. IEEE Transactions on Audio Speech and Language Processing 15:1236–1246
Wachter A, Biegler LT (2006) On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math Program 106:25–57
Wang D, Miao DQ, Xie C (2011) Best basis-based wavelet packet entropy feature extraction and hierarchical EEG classification for epileptic detection. Expert Syst Appl 38:14314–14320
Wang H, Qin K (2005) Improved ternary subdivision interpolation scheme. Tsinghua Sci Technol 10:128–132
Wold E, Blum T, Keislar D, Wheaten J (1996) Content-based classification, search, and retrieval of audio. IEEE Multimedia 3:27–36
Xie L, Fu ZH, Feng W, Luo Y (2011) Pitch-density-based features and an SVM binary tree approach for multi-class audio classification in broadcast news. Multimedia Systems 17:101–112
Acknowledgments
The authors would like to appreciate all anonymous reviewers for their insightful comments and constructive suggestions to polish this paper in high quality. This research was supported by the ShanghaiScience and Technology Innovation Action Plan Project (16111107502, 17511107203), the National Natural Science Foundation of China (61502220), the Zhejiang Provincial Natural Science Foundation of China(No.LY14F020044), the program for tackling key problems in Henan science and technology (No.172102310636) and the Nanjing Leading Science and Technology Entrepreneurial Talents Introduction Program Funded Project (2014A090002).
Funding
This study was supported by the Shanghai Science and Technology Innovation Action Plan Project (16111107502, 17511107203),the National Natural Science Foundation of China (61502220), the Zhejiang Provincial Natural Science Foundation of China (No.LY14F020044), the program for tackling key problems in Henan science and technology (No.172102310636) and the Nanjing Leading Science and Technology Entrepreneurial Talents Introduction ProgramFunded Project (2014A090002).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jiang, Y., Wu, C., Deng, K. et al. An audio fingerprinting extraction algorithm based on lifting wavelet packet and improved optimal-basis selection. Multimed Tools Appl 78, 30011–30025 (2019). https://doi.org/10.1007/s11042-018-6802-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6802-y