An audio fingerprinting extraction algorithm based on lifting wavelet packet and improved optimal-basis selection

Jiang, Yuantao; Wu, Chunxue; Deng, Kaifa; Wu, Yan

doi:10.1007/s11042-018-6802-y

An audio fingerprinting extraction algorithm based on lifting wavelet packet and improved optimal-basis selection

Published: 20 November 2018

Volume 78, pages 30011–30025, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Yuantao Jiang¹,
Chunxue Wu¹,
Kaifa Deng² &
…
Yan Wu³

455 Accesses
8 Citations
Explore all metrics

Abstract

Audio fingerprinting technology is widely applied to the analysis and processing of digital signal, especially in the application of speech recognition which is one of the most popular fields of the intelligent multimedia and artificial intelligence. Traditional audio fingerprinting extraction algorithm is based on the decomposition and reconstruction of the wavelet packet. But the requirement of computational capacity and memory is so large. So this paper proposed an algorithm which is based on the lifting wavelet packet and the improved optimal-basis selection to find the coefficient of optimal wavelet packet. Then the average of the logarithmic energy entropy is adopted as the characteristic parameter. And the capacity of computing and memory is better than the traditional algorithm because of the lifting wavelet packet which is more suitable for processing of speech online and the design of intelligent multimedia. And the experiment results indicate that this algorithm is not only robust for the audio which is handled by some kinds of methods and can reflect the overall characteristics of the audio very well, but also has good distinguishability between different audio.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Article Open access 03 January 2024

Review of wavelet denoising algorithms

Article 03 April 2023

Fundamentals, present and future perspectives of speech enhancement

Article 22 January 2020

References

Baluja S, Covell M (2008) Waveprint: Efficient wavelet-based audio fingerprinting. Pattern Recogn 41:3467–3480
Article Google Scholar
Brechet L, Lucas MF, Doncarli C, Farina D (2007) Compression of biomedical signals with mother wavelet optimization and best-basis wavelet packet selection. IEEE Trans Biomed Eng 54:2186–2192
Article Google Scholar
Burges CJC, Platt JC, Jana S (2003) Distortion discriminant analysis for audio fingerprinting. IEEE Transactions on Speech and Audio Processing 11:165–174
Article Google Scholar
Calvi JP, Phung VM (2013) On the continuity of multivariate Lagrange interpolation at natural lattices. LMS J Comput Math 16:45–60
Article MathSciNet Google Scholar
Cano P, Batlle E, Gomez E, Gomes LDT, Bonnet M (2005) Audio fingerprinting: Concepts and applications. In: S. Halgamuge and L. Wang, (eds). Computational Intelligence for Modelling and Prediction. vol. 2, pp. 233–245
Google Scholar
Cano P, Batlle E, Kalker T, Haitsma J (2005) A review of audio fingerprinting. Journal of Vlsi Signal Processing Systems for Signal Image and Video Technology 41:271–284
Article Google Scholar
Cano P, Koppenberger M, Wack N (2005) Content-based music audio recommendation. 13th Annual ACM International Conference on Multimedia, pp. 211–212
Coifman RR, Wickerhauser MV (1992) Entropy-based algorithms for best basis selection. IEEE Trans Inf Theory 38:713–718
Article Google Scholar
Cotton CV, Ellis DPW, and IEEE (2010) Audio fingerprinting to identify multiple videos of an event. In: 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 2386–2389
Daubechies I (1990) The wavelet transform, time-frequency localization and signal analysis. IEEE Trans Inf Theory 36:961–1005
Article MathSciNet Google Scholar
Daubechies I, Sweldens W (1998) Factoring wavelet transforms into lifting steps. J Fourier Anal Appl 4:247–269
Article MathSciNet Google Scholar
Graps A (1995) An introduction to wavelets. IEEE Comput Sci Eng 2:50–61
Article Google Scholar
Guo GD, Li SZ (2003) Content-based audio classification and retrieval by support vector machines. IEEE Trans Neural Netw 14:209–215
Article Google Scholar
Haitsma J, Kalker T (2003) A highly robust audio fingerprinting system with an efficient search strategy. Journal of New Music Research 32:211–221
Article Google Scholar
Lowen R (1990) A fuzzy Lagrange interpolation theorem. Fuzzy Sets Syst 34:33–38, 10
Article MathSciNet Google Scholar
Lu CS, IEEE, and IEEE (2002) Audio fingerprinting based on analyzing time-frequency localization of signals
Mitrovic D, Zeppelzauer M, Breiteneder C (2010) Features for content-based audio retrieval. In: Zelkowitz MV (ed). Advances in Computers, Vol 78: Improving the Web. vol. 78, pp. 71–150
Google Scholar
Muntean O, Diosan L, Oltean M, and ACM (2007) Best subtree genetic programming
Ramalingam A, Krishnan S (2006) Gaussian mixture modeling of short-time Fourier transform features for audio fingerprinting. IEEE Transactions on Information Forensics and Security 1:457–463
Article Google Scholar
Seo JS, Jin MH, Lee S, Jang DW, Lee S, Yoo CD (2006) Audio fingerprinting based on normalized spectral subband moments. IEEE Signal Processing Letters 13:209–212
Article Google Scholar
Suyi L, Guangda L, Zhenbao L (2009) Comparisons of wavelet packet, lifting wavelet and stationary wavelet transform for de-noising ECG. 2009 2nd IEEE International Conference on Computer Science and Information Technology (ICCSIT 2009), pp. 491–494
Sweldens W (1995) The lifting scheme: A new philosophy in biorthogonal wavelet constructions image processing. Proceedings of the SPIE - The International Society for Optical Engineering 2569:68–79
Article Google Scholar
Sweldens W (1998) The lifting scheme: A construction of second generation wavelets. SIAM J Math Anal 29:511–546
Article MathSciNet Google Scholar
Umapathy K, Krishnan S, Rao RK (2007) Audio signal feature extraction and classification using local discriminant bases. IEEE Transactions on Audio Speech and Language Processing 15:1236–1246
Article Google Scholar
Wachter A, Biegler LT (2006) On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math Program 106:25–57
Article MathSciNet Google Scholar
Wang D, Miao DQ, Xie C (2011) Best basis-based wavelet packet entropy feature extraction and hierarchical EEG classification for epileptic detection. Expert Syst Appl 38:14314–14320
Google Scholar
Wang H, Qin K (2005) Improved ternary subdivision interpolation scheme. Tsinghua Sci Technol 10:128–132
Article MathSciNet Google Scholar
Wold E, Blum T, Keislar D, Wheaten J (1996) Content-based classification, search, and retrieval of audio. IEEE Multimedia 3:27–36
Article Google Scholar
Xie L, Fu ZH, Feng W, Luo Y (2011) Pitch-density-based features and an SVM binary tree approach for multi-class audio classification in broadcast news. Multimedia Systems 17:101–112
Article Google Scholar

Download references

Acknowledgments

The authors would like to appreciate all anonymous reviewers for their insightful comments and constructive suggestions to polish this paper in high quality. This research was supported by the ShanghaiScience and Technology Innovation Action Plan Project (16111107502, 17511107203), the National Natural Science Foundation of China (61502220), the Zhejiang Provincial Natural Science Foundation of China(No.LY14F020044), the program for tackling key problems in Henan science and technology (No.172102310636) and the Nanjing Leading Science and Technology Entrepreneurial Talents Introduction Program Funded Project (2014A090002).

Funding

This study was supported by the Shanghai Science and Technology Innovation Action Plan Project (16111107502, 17511107203),the National Natural Science Foundation of China (61502220), the Zhejiang Provincial Natural Science Foundation of China (No.LY14F020044), the program for tackling key problems in Henan science and technology (No.172102310636) and the Nanjing Leading Science and Technology Entrepreneurial Talents Introduction ProgramFunded Project (2014A090002).

Author information

Authors and Affiliations

School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, China
Yuantao Jiang & Chunxue Wu
School of Art and Design, Shanghai University of Engineering Science, Shanghai, China
Kaifa Deng
School of Public and Environmental Affairs, Indiana University Bloomington, Bloomington, IN, USA
Yan Wu

Authors

Yuantao Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Chunxue Wu
View author publications
You can also search for this author in PubMed Google Scholar
Kaifa Deng
View author publications
You can also search for this author in PubMed Google Scholar
Yan Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chunxue Wu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, Y., Wu, C., Deng, K. et al. An audio fingerprinting extraction algorithm based on lifting wavelet packet and improved optimal-basis selection. Multimed Tools Appl 78, 30011–30025 (2019). https://doi.org/10.1007/s11042-018-6802-y

Download citation

Received: 11 May 2018
Revised: 29 July 2018
Accepted: 23 October 2018
Published: 20 November 2018
Issue Date: November 2019
DOI: https://doi.org/10.1007/s11042-018-6802-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An audio fingerprinting extraction algorithm based on lifting wavelet packet and improved optimal-basis selection

Abstract

Access this article

Similar content being viewed by others

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Review of wavelet denoising algorithms

Fundamentals, present and future perspectives of speech enhancement

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An audio fingerprinting extraction algorithm based on lifting wavelet packet and improved optimal-basis selection

Abstract

Access this article

Similar content being viewed by others

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Review of wavelet denoising algorithms

Fundamentals, present and future perspectives of speech enhancement

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation