Skip to main content
Log in

An audio fingerprinting extraction algorithm based on lifting wavelet packet and improved optimal-basis selection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Audio fingerprinting technology is widely applied to the analysis and processing of digital signal, especially in the application of speech recognition which is one of the most popular fields of the intelligent multimedia and artificial intelligence. Traditional audio fingerprinting extraction algorithm is based on the decomposition and reconstruction of the wavelet packet. But the requirement of computational capacity and memory is so large. So this paper proposed an algorithm which is based on the lifting wavelet packet and the improved optimal-basis selection to find the coefficient of optimal wavelet packet. Then the average of the logarithmic energy entropy is adopted as the characteristic parameter. And the capacity of computing and memory is better than the traditional algorithm because of the lifting wavelet packet which is more suitable for processing of speech online and the design of intelligent multimedia. And the experiment results indicate that this algorithm is not only robust for the audio which is handled by some kinds of methods and can reflect the overall characteristics of the audio very well, but also has good distinguishability between different audio.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Baluja S, Covell M (2008) Waveprint: Efficient wavelet-based audio fingerprinting. Pattern Recogn 41:3467–3480

    Article  Google Scholar 

  2. Brechet L, Lucas MF, Doncarli C, Farina D (2007) Compression of biomedical signals with mother wavelet optimization and best-basis wavelet packet selection. IEEE Trans Biomed Eng 54:2186–2192

    Article  Google Scholar 

  3. Burges CJC, Platt JC, Jana S (2003) Distortion discriminant analysis for audio fingerprinting. IEEE Transactions on Speech and Audio Processing 11:165–174

    Article  Google Scholar 

  4. Calvi JP, Phung VM (2013) On the continuity of multivariate Lagrange interpolation at natural lattices. LMS J Comput Math 16:45–60

    Article  MathSciNet  Google Scholar 

  5. Cano P, Batlle E, Gomez E, Gomes LDT, Bonnet M (2005) Audio fingerprinting: Concepts and applications. In: S. Halgamuge and L. Wang, (eds). Computational Intelligence for Modelling and Prediction. vol. 2, pp. 233–245

    Google Scholar 

  6. Cano P, Batlle E, Kalker T, Haitsma J (2005) A review of audio fingerprinting. Journal of Vlsi Signal Processing Systems for Signal Image and Video Technology 41:271–284

    Article  Google Scholar 

  7. Cano P, Koppenberger M, Wack N (2005) Content-based music audio recommendation. 13th Annual ACM International Conference on Multimedia, pp. 211–212

  8. Coifman RR, Wickerhauser MV (1992) Entropy-based algorithms for best basis selection. IEEE Trans Inf Theory 38:713–718

    Article  Google Scholar 

  9. Cotton CV, Ellis DPW, and IEEE (2010) Audio fingerprinting to identify multiple videos of an event. In: 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 2386–2389

  10. Daubechies I (1990) The wavelet transform, time-frequency localization and signal analysis. IEEE Trans Inf Theory 36:961–1005

    Article  MathSciNet  Google Scholar 

  11. Daubechies I, Sweldens W (1998) Factoring wavelet transforms into lifting steps. J Fourier Anal Appl 4:247–269

    Article  MathSciNet  Google Scholar 

  12. Graps A (1995) An introduction to wavelets. IEEE Comput Sci Eng 2:50–61

    Article  Google Scholar 

  13. Guo GD, Li SZ (2003) Content-based audio classification and retrieval by support vector machines. IEEE Trans Neural Netw 14:209–215

    Article  Google Scholar 

  14. Haitsma J, Kalker T (2003) A highly robust audio fingerprinting system with an efficient search strategy. Journal of New Music Research 32:211–221

    Article  Google Scholar 

  15. Lowen R (1990) A fuzzy Lagrange interpolation theorem. Fuzzy Sets Syst 34:33–38, 10

    Article  MathSciNet  Google Scholar 

  16. Lu CS, IEEE, and IEEE (2002) Audio fingerprinting based on analyzing time-frequency localization of signals

  17. Mitrovic D, Zeppelzauer M, Breiteneder C (2010) Features for content-based audio retrieval. In: Zelkowitz MV (ed). Advances in Computers, Vol 78: Improving the Web. vol. 78, pp. 71–150

    Google Scholar 

  18. Muntean O, Diosan L, Oltean M, and ACM (2007) Best subtree genetic programming

  19. Ramalingam A, Krishnan S (2006) Gaussian mixture modeling of short-time Fourier transform features for audio fingerprinting. IEEE Transactions on Information Forensics and Security 1:457–463

    Article  Google Scholar 

  20. Seo JS, Jin MH, Lee S, Jang DW, Lee S, Yoo CD (2006) Audio fingerprinting based on normalized spectral subband moments. IEEE Signal Processing Letters 13:209–212

    Article  Google Scholar 

  21. Suyi L, Guangda L, Zhenbao L (2009) Comparisons of wavelet packet, lifting wavelet and stationary wavelet transform for de-noising ECG. 2009 2nd IEEE International Conference on Computer Science and Information Technology (ICCSIT 2009), pp. 491–494

  22. Sweldens W (1995) The lifting scheme: A new philosophy in biorthogonal wavelet constructions image processing. Proceedings of the SPIE - The International Society for Optical Engineering 2569:68–79

    Article  Google Scholar 

  23. Sweldens W (1998) The lifting scheme: A construction of second generation wavelets. SIAM J Math Anal 29:511–546

    Article  MathSciNet  Google Scholar 

  24. Umapathy K, Krishnan S, Rao RK (2007) Audio signal feature extraction and classification using local discriminant bases. IEEE Transactions on Audio Speech and Language Processing 15:1236–1246

    Article  Google Scholar 

  25. Wachter A, Biegler LT (2006) On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math Program 106:25–57

    Article  MathSciNet  Google Scholar 

  26. Wang D, Miao DQ, Xie C (2011) Best basis-based wavelet packet entropy feature extraction and hierarchical EEG classification for epileptic detection. Expert Syst Appl 38:14314–14320

    Google Scholar 

  27. Wang H, Qin K (2005) Improved ternary subdivision interpolation scheme. Tsinghua Sci Technol 10:128–132

    Article  MathSciNet  Google Scholar 

  28. Wold E, Blum T, Keislar D, Wheaten J (1996) Content-based classification, search, and retrieval of audio. IEEE Multimedia 3:27–36

    Article  Google Scholar 

  29. Xie L, Fu ZH, Feng W, Luo Y (2011) Pitch-density-based features and an SVM binary tree approach for multi-class audio classification in broadcast news. Multimedia Systems 17:101–112

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to appreciate all anonymous reviewers for their insightful comments and constructive suggestions to polish this paper in high quality. This research was supported by the ShanghaiScience and Technology Innovation Action Plan Project (16111107502, 17511107203), the National Natural Science Foundation of China (61502220), the Zhejiang Provincial Natural Science Foundation of China(No.LY14F020044), the program for tackling key problems in Henan science and technology (No.172102310636) and the Nanjing Leading Science and Technology Entrepreneurial Talents Introduction Program Funded Project (2014A090002).

Funding

This study was supported by the Shanghai Science and Technology Innovation Action Plan Project (16111107502, 17511107203),the National Natural Science Foundation of China (61502220), the Zhejiang Provincial Natural Science Foundation of China (No.LY14F020044), the program for tackling key problems in Henan science and technology (No.172102310636) and the Nanjing Leading Science and Technology Entrepreneurial Talents Introduction ProgramFunded Project (2014A090002).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunxue Wu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, Y., Wu, C., Deng, K. et al. An audio fingerprinting extraction algorithm based on lifting wavelet packet and improved optimal-basis selection. Multimed Tools Appl 78, 30011–30025 (2019). https://doi.org/10.1007/s11042-018-6802-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6802-y

Keywords

Navigation