Skip to main content
Log in

An automatic approach of audio feature engineering for the extraction, analysis and selection of descriptors

  • Regular Paper
  • Published:
International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Abstract

Currently, it is critical to find the correct features from the audio, in order to analyze the information contained in it. This paper analyzes several feature types in audio from different points of view: time series, sound engineering, etc. In particular, the description of audio as a set of time series is not very common in the literature, and it is one of the aspects studied in this paper. Particularly, this paper proposes an automated method for feature engineering in audios, to extract, analyze and select the best features in a given context. Specifically, this paper develops a hybrid scheme of extraction of audio descriptors based on different principles and defines an automatic approach for the analysis and selection of these descriptors in a given audio context. Finally, our approach was tested on grouping tasks and compared to previous works on audio classification problems, with encouraging results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Moffat D, Ronan D, Reiss J (2015) An evaluation of audio feature extraction toolboxes,” DAFx 2015—Proceedings of the 18th International Conference on Digital Audio Effects

  2. Seyerlehner K, Schedl M (2009) Block-level audio feature for music genre classification. In: online Proc. of the 5th Annual Music Information Retrieval Evaluation eXchange (MIREX-09)

  3. Pearce A, Brookes T, Mason R (2017) Timbral attributes for sound effect library searching. In: Audio Engineering Society Conference: 2017 AES International Conference on Semantic Audio. Audio Engineering Society

  4. Liu Q, Li R, Hu H, Gu D (2016) Extracting semantic information from visual data: a survey. Robotics 5(1):8. https://doi.org/10.3390/robotics5010008

    Article  Google Scholar 

  5. Aguilar J, Salazar C, Velasco H, Monsalve-Pulido J, Montoya E (2020) Comparison and evaluation of different methods for the feature extraction from educational contents. Computation 8(2):30. https://doi.org/10.3390/computation8020030

    Article  Google Scholar 

  6. Deldjoo Y, Dacrema MF, Constantin MG, Eghbal-Zadeh H, Cereda S, Schedl M, Ionescu B, Cremonesi P (2019) Movie genome: alleviating new item cold start in movie recommendation. User Model User-Adapt Inter 29(2):291–343

    Article  Google Scholar 

  7. Seyerlehner K, Widmer G, Schedl M, Knees P (2010) Automatic music tag classification based on block-level. In: Proceedings of Sound and Music Computing 2010

  8. Fulcher BD, Jones NS (2014) Highly comparative feature-based time-series classification. IEEE Trans Knowl Data Eng 26(12):3026–3037

    Article  Google Scholar 

  9. Hyndman RJ, Wang E, Laptev N (2015) Large-scale unusual time series detection. In: 2015 IEEE international conference on data mining workshop (ICDMW). IEEE, pp 1616–1619

  10. Wang X, Smith K, Hyndman R (2006) Characteristic-based clustering for time series data. Data Min Knowl Discov 13(3):335364. https://doi.org/10.1007/s10618-005-0039-x

    Article  MathSciNet  Google Scholar 

  11. Wülfing J, Riedmiller M (2012) Unsupervised learning of local features for music classification. In: Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR). pp 139–144. https://doi.org/10.5281/zenodo.1414782

  12. Costa YM, Oliveira LS, Koerich AL, Gouyon F (2012) Comparing textural features for music genre classification. In: The 2012 International Joint Conference on Neural Networks (IJCNN). IEEE, pp 1–6

  13. Muaidi H, Al-Ahmad A, Khdoor T, Alqrainy S, Alkoffash M (2014) Arabic audio news retrieval system using dependent speaker mode, mel frequency cepstral coefficient and dynamic time warping techniques. Res J Appl Sci Eng Technol 7(24):5082–5097

    Article  Google Scholar 

  14. Serizel R, Bisot V, Essid S, Richard G (2018) Acoustic features for environmental sound analysis. In: Virtanen T, Plumbley M, Ellis D (eds) Computational analysis of sound scenes and events. Springer, Cham, pp 71–101

    Chapter  Google Scholar 

  15. Aguilar J (2001) A general ant colony model to solve combinatorial optimization problems. Rev Colombiana de Comput 2(1):7–18

    Google Scholar 

  16. Aguilar J (1998) Definition of an energy function for the random neural to solve optimization problems. Neural Netw 11(4):731737

    Google Scholar 

  17. Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Trans Speech Audio Process 10(5):293–302. https://doi.org/10.1109/tsa.2002.800560

    Article  Google Scholar 

  18. Li T, Ogihara M, Li Q (2003) A comparative study on content-based music genre classification. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval —SIGIR ’03. ACM Press [Online]. Available: https://doi.org/10.1145/860435.860487

  19. Lidy T, Rauber A (2005) Evaluation of feature extractors and psycho-acoustic transformations for music genre classification. In: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). pp 34–41

  20. Pampalk E, Flexer A, Widmer G (2005) Improvements of audio-based music similarity and genre classificaton. In:Proc. 6th Int. Conf. Music Information Retrieval, pp 628–633, 01

  21. Bergstra J, Casagrande N, Erhan D, Eck D, Kégl B (2006) Aggregate features and ADABOOST for music classification. Mach Learn 65(2–3):473–484. https://doi.org/10.1007/s10994-006-9019-7

    Article  Google Scholar 

  22. Holzapfel A, Stylianou Y (2008) Musical genre classification using nonnegative matrix factorization-based features. IEEE Trans Audio, Speech, Language Process 16(2):424–434. https://doi.org/10.1109/tasl.2007.909434

    Article  Google Scholar 

  23. Kobayashi T, Kubota A, Suzuki Y (Dec. 2018) Audio feature extraction based on sub-band signal correlations for music genre classification. In: 2018 IEEE International Symposium on Multimedia (ISM). IEEE. [Online]. Available: https://doi.org/10.1109/ism.2018.00-15

  24. Morales L, Ouedraogo CA, Aguilar J, Chassot C, Medjiah S, Drira K (2019) Experimental comparison of the diagnostic capabilities of classification and clustering algorithms for the qos management in an autonomic iot platform. Serv Orient Comput Appl 13(3):199–219

    Article  Google Scholar 

  25. Morales L, Aguilar J (2020) An automatic merge technique to improve the clustering quality performed by lamda. IEEE Access 8:162917–162944

    Article  Google Scholar 

  26. Pampalk E, Rauber A, Merkl D (2002) Content-based organization and visualization of music archives. In: Proceedings of the tenth (ACM) international conference on Multimedia—(MULTIMEDIA) 02. ACM Press. [Online]. Available: https://doi.org/10.1145/641007.641121

  27. Pampalk E, Dixon S, Widmer G (2003) On the evaluation of perceptual similarity measures for music. In: of: Proceedings of the sixth international conference on digital audio effects (DAFx-03), pp 7–12

  28. Mandel MI, Ellis DP (2005) Song-level features and support vector machines for music classification. Proc. 6th Int. Conf. Music Information Retrieval, pp 594–599

  29. Li T, Ogihara M (2006) Toward intelligent music information retrieval. IEEE Trans Multimed 8(3):564–574. https://doi.org/10.1109/tmm.2006.870730

    Article  Google Scholar 

  30. Tzanetakis G (2008) Marsyas-0.2: a case study in implementing music information retrieval systems. In: Shen J, Shepherd J, Cui B, Liu L (eds) Intelligent music information systems: tools and methodologies. IGI Global. pp 31–49

  31. Panagakis I, Benetos E, Kotropoulos C (2008) Music genre classification: a multilinear approach. In: Proceedings of the 9th International Society for Music Information Retrieval Conference (ISMIR). pp 583–588

Download references

Acknowledgements

This work has been supported by the project 64366: “Contenidos de aprendizaje inteligentes a travs del uso de herramientas de Big Data, Analtica Avanzada e IA”—Ministry of Science—Government of Antioquia—Republic of Colombia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jose Aguilar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiménez, M., Aguilar, J., Monsalve-Pulido, J. et al. An automatic approach of audio feature engineering for the extraction, analysis and selection of descriptors. Int J Multimed Info Retr 10, 33–42 (2021). https://doi.org/10.1007/s13735-020-00202-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13735-020-00202-1

Keywords

Navigation