Content based audio classification: a neural network approach

Mitra, Vikramjit; Wang, Chia-Jiu

doi:10.1007/s00500-007-0241-4

Content based audio classification: a neural network approach

Focus
Published: 10 October 2007

Volume 12, pages 639–646, (2008)
Cite this article

Soft Computing Aims and scope Submit manuscript

Vikramjit Mitra¹ &
Chia-Jiu Wang²

193 Accesses
14 Citations
Explore all metrics

Abstract

Content based music genre classification is a key component for next generation multimedia search agents. This paper introduces an audio classification technique based on audio content analysis. Artificial Neural Networks (ANNs), specifically multi-layered perceptrons (MLPs) are implemented to perform the classification task. Windowed audio files of finite length are analyzed to generate multiple feature sets which are used as input vectors to a parallel neural architecture that performs the classification. This paper examines a combination of linear predictive coding (LPC), mel frequency cepstrum coefficients (MFCCs), Haar Wavelet, Daubechies Wavelet and Symlet coefficients as feature sets for the proposed audio classifier. Parallel to MLP, a Gaussian radial basis function (GRBF) based ANN is also implemented and analyzed. The obtained prediction accuracy of 87.3% in determining the audio genres claims the efficiency of the proposed architecture. The ANN prediction values are processed by a rule based inference engine (IE) that presents the final decision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Atal B and Schroeder M (1979). Predictive coding of speech signals and subjective error criteria. IEEE Trans Acoust Speech Signal Process 27(3): 247–254
Article Google Scholar
Blum T, Keislar D, Wheaton J, Wold E (1999) Method and article of manufacture for content-based analysis, storage, retrieval, and segmentation of audio information. U.S. Patent 5, 918, 223, (1999)
Graps A (1995). An introduction to wavelets, IEEE Computer Science and Engineering. IEEE Comput Soc 2(2): 50–61
Article Google Scholar
Guo G and Li SZ (2003). Content-based audio classification and retrieval by support vector machines. IEEE Trans Neural Netw 14(1): 209–215
Article Google Scholar
Kailath T (1974). A view of three decades of linear filtering theory. IEEE Trans Inf Theory 20(2): 146–181
Article MATH MathSciNet Google Scholar
Logan B (2000) Mel frequency cepstral coefficients for music modeling. In: Proceedings of the international symposium on music information retrieval (SMIR)
Markel JD and Gray A (1976). Linear prediction of speech. Communication & Cybernetics. Springer, Heidelberg
Google Scholar
McGarry KJ, Wermter S, McIntyre J (1999) Knowledge extraction from radial basis function networks and multi-layer perceptrons. In: Proceedings of international joint conference on neural networks (IJCNN), Washinton, vol 4, pp 2494–2497
MPEG Requirement Group (1998) MPEG-7: overview of the MPEG-7 standard. ISO/IEC JTC1/SC29/WG11 N3752, France
National Communication System - Office Technology and Standards (1984) Federal Standard 1015, telecommunications: analog to digital conversion of radio voice by 2400 bit/second linear predictive coding
Principe JC, Euliano NR and Lefebvre WC (2000). Neural and adaptive systems: fundamentals through simulations. Wiley, New York
Google Scholar
Rabiner L and Juang B (1993). Fundamentals of speech recognition. Prentice-Hall, Englewood Cliffs
Google Scholar
Slaney M (1999) Auditory toolbox for Matlab, Interval Research Corporation, Version 2. http://cobweb.ecn.purdue.edu/~malcolm/interval/1998-010/AuditoryToolboxTechReport.pdf
Tzanetakis G, Essl G, Cook P (2001) Audio analysis using the discrete wavelet transform. In: Proceedings of WSES international conference, acoustics and music: theory and applications (AMTA), Skiathos, Greece
Wiener N (1949). Extrapolation, interpolation and smoothing of stationary time series with engineering applications. Technology Press/Wiley, New York
MATH Google Scholar
Wold H (1954). A study in the analysis of stationary time series, 2nd edn. Almquist and Wiksell, Stockholm
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, 20742, USA
Vikramjit Mitra
Department of Electrical and Computer Engineering, University of Colorado at Colorado Springs, Colorado Springs, CO, 80933, USA
Chia-Jiu Wang

Authors

Vikramjit Mitra
View author publications
You can also search for this author in PubMed Google Scholar
Chia-Jiu Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vikramjit Mitra.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mitra, V., Wang, CJ. Content based audio classification: a neural network approach. Soft Comput 12, 639–646 (2008). https://doi.org/10.1007/s00500-007-0241-4

Download citation

Published: 10 October 2007
Issue Date: May 2008
DOI: https://doi.org/10.1007/s00500-007-0241-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Content based audio classification: a neural network approach

Abstract

Access this article

Similar content being viewed by others

Automatic Music Genre Detection Using Artificial Neural Networks

Audio Songs Classification Based on Music Patterns

Stage Audio Classifier Using Artificial Neural Network

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Content based audio classification: a neural network approach

Abstract

Access this article

Similar content being viewed by others

Automatic Music Genre Detection Using Artificial Neural Networks

Audio Songs Classification Based on Music Patterns

Stage Audio Classifier Using Artificial Neural Network

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation