Abstract
Computer storage and network techniques have brought a tremendous need to find a way to automatically index digital music recordings. In this chapter, state of the art acoustic features for timbre automatic indexing are explored to construct efficient classification models, such as decision tree and KNN. The authors built a database containing more than one million music instrument sound slices, each described by a large number of features including standard MPEG7 audio descriptors, features for speech recognition, and many new audio features developed by the authors, spanning from temporal space to spectral domain. Each classification model was tuned with feature selection based on its distinct characteristics for the blind sound separation system. Based on the experimental results, authors proposed a new framework for MIR with multiple classifiers trained on different features. Inspired by the human recognition experience, timbre estimation based on the hierarchical structure of musical instrument families was investigated. A framework for timbre automatic indexing based on Cascade Classification System was proposed. The authors also discussed the issue of features and classifiers selection during the cascade classification process.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A survery of music information fretirval systems, http://mirsystems.info
Akinobu, L., et al.: Julius software toolkit, http://julius.sourceforge.jp/en/
Brown, J.C.: Musical instrument identification using pattern recognition with cepstral coefficients as features. J. Acoust. Soc. of America. 105(3), 1933–1941 (1999)
Brown, J.C., Houix, O., McAdams, S.: Feature dependence in the automatic identification of musical wind instruments. J. Acoust. Soc. of America. 109, 1064–1072 (2001)
Bregman, A.S.: Auditory scene analysis, the perceptual organization of sound. MIT Press, Cambridge (1990)
Cosi, P.: Auditory Modeling and Neural Networks. In: ICANN 1998. LNCS, Springer, Heidelberg (1998)
Cutting, D., Kupiec, J., Pedersen, J., Sibun, P.: A Practical Part-of-Speech Tagger. In: The Third Conference on Applied Natural Language Processing, pp. 133–140 (1992)
Czyzewski, A.: Soft processing of audio signals. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery, pp. 147–165. Physica Verlag, Heidelberg
Kaminskyj, I.: Multi-feature Musical Instrument Sound Classifier. Mikropolyphonie WWW Journal (6) (2001), http://farben.latrobe.edu.au/mikropol/articles.html
Kostek, B.: Soft computing-based recognition of musical sounds. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery. Physica-Verlag, Heidelberg (1998)
Kupiec, J.: Robust Part-of-Speech Tagging Using a Hidden Markov Model. The Computer Speech and Language 6, 225–242 (1992)
Kostek, B., Czyzewski, A.: Representing Musical Instrument Sounds for Their Automatic Classification. J. Audio Eng. Soc. 49(9), 768–785 (2001)
Herrera, P., Amatriain, X., Batlle, E., Serra, X.: Towards instrument segmentation for music content description: a critical review of instrument classification techniques. In: The international Symposium on Music Information Retrieval (ISMIR 2000), Plymouth, MA (2000)
Jensen, K., Arnspang, J.: Binary decision tree classification of musical sounds. In: The 1999 International Computer Music Conference, Beijing, China (1999)
Lindsay, A.T., Herre, J.: MPEG-7 and MPEG-7 Audio-An Overview. J. Audio Eng. Soc. 49, 589–594 (2001)
Logan, B.: Frequency Cepstral Coefficients for Music Modeling. In: 1st Ann. Int. Symposium On Music Information Retrieval (2000)
Martin, K.D.: Sound-Source Recognition: A Theory and Computational Model., Ph.D. Thesis, MIT, Cambridge, MA (1999)
Martin, K.D., Kim, Y.E.: Musical Instrument Identification: A Pattern-Recognition Approach. In: 136th Meeting of the Acoustical Soc. of America, Norfolk, VA 2pMU9 (1998)
Paulus, J., Virtanen, T.: Drum transcription with non-negative spectrogram factorization. In: 13th European Signal Processing Conference, Antalya, Turkey, pp. 4–8 (2005)
Polkowski, L., Skowron, A.: Rough Sets in Knowledge Discovery. Physica-Verlag, Heidelberg (1998)
Press, W.H., Teukolsky, S.A.: Numerical Recipes in C, 2nd edn., Cambridge (1992)
Ras, Z., Wieczorkowska, A.: Indexing audio databases with musical information. In: SCI 2001, Orlando, Florida, vol. 10, pp. 279–285 (2001)
Scheirer, E., Slaney, M.: Construction and Evaluation of a Robust Multi-feature Speech/Music Discriminator. In: IEEE int. Conf. on Acoustics, Speech and Signal Processing, vol. 10, pp. 279–285 (1997)
Tzanetakis, G., Cook, P.: Musical Genre Classification of Audio Signals. IEEE Trans. Speech and Audio Processing 10, 293–302 (1997)
Wieczorkowska, A.: Classification of musical instrument sounds using decision trees. In: The 8th International Symposium on Sound Engineering and Mastering, pp. 225–230 (1999)
Wieczorkowska, A., Ras, Z.: Audio content description in sound database. In: Zhong, N., Yao, Y., Ohsuga, S., Liu, J. (eds.) WI 2001. LNCS (LNAI), vol. 2198, pp. 175–183. Springer, Heidelberg (2001)
Wold, E., Blum, T., Keislar, D., Wheaton, J.: Content-Based Classification, Search and Retrieval of Audio. IEEE Multimedia, 27–36 (1996)
Freund, Y.: Boosting a weak learning algorithm by majority. In: The 3rd Annual Workshop on Computational Learning Theory (1990)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)
Young, S.J., Russell, N.H., Thornton, J.H.: Token passing: asimple conceptual model for connected speech recognition systems. Technical Report CUED/F-INFENG/TR38, Cambridge University Engineering Department, Cambridge, UK (1989)
Zhang, X., Marasek, K., Ras, Z.W.: Maximum Likelihood Study for Sound Pattern Separation and Recognition. In: IEEE CS International Conference on Multimedia and Ubiquitous Engineering, Seoul, Korea, pp. 807–812 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Jiang, W., Zhang, X., Cohen, A., Raś, Z.W. (2010). Multiple Classifiers for Different Features in Timbre Estimation. In: Ras, Z.W., Tsay, LS. (eds) Advances in Intelligent Information Systems. Studies in Computational Intelligence, vol 265. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05183-8_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-05183-8_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05182-1
Online ISBN: 978-3-642-05183-8
eBook Packages: EngineeringEngineering (R0)