Multiple Classifiers for Different Features in Timbre Estimation

Jiang, Wenxin; Zhang, Xin; Cohen, Amanda; Raś, Zbigniew W.

doi:10.1007/978-3-642-05183-8_14

Wenxin Jiang⁴,
Xin Zhang⁵,
Amanda Cohen⁴ &
…
Zbigniew W. Raś^4,6

Part of the book series: Studies in Computational Intelligence ((SCI,volume 265))

498 Accesses
3 Citations

Abstract

Computer storage and network techniques have brought a tremendous need to find a way to automatically index digital music recordings. In this chapter, state of the art acoustic features for timbre automatic indexing are explored to construct efficient classification models, such as decision tree and KNN. The authors built a database containing more than one million music instrument sound slices, each described by a large number of features including standard MPEG7 audio descriptors, features for speech recognition, and many new audio features developed by the authors, spanning from temporal space to spectral domain. Each classification model was tuned with feature selection based on its distinct characteristics for the blind sound separation system. Based on the experimental results, authors proposed a new framework for MIR with multiple classifiers trained on different features. Inspired by the human recognition experience, timbre estimation based on the hierarchical structure of musical instrument families was investigated. A framework for timbre automatic indexing based on Cascade Classification System was proposed. The authors also discussed the issue of features and classifiers selection during the cascade classification process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A survery of music information fretirval systems, http://mirsystems.info
Akinobu, L., et al.: Julius software toolkit, http://julius.sourceforge.jp/en/
Brown, J.C.: Musical instrument identification using pattern recognition with cepstral coefficients as features. J. Acoust. Soc. of America. 105(3), 1933–1941 (1999)
Article Google Scholar
Brown, J.C., Houix, O., McAdams, S.: Feature dependence in the automatic identification of musical wind instruments. J. Acoust. Soc. of America. 109, 1064–1072 (2001)
Article Google Scholar
Bregman, A.S.: Auditory scene analysis, the perceptual organization of sound. MIT Press, Cambridge (1990)
Google Scholar
Cosi, P.: Auditory Modeling and Neural Networks. In: ICANN 1998. LNCS, Springer, Heidelberg (1998)
Google Scholar
Cutting, D., Kupiec, J., Pedersen, J., Sibun, P.: A Practical Part-of-Speech Tagger. In: The Third Conference on Applied Natural Language Processing, pp. 133–140 (1992)
Google Scholar
Czyzewski, A.: Soft processing of audio signals. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery, pp. 147–165. Physica Verlag, Heidelberg
Google Scholar
Kaminskyj, I.: Multi-feature Musical Instrument Sound Classifier. Mikropolyphonie WWW Journal (6) (2001), http://farben.latrobe.edu.au/mikropol/articles.html
Kostek, B.: Soft computing-based recognition of musical sounds. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery. Physica-Verlag, Heidelberg (1998)
Google Scholar
Kupiec, J.: Robust Part-of-Speech Tagging Using a Hidden Markov Model. The Computer Speech and Language 6, 225–242 (1992)
Article Google Scholar
Kostek, B., Czyzewski, A.: Representing Musical Instrument Sounds for Their Automatic Classification. J. Audio Eng. Soc. 49(9), 768–785 (2001)
Google Scholar
Herrera, P., Amatriain, X., Batlle, E., Serra, X.: Towards instrument segmentation for music content description: a critical review of instrument classification techniques. In: The international Symposium on Music Information Retrieval (ISMIR 2000), Plymouth, MA (2000)
Google Scholar
Jensen, K., Arnspang, J.: Binary decision tree classification of musical sounds. In: The 1999 International Computer Music Conference, Beijing, China (1999)
Google Scholar
Lindsay, A.T., Herre, J.: MPEG-7 and MPEG-7 Audio-An Overview. J. Audio Eng. Soc. 49, 589–594 (2001)
Google Scholar
Logan, B.: Frequency Cepstral Coefficients for Music Modeling. In: 1st Ann. Int. Symposium On Music Information Retrieval (2000)
Google Scholar
Martin, K.D.: Sound-Source Recognition: A Theory and Computational Model., Ph.D. Thesis, MIT, Cambridge, MA (1999)
Google Scholar
Martin, K.D., Kim, Y.E.: Musical Instrument Identification: A Pattern-Recognition Approach. In: 136th Meeting of the Acoustical Soc. of America, Norfolk, VA 2pMU9 (1998)
Google Scholar
Paulus, J., Virtanen, T.: Drum transcription with non-negative spectrogram factorization. In: 13th European Signal Processing Conference, Antalya, Turkey, pp. 4–8 (2005)
Google Scholar
Polkowski, L., Skowron, A.: Rough Sets in Knowledge Discovery. Physica-Verlag, Heidelberg (1998)
Google Scholar
Press, W.H., Teukolsky, S.A.: Numerical Recipes in C, 2nd edn., Cambridge (1992)
Google Scholar
Ras, Z., Wieczorkowska, A.: Indexing audio databases with musical information. In: SCI 2001, Orlando, Florida, vol. 10, pp. 279–285 (2001)
Google Scholar
Scheirer, E., Slaney, M.: Construction and Evaluation of a Robust Multi-feature Speech/Music Discriminator. In: IEEE int. Conf. on Acoustics, Speech and Signal Processing, vol. 10, pp. 279–285 (1997)
Google Scholar
Tzanetakis, G., Cook, P.: Musical Genre Classification of Audio Signals. IEEE Trans. Speech and Audio Processing 10, 293–302 (1997)
Article Google Scholar
Wieczorkowska, A.: Classification of musical instrument sounds using decision trees. In: The 8th International Symposium on Sound Engineering and Mastering, pp. 225–230 (1999)
Google Scholar
Wieczorkowska, A., Ras, Z.: Audio content description in sound database. In: Zhong, N., Yao, Y., Ohsuga, S., Liu, J. (eds.) WI 2001. LNCS (LNAI), vol. 2198, pp. 175–183. Springer, Heidelberg (2001)
Chapter Google Scholar
Wold, E., Blum, T., Keislar, D., Wheaton, J.: Content-Based Classification, Search and Retrieval of Audio. IEEE Multimedia, 27–36 (1996)
Google Scholar
Freund, Y.: Boosting a weak learning algorithm by majority. In: The 3rd Annual Workshop on Computational Learning Theory (1990)
Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)
Article MATH MathSciNet Google Scholar
Young, S.J., Russell, N.H., Thornton, J.H.: Token passing: asimple conceptual model for connected speech recognition systems. Technical Report CUED/F-INFENG/TR38, Cambridge University Engineering Department, Cambridge, UK (1989)
Google Scholar
Zhang, X., Marasek, K., Ras, Z.W.: Maximum Likelihood Study for Sound Pattern Separation and Recognition. In: IEEE CS International Conference on Multimedia and Ubiquitous Engineering, Seoul, Korea, pp. 807–812 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, University of North Carolina, Charlotte, NC, 28223, USA
Wenxin Jiang, Amanda Cohen & Zbigniew W. Raś
Dept. of Math. and Computer Science, University of North Carolina, Pembroke, Pembroke, NC, 28372, USA
Xin Zhang
Institute of Computer Science, Polish Academy of Sciences, 01-237, Warsaw, Poland
Zbigniew W. Raś

Authors

Wenxin Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Xin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Amanda Cohen
View author publications
You can also search for this author in PubMed Google Scholar
Zbigniew W. Raś
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of North Carolina, N.C. 28223, Charlotte, USA
Zbigniew W. Ras
Department of Electronics, Computer & Information Technology, NC A &T State University, NC 27411, Greensboro, USA
Li-Shiang Tsay

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jiang, W., Zhang, X., Cohen, A., Raś, Z.W. (2010). Multiple Classifiers for Different Features in Timbre Estimation. In: Ras, Z.W., Tsay, LS. (eds) Advances in Intelligent Information Systems. Studies in Computational Intelligence, vol 265. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05183-8_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-05183-8_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05182-1
Online ISBN: 978-3-642-05183-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics