Summary
Recognition and separation of sounds played by various instruments is very useful in labeling audio files with semantic information. This is a non-trivial task requiring sound analysis, but the results can aid automatic indexing and browsing music data when searching for melodies played by user specified instruments. In this chapter, we describe all stages of this process, including sound parameterization, instrument identification, and also separation of layered sounds. Parameterization in our case represents power amplitude spectrum, but we also perform comparative experiments with parameterization based mainly on spectrum related sound attributes, including MFCC, parameters describing the shape of the power spectrum of the sound waveform, and also time domain related parameters. Various classification algorithms have been applied, including k-nearest neighbor (KNN) yielding good results. The experiments on polyphonic (polytimbral) recordings and results discussed in this chapter allow us to draw conclusions regarding the directions of further experiments on this subject, which can be of interest for any user of music audio data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agostini, G., Longari, M., Pollastri, E.: Content-Based Classification of Musical Instrument Timbres. In: International Workshop on Content-Based Multimedia Indexing (2001)
American National Standards Institute, American national standard: Psychoacoustical terminology. ANSI S3.20-1973 (1973)
Aniola, P., Lukasik, E.: JAVA Library for Automatic Musical Instruments Recognition. AES 122 Convention, Vienna, Austria (2007)
Brown, J.C.: Computer identification of musical instruments using pattern recognition with cepstral coefficients as features. J. Acoust. Soc. Am. 105, 1933–1941 (1999)
Fitzgerald, R., Lindsay, A.: Tying semantic labels to computational descriptors of similar timbres. In: Sound and Music Computing 2004 (2004)
Fujinaga, I., McMillan, K.: Real Time Recognition of Orchestral Instruments. In: International Computer Music Conference (2000)
Herrera, P., Amatriain, X., Batlle, E., Serra, X.: Towards instrument segmentation for music content description: a critical review of instrument classification techniques. In: International Symposium on Music Information Retrieval ISMIR (2000)
ISO/IEC JTC1/SC29/WG11, MPEG-7 Overview (2004), http://www.chiariglione.org/mpeg/standards/mpeg-7/mpeg-7.htm
Kaminskyj, I.: Multi-feature Musical Instrument Sound Classifier w/user determined generalisation performance. In: Proceedings of the Australasian Computer Music Association Conference ACMC, pp. 53–62 (2002)
Kawahara, T., Lee, A., Kobayashi, T., Takeda, K., Minematsu, N., Sagayama, S., Itou, K., Ito, A., Yamamoto, M., Yamada, A., Utsuro, T., Shikano, K.: Free software toolkit for Japanese large vocabulary continuous speech recognition. In: Proc. Int’l Conf. on Spoken Language Processing (ICSLP), vol. 4, pp. 476–479 (2000)
Kitahara, T., Goto, M., Okuno, H.G.: Pitch-Dependent Identification of Musical Instrument Sounds. Applied Intelligence 23, 267–275 (2005)
Logan, B.: Mel Frequency Cepstral Coefficients for Music Modeling. In: Proceedings of the First International Symposium on Music Information Retrieval ISMIR 2000 (2000)
Martin, K.D., Kim, Y.E.: Musical instrument identification: A pattern-recognition approach. In: 136-th meeting of the Acoustical Society of America, Norfolk, VA (1998)
Pollard, H.F., Jansson, E.V.: A Tristimulus Method for the Specification of Musical Timbre. Acustica 51, 162–171 (1982)
Ras, Z., Wieczorkowska, A., Lewis, R., Marasek, K., Zhang, C., Cohen, A., Kolczynska, E., Jiang, M.: Automatic Indexing of Audio With Timbre Information for Musical Instruments of Definite Pitch (2008), http://www.mir.uncc.edu/
Ras, Z., Zhang, X., Lewis, R.: MIRAI: Multi-hierarchical, FS-tree based Music Information Retrieval System (Invited Paper). In: Kryszkiewicz, M., Peters, J.F., Rybinski, H., Skowron, A. (eds.) RSEISP 2007. LNCS (LNAI), vol. 4585, pp. 80–89. Springer, Heidelberg (2007)
Saha, G., Yadhunandan, U.: Modified Mel-Frequency Cepstral Coefficient. In: Proceedings of the IASTED 2004 (2004)
Sonic Foundry, Sound Forge. Software (2003)
Wieczorkowska, A.: Towards Musical Data Classification via Wavelet Analysis. In: Ohsuga, S., Raś, Z.W. (eds.) ISMIS 2000. LNCS (LNAI), vol. 1932, pp. 292–300. Springer, Heidelberg (2000)
Wieczorkowska, A., Ras, Z., Zhang, X., Lewis, R.: Multi-way Hierarchic Classification of Musical Instrument Sounds. In: Kim, S., Park, J., Pissinou, N., Kim, T., Fang, W., Slezak, D., Arabnia, H., Howard, D. (eds.) International Conference on Multimedia and Ubiquitous Engineering MUE 2007, Seoul, Korea. IEEE Computer Society, Los Alamitos (2007)
Wold, E., Blum, T., Keislar, D., Wheaten, J.: Content-based classification, search, and retrieval of audio. IEEE Multimedia 3(3), 27–36 (1996)
Zhang, X.: Cooperative Music Retrieval Based on Automatic Indexing of Music by Instruments and Their Types. PhD dissertation, The University of North Carolina at Charlotte, Charlotte (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Jiang, W., Wieczorkowska, A., Raś, Z.W. (2009). Music Instrument Estimation in Polyphonic Sound Based on Short-Term Spectrum Match. In: Hassanien, AE., Abraham, A., Herrera, F. (eds) Foundations of Computational Intelligence Volume 2. Studies in Computational Intelligence, vol 202. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01533-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-01533-5_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01532-8
Online ISBN: 978-3-642-01533-5
eBook Packages: EngineeringEngineering (R0)