Abstract
In order to achieve highly accurate content-based music information retrieval (MIR), it is necessary to compensate the various bit rates of encoded songs which are stored in the music collection, since the bit rate differences are expected to apply a negative effect to content-based MIR results. In this paper, we examine how the bit rate differences affect MIR results, propose methods to normalize MFCC features extracted from encoded files with various bit rates, and show their effects to stabilize MIR results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sigurdsson, S., Petersen, K.B., Lehn-Schiøler, T.: Mel Frequency Cepstral Coefficients: An Evaluation of Robustness of MP3 Encoded Music. In: Proceedings of the International Conference on Music Information Retrieval (2006)
Hoashi, K., Matsumoto, K., Sugaya, F., Ishizaki, H., Katto, J.: Feature space modification for content-based music retrieval based on user preferences. In: Proceedings of ICASSP, pp. 517–520 (2006)
Foote, J.: Content-based retrieval of music and audio. In: Proceedings of SPIE, vol. 3229, pp. 138–147 (1997)
Spevak, C., Favreau, E.: SOUNDSPOTTER-A prototype system for content-based audio retrieval. In: Proceedings of the International Conference on Digital Audio Effects, pp. 27–32 (2002)
Deshpande, H., Singh, R., Nam, U.: Classification of musical signals in the visual domain. In: Proceedings of the International Conference on Digital Audio Effects (2001)
Tzanetakis, G., Cook, P.: MARSYAS: A framework for audio analysis. Organized Sound 4(3), 169–175 (2000)
Mermelstein, P.: Distance measures for speech recognition. Psychological and instrumental. Pattern Recognition and Artificial Intelligence, 374–388 (1976)
Logan, B.: Mel frequency cepstral coefficients for music modeling. In: Proceedings of the International Symposium on Music Information Retrieval (2000)
Slaney, M.: Auditory toolbox, version 2. Technical Report #1998-010, Interval Research Corporation (1998)
Goto, M., et al.: RWC Music Database: Music GenreDatabase and Musical Instrument Sound Database. In: Proceedings of the International Conference on Music Information Retrieval, pp. 229–230 (2003)
Viikki, O., Laurila, k.: Cepstral domain segmental feature vector normalization for noise robust speech recognition. Speech Communication 25, 133–147 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hamawaki, S., Funasawa, S., Katto, J., Ishizaki, H., Hoashi, K., Takishima, Y. (2009). Feature Analysis and Normalization Approach for Robust Content-Based Music Retrieval to Encoded Audio with Different Bit Rates. In: Huet, B., Smeaton, A., Mayer-Patel, K., Avrithis, Y. (eds) Advances in Multimedia Modeling . MMM 2009. Lecture Notes in Computer Science, vol 5371. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92892-8_32
Download citation
DOI: https://doi.org/10.1007/978-3-540-92892-8_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92891-1
Online ISBN: 978-3-540-92892-8
eBook Packages: Computer ScienceComputer Science (R0)