Group Delay Function from All-Pole Models for Musical Instrument Recognition

Diment, Aleksandr; Rajan, Padmanabhan; Heittola, Toni; Virtanen, Tuomas

doi:10.1007/978-3-319-12976-1_37

Aleksandr Diment¹⁷,
Padmanabhan Rajan¹⁸,
Toni Heittola¹⁷ &
…
Tuomas Virtanen¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8905))

Included in the following conference series:

International Symposium on Computer Music Multidisciplinary Research

1936 Accesses
2 Citations

Abstract

In this work, the feature based on the group delay function from all-pole models (APGD) is proposed for pitched musical instrument recognition. Conventionally, the spectrum-related features take into account merely the magnitude information, whereas the phase is often overlooked due to the complications related to its interpretation. However, there is often additional information concealed in the phase, which could be beneficial for recognition. The APGD is an elegant approach to inferring phase information, which lacks of the issues related to interpreting the phase and does not require extensive parameter adjustment. Having shown applicability for speech-related problems, it is now explored in terms of instrument recognition. The evaluation is performed with various instrument sets and shows noteworthy absolute accuracy gains of up to 7 % compared to the baseline mel-frequency cepstral coefficients (MFCCs) case. Combined with the MFCCs and with feature selection, APGD demonstrates superiority over the baseline with all the evaluated sets.

This research has been funded by the Academy of Finland, project numbers 258708, 253120 and 265024.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agostini, G., Longari, M., Pollastri, E.: Musical instrument timbres classification with spectral features. In: IEEE Fourth Workshop on Multimedia Signal Processing, pp. 97–102 (2001)
Google Scholar
Alsteris, L.D., Paliwal, K.K.: Short-time phase spectrum in speech processing: a review and some experimental results. Digital Signal Proc. 17(3), 578–616 (2007)
Article Google Scholar
Banno, H., Lu, J., Nakamura, S., Shikano, K., Kawahara, H.: Efficient representation of short-time phase based on group delay. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 861–864, May 1998
Google Scholar
Bozkurt, B., Couvreur, L., Dutoit, T.: Chirp group delay analysis of speech signals. Speech Commun. 49, 159–176 (2007)
Article Google Scholar
Diment, A., Heittola, T., Virtanen, T.: Semi-supervised learning for musical instrument recognition. In: 21st European Signal Processing Conference 2013 (EUSIPCO 2013). Marrakech, Morocco, Sep 2013
Google Scholar
Diment, A., Padmanabhan, R., Heittola, T., Virtanen, T.: Modified group delay feature for musical instrument recognition. In: 10th International Symposium on Computer Music Multidisciplinary Research (CMMR). Marseille, France, Oct 2013
Google Scholar
Duxbury, C., Davies, M., Sandler, M.: Separation of transient information in musical audio using multiresolution analysis techniques. In: Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-01). Limerick, Ireland (2001)
Google Scholar
Eronen, A.: Comparison of features for musical instrument recognition. In: 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics, pp. 19–22 (2001)
Google Scholar
Fletcher, N.H., Rossing, T.D.: The Physics of Musical Instruments. Springer, New York (1998)
Book MATH Google Scholar
Fuhrmann, F.: Automatic musical instrument recognition from polyphonic music audio signals. Ph.D. thesis, Universitat Pompeu Fabra (2012)
Google Scholar
Giannoulis, D., Klapuri, A.: Musical instrument recognition in polyphonic audio using missing feature approach. IEEE Trans. Audio Speech Lang. Process. 21(9), 1805–1817 (2013)
Article Google Scholar
Goto, M., Hashiguchi, H., Nishimura, T., Oka, R.: RWC music database: music genre database and musical instrument sound database. In: Proceedings of the 4th International Conference on Music Information Retrieval (ISMIR), pp. 229–230 (2003)
Google Scholar
Hacihabiboglu, H., Canagarajah, N.: Musical instrument recognition with wavelet envelopes. In: Proceedings of Forum Acusticum Sevilla (CD-ROM) (2002)
Google Scholar
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: NIPS, vol. 186, p. 189 (2005)
Google Scholar
Hegde, R., Murthy, H., Gadde, V.: Significance of the modified group delay feature in speech recognition. IEEE Trans. Audio Speech Lang. Process. 15(1), 190–202 (2007)
Article Google Scholar
Jensen, K.: Timbre models of musical sounds: from the model of one sound to the model of one instrument. Report, Københavns Universitet (1999)
Google Scholar
Kaminsky, I., Materka, A.: Automatic source identification of monophonic musical instrument sounds. In: Proceedings of IEEE International Conference on Neural Networks, IEEE, vol. 1, pp. 189–194 (1995)
Google Scholar
Karjalainen, M., Hrm, A., Laine, U.K., Huopaniemi, J.: Warped filters and their audio applications. In: 1997 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics, IEEE, pp. 4 (1997)
Google Scholar
Klapuri, A.: Analysis of musical instrument sounds by source-filter-decay model. In: IEEE International Conference on Acoustics, Speech and Signal Processing. vol. 1, pp. I-53–I-56 (2007)
Google Scholar
Kostek, B., Czyzewski, A.: Representing musical instrument sounds for their automatic classification. J. Audio Eng. Soc. 49(9), 768–785 (2001)
Google Scholar
Makhoul, J.: Linear prediction: a tutorial review. Proc. IEEE 63(4), 561–580 (1975)
Article Google Scholar
Marques, J., Moreno, P.J.: A study of musical instrument classification using gaussian mixture models and support vector machines. Cambridge Research Laboratory Technical Report Series CRL 4 (1999)
Google Scholar
Meillier, J.L., Chaigne, A.: AR modeling of musical transients. In: 1991 International Conference on Acoustics, Speech, and Signal Processing. ICASSP-91, IEEE, pp. 3649–3652 (1991)
Google Scholar
Murthy, H., Gadde, V.: The modified group delay function and its application to phoneme recognition. In: 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (ICASSP ’03), vol. 1, pp. I-68-71 (2003)
Google Scholar
Rajan, P., Kinnunen, T., Hanili, C., Pohjalainen, J., Alku, P.: Using group delay functions from all-pole models for speaker recognition. Proc. Interspeech 2013, 2489–2493 (2013)
Google Scholar
Sturm, B., Morvidone, M., Daudet, L.: Musical instrument identification using multiscale mel-frequency cepstral coefficients. In: Proceedings of the European Signal Processing Conference (EUSIPCO), pp. 477–481 (2010)
Google Scholar
Yegnanarayana, B.: Formant extraction from linear-prediction phase spectra. J. Acoust. Soc. Am. 63(5), 1638–1640 (1978)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Signal Processing, Tampere University of Technology, Tampere, Finland
Aleksandr Diment, Toni Heittola & Tuomas Virtanen
School of Computing and Electrical Engineering, Indian Institute of Technology, Mandi, Himachal Pradesh, India
Padmanabhan Rajan

Authors

Aleksandr Diment
View author publications
You can also search for this author in PubMed Google Scholar
Padmanabhan Rajan
View author publications
You can also search for this author in PubMed Google Scholar
Toni Heittola
View author publications
You can also search for this author in PubMed Google Scholar
Tuomas Virtanen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aleksandr Diment .

Editor information

Editors and Affiliations

CNRS - LMA, Marseille, France
Mitsuko Aramaki
Toulon-Var University and CNRS - LMA, Marseille, France
Olivier Derrien
CNRS - LMA, Marseille, France
Richard Kronland-Martinet
CNRS - LMA, Marseille, France
Sølvi Ystad

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Diment, A., Rajan, P., Heittola, T., Virtanen, T. (2014). Group Delay Function from All-Pole Models for Musical Instrument Recognition. In: Aramaki, M., Derrien, O., Kronland-Martinet, R., Ystad, S. (eds) Sound, Music, and Motion. CMMR 2013. Lecture Notes in Computer Science(), vol 8905. Springer, Cham. https://doi.org/10.1007/978-3-319-12976-1_37

Download citation

DOI: https://doi.org/10.1007/978-3-319-12976-1_37
Published: 05 December 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12975-4
Online ISBN: 978-3-319-12976-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics