Advertisement

Computational Statistics

, Volume 26, Issue 2, pp 279–291 | Cite as

Feature clustering for instrument classification

  • Uwe Ligges
  • Sebastian Krey
Original Paper

Abstract

We propose a method that allows for instrument classification from a piece of sound. Features are derived from a pre-filtered time series divided into small windows. Afterwards, features from the (transformed) spectrum, Perceptive Linear Prediction (PLP), and Mel Frequency Cepstral Coefficients (MFCCs) as known from speech processing are selected. As a clustering method, k-means is applied yielding a reduced number of features for the classification task. A SVM classifier using a polynomial kernel yields good results. The accuracy is very convincing given a misclassification error of roughly 19% for 59 different classes of instruments. As expected, misclassification error is smaller for a problem with less classes. The rastamat library (Ellis in PLP and RASTA (and MFCC, and inversion) in Matlab. http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/, online web resource, 2005) functionality has been ported from Matlab to R. This means feature extraction as known from speech processing is now easily available from the statistical programming language R. This software has been used on a cluster of machines for the computer intensive evaluation of the proposed method.

Keywords

Feature clustering SVM Classification Timbre Instrument Music 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bischl B, Wornowizki M, Borg K (2009) The mlr package: machine learning in R. http://www.algorithm-forge.com/bischl/mlr/
  2. Davis SB, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acous Speech Signal Process ASSP 28(4): 357–366CrossRefGoogle Scholar
  3. Ellis DPW (2005) PLP and RASTA (and MFCC, and inversion) in Matlab. http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/, online web resource
  4. Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validation techniques. J Intell Inf Syst 17(2–3): 107–145zbMATHCrossRefGoogle Scholar
  5. Hastie TJ, Tibshirani RJ, Friedman J (2001) The elements of statistical learning. Data mining inference and prediction. Springer, New YorkzbMATHGoogle Scholar
  6. Hermansky H (1990) Perceptual linear predictive (PLP) analysis of speech. J Acoust Soc Am 87(4): 1738–1752CrossRefGoogle Scholar
  7. Hsu CW, Chang CC, Lin CJ (2009) A practical guide to support vector classification. National Taiwan University, Taipei, http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf
  8. Karatzoglou A, Smola A, Hornik K, Zeileis A (2004) kernlab—an S4 package for kernel methods in R. J Stat Softw 11(9):1–20, http://www.jstatsoft.org/v11/i09/ Google Scholar
  9. Klapuri A, Davy M (2006) Signal processing methods for music transcription. Springer, New YorkCrossRefGoogle Scholar
  10. Krey S (2008) SVM basierte Klangklassifikation. Dimplomarbeit, TU Dortmund, DortmundGoogle Scholar
  11. Li S (2010) FNN: Fast nearest neighbor search algorithms and applications. http://CRAN.R-project.org/package=FNN
  12. Liaw A, Wiener M (2002) Classification and regression by randomforest. R News 2(3):18–22, http://CRAN.R-project.org/doc/Rnews/ Google Scholar
  13. Opolko F, Wapnick J (1987) McGill University master samples (CDs)Google Scholar
  14. R Development Core Team (2009) R: A language and environment for statistical computing. Vienna, Austria, http://www.r-project.org, ISBN 3-900051-07-0
  15. Roever C (2003) Musikinstrumentenerkennung mit Hilfe der Hough-Transformation. Universität Dortmund, Fakultät Statistik, http://www.aei.mpg.de/~chroev/publications/RoeverDiplom.pdf
  16. Slaney M (1998) Auditory toolbox: A MATLAB Toolbox for auditory modeling work version 2. Tech. Rep. 1998-010, http://rvl4.ecn.purdue.edu/~malcolm/interval/1998-010/
  17. Traunmüller H (1990) Analytical expressions for the tonotopic sensory scale. J Acoust Soc Am 88: 97–100CrossRefGoogle Scholar
  18. Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New York, http://www.stats.ox.ac.uk/pub/MASS4
  19. Walker JS (1996) Fast fourier transforms, 2nd edn. CRC Press, Boca RatonzbMATHGoogle Scholar
  20. Weihs C, Reuter C, Ligges U (2005) Register classification by timbre. In: Weihs C, Gaul W (eds) Classification: the ubiquitous challenge. Springer, Berlin, pp 624–631CrossRefGoogle Scholar
  21. Weihs C, Szepannek G, Ligges U, Luebke K, Raabe N (2006) Local models in register classification by timbre. In: Batagelj V, Bock HH, Ferligoj A, Žiberna A (eds) Data science and classification. Springer, Berlin, pp 315–322CrossRefGoogle Scholar
  22. Weihs C, Ligges U, Mörchen F, Müllensiefen D (2007) Classification in music research. Adv Data Anal Classif 1(3): 255–291MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  1. 1.Fakultät StatistikTechnische Universität DortmundDortmundGermany

Personalised recommendations