Feature clustering for instrument classification
We propose a method that allows for instrument classification from a piece of sound. Features are derived from a pre-filtered time series divided into small windows. Afterwards, features from the (transformed) spectrum, Perceptive Linear Prediction (PLP), and Mel Frequency Cepstral Coefficients (MFCCs) as known from speech processing are selected. As a clustering method, k-means is applied yielding a reduced number of features for the classification task. A SVM classifier using a polynomial kernel yields good results. The accuracy is very convincing given a misclassification error of roughly 19% for 59 different classes of instruments. As expected, misclassification error is smaller for a problem with less classes. The rastamat library (Ellis in PLP and RASTA (and MFCC, and inversion) in Matlab. http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/, online web resource, 2005) functionality has been ported from Matlab to R. This means feature extraction as known from speech processing is now easily available from the statistical programming language R. This software has been used on a cluster of machines for the computer intensive evaluation of the proposed method.
KeywordsFeature clustering SVM Classification Timbre Instrument Music
Unable to display preview. Download preview PDF.
- Bischl B, Wornowizki M, Borg K (2009) The mlr package: machine learning in R. http://www.algorithm-forge.com/bischl/mlr/
- Ellis DPW (2005) PLP and RASTA (and MFCC, and inversion) in Matlab. http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/, online web resource
- Hsu CW, Chang CC, Lin CJ (2009) A practical guide to support vector classification. National Taiwan University, Taipei, http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf
- Krey S (2008) SVM basierte Klangklassifikation. Dimplomarbeit, TU Dortmund, DortmundGoogle Scholar
- Li S (2010) FNN: Fast nearest neighbor search algorithms and applications. http://CRAN.R-project.org/package=FNN
- Opolko F, Wapnick J (1987) McGill University master samples (CDs)Google Scholar
- R Development Core Team (2009) R: A language and environment for statistical computing. Vienna, Austria, http://www.r-project.org, ISBN 3-900051-07-0
- Roever C (2003) Musikinstrumentenerkennung mit Hilfe der Hough-Transformation. Universität Dortmund, Fakultät Statistik, http://www.aei.mpg.de/~chroev/publications/RoeverDiplom.pdf
- Slaney M (1998) Auditory toolbox: A MATLAB Toolbox for auditory modeling work version 2. Tech. Rep. 1998-010, http://rvl4.ecn.purdue.edu/~malcolm/interval/1998-010/
- Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New York, http://www.stats.ox.ac.uk/pub/MASS4