Abstract
This study applied a multi-dimensional scaling approach to isolating a number of perceptual dimensions from a dataset of human similarity judgements for short excerpts of recorded popular music (800ms). Two dimensions were well identified by two of the twelve timbral coefficients from the Echo Nest’s Analyze service. One of these was also identified by MFCC features from the Queen Mary Vamp plugin set, however a third dimension could not be mapped by either feature set and may represent a musical feature other than timbre. Implications are discussed within the context of existing research into music cognition and suggestions for further research regarding individual differences in sound perception are given.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Vamp plugins: Mel-frequency cepstral coefficients, http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-mfcc
Aucouturier, J., Pachet, F.: Improving timbre similarity: How high is the sky? Journal of Negative Results in Speech and Audio Sciences 1(1), 1–13 (2004)
Barthet, M., Depalle, P., Kronland-Martinet, R., Ystad, S.: Analysis-by-synthesis of timbre, timing, and dynamics in expressive clarinet performance. Music Perception 28(3), 265–278 (2011)
Bertin-Mahieux, T., Ellis, D.P.W., Whitman, B., Lamere, P.: The million song dataset. In: Klapuri, A., Leider, C. (eds.) Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR 2011), pp. 591–597. University of Miami, Miami (2011)
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Filipic, S., Tillmann, B., Bigand, E.: Judging familiarity and emotion from very brief musical excerpts. Psychonomic Bulletin & Review 17(3), 335–341 (2010)
Foote, J.T.: Content-based retrieval of music and audio. In: Kuo, C.C.J., Chang, S.F., Gudivada, V.N. (eds.) Proceedings of SPIE, Multimedia Storage and Archiving Systems II, Dallas, Texas, USA, vol. 3229, pp. 138–147 (1997)
Frith, C., Horne, H.: Art into Pop. Methuen Young Books, London (1987)
Gfeller, K., Witt, S., Adamek, M., Mehr, M., Rogers, J., Stordahl, J., Ringgenberg, S.: Effects of training on timbre recognition and appraisal by postlingually deafened cochlear implant recipients. Journal of the American Academy of Audiology 13(3), 132–145 (2002)
Gjerdingen, R.O., Perrott, D.: Scanning the dial: The rapid recognition of music genres. Journal of New Music Research 37(2), 93–100 (2008)
Grey, J.M.: Timbre discrimination in musical patterns. The Journal of the Acoustical Society of America 64, 467–478 (1978)
Hastie, T., Tibshirani, R., Friedman, J.: Random forests. In: The Elements of Statistical Learning. Springer Series in Statistics, pp. 1–18. Springer, New York (2009)
Hothorn, T., Hornik, K., Zeileis, A.: Model-based recursive partitioning. Journal of Computational and Graphical Statistics 17(2), 492–514 (2008)
Iverson, P., Krumhansl, C.L.: Isolating the dynamic attributes of musical timbre. The Journal of the Acoustical Society of America 94, 2595–2606 (1993)
Jehan, T.: Creating music by listening. Ph.D. thesis, Massachusetts Institute of Technology (2005)
Karadogan, C.: A comparison of kanun recording techniques as they relate to turkish makam music perception. In: Proceedings of the 130th Audio Engineering Society Convention, Audio Engineering Society (2011)
Krumhansl, C.L.: Plink: ”Thin slices” of music. Music Perception: An Interdisciplinary Journal 27, 337–354 (2010)
Kruskal, J.: Nonmetric multidimensional scaling: A numerical method. Psychometrika 29(1), 115–129 (1964)
Lindsay, A.T., Hutchinson, D.: Fluently remixing musical objects with higher-order functions. In: Proceedings of the 12th International Conference on Digital Audio Effects (DAFx 2009), Como, Italy, pp. 429–436 (2009)
Logan, B.: Mel frequency cepstral coefficients for music modeling. In: International Symposium on Music Information Retrieval, vol. 28, pp. 5–11 (2000)
Logan, B., Salomon, A.: A music similarity function based on signal analysis. In: Proceedings of the 2001 IEEE International Conference on Multimedia and Expo. (ICME 2001), Tokyo, Japan, pp. 745–748 (2001)
Marui, A., Martens, W.L.: Timbre of nonlinear distortion effects: Perceptual attributes beyond sharpness. In: Proceedings of the Conference on Interdisciplinary Musicology (2005)
McAdams, S., Giordano, B.L.: The perception of musical timbre. In: Hallam, S., Cross, I., Thaut, M. (eds.) The Oxford Handbook of Music Psychology, pp. 72–80. Oxford University Press (2009)
Müllensiefen, D., Gingras, B., Stewart, L., Musil, J.J.: Goldsmiths musical sophistication index (gold-msi) v1.0: Technical report and documentation revision 0.3. Tech. rep., Goldsmiths, University of London, London (2012), http://www.gold.ac.uk/music-mind-brain/gold-msi
Neiberg, D., Elenius, K., Laskowski, K.: Emotion recognition in spontaneous speech using gmms. In: Proceedings of Interspeech 2006 and 9th International Conference on Spoken Language Processing, Baixas, pp. 809–812 (2006)
Pachet, F., Roy, P.: Exploring billions of audio features. In: CBMI, pp. 227–235 (2007)
Rentfrow, P.J., Gosling, S.D.: The do re mi’s of everyday life: The structure and personality correlates of music preferences. Journal of Personality and Social Psychology 84, 1236–1256 (2003)
Rentfrow, P.J., Gosling, S.D.: Message in a ballad. Psychological Science 17(3), 236–242 (2006)
Samson, S., Zatorre, R.J., Ramsay, J.O.: Deficits of musical timbre perception after unilateral temporal-lobe lesion revealed with multidimensional scaling. Brain 125(3), 511–522 (2002)
Scheirer, E.D., Watson, R.B., Vercoe, B.L.: On the perceived complexity of short musical segments. In: Proceedings of the 2000 International Conference on Music Perception and Cognition. Citeseer (2000)
Shahin, A.J., Roberts, L.E., Chau, W., Trainor, L.J., Miller, L.M.: Music training leads to the development of timbre-specific gamma band activity. NeuroImage 41(1), 113–122 (2008)
Strobl, C., Boulesteiz, A., Kneib, T., Augustin, T., Zeileis, A.: Conditional variable importance for random forests. Bioinformatics 9, 307–327 (2008)
Terasawa, H., Slaney, M., Berger, J.: A statistical model of timbre perception. In: ISCA Tutorial and Research Workshop on Statistical And Perceptual Audition (SAPA 2006), Pittsburgh, pp. 18–23 (2006)
Tingle, D., Youngmoo, E.K., Turnbull, D.: Exploring automatic music annotation with ”acoustically-objective” tags. In: Wang, J.Z., Boujemaa, N., Ramirez, N.O., Natsev, A. (eds.) Proceedings of the 11th ACM SIGMM International Conference on Multimedia Information Retrieval (MIR 2010), Philadelphia, Pennsylvania, USA, pp. 55–62. ACM (2010)
Wedin, L., Goude, G.: Dimension analysis of the perception of instrumental timbre. Scandinavian Journal of Psychology 13(1), 228–240 (1972)
Wold, H.: Soft modelling by latent variables: The non-linear iterative partial least squares (NIPALS) approach. In: Gani, J. (ed.) Perspectives in Probability and Statistics, Papers in Honour of M. S. Bartlett, pp. 117–142. Academic Press, London (1975)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Musil, J.J., Elnusairi, B., Müllensiefen, D. (2013). Perceptual Dimensions of Short Audio Clips and Corresponding Timbre Features. In: Aramaki, M., Barthet, M., Kronland-Martinet, R., Ystad, S. (eds) From Sounds to Music and Emotions. CMMR 2012. Lecture Notes in Computer Science, vol 7900. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41248-6_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-41248-6_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41247-9
Online ISBN: 978-3-642-41248-6
eBook Packages: Computer ScienceComputer Science (R0)