Skip to main content

Perceptual Dimensions of Short Audio Clips and Corresponding Timbre Features

  • Conference paper
From Sounds to Music and Emotions (CMMR 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7900))

Included in the following conference series:

Abstract

This study applied a multi-dimensional scaling approach to isolating a number of perceptual dimensions from a dataset of human similarity judgements for short excerpts of recorded popular music (800ms). Two dimensions were well identified by two of the twelve timbral coefficients from the Echo Nest’s Analyze service. One of these was also identified by MFCC features from the Queen Mary Vamp plugin set, however a third dimension could not be mapped by either feature set and may represent a musical feature other than timbre. Implications are discussed within the context of existing research into music cognition and suggestions for further research regarding individual differences in sound perception are given.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Vamp plugins: Mel-frequency cepstral coefficients, http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-mfcc

  2. Aucouturier, J., Pachet, F.: Improving timbre similarity: How high is the sky? Journal of Negative Results in Speech and Audio Sciences 1(1), 1–13 (2004)

    Google Scholar 

  3. Barthet, M., Depalle, P., Kronland-Martinet, R., Ystad, S.: Analysis-by-synthesis of timbre, timing, and dynamics in expressive clarinet performance. Music Perception 28(3), 265–278 (2011)

    Article  Google Scholar 

  4. Bertin-Mahieux, T., Ellis, D.P.W., Whitman, B., Lamere, P.: The million song dataset. In: Klapuri, A., Leider, C. (eds.) Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR 2011), pp. 591–597. University of Miami, Miami (2011)

    Google Scholar 

  5. Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  6. Filipic, S., Tillmann, B., Bigand, E.: Judging familiarity and emotion from very brief musical excerpts. Psychonomic Bulletin & Review 17(3), 335–341 (2010)

    Article  Google Scholar 

  7. Foote, J.T.: Content-based retrieval of music and audio. In: Kuo, C.C.J., Chang, S.F., Gudivada, V.N. (eds.) Proceedings of SPIE, Multimedia Storage and Archiving Systems II, Dallas, Texas, USA, vol. 3229, pp. 138–147 (1997)

    Google Scholar 

  8. Frith, C., Horne, H.: Art into Pop. Methuen Young Books, London (1987)

    Google Scholar 

  9. Gfeller, K., Witt, S., Adamek, M., Mehr, M., Rogers, J., Stordahl, J., Ringgenberg, S.: Effects of training on timbre recognition and appraisal by postlingually deafened cochlear implant recipients. Journal of the American Academy of Audiology 13(3), 132–145 (2002)

    Google Scholar 

  10. Gjerdingen, R.O., Perrott, D.: Scanning the dial: The rapid recognition of music genres. Journal of New Music Research 37(2), 93–100 (2008)

    Article  Google Scholar 

  11. Grey, J.M.: Timbre discrimination in musical patterns. The Journal of the Acoustical Society of America 64, 467–478 (1978)

    Article  Google Scholar 

  12. Hastie, T., Tibshirani, R., Friedman, J.: Random forests. In: The Elements of Statistical Learning. Springer Series in Statistics, pp. 1–18. Springer, New York (2009)

    Chapter  Google Scholar 

  13. Hothorn, T., Hornik, K., Zeileis, A.: Model-based recursive partitioning. Journal of Computational and Graphical Statistics 17(2), 492–514 (2008)

    Article  MathSciNet  Google Scholar 

  14. Iverson, P., Krumhansl, C.L.: Isolating the dynamic attributes of musical timbre. The Journal of the Acoustical Society of America 94, 2595–2606 (1993)

    Article  Google Scholar 

  15. Jehan, T.: Creating music by listening. Ph.D. thesis, Massachusetts Institute of Technology (2005)

    Google Scholar 

  16. Karadogan, C.: A comparison of kanun recording techniques as they relate to turkish makam music perception. In: Proceedings of the 130th Audio Engineering Society Convention, Audio Engineering Society (2011)

    Google Scholar 

  17. Krumhansl, C.L.: Plink: ”Thin slices” of music. Music Perception: An Interdisciplinary Journal 27, 337–354 (2010)

    Article  Google Scholar 

  18. Kruskal, J.: Nonmetric multidimensional scaling: A numerical method. Psychometrika 29(1), 115–129 (1964)

    Article  MathSciNet  MATH  Google Scholar 

  19. Lindsay, A.T., Hutchinson, D.: Fluently remixing musical objects with higher-order functions. In: Proceedings of the 12th International Conference on Digital Audio Effects (DAFx 2009), Como, Italy, pp. 429–436 (2009)

    Google Scholar 

  20. Logan, B.: Mel frequency cepstral coefficients for music modeling. In: International Symposium on Music Information Retrieval, vol. 28, pp. 5–11 (2000)

    Google Scholar 

  21. Logan, B., Salomon, A.: A music similarity function based on signal analysis. In: Proceedings of the 2001 IEEE International Conference on Multimedia and Expo. (ICME 2001), Tokyo, Japan, pp. 745–748 (2001)

    Google Scholar 

  22. Marui, A., Martens, W.L.: Timbre of nonlinear distortion effects: Perceptual attributes beyond sharpness. In: Proceedings of the Conference on Interdisciplinary Musicology (2005)

    Google Scholar 

  23. McAdams, S., Giordano, B.L.: The perception of musical timbre. In: Hallam, S., Cross, I., Thaut, M. (eds.) The Oxford Handbook of Music Psychology, pp. 72–80. Oxford University Press (2009)

    Google Scholar 

  24. Müllensiefen, D., Gingras, B., Stewart, L., Musil, J.J.: Goldsmiths musical sophistication index (gold-msi) v1.0: Technical report and documentation revision 0.3. Tech. rep., Goldsmiths, University of London, London (2012), http://www.gold.ac.uk/music-mind-brain/gold-msi

  25. Neiberg, D., Elenius, K., Laskowski, K.: Emotion recognition in spontaneous speech using gmms. In: Proceedings of Interspeech 2006 and 9th International Conference on Spoken Language Processing, Baixas, pp. 809–812 (2006)

    Google Scholar 

  26. Pachet, F., Roy, P.: Exploring billions of audio features. In: CBMI, pp. 227–235 (2007)

    Google Scholar 

  27. Rentfrow, P.J., Gosling, S.D.: The do re mi’s of everyday life: The structure and personality correlates of music preferences. Journal of Personality and Social Psychology 84, 1236–1256 (2003)

    Article  Google Scholar 

  28. Rentfrow, P.J., Gosling, S.D.: Message in a ballad. Psychological Science 17(3), 236–242 (2006)

    Article  Google Scholar 

  29. Samson, S., Zatorre, R.J., Ramsay, J.O.: Deficits of musical timbre perception after unilateral temporal-lobe lesion revealed with multidimensional scaling. Brain 125(3), 511–522 (2002)

    Article  Google Scholar 

  30. Scheirer, E.D., Watson, R.B., Vercoe, B.L.: On the perceived complexity of short musical segments. In: Proceedings of the 2000 International Conference on Music Perception and Cognition. Citeseer (2000)

    Google Scholar 

  31. Shahin, A.J., Roberts, L.E., Chau, W., Trainor, L.J., Miller, L.M.: Music training leads to the development of timbre-specific gamma band activity. NeuroImage 41(1), 113–122 (2008)

    Article  Google Scholar 

  32. Strobl, C., Boulesteiz, A., Kneib, T., Augustin, T., Zeileis, A.: Conditional variable importance for random forests. Bioinformatics 9, 307–327 (2008)

    Google Scholar 

  33. Terasawa, H., Slaney, M., Berger, J.: A statistical model of timbre perception. In: ISCA Tutorial and Research Workshop on Statistical And Perceptual Audition (SAPA 2006), Pittsburgh, pp. 18–23 (2006)

    Google Scholar 

  34. Tingle, D., Youngmoo, E.K., Turnbull, D.: Exploring automatic music annotation with ”acoustically-objective” tags. In: Wang, J.Z., Boujemaa, N., Ramirez, N.O., Natsev, A. (eds.) Proceedings of the 11th ACM SIGMM International Conference on Multimedia Information Retrieval (MIR 2010), Philadelphia, Pennsylvania, USA, pp. 55–62. ACM (2010)

    Google Scholar 

  35. Wedin, L., Goude, G.: Dimension analysis of the perception of instrumental timbre. Scandinavian Journal of Psychology 13(1), 228–240 (1972)

    Article  Google Scholar 

  36. Wold, H.: Soft modelling by latent variables: The non-linear iterative partial least squares (NIPALS) approach. In: Gani, J. (ed.) Perspectives in Probability and Statistics, Papers in Honour of M. S. Bartlett, pp. 117–142. Academic Press, London (1975)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Musil, J.J., Elnusairi, B., Müllensiefen, D. (2013). Perceptual Dimensions of Short Audio Clips and Corresponding Timbre Features. In: Aramaki, M., Barthet, M., Kronland-Martinet, R., Ystad, S. (eds) From Sounds to Music and Emotions. CMMR 2012. Lecture Notes in Computer Science, vol 7900. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41248-6_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41248-6_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41247-9

  • Online ISBN: 978-3-642-41248-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics