Perceptual Dimensions of Short Audio Clips and Corresponding Timbre Features

Musil, Jason Jiří; Elnusairi, Budr; Müllensiefen, Daniel

doi:10.1007/978-3-642-41248-6_12

Jason Jiří Musil¹⁸,
Budr Elnusairi¹⁸ &
Daniel Müllensiefen¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7900))

Included in the following conference series:

International Symposium on Computer Music Modeling and Retrieval

3477 Accesses
3 Citations
1 Altmetric

Abstract

This study applied a multi-dimensional scaling approach to isolating a number of perceptual dimensions from a dataset of human similarity judgements for short excerpts of recorded popular music (800ms). Two dimensions were well identified by two of the twelve timbral coefficients from the Echo Nest’s Analyze service. One of these was also identified by MFCC features from the Queen Mary Vamp plugin set, however a third dimension could not be mapped by either feature set and may represent a musical feature other than timbre. Implications are discussed within the context of existing research into music cognition and suggestions for further research regarding individual differences in sound perception are given.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Vamp plugins: Mel-frequency cepstral coefficients, http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-mfcc
Aucouturier, J., Pachet, F.: Improving timbre similarity: How high is the sky? Journal of Negative Results in Speech and Audio Sciences 1(1), 1–13 (2004)
Google Scholar
Barthet, M., Depalle, P., Kronland-Martinet, R., Ystad, S.: Analysis-by-synthesis of timbre, timing, and dynamics in expressive clarinet performance. Music Perception 28(3), 265–278 (2011)
Article Google Scholar
Bertin-Mahieux, T., Ellis, D.P.W., Whitman, B., Lamere, P.: The million song dataset. In: Klapuri, A., Leider, C. (eds.) Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR 2011), pp. 591–597. University of Miami, Miami (2011)
Google Scholar
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Article MATH Google Scholar
Filipic, S., Tillmann, B., Bigand, E.: Judging familiarity and emotion from very brief musical excerpts. Psychonomic Bulletin & Review 17(3), 335–341 (2010)
Article Google Scholar
Foote, J.T.: Content-based retrieval of music and audio. In: Kuo, C.C.J., Chang, S.F., Gudivada, V.N. (eds.) Proceedings of SPIE, Multimedia Storage and Archiving Systems II, Dallas, Texas, USA, vol. 3229, pp. 138–147 (1997)
Google Scholar
Frith, C., Horne, H.: Art into Pop. Methuen Young Books, London (1987)
Google Scholar
Gfeller, K., Witt, S., Adamek, M., Mehr, M., Rogers, J., Stordahl, J., Ringgenberg, S.: Effects of training on timbre recognition and appraisal by postlingually deafened cochlear implant recipients. Journal of the American Academy of Audiology 13(3), 132–145 (2002)
Google Scholar
Gjerdingen, R.O., Perrott, D.: Scanning the dial: The rapid recognition of music genres. Journal of New Music Research 37(2), 93–100 (2008)
Article Google Scholar
Grey, J.M.: Timbre discrimination in musical patterns. The Journal of the Acoustical Society of America 64, 467–478 (1978)
Article Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: Random forests. In: The Elements of Statistical Learning. Springer Series in Statistics, pp. 1–18. Springer, New York (2009)
Chapter Google Scholar
Hothorn, T., Hornik, K., Zeileis, A.: Model-based recursive partitioning. Journal of Computational and Graphical Statistics 17(2), 492–514 (2008)
Article MathSciNet Google Scholar
Iverson, P., Krumhansl, C.L.: Isolating the dynamic attributes of musical timbre. The Journal of the Acoustical Society of America 94, 2595–2606 (1993)
Article Google Scholar
Jehan, T.: Creating music by listening. Ph.D. thesis, Massachusetts Institute of Technology (2005)
Google Scholar
Karadogan, C.: A comparison of kanun recording techniques as they relate to turkish makam music perception. In: Proceedings of the 130th Audio Engineering Society Convention, Audio Engineering Society (2011)
Google Scholar
Krumhansl, C.L.: Plink: ”Thin slices” of music. Music Perception: An Interdisciplinary Journal 27, 337–354 (2010)
Article Google Scholar
Kruskal, J.: Nonmetric multidimensional scaling: A numerical method. Psychometrika 29(1), 115–129 (1964)
Article MathSciNet MATH Google Scholar
Lindsay, A.T., Hutchinson, D.: Fluently remixing musical objects with higher-order functions. In: Proceedings of the 12th International Conference on Digital Audio Effects (DAFx 2009), Como, Italy, pp. 429–436 (2009)
Google Scholar
Logan, B.: Mel frequency cepstral coefficients for music modeling. In: International Symposium on Music Information Retrieval, vol. 28, pp. 5–11 (2000)
Google Scholar
Logan, B., Salomon, A.: A music similarity function based on signal analysis. In: Proceedings of the 2001 IEEE International Conference on Multimedia and Expo. (ICME 2001), Tokyo, Japan, pp. 745–748 (2001)
Google Scholar
Marui, A., Martens, W.L.: Timbre of nonlinear distortion effects: Perceptual attributes beyond sharpness. In: Proceedings of the Conference on Interdisciplinary Musicology (2005)
Google Scholar
McAdams, S., Giordano, B.L.: The perception of musical timbre. In: Hallam, S., Cross, I., Thaut, M. (eds.) The Oxford Handbook of Music Psychology, pp. 72–80. Oxford University Press (2009)
Google Scholar
Müllensiefen, D., Gingras, B., Stewart, L., Musil, J.J.: Goldsmiths musical sophistication index (gold-msi) v1.0: Technical report and documentation revision 0.3. Tech. rep., Goldsmiths, University of London, London (2012), http://www.gold.ac.uk/music-mind-brain/gold-msi
Neiberg, D., Elenius, K., Laskowski, K.: Emotion recognition in spontaneous speech using gmms. In: Proceedings of Interspeech 2006 and 9th International Conference on Spoken Language Processing, Baixas, pp. 809–812 (2006)
Google Scholar
Pachet, F., Roy, P.: Exploring billions of audio features. In: CBMI, pp. 227–235 (2007)
Google Scholar
Rentfrow, P.J., Gosling, S.D.: The do re mi’s of everyday life: The structure and personality correlates of music preferences. Journal of Personality and Social Psychology 84, 1236–1256 (2003)
Article Google Scholar
Rentfrow, P.J., Gosling, S.D.: Message in a ballad. Psychological Science 17(3), 236–242 (2006)
Article Google Scholar
Samson, S., Zatorre, R.J., Ramsay, J.O.: Deficits of musical timbre perception after unilateral temporal-lobe lesion revealed with multidimensional scaling. Brain 125(3), 511–522 (2002)
Article Google Scholar
Scheirer, E.D., Watson, R.B., Vercoe, B.L.: On the perceived complexity of short musical segments. In: Proceedings of the 2000 International Conference on Music Perception and Cognition. Citeseer (2000)
Google Scholar
Shahin, A.J., Roberts, L.E., Chau, W., Trainor, L.J., Miller, L.M.: Music training leads to the development of timbre-specific gamma band activity. NeuroImage 41(1), 113–122 (2008)
Article Google Scholar
Strobl, C., Boulesteiz, A., Kneib, T., Augustin, T., Zeileis, A.: Conditional variable importance for random forests. Bioinformatics 9, 307–327 (2008)
Google Scholar
Terasawa, H., Slaney, M., Berger, J.: A statistical model of timbre perception. In: ISCA Tutorial and Research Workshop on Statistical And Perceptual Audition (SAPA 2006), Pittsburgh, pp. 18–23 (2006)
Google Scholar
Tingle, D., Youngmoo, E.K., Turnbull, D.: Exploring automatic music annotation with ”acoustically-objective” tags. In: Wang, J.Z., Boujemaa, N., Ramirez, N.O., Natsev, A. (eds.) Proceedings of the 11th ACM SIGMM International Conference on Multimedia Information Retrieval (MIR 2010), Philadelphia, Pennsylvania, USA, pp. 55–62. ACM (2010)
Google Scholar
Wedin, L., Goude, G.: Dimension analysis of the perception of instrumental timbre. Scandinavian Journal of Psychology 13(1), 228–240 (1972)
Article Google Scholar
Wold, H.: Soft modelling by latent variables: The non-linear iterative partial least squares (NIPALS) approach. In: Gani, J. (ed.) Perspectives in Probability and Statistics, Papers in Honour of M. S. Bartlett, pp. 117–142. Academic Press, London (1975)
Google Scholar

Download references

Author information

Authors and Affiliations

Goldsmiths, University of London, UK
Jason Jiří Musil, Budr Elnusairi & Daniel Müllensiefen

Authors

Jason Jiří Musil
View author publications
You can also search for this author in PubMed Google Scholar
Budr Elnusairi
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Müllensiefen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

CNRS - LMA, 31 Chemin Joseph Aiguier, 13402, Marseille Cedex 20, France
Mitsuko Aramaki , Richard Kronland-Martinet & Sølvi Ystad , &
Centre for Digital Music, Queen Mary University of London, Mile End Road, E1 4NS, London, UK
Mathieu Barthet

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Musil, J.J., Elnusairi, B., Müllensiefen, D. (2013). Perceptual Dimensions of Short Audio Clips and Corresponding Timbre Features. In: Aramaki, M., Barthet, M., Kronland-Martinet, R., Ystad, S. (eds) From Sounds to Music and Emotions. CMMR 2012. Lecture Notes in Computer Science, vol 7900. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41248-6_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-41248-6_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41247-9
Online ISBN: 978-3-642-41248-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics