Abstract
Musical experience has been often suggested to be related to forming of expectations, their fulfillment or denial. In terms of information theory, expectancies and predictions serve to reduce uncertainty about the future and might be used to efficiently represent and “compress” data. In this chapter we present an information theoretic model of musical listening based on the idea that expectations that arise from past musical material are framing our appraisal of what comes next, and that this process eventually results in creation of emotions or feelings. Using a notion of “information rate” we can measure the amount of information between past and present in the musical signal on different time scales using statistics of sound spectral features. Several musical pieces are analyzed in terms of short and long term information rate dynamics and are compared to analysis of musical form and its structural functions. The findings suggest that a relation exists between information dynamics and musical structure that eventually leads to creation of human listening experience and feelings such as “wow” and “aha”.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This is known as relative entropy, or Kullback-Leibler distance.
- 2.
This derivation summarizes and corrects a sign error of a derivation that appeared in [5].
- 3.
If a single observation carries a lot of information about the model, then \(I(x_n,\theta) \approx H(\theta)\) and model-IR becomes \(H(\theta) - E[D(\theta||\theta^*)]\).
- 4.
Note that due to the indexing convention chosen for the transition matrix, Markov process operates by left side matrix multiplication. The stationary vector then is a left (row) eigenvector with an eigenvalue that equals to one.
- 5.
The smoothing was done using a liner phase low pass filter with frequency cutoff at 0.3 of the window advance rate
- 6.
- 7.
The dramatic power of a silence is of course well known to performers, creating a suspense by delaying a continuation. What is new here is the fact that this effect is captured by SR in terms of introduction of new spectral contents.
- 8.
Partition function gives a measure of spread of configuration of the different signal types.
References
Berns G (2005) Satisfaction: the science of finding true fulfillment. Henry Holt, New York
Burns K (2006) Bayesian beauty: on the art of eve and the act of enjoyment. In: Proceedings of the AAAI06 workshop on computational aesthetics
Casey MA (2001) Mpeg-7 sound recognition tools. IEEE Trans Circuits Sys Video Technol 11(6):737–747
Csikszentmihalyi M (1990) Flow: the psychology of optimal experience. Harper & Row, New York, NY
Dubnov S (2008) Unified view of prediction and repetition structure in audio signals with application to interest point detection. IEEE Trans Audio Speech Lang Process 16(2):327–337
Dubnov S (2006) Analysis of musical structure in audio and midi using information rate. In: Proceedings of the international computer music conference, New Orleans, LA
Dubnov S (2006) Spectral anticipations. Compu Music J 30(2):63–83
Foote J, Cooper M (2001) Visualizing musical structure and rhythm via self-similarity. In: Proceedings of the international computer music conference, pp 419–422
Fraisse P (1957) Psychologie du temps [Psychology of time]. Presses Universitaires de France, Paris
Goffman E (1974) Frame analysis: an essay on the organization of experience. Harvard University Press, Cambridge, MA
Huron D (2006) Sweet anticipation: music and the psychology of expectation. MIT Press, Cambridge, MA
Kohs EB (1976) Musical form: studies in analysis and synthesis. Houghton Mifflin, Boston, MA
Lewicki MS (2002) Efficient coding of natural sounds. Nature Neurosci 5(4):356–363
McAdams S (1989) Psychological constraints on form-bearing dimensions in music. Contemp Music Rev 4:181–198
Moreau N, Kim H, Sikora T (2004) Audio classification based on mpeg-7 spectral basis representations. IEEE Trans Circuits Syst Video Technol 14(5):716–725
Narmour E (1990) The analysis and cognition of basic melodic structures: the implication-realization model. University of Chicago Press, Chicago, IL
Nemenman I, Bialek W, Tishby N (2001) Predictability, complexity and learning. Neural Comput 13:2409–2463
Oppenheim AV, Schafer RW (1989) Discrete-time signal processing. Prentice Hall Upper Saddle River, NJ
Reynolds R (2005) Mind models. Routledge, New York, NY
Reynolds R (2005) Form and method: composing music. Routledge, New York, NY
Reynolds R, Dubnov S, McAdams S (2006) Structural and affective aspects of music from statistical audio signal analysis. J Am Soc Inf Sci Technol 57(11):1526–1536
Stein L (1962) Structure and style: the study and analysis of musical forms. Summy-Birchard, Evanston, IL
Tversky A, Kahneman D (1981) The framing of decisions and the psychology of choice. Science 221:453–458
Zhang H-J, Lu L, Wenyin L (2004) Audio textures: theory and applications. IEEE Trans Speech Audio Process 12(2):156–167
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
Probability of the observations (data) depends on a parameter that describes the distribution and probability of occurrence of the parameter itself
Considering an approximation of probability around an empirical distribution θ, [17],
the entropy of a block of samples can be written in terms of conditional entropy given model θ and logarithm of partition functionFootnote 8 \(Z_n(\theta) = \int P(\alpha) e^{-nD(\theta||\alpha)} d\alpha\),
where we used the fact that \(\int P(x_1^n | \theta) dx_1^n = 1\) independent of θ. With entropy of a single observation expressed in terms of conditional entropy and mutual information
we express IR in terms of data, model and configuration factors
Assuming that the space of models comprises of several peaks centered around distinct parameter values, the partition function \(Z_n(\theta)\) can be written through Laplace’s method of saddle point approximation in terms of a function proportional to its argument at an extremal value \(\theta = \theta^*\). This allows writing the right hand of (7.11) as
resulting in equation of model-based IR
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Dubnov, S. (2010). Information Dynamics and Aspects of Musical Perception. In: Argamon, S., Burns, K., Dubnov, S. (eds) The Structure of Style. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12337-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-12337-5_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12336-8
Online ISBN: 978-3-642-12337-5
eBook Packages: Computer ScienceComputer Science (R0)