Abstract
In this paper, we review computational methods for the representation and similarity computation of musical rhythms in both symbolic and sub-symbolic (e.g., audio) domains. Both tasks are fundamental to multiple application scenarios from indexing, browsing, and retrieving music, namely navigating musical archives at scale. Stemming from the literature review, we identified three main rhythmic representations: string (sequence of alpha-numeric symbols to denote the temporal organization of events), geometric (spatio-temporal pictorial representation of events), and feature lists (transformation of audio into a temporal series of features or descriptors), and twofold categories of feature- and transformation-based distance metrics for similarity computation. Furthermore, we address the gap between explicit (symbolic) and implicit (musical audio) rhythmic representations stressing that a greater interaction across modalities would promote a holistic view of the temporal music phenomena. We conclude the article by unveiling avenues for future work on (1) hierarchical, (2) multi-attribute and (3) rhythmic layering models grounded in methodologies across disciplines, such as perception, cognition, mathematics, signal processing, and music.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A short segment of audio, e.g. a measure of a drum beat, that is set up to repeat in order to allow the sound to sustain longer than the original sample lasts (Gallagher, 2009).
- 2.
The concept of beat often assumes different meaning depending on discipline and context. In our work, we follow (Weihs et al., 2019), which defines it as “any of the events or accents in the music, [...] characterized by as what listeners typically entrain to as they tap their foot or dance along with a piece of music.”
- 3.
Sethares (2007) refers to these degrees of information as symbolic and literal, and denotes rhythmic representations as spatial metaphors for the temporal dimension, analogous to the text manifestation in the written and spoken domains.
- 4.
Toussaint (2006, 2019) provides an in-depth study of rhythmic representations. Historically, geometric representations of rhythm have existed before Western musical notation, e.g., the pie slice (Toussaint, 2006). Conversely, string representations fit within the traditional pattern recognition problems in computer science.
- 5.
In the specific case of a loop-based rhythmic representation, the lack of the activation in time is a relevant limitation as the repeating pattern typically ties the last and first events.
- 6.
AIS and TEDAS have a dual manifestation as formal strings and geometric representations.
- 7.
Comprehensive list available online at: https://www.humdrum.org/rep/.
- 8.
Further rhythm-related schemes: **takt to represent temporal moments within a recurring cycle or pattern; **recip to represent durations according to the traditional system of beat-proportions; **metpos to represent the position in metric hierarchy; **dur to encode a sequence of time-spans or successive durations; **synco encodes numerical values that indicate the degree of metric syncopation for successive moments in a musical passage; **simil encodes numerical values that indicate the Damerau-Levenshtein edit distance between two Humdrum representations.
- 9.
Equidistant diagonal line between X and Y axis, as shown in Fig. 3.
- 10.
For example, the gahu and the son clave are the only patterns that do not have self-intersections (Toussaint, 2019). The former equally results in a simple polygon. The shiko, bossa-nova, the son clave, and the fume-fume patterns are symmetric along the isotropy line.
- 11.
Please refer to Gouyon et al. (2005) for a comprehensive list of features adopted in rhythmic periodicity functions.
- 12.
Besides of the approaches identified by Toussaint (2019), computational geometry proposes another classification for difference estimation and shape matching: (1) transformational (based on the comparison of curve descriptions, e.g., Fourier descriptors, turning functions, etc.), (2) geometrical (optimization of position and scaling for the match and calculate the difference in the areas [i.e. TEDAS], boundaries or contour line segments using an optimal correspondence), (3) structural (based on string or graph matching using the curve representation or by decomposing the curve in parts to apply then a transformational approach) and (4) quantitative (based in using various shape descriptors such as ratio perimeter/area2, average angle changes, and the ratio of perpendicular chords) (Cakmakov & Celakoska, 2004).
- 13.
A comparative summary of the current review can be accessed online at: https://sites.google.com/view/mrrd.
References
Akhtaruzzaman, M. (2008). Representation of musical rhythm and its classification system based on mathematical and geometrical analysis. In Proceedings of the ICCCE (pp. 466–471).
Aloupis, G., Fevens, T., Langerman, S., Matsui, T., Mesa, A., Nuñez, Y., et al. (2006). Algorithms for computing geometric measures of melodic similarity. Computer Music Journal, 30(3), 67–76.
Arkin, E. M., Chew, L. P., Huttenlocher, D. P., Kedem, K., & Mitchell, J. S. B. (1991). An efficiently computable metric for comparing polygonal shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(3), 209–216.
Aucouturier, J. J., & Pachet, F. (2002). Music similarity measures: What’s the use? In Proceedings of the ISMIR.
Bello, J. P., Daudet, L., Abdallah, S., Duxbury, C., Davies, M., & Sandler, M. B. (2005). A tutorial on onset detection in music signals. IEEE Transactions on Speech and Audio Processing, 13(5), 1035–1047.
Böck, S., & Schedl, M. (2011). Enhanced beat tracking with context-aware neural networks. In Proceedings of International Conference on Digital Audio Effects (pp. 135–139).
Bosteels, K., & Kerre, E. E. (2008). Fuzzy audio similarity measures based on spectrum histograms and fluctuation patterns. In Computational intelligence in multimedia processing: Recent advances (pp. 213–231). Springer.
Bouwer, F. L., Burgoyne, J. A., Odijk, D., Honing, H., & Grahn, J. A. (2018). What makes a rhythm complex? The influence of musical training and accent type on beat perception. PLOS ONE, 13(1), 1–26.
Bruford, F., Lartillot, O., McDonald, S., & Sandler, M. B. (2020). Multidimensional similarity modelling of complex drum loops using the GrooveToolbox. In Proceedings of ISMIR (pp. 263–270).
Cakmakov, D., & Celakoska, E. (2004). Estimation of curve similarity using turning functions. International Journal of Applied Mathematics, 15, 403–416.
Clarke, E. F. (1999). Rhythm and timing in music. In D. Deutsch (Ed.), The psychology of music (2nd ed., pp. 473–500). San Diego, US: Academic Press.
Colannino, J., Damian, M., Hurtado, F., Langerman, S., Meijer, H., Ramaswami, S., et al. (2007). Efficient many-To-many point matching in one dimension. Graphs and Combinatorics, 23(1), 169–178.
Davies, M. E., Madison, G., Silva, P., & Gouyon, F. (2013). The effect of microtiming deviations on the perception of Groove in short rhythms. Music Perception, 30(5), 497–510.
Dixon, S., Gouyon, F., & Widmer, G. (2004). Towards characterisation of music via rhythmic patterns. In Proceedings of the ISMIR.
Dixon, S., Pampalk, E., Widmer, G.: Classification of dance music by periodicity patterns. In Proceedings of the ISMIR.
Dixon, S., Pampalk, E., & Widmer, G. (2004). Evaluating rhythmic descriptors for musical genre classification. In AES: 25th International Conference on Metadata for audio. Audio Engineering Society.
Foote, J., & Cooper, M. (2001). Visualizing musical structure and rhythm via self-similarity. Proceedings of the ICMC, 1, 423–430.
Foote, J., Cooper, M., & Nam, U. (2002). Audio retrieval by rhythmic similarity. In Proceedings of the ISMIR.
Foote, J., & Uchihashi, S. (2001). The beat spectrum: A new approach to rhythm analysis. IEEE International Conference on Multimedia and Expo (ICME) (pp. 881–884).
Gallagher, M. (2009). The music tech dictionary—A glossary of audio-related terms and technologies (1st ed.). Boston, USA: Course Technology PTR.
Goto, M. (2001). An audio-based real-time beat tracking system for music with or without drum-sounds. Journal of New Music Research, 30, 159–171.
Gouyon, F., & Dixon, S. (2004). Dance music classification: A tempo-based approach. In Proceedings of the ISMIR.
Gouyon, F., et al. (2005). A computational approach to rhythm description-audio features for the computation of rhythm periodicity functions and their use in tempo induction and music content processing (Ph.D. thesis). Universitat Pompeu Fabra.
Gustafson, K. (1988). The graphical representation of rhythm (PROPH). Progress Reports from Oxford Phonetics, 3, 6–26.
Hofmann-engl, L. (2002). Rhythmic similarity: A theoretical and empirical approach. In Proceedings of the 7th International Conference on Music Perception and Cognition
Hofmann-Engl, L. (2003). Atomic notation and melodic similarity. In Computer music modelling and retrieval. Montpellier, France.
Holzapfel, A., & Stylianou, Y. (2008). Rhythmic similarity of music based on dynamic periodicity warping. In 2008 IEEE ICASSP (pp. 2217–2220). IEEE.
Holzapfel, A., & Stylianou, Y. (2009). A scale transform based method for rhythmic similarity of music. In IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 317–320). IEEE.
Huron, D. (2002). Music information processing using the humdrum toolkit: Concepts, examples, and lessons. Computer Music Journal, 26(2), 11–26.
Johnson-Laird, P. N. (1991). Rhythm and meter: A theory at the computational level. Psychomusicology: A Journal of Research in Music Cognition, 10(2), 88–106.
Kirk, J., Nicholson, N.: Visualizing Euclidean rhythms using Tangle theory. Polymath: An Interdisciplinary Arts and Sciences Journal, 6(1), 1–10.
Lerdahl, F., & Jackendoff, R. (1996). A Generative Theory of Tonal Music. MIT Press: The MIT Press Series.
Levitin, D. (2006). This is your brain on music: The science of a human obsession. NY, USA: Dutton.
Lidy, T., & Rauber, A. (2005). Evaluation of feature extractors and psycho-acoustic transformations for music genre classification. In Proceedings of the ISMIR (pp. 34–4).
Liu, Y., Toussaint, & G. T. (2012). Mathematical notation, representation, and visualization of musical rhythm: A comparative perspective. International Journal of Machine Learning and Computing, 2(3), 261–265.
Logan, B., & Salomon, A:(2001). A music similarity function based on signal analysis. In ICME.
McKinney, M. F., Moelants, D., Davies, M. E., & Klapuri, A. (2007). Evaluation of audio beat tracking and music tempo extraction algorithms. Journal of New Music Research, 36(1), 1–16.
Pampalk, E., Dixon, S., & Widmer, G. (2003). On the evaluation of perceptual similarity measures for music. In Proceedings of the DAFx-03 (pp. 7–12).
Parncutt, R. (1994). A perceptual model of pulse salience and metrical accent in musical rhythms. Music Perception, 11(4), 409–464.
Paulus, J., & Klapuri, A. (2002). Measuring the similarity of rhythmic patterns. In Proceedings of the ISMIR.
Pohle, T., Schnitzer, D., Schedl, M., Knees, P., & Widmer, G. (2009). On rhythm and general music similarity. In Proceedings of the ISMIR (pp. 525–530).
Ravignani, A. (2017). Visualizing and interpreting rhythmic patterns using phase space plots. Music Perception, 34(5), 557–568.
Ravignani, A., Thompson, B., Lumaca, M., & Grube, M. (2018). Why do durations in musical rhythms conform to small integer ratios? Frontiers in Computational Neuroscience, 12, 86.
Roads, C. (1996). The computer music tutorial. Cambridge, MA, USA: MIT Press.
Roads, C. (2001). Microsound, (1st ed.). The MIT Press.
Rohrmeier, M. (2020). Towards a formalization of musical rhythm. In Proceedings of the ISMIR (pp. 621–629). Montréal, Canada
Sakai, K., Hikosaka, O., Miyauchi, S., Takino, R., Tamada, T., Iwata, N. K., et al. (1999). Neural representation of a rhythm depends on its interval ratio. Journal of Neuroscience, 19(22), 10074–10081.
Schillinger, J. (1946). The Schillinger system of musical composition. NY: Carl Fischer Inc.
Schillinger, J. (1948). Mathematical basis of the arts. New York: Philosophical Library.
Sethares, W. A. (2007). Rhythm and transforms (1st ed.). London: Springer.
Shmulevich, I., Yli-Harja, O., Coyle, E., Povel, D. J., & Lemström, K. (2001). Perceptual issues in music pattern recognition: Complexity of rhythm and key finding. Computers and the Humanities, 35(1), 23–35.
Sioros, G., Davies, M. E., & Guedes, C. (2017). A generative model for the characterization of musical rhythms. JNMR, 47(2), 114–128.
Sioros, G., Miron, M., Cocharro, D., Guedes, C., & Gouyon, F. (2013). Syncopalooza: Manipulating the syncopation in rhythmic performances. In CMMR (pp. 454–469).
Snyder, B. (2001). Music and memory: An introduction. Massachusetts, USA: MIT Press.
Tenney, J., & Polansky, L. (1980). Temporal Gestalt perception in music. Journal of Music Theory, 24(2), 205–241.
Toussaint, G. (2006). A comparison of rhythmic dissimilarity measures. Forma, 21, 129–149.
Toussaint, G. (2010). Computational geometric aspects of rhythm, melody, and voice-leading. Computational Geometry, 43(1), 2–22.
Toussaint, G. T. (2019). The geometry of musical rhythm what makes a “Good” rhythm good? (2nd ed.). Chapman and Hall/CRC.
Toussaint, G. T., Campbell, M., & Brown, N. (2011). Computational models of symbolic rhythm similarity: Correlation with human judgments. Analytical Approaches to World Mus, 1(2).
Tzanetakis, G., & Cook, P. (2002). Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing, 10(5), 293–302.
Veltkamp, R. C. (2001). Shape matching: similarity measures and algorithms. In Proceedings of International Conference on Shape Modeling and Applications (pp. 188–197).
Volotão, C., Santos, R., Erthal, G., & Dutra, L. (2010). Shape characterization with turning functions. In 17th International Conference on Systems, Signals and Image Processing. EdUFF, Rio de Janeiro.
Weihs, C., Jannach, D., Vatolkin, I., & Rudolph, G. (Eds.). (2019). Music data analysis: Foundations and applications (1st ed.). Abingdon, UK: Chapman and Hall/CRC.
Wen, O. X., & Krumhansl, C. L. (2019). Perception of pitch and time in the structurally isomorphic standard rhythmic and diatonic scale patterns. Music & Science, 2.
Wu, X., Westanmo, A., Zhou, L., & Pan, J. (2013). Serial binary interval ratios improve rhythm reproduction. Frontiers in Psychology, 4, 512.
Acknowledgements
This research was supported by the Portuguese Foundation for Science and Technology (FCT) under the doctoral grant SFRH/BD/132188/2017 and “Experimentation in music in Portuguese culture: History, contexts and practices in the 20th and 21st centuries” (POCI-01-0145-FEDER-031380) co-funded by the European Union through the Operational Program Competitiveness and Internationalization, in its ERDF component, and by national funds, through the Portuguese Foundation for Science and Technology. We thank George Sioros for the comments, which greatly contributed to the quality of the paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Cocharro, D., Bernardes, G., Bernardo, G., Lemos, C. (2021). A Review of Musical Rhythm Representation and (Dis)similarity in Symbolic and Audio Domains. In: Correia Castilho, L., Dias, R., Pinho, J.F. (eds) Perspectives on Music, Sound and Musicology. Current Research in Systematic Musicology, vol 10. Springer, Cham. https://doi.org/10.1007/978-3-030-78451-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-78451-5_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-78450-8
Online ISBN: 978-3-030-78451-5
eBook Packages: Literature, Cultural and Media StudiesLiterature, Cultural and Media Studies (R0)