Advertisement

Music Synchronization

  • Meinard Müller
Chapter

Abstract

Music can be described and represented in many different ways including sheet music, symbolic representations, and audio recordings. For each of these representations, there may exist different versions that correspond to the same musical work. For example, for Beethoven’s Fifth Symphony one can find a large number of music recordings performed by different orchestras and conductors. The general goal of music synchronization is to automatically link the various data streams, thus interrelating the multiple information sets related to a given musical work. More precisely, synchronization is taken to mean a procedure which, for a given position in one representation of a piece of music, determines the corresponding position within another representation.

Keywords

Dynamic Time Warping Longe Common Subsequence Music Information Retrieval Musical Work Chroma Feature 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    V. ARIFI, M. CLAUSEN, F. KURTH, AND M. MÜLLER, Synchronization of music data in score-, MIDI- and PCM-format, Computing in Musicology, 13 (2004), pp. 9–33.Google Scholar
  2. 2.
    M. A. BARTSCH AND G. H. WAKEFIELD, Audio thumbnailing of popular music using chroma-based representations, IEEE Transactions on Multimedia, 7 (2005), pp. 96–104.Google Scholar
  3. 3.
    J. P. BELLO AND J. PICKENS, A robust mid-level representation for harmonic content in music signals, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), London, UK, 2005, pp. 304–311.Google Scholar
  4. 4.
    H.-J. BÖCKENHAUER AND D. BONGARTZ, Algorithmische Grundlagen der Bioinformatik: Modelle, Methoden und Komplexität, Teubner, 2003.Google Scholar
  5. 5.
    J. C. BROWN AND M. S. PUCKETTE, An efficient algorithm for the calculation of a constant Q transform, Journal of the Acoustic Society of America (JASA), 92 (1992), pp. 2698–2701.Google Scholar
  6. 6.
    A. CONT, A coupled duration-focused architecture for real-time music-to-score alignment, IEEE Transactions on Pattern Analysis and Machine Intelligence, 32 (2010), pp. 974–987.Google Scholar
  7. 7.
    T. H. CORMEN, C. E. LEISERSON, R. L. RIVEST, AND C. STEIN, Introduction to Algorithms, McGraw-Hill Higher Education, 2001.Google Scholar
  8. 8.
    D. DAMM, A Digital Library Framework for Heterogeneous Music Collections–from Document Acquisition to Cross-Modal Interaction, PhD thesis, University of Bonn, 2013.Google Scholar
  9. 9.
    D. DAMM, C. FREMEREY, V. THOMAS, M. CLAUSEN, F. KURTH, AND M. MÜLLER, A digital library framework for heterogeneous music collections: from document acquisition to cross-modal interaction, International Journal on Digital Libraries: Special Issue on Music Digital Libraries, 12 (2012), pp. 53–71.Google Scholar
  10. 10.
    R. B. DANNENBERG, An on-line algorithm for real-time accompaniment, in Proceedings of the International Computer Music Conference (ICMC), Paris, France, 1984, pp. 193–198.Google Scholar
  11. 11.
    R. B. DANNENBERG AND N. HU, Polyphonic audio matching for score following and intelligent audio editors, in Proceedings of the International Computer Music Conference (ICMC), San Francisco, USA, 2003, pp. 27–34.Google Scholar
  12. 12.
    R. B. DANNENBERG AND C. RAPHAEL, Music score alignment and computer accompaniment, Communications of the ACM, Special Issue: Music Information Retrieval, 49 (2006), pp. 38–43.Google Scholar
  13. 13.
    S. DIXON AND G. WIDMER, MATCH: A music alignment tool chest, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), London, UK, 2005.Google Scholar
  14. 14.
    Z. DUAN AND B. PARDO, A state space model for online polyphonic audio-score alignment, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Prague, Czech Republic, 2011, pp. 197–200.Google Scholar
  15. 15.
    D. P. ELLIS AND G. E. POLINER, Identifying ‘cover songs’ with chroma features and dynamic programming beat tracking, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 4, Honolulu, Hawaii, USA, 2007.Google Scholar
  16. 16.
    S. EWERT, M. MÜLLER, AND P. GROSCHE, High resolution audio synchronization using chroma onset features, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Taipei, Taiwan, 2009, pp. 1869–1872.Google Scholar
  17. 17.
    C. FREMEREY, Automatic Organization of Digital Music Documents – Sheet Music and Audio, PhD thesis, University of Bonn, 2010.Google Scholar
  18. 18.
    C. FREMEREY, F. KURTH, M. MÜLLER, AND M. CLAUSEN, A demonstration of the SyncPlayer system, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), Vienna, Austria, 2007, pp. 131–132.Google Scholar
  19. 19.
    C. FREMEREY, M. MÜLLER, AND M. CLAUSEN, Handling repeats and jumps in score-performance synchronization, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Utrecht, The Netherlands, 2010, pp. 243–248.Google Scholar
  20. 20.
    H. FUJIHARA, M. GOTO, J. OGATA, AND H. G. OKUNO, LyricSynchronizer: Automatic synchronization system between musical audio signals and lyrics, IEEE Journal of Selected Topics in Signal Processing, 5 (2011), pp. 1252–1261.Google Scholar
  21. 21.
    T. FUJISHIMA, Realtime chord recognition of musical sound: A system using common lisp music, in Proceedings of the International Computer Music Conference (ICMC), Beijing, 1999, pp. 464–467.Google Scholar
  22. 22.
    E. GÓMEZ, Tonal Description of Music Audio Signals, PhD thesis, UPF Barcelona, 2006.Google Scholar
  23. 23.
    M. GOTO, A chorus-section detecting method for musical audio signals, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hong Kong, China, 2003, pp. 437–440.Google Scholar
  24. 24.
    ——, A chorus section detection method for musical audio signals and its application to a music listening station, IEEE Transactions on Audio, Speech, and Language Processing, 14 (2006), pp. 1783–1794.Google Scholar
  25. 25.
    N. HU, R. B. DANNENBERG, AND G. TZANETAKIS, Polyphonic audio matching and alignment for music retrieval, in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 2003.Google Scholar
  26. 26.
    Ö. IZMIRLI AND R. B. DANNENBERG, Understanding features and distance functions for music sequence alignment, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Utrecht, The Netherlands, 2010, pp. 411–416.Google Scholar
  27. 27.
    C. JODER, S. ESSID, AND G. RICHARD, A comparative study of tonal acoustic features for a symbolic level music-to-score alignment, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Dallas, Texas, USA, 2010.Google Scholar
  28. 28.
    ——, A conditional random field framework for robust and scalable audio-to-score matching, IEEE Transactions on Audio, Speech, and Language Processing, 19 (2011), pp. 2385–2397.Google Scholar
  29. 29.
    ——, Optimizing the mapping from a symbolic to an audio representation for music-to-score alignment, in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 2011, pp. 121–124.Google Scholar
  30. 30.
    M.-Y. KAN, Y. WANG, D. ISKANDAR, T. L. NWE, AND A. SHENOY, LyricAlly: Automatic synchronization of textual lyrics to acoustic music signals, IEEE Transactions on Audio, Speech, and Language Processing, 16 (2008), pp. 338–349.Google Scholar
  31. 31.
    E. KEOGH, Exact indexing of dynamic time warping, in Proceedings of the VLDB Conference, Hong Kong, 2002, pp. 406–417.Google Scholar
  32. 32.
    E. KEOGH AND M. PAZZANI, Iterative deepening dynamic time warping for time series, in Proceedings of the SIAM International Conference on Data Mining, Arlington, Virginia, USA, 2002.Google Scholar
  33. 33.
    A. P. KLAPURI, Multipitch analysis of polyphonic music and speech signals using an auditory model, IEEE Transactions on Audio, Speech, and Language Processing, 16 (2008), pp. 255–266.Google Scholar
  34. 34.
    C. L. KRUMHANSL, Cognitive foundations of musical pitch, Oxford University Press, 1990.Google Scholar
  35. 35.
    F. KURTH, D. DAMM, C. FREMEREY, M. MÜLLER, AND M. CLAUSEN, A framework for managing multimodal digitized music collections, in ECDL, 2008, pp. 334–345.Google Scholar
  36. 36.
    F. KURTH AND M. MÜLLER, Efficient index-based audio matching, IEEE Transactions on Audio, Speech, and Language Processing, 16 (2008), pp. 382–395.Google Scholar
  37. 37.
    F. KURTH, M. MÜLLER, C. FREMEREY, Y. HA CHANG, AND M. CLAUSEN, Automated synchronization of scanned sheet music with audio recordings, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), Vienna, Austria, 2007, pp. 261–266.Google Scholar
  38. 38.
    J. LANGNER AND W. GOEBL, Visualizing expressive performance in tempo-loudness space, Computer Music Journal, 27 (2003), pp. 69–83.Google Scholar
  39. 39.
    M. LAST, A. KANDEL, AND H. BUNKE, eds., Data Mining in Time Series Databases, World Scientific, 2004.Google Scholar
  40. 40.
    V. I. LEVENSHTEIN, Binary codes capable of correcting deletions, insertions, and reversals, Soviet Physics Doklady, 10 (1966), pp. 707–710.Google Scholar
  41. 41.
    R. MACRAE, J. NEUMANN, X. ANGUERA, N. OLIVER, AND S. DIXON, Real-time synchronisation of multimedia streams in a mobile device, in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), Barcelona, Spain, 2011, pp. 1–6.Google Scholar
  42. 42.
    M. MAUCH AND S. DIXON, Simultaneous estimation of chords and musical context from audio, IEEE Transactions on Audio, Speech, and Language Processing, 18 (2010), pp. 1280–1289.Google Scholar
  43. 43.
    M. MAUCH, H. FUJIHARA, AND M. GOTO, Integrating additional chord information into HMM-based lyrics-to-audio alignment, IEEE Transactions on Audio, Speech, and Language Processing, 20 (2012), pp. 200–210.Google Scholar
  44. 44.
    N. MONTECCHIO AND A. CONT, A unified approach to real time audio-to-score and audio-to-audio alignment using sequential Montecarlo inference techniques, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Prague, Czech Republic, 2011, pp. 193–196.Google Scholar
  45. 45.
    M. MÜLLER, Information Retrieval for Music and Motion, Springer Verlag, 2007.Google Scholar
  46. 46.
    M. MÜLLER AND D. APPELT, Path-constrained partial music synchronization, in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Las Vegas, Nevada, USA, 2008, pp. 65–68.Google Scholar
  47. 47.
    M. MÜLLER AND M. CLAUSEN, Transposition-invariant self-similarity matrices, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), Vienna, Austria, 2007, pp. 47–50.Google Scholar
  48. 48.
    M. MÜLLER AND S. EWERT, Towards timbre-invariant audio features for harmony-based music, IEEE Transactions on Audio, Speech, and Language Processing, 18 (2010), pp. 649–662.Google Scholar
  49. 49.
    ——, Chroma Toolbox: MATLAB implementations for extracting variants of chroma-based audio features, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), Miami, Florida, USA, 2011, pp. 215–220.Google Scholar
  50. 50.
    M. MÜLLER, V. KONZ, N. JIANG, AND Z. ZUO, A multi-perspective user interface for music signal analysis, in Proceedings of the International Computer Music Conference (ICMC), Huddersfield, UK, 2011, pp. 205–211.Google Scholar
  51. 51.
    M. MÜLLER, V. KONZ, A. SCHARFSTEIN, S. EWERT, AND M. CLAUSEN, Towards automated extraction of tempo parameters from expressive music recordings, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Kobe, Japan, 2009, pp. 69–74.Google Scholar
  52. 52.
    M. MÜLLER, F. KURTH, AND M. CLAUSEN, Chroma-based statistical audio features for audio matching, in Proceedings of the Workshop on Applications of Signal Processing (WASPAA), New Paltz, New York, USA, 2005, pp. 275–278.Google Scholar
  53. 53.
    M. MÜLLER, H. MATTES, AND F. KURTH, An efficient multiscale approach to audio synchronization, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Victoria, Canada, 2006, pp. 192–197.Google Scholar
  54. 54.
    N. ORIO, S. LEMOUTON, AND D. SCHWARZ, Score following: State of the art and new developments, in Proceedings of the International Conference on New Interfaces for Musical Expression (NIME), Montreal, Canada, 2003, pp. 36–41.Google Scholar
  55. 55.
    J. PAULUS, M. MÜLLER, AND A. P. KLAPURI, Audio-based music structure analysis, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Utrecht, The Netherlands, 2010, pp. 625–636.Google Scholar
  56. 56.
    L. RABINER AND B.-H. JUANG, Fundamentals of Speech Recognition, Prentice Hall Signal Processing Series, 1993.Google Scholar
  57. 57.
    C. RAPHAEL, A probabilistic expert system for automatic musical accompaniment, Journal of Computational and Graphical Statistics, 10 (2001), pp. 487–512.Google Scholar
  58. 58.
    ——, A hybrid graphical model for aligning polyphonic audio with musical scores, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Barcelona, Spain, 2004, pp. 387–394.Google Scholar
  59. 59.
    S. SALVADOR AND P. CHAN, FastDTW: Toward accurate dynamic time warping in linear time and space, in Proceedings of the KDD Workshop on Mining Temporal and Sequential Data, 2004.Google Scholar
  60. 60.
    C. S. SAPP, Comparative analysis of multiple musical performances, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Vienna, Austria, 2007, pp. 497–500.Google Scholar
  61. 61.
    C. SCHÖRKHUBER AND A. P. KLAPURI, Constant-Q transform toolbox for music processing, in Sound and Music Computing Conference (SMC), Barcelona, Spain, 2010.Google Scholar
  62. 62.
    J. SERRÀ, E. GÓMEZ, P. HERRERA, AND X. SERRA, Chroma binary similarity and local alignment applied to cover song identification, IEEE Transactions on Audio, Speech, and Language Processing, 16 (2008), pp. 1138–1151.Google Scholar
  63. 63.
    D. STOWELL AND M. PLUMBLEY, Adaptive whitening for improved real-time audio onset detection, in Proceedings of the International Computer Music Conference (ICMC), Copenhagen, Denmark, 2007.Google Scholar
  64. 64.
    V. THOMAS, Music Synchronization, Audio Matching, Pattern Detection, and User Interfaces for a Digital Music Library System, PhD thesis, University of Bonn, 2013.Google Scholar
  65. 65.
    V. THOMAS, C. FREMEREY, D. DAMM, AND M. CLAUSEN, SLAVE: a Score-Lyrics-Audio-Video-Explorer, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), Kobe, Japan, 2009, pp. 717–722.Google Scholar
  66. 66.
    V. THOMAS, C. FREMEREY, M. MÜLLER, AND M. CLAUSEN, Linking sheet music and audio - challenges and new approaches, in Multimodal Music Processing, M. Müller, M. Goto, and M. Schedl, eds., vol. 3 of Dagstuhl Follow-Ups, Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 2012, pp. 1–22.Google Scholar
  67. 67.
    R. J. TURETSKY AND D. P. ELLIS, Ground-truth transcriptions of real music from force-aligned MIDI syntheses, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Baltimore, Maryland, USA, 2003, pp. 135–141.Google Scholar
  68. 68.
    M. VLACHOS, D. GUNOPULOS, AND G. KOLLIOS, Discovering similar multidimensional trajectories, in Proceedings of the International Conference on Data Engineering (ICDE), San Jose, California, USA, 2002, pp. 673–684.Google Scholar
  69. 69.
    H. VON LOESCH AND S. WEINZIERL, eds., Gemessene Interpretation – Computergestützte Aufführungsanalyse im Kreuzverhör der Disziplinen, Schott Verlag, Mainz, 2011.Google Scholar
  70. 70.
    G. WIDMER, Using AI and machine learning to study expressive music performance: project survey and first report, AI Communications, 14 (2001), pp. 149–162.Google Scholar
  71. 71.
    G. WIDMER, S. DIXON, W. GOEBL, E. PAMPALK, AND A. TOBUDIC, In search of the Horowitz factor, AI Magazine, 24 (2003), pp. 111–130.Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.International Audio Laboratories ErlangenErlangenGermany

Personalised recommendations