Music Technology and Education

  • Estefanía Cano
  • Christian Dittmar
  • Jakob AbeßerEmail author
  • Christian Kehling
  • Sascha Grollmisch
Part of the Springer Handbooks book series (SHB)


In this chapter, the application of music information retrieval (MIR ) technologies in the development of music education tools is addressed. First, the relationship between technology and music education is described from a historical point of view, starting with the earliest attempts to use audio technology for education and ending with the latest developments and current research conducted in the field. Second, three MIR technologies used within a music education context are presented:
  1. 1.

    The use of pitch-informed solo and accompaniment separation as a tool for the creation of practice content

  2. 2.

    Drum transcription for real-time music practice

  3. 3.

    Guitar transcription with plucking style and expression style detection.


In each case, proposed methods are clearly described and evaluated. Objective perceptual quality metrics were used to evaluate the proposed method for solo/accompaniment separation. Mean overall perceptual scores (OPS ) of 24.68 and 34.68 were obtained for the solo and accompaniment tracks respectively. These scores are on par with the state-of-the-art methods with respect to perceptual quality of separated music signals. A dataset of 17 real-world multitrack recordings was used for evaluation. In the drum sound detection task, an F-measure of 0.96 was obtained for snare drum, kick drum, and hi-hat detection. For this evaluation, a dataset of 30 manually annotated real-world drum loops with an onset tolerance of 50 ms was used. For the guitar plucking style and guitar expression style detection tasks, F-measures of 0.93 and 0.83 were obtained respectively. For this evaluation, a dataset containing 261 recordings of both isolated notes as well as monophonic and polyphonic melodies with note-wise annotations was used. To conclude the chapter, the remaining challenges that need to be addressed to more effectively use MIR technologies in the development of music education applications are described.




artifact-related perceptual score


blind harmonic adaptive decomposition


common amplitude modulation


differential evolution


hidden Markov model


independent component analysis


instantaneous frequency


interactive music tuition system


interference-related perceptual score


independent subspace analysis


informed source separation


Kullback–Leiber divergence


Entwicklung und empirische Validierung eines Modells musikpraktischer Kompetenzen


latent harmonic allocation


Mel-frequency cepstral coefficient


musical instrument digital interface


music information retrieval


nonnegative independent component analysis


nonnegative matrix factorization


nonnegative tensor factorization


overall perceptual score


principal component analysis


perceptual evaluation methods for audio source separation


prior subspace analysis


short-term Fourier transform/short-time Fourier transform


support vector machine


target-related perceptual score


  1. 41.1
    C. Dittmar, E. Cano, J. Abeßer, S. Grollmisch: Music information retrieval meets music education. In: Multimodal Music Process. Dagstuhl Follow-Ups, ed. by M. Müller, M. Goto, M. Schedl (Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Wadern 2012) pp. 95–120Google Scholar
  2. 41.2
    Jamey Aebersold Jazz: The original jazz play-a-longs, (2014)
  3. 41.3
    Hal Leonard Corporation: Jazz play-along, (2014)
  4. 41.4
    Music Minus One: (2014)
  5. 41.5
    Alfred Music Publishing: Alfred music DVD, (2014)
  6. 41.6
  7. 41.7
    Icons of Rock: (2014)
  8. 41.8
    Homespun: Homespun music instruction, (2014)
  9. 41.9
    Drumeo: The ultimate online drum lesson experience, (2014)
  10. 41.10
    Get2Play: Learn instruments easily online, (2014)
  11. 41.11
    GuitarHero: (2014)
  12. 41.12
    Singstar: Singstar, (2014)
  13. 41.13
  14. 41.14
    Music Delta: (2014)
  15. 41.15
    Synthesia: A fun way lo learn how to play the piano, (2014)
  16. 41.16
    Gigajam: Creating musicians, (2014)
  17. 41.17
    Makemusic: Smart music, (2014)
  18. 41.18
    Apple Inc: Garage band, (2014)
  19. 41.19
    Fraunhofer IDMT: Songs2See: Learn to play by playing, (2014)
  20. 41.20
    E. Cano, C. Dittmar, S. Grollmisch: Songs2See: Learn to play by playing. In: 12th Int. Soc. Music Inf. Retr. Conf. (ISMIR) (2011)Google Scholar
  21. 41.21
    Rock Prodigy: Rock prodigy, (2014)
  22. 41.22
    Tonara: (2014)
  23. 41.23
    i-Maestro: Interactive multimedia environment for technology enhanced music education and creative collaborative composition and performance, (2014)
  24. 41.24
    Vemus: Virtual european music school, http:// (2014)Google Scholar
  25. 41.25
    IMUTUS: Interactive music tuition system, (2014)
  26. 41.26
    A. Cont: ANTESCOFO: Anticipatory synchronization and control of interactive parameters in computer music. In: Proc. Int. Comput. Music Conf. (ICMC) (2008)Google Scholar
  27. 41.27
    R. Christopher: Music plus one and machine learning. In: 27th Int. Conf. Mach. Learn. (2010)Google Scholar
  28. 41.28
    J. Abeßer, J. Hasselhorn, C. Dittmar, A. Lehmann, S. Grollmisch: Automatic quality assessment of vocal and instrumental performances of 9th-grade and 10th-grade pupils. In: Proc. 10th Int. Symp. Comput. Music Multidiscip. Res. (CMMR) (2013)Google Scholar
  29. 41.29
    E. Cano: Solo and accompaniment separation: Towards its use in music education applications, (2013)
  30. 41.30
    A. Liutkus, J. Durrieu, L. Daudet, G. Richard: An overview of informed audio source separation. In: Proc. 14th Int. Workshop Image Audio Anal. Multimed. Interact. Serv. (2013) pp. 3–6Google Scholar
  31. 41.31
    K. Dressler: An auditory streaming approach for melody extraction from polyphonic music. In: Proc. 12th Int. Soc. Music Inf. Retr. Conf. (ISMIR) (2011) pp. 19–24Google Scholar
  32. 41.32
    J. Salamon, E. Gómez, D.P. Ellis, G. Richard: Melody extraction from polyphonic music signals: Approaches, applications and challenges, IEEE Signal Process. Mag. 31(2), 118–134 (2014)CrossRefGoogle Scholar
  33. 41.33
    E. Cano, C. Dittmar, G. Schuller: Efficient implementation of a system for solo and accompaniment separation in polyphonic music. In: Proc. 10th Eur. Signal Process. Conf. (EUSIPCO) (2012) pp. 285–289Google Scholar
  34. 41.34
    B. Fuentes, R. Badeau, G. Richard: Blind harmonic adaptive decomposition applied to supervised source separation. In: Proc. 20th Eur. Signal Process. Conf. (EUSIPCO) (2012) pp. 2654–2658Google Scholar
  35. 41.35
    R. Marxer, J. Janer, J. Bonada: Low-latency instrument separation in polyphonic audio using timbre models, Latent Var. Anal. Signal Sep. 7191, 314–321 (2012)CrossRefGoogle Scholar
  36. 41.36
    J. Fritsch, M.D. Plumbley: Score informed audio source separation using constrained non-negative matrix factorization and score synthesis. In: IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP) (2013) pp. 888–891Google Scholar
  37. 41.37
    J. Ganseman, P. Scheunders, G.J. Mysore, J.S. Abel: Evaluation of a score-informed source separation system. In: 11th Int. Soc. Music Inf. Retr. Conf. (ISMIR) (2010)Google Scholar
  38. 41.38
    C. Joder, B. Schuller: Score-informed leading voice separation from monaural audio. In: 13th Int. Soc. Music Inf. Retr. Conf. (ISMIR) (2012) pp. 277–282Google Scholar
  39. 41.39
    M. Coic, J.J. Burred: Bayesian non-negative matrix factorization with learned temporal smoothness priors. In: Int. Conf. Latent Var. Anal. Signal Sep. (LVA/ICA) (2012) pp. 280–287CrossRefGoogle Scholar
  40. 41.40
    J.J. Burred, A. Röbel: A segmental spectro-temporal model of musical timbre. In: 13th Int. Conf. Dig. Audio Eff. (DAFx-10) (2010) pp. 1–7Google Scholar
  41. 41.41
    Y. Li, J. Woodruff, D. Wang: Monaural musical sound separation based on pitch and common amplitude modulation, IEEE Trans. Acoust. Speech Signal Process. 17(7), 1361–1371 (2009)Google Scholar
  42. 41.42
    J. Bosch, K. Kondo, R. Marxer, J. Janer: Score-informed and timbre independent lead instrument separation in real-world scenarios. In: Proc. 20th Eur. Signal Process. Conf. (EUSIPCO) (2012) pp. 2417–2421Google Scholar
  43. 41.43
    J. Durrieu, B. David, G. Richard: A musically motivated mid-level representation for pitch estimation and musical audio source separation, IEEE J. Sel. Top. Signal Process. 5(6), 1180–1191 (2011)CrossRefGoogle Scholar
  44. 41.44
    P. Huang, S. Chen, P. Smaragdis, M. Hasegawa-Johnson: Singing-voice separation from monaural recordings using robust principal component analysis. In: Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP) (2012) pp. 57–60Google Scholar
  45. 41.45
    Z. Rafii, B. Pardo: Repeating pattern extraction technique (REPET): A simple method for music/voice separation, IEEE Trans. Audio Speech Lang. Process. 21, 73–84 (2013)CrossRefGoogle Scholar
  46. 41.46
    J. Janer, R. Marxer: Separation of unvoiced fricatives in singing voice mixtures with music semi-supervised NMF. In: Proc. 16th Int. Conf. Dig. Audio Eff. (DAFx-13) (2013) pp. 1–4Google Scholar
  47. 41.47
    C. Févotte, N. Bertin, J.L. Durrieu: Nonnegative matrix factorization with the Itakura–Saito divergence: With application to music analysis, Neural Comput. 21(3), 793–830 (2009)CrossRefGoogle Scholar
  48. 41.48
    J.-L. Durrieu, J.-P. Thiran: Musical audio source separation based on user-selected F0 track, Latent Var. Anal. Signal Sep. 7191, 438–445 (2012)CrossRefGoogle Scholar
  49. 41.49
    U. Simsekli, A. Cemgil: Score guided musical source separation using generalized coupled tensor factorization. In: Proc. 20th Eur. Signal Process. Conf. (EUSIPCO) (2012) pp. 2639–2643Google Scholar
  50. 41.50
    D. FitzGerald: User assisted separation using tensor factorisations. In: 20th Eur. Signal Process. Conf. (EUSIPCO) (2012) pp. 2412–2416Google Scholar
  51. 41.51
    L. Benaroya, L. Donagh, F. Bimbot, R. Gribonval: Non-negative sparse representation for Wiener based source separation with single sensor. In: Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP) (2003) pp. 613–616Google Scholar
  52. 41.52
    J. Le Roux, E. Vincent, Y. Mizuno, K. Hirokazu, N. Ono, S. Sagayama: Consistent Wiener filtering: Generalized time-frequency masking respecting spectrogram consistency. In: Int. Conf. Latent Var. Anal. Signal Sep. (LVA/ICA) (2010)Google Scholar
  53. 41.53
    D. Fitzgerald: Harmonic/percussive separation using median filtering. In: 13th Int. Conf. Dig. Audio Eff. (DAFx-10) (2010) p. 10Google Scholar
  54. 41.54
    A. Bregman: Auditory Scene Analysis (MIT Press, Cambridge 1990)Google Scholar
  55. 41.55
    V. Emiya, E. Vincent, N. Harlander, V. Hohmann: Subjective and objective quality assessment of audio source separation, IEEE Trans. Audio Speech Lang. Process. 19(7), 2046–2057 (2011)CrossRefGoogle Scholar
  56. 41.56
    Signal Separation Evaluation Campaign (SiSEC): (2013)
  57. 41.57
    E. Cano, G. Schuller, C. Dittmar: Pitch-informed solo and accompaniment separation towards its use in music education applications, EURASIP J. Adv. Signal Process. 23, 1–19 (2014)Google Scholar
  58. 41.58
    J. Paulus: Signal Processing Methods for Drum Transciption and Music Structure Analysis, Ph.D. Thesis (Tampere Univ. Technol., Tampere 2009)Google Scholar
  59. 41.59
    M. Casey: Separation of mixed audio sources by independent subspace analysis. In: Proc. Int. Comput. Music Conf. (2000)Google Scholar
  60. 41.60
    C. Uhle, C. Dittmar, T. Sporer: Extraction of drum tracks from polyphonic music using independent subspace analysis. In: Proc. 4th Int. Symp. Indep. Compon. Anal. Blind Signal Sep. (2003)Google Scholar
  61. 41.61
    M. Plumbley: Algorithms for non-negative independent component analysis, IEEE Trans. Neural Netw. 14, 30–37 (2003)CrossRefGoogle Scholar
  62. 41.62
    C. Dittmar, C. Uhle: Further steps towards drum transcription of polyphonic music. In: Proc. AES 116th Conv. (2004)Google Scholar
  63. 41.63
    D. FitzGerald, B. Lawlor, E. Coyle: Prior subspace analysis for drum transcription. In: Proc. AES 114th Conv. (2003)Google Scholar
  64. 41.64
    A. Spich, M. Zanoni, A. Sarti, S. Tubaro: Drum music transcription using prior subspace analysis and pattern recognition. In: Proc. 13th Int. Conf. Dig. Audio Eff. (DAFx) (2010)Google Scholar
  65. 41.65
    M. Helén, T. Virtanen: Separation of drums from polyphonic music using non-negative matrix factorization and support vector machine. In: Proc. 13th Eur. Signal Process. Conf. (EUSIPCO) (2005)Google Scholar
  66. 41.66
    J. Paulus, A. Klapuri: Drum transcription with non-negative spectrogram factorisation. In: Proc. 13th Eur. Signal Process. Conf. (EUSIPCO) (2005)Google Scholar
  67. 41.67
    E. Battenberg, V. Huang, D. Wessel: Live drum separation using probabilistic spectral clustering based on the Itakura-Saito divergence. In: Proc. AES 45th Conf. Time-Freq. Process. Audio (2012)Google Scholar
  68. 41.68
    K. Yoshii, M. Goto, H. Okuno: Automatic drum sound description for real-world music using template adaption and matching methods. In: Proc. 5th Int. Conf. Music Inf. Retr. (ISMIR) (2004)Google Scholar
  69. 41.69
    C. Dittmar, D. Wagner, D. Gärtner: Drumloop separation using adaptive spectrogram templates. In: Proc. 36th Jahrestag. Akust. (DAGA) (2010)Google Scholar
  70. 41.70
    A. Maximos, A. Floros, M. Vrahatis, N. Kanellopoulos: Real-time drums transcription with characteristic bandpass filtering. In: Proc. 7th Audio Mostly Conf. (2012)Google Scholar
  71. 41.71
    O. Gillet, G. Richard: Automatic transcription of drum loops. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) (2004)Google Scholar
  72. 41.72
    K. Tanghe, S. Degroeve, B. De Baets: An algorithm for detecting and labeling drum events in polyphonic music. In: Proc. 1st Ann. Music Inf. Retr. Eval. eXchange (MIREX ’05) (2005)Google Scholar
  73. 41.73
    P. Grosche, M. Müller: Extracting predominant local pulse information from music recordings, IEEE Trans. Audio Speech Lang. Process. 19(6), 1688–1701 (2011)CrossRefGoogle Scholar
  74. 41.74
    C. Dittmar, A. Männchen, J. Abeßer: Real-time guitar string detection for music education software. In: Proc. 14th Int. Workshop Image Anal. Multimed. Interact. Serv. (WIAMIS) (2013) pp. 1–4Google Scholar
  75. 41.75
    A. Carrillo, M. Wanderley: Learning and extraction of violin instrumental controls from audio signal. In: Proc. 2nd Int. ACM Workshop Music Inf. Retr. User-Centered Multimodal Strateg. (MIRUM) (2012) pp. 25–30Google Scholar
  76. 41.76
    P. O’Grady, S. Rickard: Automatic hexaphonic guitar transcription using non-negative constraints. In: Proc. IET Ir. Signals Syst. Conf. (ISSC) (2009)Google Scholar
  77. 41.77
    A. Hrybyk, Y. Kim: Combined audio and video for guitar chord identification. In: Proc. 11th Int. Soc. Music Inf. Retr. Conf. (ISMIR) (2010) pp. 159–164Google Scholar
  78. 41.78
    I. Barbancho, A. Barbancho, L. Tardón, S. Sammartino, L. Tardón: Pitch and played string estimation in classic and acoustic guitars. In: Proc. 126th Audio Eng. Soc. (AES) Conv. (2009)Google Scholar
  79. 41.79
    J. Abeßer: Automatic string detection for bass guitar and electric guitar. In: Sounds Music Emot. – 9th Int. Symp., CMMR, London 2012 (2013) pp. 333–352Google Scholar
  80. 41.80
    A. Barbancho, A. Klapuri, L. Tardòn, I. Barbancho: Automatic transcription of guitar chords and fingering from audio, IEEE Trans. Audio Speech Lang. Process. 20, 915–921 (2011)CrossRefGoogle Scholar
  81. 41.81
    X. Fiss, A. Kwasinski: Automatic real-time electric guitar audio transcription. In: Proc. IEEE Conf. Acoust. Speech Signal Process. (ICASSP) (2011) pp. 373–376Google Scholar
  82. 41.82
    K. Yazawa, D. Sakaue, K. Nagira, K. Itoyama, H. Okuno: Audio-based guitar tablature transcription using multipitch analysis and playability constraints. In: Proc. 38th IEEE Conf. Acoust. Speech Signal Process. (ICASSP) (2013) pp. 196–200Google Scholar
  83. 41.83
    A. Burns, M. Wanderley: Visual methods for the retrieval of guitarist fingering. In: Proc. 2006 Int. Conf. New Interfaces Music. Expr. (NIME06) (2006) pp. 196–199Google Scholar
  84. 41.84
    C. Kerdvibulvech, H. Saito: Vision-based guitarist fingering tracking using a Bayesian classifier and particle filters, Lect. Notes Comput. Sci. 4872, 625–638 (2007)CrossRefGoogle Scholar
  85. 41.85
    M. Paleari, B. Huet, A. Schutz, D. Slock: A multimodal approach to music transcription. In: Proc. 15th IEEE Int. Conf. Image Process. (ICIP) (2008) pp. 93–96Google Scholar
  86. 41.86
    J. Abeßer, C. Dittmar, G. Schuller: Automatic Recognition and parametrization of frequency modulation techniques in bass guitar recordings. In: Proc. 42nd Audio Eng. Soc. (AES) Conf. Semant. Audio (2011)Google Scholar
  87. 41.87
    C. Erkut, M. Karjalainen, M. Laurson: Extraction of physical and expressive parameters for model-based sound synthesis of the classical guitar. In: Proc. 108th Audio Eng. Soc. (AES) Conv. (2000) pp. 19–22Google Scholar
  88. 41.88
    E. Guaus, T. Özaslan, E. Palacios, J. Arcos: A left hand gesture caption system for guitar based on capacitive sensors. In: Proc. 10th Int. Conf. New Interfaces Music. Expr. (NIME) (2010) pp. 238–243Google Scholar
  89. 41.89
    L. Reboursière, O. Lähdeoja, T. Drugman, S. Dupont, C. Picard-Limpens, N. Riche: Left and right-hand guitar playing techniques detection. In: Proc. Int. Conf. New Interfaces Music. Expr. (NIME) (2012) pp. 1–4Google Scholar
  90. 41.90
    T. Özaslan, E. Guaus, E. Palacios, J. Arcos: Attack based articulation analysis of nylon string guitar. In: Proc. 7th Int. Symp. Comput. Music Model. Retr. (CMMR) (2010) pp. 285–298Google Scholar
  91. 41.91
    T. Özaslan, J. Arcos: Legato and glissando identification in classical guitar. In: Proc. Sound Music Comput. Conf. (SMC), Barcelona (2010) pp. 457–463Google Scholar
  92. 41.92
    J. Abeßer, G. Schuller: Instrument-centered music transcription of bass guitar tracks. In: Proc. AES 53rd Conf. Semant. Audio (2014)Google Scholar
  93. 41.93
    C. Kehling: Entwicklung eines parametrischen Instrumentencoders basierend auf Analyse und Re-Synthese von Gitarrenaufnahmen, Diploma Thesis (Technische Universität Ilmenau, Ilmenau 2013)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2018

Authors and Affiliations

  • Estefanía Cano
    • 1
  • Christian Dittmar
    • 2
  • Jakob Abeßer
    • 3
    Email author
  • Christian Kehling
    • 4
  • Sascha Grollmisch
    • 3
  1. 1.Fraunhofer IDMTIlmenauGermany
  2. 2.International Audio Laboratories ErlangenErlangenGermany
  3. 3.Fraunhofer IDMTIlmenauGermany
  4. 4.Neways TechnologiesErfurtGermany

Personalised recommendations