Statistical Approach to Automatic Expressive Rendition of Polyphonic Piano Music

  • Tae Hun Kim
  • Satoru Fukayama
  • Takuya Nishimoto
  • Shigeki Sagayama


In this chapter, we discuss how to render expressive polyphonic piano music through a statistical approach. Generating polyphonic expression is an important element in achieving automatic expressive piano performance since the piano is a polyphonic instrument. We will start by discussing the features of polyphonic piano expression and present a method for modeling it based on an approximation involving melodies and harmonies. An experimental evaluation indicates that performances generated with the proposed method achieved polyphonic expression and created an impression of expressiveness. In addition, performances generated with models trained on different performances were perceptually distinguishable by human listeners. Finally, we introduce an automatic expressive piano system called Polyhymnia that won the first place in the autonomous section of Performance Rendering Contest for Computer Systems (RenCon) 2010.


Conditional Random Field Score Feature Expressive Performance Musical Symbol Note Expression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was partially funded by CrestMuse Project of Japan Science and Technology Agency and supported by Samsung Scholarship Foundation.


  1. 1.
    Berger AL (1997) The improved iterative scaling algorithm: a gentle introduction, Carnegie Mellon UniversityGoogle Scholar
  2. 2.
    Bottou L (1991) Stochastic gradient learning in neural networks. In Proceedings of Neuro-Nimes 91Google Scholar
  3. 3.
    Darroch J, Ratcliff D (1972) Generalized iterative scaling for log-linear models. Ann Math Stat 43:1470–1480MathSciNetMATHCrossRefGoogle Scholar
  4. 4.
    Flossmann S, Grachten M, Widmer G (2009) Expressive performance rendering: introducing performance context. In Proceedings of the SMC, pp 155–160Google Scholar
  5. 5.
    Friberg A, Bresin R, Sundberg J (2006) Overview of the KTH rule system for musical performance. Adv Cogn Psychol 2(2):145–161CrossRefGoogle Scholar
  6. 6.
    Gabrielsson A (1985) Interplay between analysis and synthesis in studies of music performance and music experience. Music Percept 3:59–86CrossRefGoogle Scholar
  7. 7.
    Gieseking A, Leimer K (1972) Piano technique. Dover, New YorkGoogle Scholar
  8. 8.
    Grindlay G, Helmhold D (2006) Modeling, analyzing, and synthesizing expressive piano performance with graphical models. Mach Learn 65:361–387CrossRefGoogle Scholar
  9. 9.
    Hashida M, Matsui T, Katayose H (2008) A new database describing deviation information of performance expressions. In Proceedings of the ISMIR, pp 489–494Google Scholar
  10. 10.
    Huron D, Fantini D (1989) The avoidance of inner-voice entries: perceptual evidence and musical practice. Music Percept 7(1):43–48CrossRefGoogle Scholar
  11. 11.
    Lafferty J, McCallum A, Pereira F (2011) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In Proceedings of the ML, pp 282–289Google Scholar
  12. 12.
    Lehvinne J (1972) Basic principle in pianoforte playing. Dover, New YorkGoogle Scholar
  13. 13.
    Lerdahl F, Jackendoff R (1983) A generative theory of tonal music. MIT Press, Cambridge, MAGoogle Scholar
  14. 14.
    Mazzola G (2002) The topos of music – geometric logic of concept, theory, and performance. Birkenhäuser, Basel/BostonGoogle Scholar
  15. 15.
    MusicXML, Recordare llc.,
  16. 16.
    Palmer C (1997) Music performance. Ann Rev Psychol 48:115–138CrossRefGoogle Scholar
  17. 17.
    Pancutt R (2003) Accents and expression in piano performance. In: Niemöller KW (ed) Perspektiven und Methoden einer Systemischen Musikwissenschaft, Systemische Musikwissenschaft. Peter Lang, Frankfurt am Main, pp 163–185Google Scholar
  18. 18.
    Pietra SD, Pietra VD, Lafferty J (1995) Inducing features of random fields. Technical Report, CMU-CS-95–144, Carnegie Mellon UniversityGoogle Scholar
  19. 19.
    Repp BH (1992) Diversity and commonality in music performance: an analysis of timing microstructure in Schumann’s “Trämerei”. J Acoust Soc Am 92(5):2546–2568CrossRefGoogle Scholar
  20. 20.
    Sha F, Pereira F (2003) Shallow parsing with conditional random fields. In Proceedings of Human Language Technology, NAACLGoogle Scholar
  21. 21.
    Sundberg J, Askenfelt A, Frydèn L (1983) Musical performance: a synthesis-by-rule approach. Comp Music J 7(1):37–43CrossRefGoogle Scholar
  22. 22.
    Suzuki T, Tokunaga T, Tanaka H (1999) Case-based approach to the generation of musical expression. In Proceedings of the IJCAI, pp 642–648Google Scholar
  23. 23.
    Teramura K, Okuma H (2008) Gaussian process regression for rendering music performance. In Proceedings of the ICMPCGoogle Scholar
  24. 24.
    Teramura K, Maeda S (2010) Statistical learning of tempo variation for imitating piano performance (in Japanese). IPSJ Tech Rep., vol. 85, no. 12Google Scholar
  25. 25.
    Wallach HM (2002) Efficient training of conditional random fields. Master’s thesis, University of EdinburghGoogle Scholar
  26. 26.
    Widmer G, Dixon S, Goebl W, Pampalk E, Tobudic A (2003) In search of the Horowitz factor. AI Mag 24(3):111–130Google Scholar

Copyright information

© Springer-Verlag London 2013

Authors and Affiliations

  • Tae Hun Kim
    • 1
  • Satoru Fukayama
    • 2
  • Takuya Nishimoto
    • 3
  • Shigeki Sagayama
    • 2
  1. 1.Audio Communication GroupTechnische Universitat BerlinBerlinGermany
  2. 2.Graduate School of Information Science and TechnologyThe University of TokyoTokyoJapan
  3. 3.Olarbee JapanHiroshimaJapan

Personalised recommendations