k-Best Unit Selection Strategies for Musical Concatenative Synthesis

  • Cárthach Ó Nuanáin
  • Perfecto Herrera
  • Sergi Jordá
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11265)


Concatenative synthesis is a sample-based approach to sound creation used frequently in speech synthesis and, increasingly, in musical contexts. Unit selection, a key component, is the process by which sounds are chosen from the corpus of samples. With their ability to match target units as well as preserve continuity, Hidden Markov Models are often chosen for this task, but one common criticism is its singular path output which is considered too restrictive when variations are desired. In this article, we propose considering the problem in terms of k-Best path solving for generating alternative lists of candidate solutions and summarise our implementations along with some practical examples.


Hidden Markov Models Concatenative synthesis Artificial intelligence Musical signal processing 


  1. 1.
    Aucouturier, J.J., Pachet, F.: Jamming with plunderphonics: interactive concatenative synthesis of music. J. New Music. Res. 35(1), 35–50 (2006)CrossRefGoogle Scholar
  2. 2.
    Bellman, R.: On a routing problem. Q. Appl. Math. 16(1), 87–90 (1958)CrossRefGoogle Scholar
  3. 3.
    Bird, S.: NLTK: The natural language toolkit NLTK: The Natural Language Toolkit. In: Proceedings of the COLING/ACL on Interactive Presentation Sessions, pp. 69–72 (2016)Google Scholar
  4. 4.
    Brown, D.G., Golod, D.: Decoding HMMs using the k best paths: algorithms and applications. BMC Bioinf. 11(Suppl 1), S28 (2010)CrossRefGoogle Scholar
  5. 5.
    Cho, T., Weiss, R.J., Bello, J.P.: Exploring common variations in state of the art chord recognition systems. Sound Music. Comput. 1(January), 11–22 (2010)Google Scholar
  6. 6.
    Coleman, G., Maestre, E., Bonada, J.: Augmenting sound mosaicing with descriptor-driven transformation. In: Proceedings Digital Audio Effects (DAFx-10), pp. 1–4 (2010)Google Scholar
  7. 7.
    Collins, N.: Audiovisual concatenative synthesis. In: Proceedings of the International Computer Conference, pp. 389–392 (2007)Google Scholar
  8. 8.
    Dannenberg, R.B.: Concatenative synthesis using score-aligned transcriptions music analysis and segmentation. In: International Computer Music Conference, pp. 352–355 (2006)Google Scholar
  9. 9.
    Davies, M.E.P., Hamel, P., Yoshii, K., Goto, M.: AutoMashUpper: an automatic multi-song mashup system. In: Proceedings of the 14th International Society for Music Information Retrieval Conference, ISMIR 2013, pp. 575–580 (2013)Google Scholar
  10. 10.
    Dijkstra, E.W.: A note on two problems in connexion with graphs. Numer. Math. 1(1), 269–271 (1959)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Eigenfeldt, A.: The evolution of evolutionary software: intelligent rhythm generation in kinetic engine. In: Giacobini, M., et al. (eds.) EvoWorkshops 2009. LNCS, vol. 5484, pp. 498–507. Springer, Heidelberg (2009). Scholar
  12. 12.
    Einbond, A., Schwarz, D.: Spatializing timbre with corpus-based concatenative synthesis. In: International Computer Music Conference, New York, USA (2010)Google Scholar
  13. 13.
    Fernández, J.D., Vico, F.: AI methods in algorithmic composition: a comprehensive survey. J. Artif. Intell. Res. 48, 513–582 (2013)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Ford Jr., L.R.: Network flow theory. Technical report, RAND CORP SANTA MONICA CA (1956)Google Scholar
  15. 15.
    Guéguen, L.: Sarment: Python modules for HMM analysis and partitioning of sequences. Bioinformatics 21(16), 3427–3428 (2005)CrossRefGoogle Scholar
  16. 16.
    Hagberg, A.A., Schult, D.A., Swart, P.J.: Exploring network structure, dynamics, and function using NetworkX. In: Varoquaux, G., Vaught, T., Millman, J. (eds.) Proceedings of the 7th Python in Science Conference, Pasadena, CA USA, pp. 11–15 (2008)Google Scholar
  17. 17.
    Hunt, A.J., Black, A.W.: Unit selection in a concatenative speech synthesis system using a large speech database. In: 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, vol. 1, pp. 373–376 (1996)Google Scholar
  18. 18.
    Jones, E., Oliphant, T., Peterson, P.: SciPy: Open Source Scientific Tools for Python (2014)Google Scholar
  19. 19.
    Jordà, S., Gómez-Marín, D., Faraldo, Á., Herrera, P.: Drumming with style: from user needs to a working prototype. In: Proceedings of the International Conference on New Interfaces for Musical Expression, vol. 16, pp. 365–370 (2016)Google Scholar
  20. 20.
    Kaehler, A., Bradski, G.: Learning OpenCV 3: Computer Vision in C++ with the OpenCV Library. O’Reilly Media, Inc. (2016)Google Scholar
  21. 21.
    Klügel, N., Becker, T., Groh, G.: Designing sound collaboratively - perceptually motivated audio synthesis. In: New Interfaces for Musical Expression, London, UK, pp. 327–330 (2014).
  22. 22.
    Maestre, E., Hazan, A., Ramirez, R., Perez, A.: Using concatenative synthesis for expressive performance in jazz saxophone. In: Proceedings of the International Computer Music Conference 2006, pp. 163–166 (2006)Google Scholar
  23. 23.
    Nierhaus, G.: Algorithmic Composition: Paradigms of Automated Music Generation. Springer, Wien (2009). Scholar
  24. 24.
    Nill, C., Sundberg, C.E.W.: List and soft symbol output viterbi algorithms: extensions and comparisons. IEEE Trans. Commun. 43(234), 277–287 (1995)CrossRefGoogle Scholar
  25. 25.
    Nuanáin, C.Ó., Herrera, P., Jordà, S.: An evaluation framework and case study for rhythmic concatenative synthesis. In: Proceedings of the 17th International Society for Music Information Retrieval Conference, New York, USA (2016)Google Scholar
  26. 26.
    Nuanáin, C.Ó., Herrera, P., Jordà, S.: Rhythmic concatenative synthesis for electronic music: techniques, implementation, and evaluation. Comput. Music J. 41(2), 21–37 (2017)CrossRefGoogle Scholar
  27. 27.
    Nuanáin, C.Ó., Jordà, S., Herrera, P.: An interactive software instrument for real-time rhythmic concatenative synthesis. In: New Interfaces for Musical Expression, Brisbane, Australia (2016)Google Scholar
  28. 28.
    Nuanáin, C.Ó., Jordà, S., Herrera, P.: Towards user-tailored creative applications of concatenative synthesis in electronic dance music. In: International Workshop on Musical Metacreation (MUME), Paris, France (2016)Google Scholar
  29. 29.
    Orio, N., Lemouton, S., Schwarz, D.: Score following: state of the art and new developments. In: Proceedings of the Conference on New Interfaces for Musical Expression, pp. 36–41 (2003)Google Scholar
  30. 30.
    Papadopoulos, H., Peeters, G.: Large-scale study of chord estimation algorithms based on chroma representation and HMM. In: 2007 International Workshop on Content-Based Multimedia Indexing, Proceedings, CBMI 2007, pp. 53–60 (2007)Google Scholar
  31. 31.
    Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition (1993)Google Scholar
  32. 32.
    Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition (1989)CrossRefGoogle Scholar
  33. 33.
    Roads, C.: Microsound. The MIT Press, Cambridge (2004)Google Scholar
  34. 34.
    Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 2nd Edn. Prentice Hall (2002)Google Scholar
  35. 35.
    Schwarz, D.: The caterpillar system for data-driven concateantive sound synthesis. In: Proceedings of the 6th International Conference on Digital Audio Effects (DAFx-03), pp. 1–6 (2003)Google Scholar
  36. 36.
    Schwarz, D.: Concatenative sound synthesis: the early years. J. New Music. Res. 35(1), 3–22 (2006)MathSciNetCrossRefGoogle Scholar
  37. 37.
    Schwarz, D.: Distance mapping for corpus-based concatenative synthesis. In: Sound and Music Computing Conference (SMC), Padova, Italy (2011)Google Scholar
  38. 38.
    Schwarz, D., Schnell, N., Gulluni, S.: Scalability in content-based navigation of sound databases. In: Proceedings of the International Computer Music Conference, pp. 253–258 (2009)Google Scholar
  39. 39.
    Seshadri, N., Sundberg, C.E.: List Viterbi decoding algorithms with applications. IEEE Trans. Commun. 42(2/3/4), 313–323 (1994)CrossRefGoogle Scholar
  40. 40.
    Sheh, A., Ellis, D.P.W.: Chord segmentation and recognition using EM-trained hidden markov models. In: Proceedings of the International Conference on Music Information Retrieval (ISMIR), pp. 185–191 (2003)Google Scholar
  41. 41.
    Smith, J.B.L., Percival, G., Kato, J., Goto, M., Fukayama, S.: CrossSong puzzle: generating and unscrambling music mashups with real-time interactivity. In: Sound and Music Computing Conference, Maynooth, Ireland (2015)Google Scholar
  42. 42.
    Stoll, T.: CorpusDB: software for analysis, storage, and manipulation of sound corpora. In: International Workshop on Musical Metacreation (MuMe), pp. 108–113 (2013)Google Scholar
  43. 43.
    Sturm, B.L.: Adaptive concatenative sound synthesis and its application to micromontage composition. Comput. Music. J. 30(4), 46–66 (2006)CrossRefGoogle Scholar
  44. 44.
    Viterbi, A.J.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inf. Theory 13(2), 260–269 (1967)CrossRefGoogle Scholar
  45. 45.
    Yen, J.Y.: Finding the K shortest loopless paths in a network. Manag. Sci. 17(11), 712–716 (1971)MathSciNetCrossRefGoogle Scholar
  46. 46.
    Zils, A., Pachet, F.: Musical mosaicing. In: Digital Audio Effects (DAFx), pp. 1–6 (2001)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Cárthach Ó Nuanáin
    • 1
  • Perfecto Herrera
    • 1
  • Sergi Jordá
    • 1
  1. 1.Music Technology GroupUniversitat Pompeu FabraBarcelonaSpain

Personalised recommendations