k-Best Unit Selection Strategies for Musical Concatenative Synthesis

Nuanáin, Cárthach Ó; Herrera, Perfecto; Jordá, Sergi

doi:10.1007/978-3-030-01692-0_6

Cárthach Ó Nuanáin¹⁷,
Perfecto Herrera¹⁷ &
Sergi Jordá¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11265))

Included in the following conference series:

International Symposium on Computer Music Multidisciplinary Research

1014 Accesses

Abstract

Concatenative synthesis is a sample-based approach to sound creation used frequently in speech synthesis and, increasingly, in musical contexts. Unit selection, a key component, is the process by which sounds are chosen from the corpus of samples. With their ability to match target units as well as preserve continuity, Hidden Markov Models are often chosen for this task, but one common criticism is its singular path output which is considered too restrictive when variations are desired. In this article, we propose considering the problem in terms of k-Best path solving for generating alternative lists of candidate solutions and summarise our implementations along with some practical examples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/carthach/kBestViterbi/blob/master/kBestViterbi.py.
2.
https://en.wikipedia.org/wiki/Viterbi_algorithm.
3.
A heap queue is a binary tree with the special condition that every parent has a value less than or equal to that of its children (this is a minimum queue, a maximum is naturally the inverse). The important function in our case is the push function, which adds items to the tree and maintains the sorted heap property in O(logn) time.
4.
In fact, the system is sufficiently decoupled that any of these logical stages can be performed separately for their own purpose. For example the tool can be used solely for slicing sounds, or performing batch feature analysis on a library for the purposes of MIR.
5.
http://www.breakfastquay.com/rubberband/.

References

Aucouturier, J.J., Pachet, F.: Jamming with plunderphonics: interactive concatenative synthesis of music. J. New Music. Res. 35(1), 35–50 (2006)
Article Google Scholar
Bellman, R.: On a routing problem. Q. Appl. Math. 16(1), 87–90 (1958)
Article Google Scholar
Bird, S.: NLTK: The natural language toolkit NLTK: The Natural Language Toolkit. In: Proceedings of the COLING/ACL on Interactive Presentation Sessions, pp. 69–72 (2016)
Google Scholar
Brown, D.G., Golod, D.: Decoding HMMs using the k best paths: algorithms and applications. BMC Bioinf. 11(Suppl 1), S28 (2010)
Article Google Scholar
Cho, T., Weiss, R.J., Bello, J.P.: Exploring common variations in state of the art chord recognition systems. Sound Music. Comput. 1(January), 11–22 (2010)
Google Scholar
Coleman, G., Maestre, E., Bonada, J.: Augmenting sound mosaicing with descriptor-driven transformation. In: Proceedings Digital Audio Effects (DAFx-10), pp. 1–4 (2010)
Google Scholar
Collins, N.: Audiovisual concatenative synthesis. In: Proceedings of the International Computer Conference, pp. 389–392 (2007)
Google Scholar
Dannenberg, R.B.: Concatenative synthesis using score-aligned transcriptions music analysis and segmentation. In: International Computer Music Conference, pp. 352–355 (2006)
Google Scholar
Davies, M.E.P., Hamel, P., Yoshii, K., Goto, M.: AutoMashUpper: an automatic multi-song mashup system. In: Proceedings of the 14th International Society for Music Information Retrieval Conference, ISMIR 2013, pp. 575–580 (2013)
Google Scholar
Dijkstra, E.W.: A note on two problems in connexion with graphs. Numer. Math. 1(1), 269–271 (1959)
Article MathSciNet Google Scholar
Eigenfeldt, A.: The evolution of evolutionary software: intelligent rhythm generation in kinetic engine. In: Giacobini, M., et al. (eds.) EvoWorkshops 2009. LNCS, vol. 5484, pp. 498–507. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01129-0_56
Chapter Google Scholar
Einbond, A., Schwarz, D.: Spatializing timbre with corpus-based concatenative synthesis. In: International Computer Music Conference, New York, USA (2010)
Google Scholar
Fernández, J.D., Vico, F.: AI methods in algorithmic composition: a comprehensive survey. J. Artif. Intell. Res. 48, 513–582 (2013)
Article MathSciNet Google Scholar
Ford Jr., L.R.: Network flow theory. Technical report, RAND CORP SANTA MONICA CA (1956)
Google Scholar
Guéguen, L.: Sarment: Python modules for HMM analysis and partitioning of sequences. Bioinformatics 21(16), 3427–3428 (2005)
Article Google Scholar
Hagberg, A.A., Schult, D.A., Swart, P.J.: Exploring network structure, dynamics, and function using NetworkX. In: Varoquaux, G., Vaught, T., Millman, J. (eds.) Proceedings of the 7th Python in Science Conference, Pasadena, CA USA, pp. 11–15 (2008)
Google Scholar
Hunt, A.J., Black, A.W.: Unit selection in a concatenative speech synthesis system using a large speech database. In: 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, vol. 1, pp. 373–376 (1996)
Google Scholar
Jones, E., Oliphant, T., Peterson, P.: SciPy: Open Source Scientific Tools for Python (2014)
Google Scholar
Jordà, S., Gómez-Marín, D., Faraldo, Á., Herrera, P.: Drumming with style: from user needs to a working prototype. In: Proceedings of the International Conference on New Interfaces for Musical Expression, vol. 16, pp. 365–370 (2016)
Google Scholar
Kaehler, A., Bradski, G.: Learning OpenCV 3: Computer Vision in C++ with the OpenCV Library. O’Reilly Media, Inc. (2016)
Google Scholar
Klügel, N., Becker, T., Groh, G.: Designing sound collaboratively - perceptually motivated audio synthesis. In: New Interfaces for Musical Expression, London, UK, pp. 327–330 (2014). http://arxiv.org/abs/1406.6012
Maestre, E., Hazan, A., Ramirez, R., Perez, A.: Using concatenative synthesis for expressive performance in jazz saxophone. In: Proceedings of the International Computer Music Conference 2006, pp. 163–166 (2006)
Google Scholar
Nierhaus, G.: Algorithmic Composition: Paradigms of Automated Music Generation. Springer, Wien (2009). https://doi.org/10.1007/978-3-211-75540-2
Book MATH Google Scholar
Nill, C., Sundberg, C.E.W.: List and soft symbol output viterbi algorithms: extensions and comparisons. IEEE Trans. Commun. 43(234), 277–287 (1995)
Article Google Scholar
Nuanáin, C.Ó., Herrera, P., Jordà, S.: An evaluation framework and case study for rhythmic concatenative synthesis. In: Proceedings of the 17th International Society for Music Information Retrieval Conference, New York, USA (2016)
Google Scholar
Nuanáin, C.Ó., Herrera, P., Jordà, S.: Rhythmic concatenative synthesis for electronic music: techniques, implementation, and evaluation. Comput. Music J. 41(2), 21–37 (2017)
Article Google Scholar
Nuanáin, C.Ó., Jordà, S., Herrera, P.: An interactive software instrument for real-time rhythmic concatenative synthesis. In: New Interfaces for Musical Expression, Brisbane, Australia (2016)
Google Scholar
Nuanáin, C.Ó., Jordà, S., Herrera, P.: Towards user-tailored creative applications of concatenative synthesis in electronic dance music. In: International Workshop on Musical Metacreation (MUME), Paris, France (2016)
Google Scholar
Orio, N., Lemouton, S., Schwarz, D.: Score following: state of the art and new developments. In: Proceedings of the Conference on New Interfaces for Musical Expression, pp. 36–41 (2003)
Google Scholar
Papadopoulos, H., Peeters, G.: Large-scale study of chord estimation algorithms based on chroma representation and HMM. In: 2007 International Workshop on Content-Based Multimedia Indexing, Proceedings, CBMI 2007, pp. 53–60 (2007)
Google Scholar
Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition (1993)
Google Scholar
Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition (1989)
Article Google Scholar
Roads, C.: Microsound. The MIT Press, Cambridge (2004)
Google Scholar
Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 2nd Edn. Prentice Hall (2002)
Google Scholar
Schwarz, D.: The caterpillar system for data-driven concateantive sound synthesis. In: Proceedings of the 6th International Conference on Digital Audio Effects (DAFx-03), pp. 1–6 (2003)
Google Scholar
Schwarz, D.: Concatenative sound synthesis: the early years. J. New Music. Res. 35(1), 3–22 (2006)
Article MathSciNet Google Scholar
Schwarz, D.: Distance mapping for corpus-based concatenative synthesis. In: Sound and Music Computing Conference (SMC), Padova, Italy (2011)
Google Scholar
Schwarz, D., Schnell, N., Gulluni, S.: Scalability in content-based navigation of sound databases. In: Proceedings of the International Computer Music Conference, pp. 253–258 (2009)
Google Scholar
Seshadri, N., Sundberg, C.E.: List Viterbi decoding algorithms with applications. IEEE Trans. Commun. 42(2/3/4), 313–323 (1994)
Article Google Scholar
Sheh, A., Ellis, D.P.W.: Chord segmentation and recognition using EM-trained hidden markov models. In: Proceedings of the International Conference on Music Information Retrieval (ISMIR), pp. 185–191 (2003)
Google Scholar
Smith, J.B.L., Percival, G., Kato, J., Goto, M., Fukayama, S.: CrossSong puzzle: generating and unscrambling music mashups with real-time interactivity. In: Sound and Music Computing Conference, Maynooth, Ireland (2015)
Google Scholar
Stoll, T.: CorpusDB: software for analysis, storage, and manipulation of sound corpora. In: International Workshop on Musical Metacreation (MuMe), pp. 108–113 (2013)
Google Scholar
Sturm, B.L.: Adaptive concatenative sound synthesis and its application to micromontage composition. Comput. Music. J. 30(4), 46–66 (2006)
Article Google Scholar
Viterbi, A.J.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inf. Theory 13(2), 260–269 (1967)
Article Google Scholar
Yen, J.Y.: Finding the K shortest loopless paths in a network. Manag. Sci. 17(11), 712–716 (1971)
Article MathSciNet Google Scholar
Zils, A., Pachet, F.: Musical mosaicing. In: Digital Audio Effects (DAFx), pp. 1–6 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain
Cárthach Ó Nuanáin, Perfecto Herrera & Sergi Jordá

Authors

Cárthach Ó Nuanáin
View author publications
You can also search for this author in PubMed Google Scholar
Perfecto Herrera
View author publications
You can also search for this author in PubMed Google Scholar
Sergi Jordá
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cárthach Ó Nuanáin .

Editor information

Editors and Affiliations

Laboratoire PRISM, AMU-CNRS, Marseille, France
Mitsuko Aramaki
INESC TEC, Porto, Portugal
Matthew E. P. Davies
Laboratoire PRISM, AMU-CNRS, Marseille, France
Richard Kronland-Martinet
Laboratoire PRISM, AMU-CNRS, Marseille, France
Sølvi Ystad

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nuanáin, C.Ó., Herrera, P., Jordá, S. (2018). k-Best Unit Selection Strategies for Musical Concatenative Synthesis. In: Aramaki, M., Davies , M., Kronland-Martinet, R., Ystad, S. (eds) Music Technology with Swing. CMMR 2017. Lecture Notes in Computer Science(), vol 11265. Springer, Cham. https://doi.org/10.1007/978-3-030-01692-0_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-01692-0_6
Published: 24 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01691-3
Online ISBN: 978-3-030-01692-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics