Polyphonic Transcription: Exploring a Hybrid of Tone Models and Particle Swarm Optimisation

Phon-Amnuaisuk, Somnuk

doi:10.1007/978-3-642-29142-5_19

Somnuk Phon-Amnuaisuk¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7247))

Included in the following conference series:

International Conference on Evolutionary and Biologically Inspired Music and Art

1039 Accesses
1 Citations

Abstract

Polyphonic transcription could be formulated as a supervised classification task if the classifiers of all possible polyphonic combinations could be learned beforehand. However, it is impractical to learn all possible classification models in real life due to the exponential explosion of all possible polyphonic combinations. Here, we describe a novel polyphonic transcription approach that applies a hybrid of the Particle Swarm Optimisation (PSO) and the Tone-model techniques. This hybrid approach exploits the strengths from both the heuristic-search and the model based approaches. In our work, only the monophonic Tone-models of all pitches are learned and employed to calculate the first pass output of polyphonic transcription, which is then refined in the second pass by PSO. The experimental results show that the proposed hybrid approach outperform the competing Non-negative Matrix Factorisation (NMF) approach. This paper presents and discusses the design and the experimental results of this novel approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Plumbley, M.D., Abdallah, S.A., Blumensath, T., Davies, M.E.: Sparse representations of polyphonic music. Signal Processing 86(3), 417–431 (2005)
Article Google Scholar
Bello, J.P.: Toward the automated analysis of simple polyphonic music: a knowledge-based approach. Ph.D. dissertation, Department of Electrical Engineering, Queen Mary, University of London, London, U.K. (2003)
Google Scholar
Bregman, A.: Auditory Scence Analysis. MIT Press, Cambridge (1990)
Google Scholar
Brown, G.J., Cooke, M.: Computational auditory scene analysis. Computer Speech and Language 8, 297–336 (1994)
Article Google Scholar
Brown, J.C., Puckette, M.S.: An efficient algorithm for the calculation of a constant Q transform. Journal of the Acoustical Society of America 92(5), 2698–2701 (1992)
Article Google Scholar
Davy, M., Godsill, S.J.: Bayesian Harmonic Models for Musical Signal Analysis. In: Bernardo, J.M., Bayarri, M.J., Berger, J.O., Dawid, A.P., Heckerman, D., Smith, A.F.M., West, M. (eds.) Bayesian Statistics, vol. 7, pp. 105–124. Oxford University Press (2003)
Google Scholar
Eberhart, R.C., Kennedy, J.: A new optimizer using particle swarm theory. In: Proceedings of the Sixth International Symposium on Micromachine and Human Science, Nagoya, Japan, pp. 39–43 (1995)
Google Scholar
Ellis, D.P.W.: Model-based scene analysis. In: Wang, D., Brown, G.J. (eds.) Computational Auditory Scene Analysis: Principles, Algorithms and Applications. IEEE Press, A John Wiley & Sons, Inc. (2006)
Google Scholar
Goto, M.: A real-time music-scence-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals. Speech Communication 43, 311–329 (2004)
Article Google Scholar
Kashino, K., Nakadai, K., Kinoshita, T., Tanaka, H.: Application of Bayesian probability network to music scence analysis. In: Proceedings of IJCAI Workshop on CASA, Montreal, pp. 52–59 (1995)
Google Scholar
Klapuri, A.: Automatic music transcription as we know it today. Journal of New Music Research 33(3), 269–282 (2004)
Article Google Scholar
Klapuri, A.: Signal processing methods for the automatic transcription of music. Ph.D thesis, Tampere University of Technology (2004)
Google Scholar
Martin, K.D.: A blackboard system for automatic transcription of simple polyphonic music. M.I.T. Media Lab, Perceptual Computing, Technical Report. 385 (1996)
Google Scholar
Niedermayer, B.: Non-negative matrix division for the automatic transcription of polyphonic music. In: Proceedings of International Conference on Music Information Retrieval (ISMIR 2008), Austria, pp. 545–549 (2008)
Google Scholar
Phon-Amnuaisuk, S.: Transcribing Bach chorales using non-negative matrix factorisation. In: Proceedings of the 2010 International Conference on Information Technology Convergence on Audio, Language and Image Processing (ICALIP 2010), Shanghai China, pp. 688–693 (2010)
Google Scholar
Smaragdis, P., Brown, J.C.: Non-negative matric factorization for polyphonic music transcription. In: Proceedings of IEEE Workshop Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, pp. 177–180 (2003)
Google Scholar
Vincent, E., Rodet, X.: Music transcription with ISA and HMM In. In: Proceedings of the Fifth International Conference on Independent Component Analysis and Blind Signal Separation (ICA 2004), Gradana, Spain, pp. 1197–1204 (2004)
Google Scholar
Walmsley, P.J., Godsill, S.J., Rayner, P.J.W.: Bayesian graphical models for polyphonic pitch tracking. In: Proceedings of Diderot Forum on Mathematics and Music, Vienna, Austria, December 2-4 (1999)
Google Scholar
Wang, D., Brown, G.J.: Computational Auditory Scene Analysis: Principles, Algorithms and Applications. IEEE Press, A John Wiley & Sons, Inc. (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Music Informatics Research Group, Faculty of Creative Industries, Universiti Tunku Abdul Rahman, Petaling Jaya Campus, Selangor Darul Ehsan, Malaysia
Somnuk Phon-Amnuaisuk

Authors

Somnuk Phon-Amnuaisuk
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Sciences and Technology, Department of Informatics Engineering, University of Coimbra, Pólo II - Pinhal de Marrocos, 3030, Coimbra, Portugal
Penousal Machado
School of Computer Science, Department of Communications and Information Technologies, University of A Coruña, Campus de Elviña, 15071 A, Coruña, Spain
Juan Romero & Adrian Carballal &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Phon-Amnuaisuk, S. (2012). Polyphonic Transcription: Exploring a Hybrid of Tone Models and Particle Swarm Optimisation. In: Machado, P., Romero, J., Carballal, A. (eds) Evolutionary and Biologically Inspired Music, Sound, Art and Design. EvoMUSART 2012. Lecture Notes in Computer Science, vol 7247. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29142-5_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-29142-5_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29141-8
Online ISBN: 978-3-642-29142-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics