Abstract
Polyphonic transcription could be formulated as a supervised classification task if the classifiers of all possible polyphonic combinations could be learned beforehand. However, it is impractical to learn all possible classification models in real life due to the exponential explosion of all possible polyphonic combinations. Here, we describe a novel polyphonic transcription approach that applies a hybrid of the Particle Swarm Optimisation (PSO) and the Tone-model techniques. This hybrid approach exploits the strengths from both the heuristic-search and the model based approaches. In our work, only the monophonic Tone-models of all pitches are learned and employed to calculate the first pass output of polyphonic transcription, which is then refined in the second pass by PSO. The experimental results show that the proposed hybrid approach outperform the competing Non-negative Matrix Factorisation (NMF) approach. This paper presents and discusses the design and the experimental results of this novel approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Plumbley, M.D., Abdallah, S.A., Blumensath, T., Davies, M.E.: Sparse representations of polyphonic music. Signal Processing 86(3), 417–431 (2005)
Bello, J.P.: Toward the automated analysis of simple polyphonic music: a knowledge-based approach. Ph.D. dissertation, Department of Electrical Engineering, Queen Mary, University of London, London, U.K. (2003)
Bregman, A.: Auditory Scence Analysis. MIT Press, Cambridge (1990)
Brown, G.J., Cooke, M.: Computational auditory scene analysis. Computer Speech and Language 8, 297–336 (1994)
Brown, J.C., Puckette, M.S.: An efficient algorithm for the calculation of a constant Q transform. Journal of the Acoustical Society of America 92(5), 2698–2701 (1992)
Davy, M., Godsill, S.J.: Bayesian Harmonic Models for Musical Signal Analysis. In: Bernardo, J.M., Bayarri, M.J., Berger, J.O., Dawid, A.P., Heckerman, D., Smith, A.F.M., West, M. (eds.) Bayesian Statistics, vol. 7, pp. 105–124. Oxford University Press (2003)
Eberhart, R.C., Kennedy, J.: A new optimizer using particle swarm theory. In: Proceedings of the Sixth International Symposium on Micromachine and Human Science, Nagoya, Japan, pp. 39–43 (1995)
Ellis, D.P.W.: Model-based scene analysis. In: Wang, D., Brown, G.J. (eds.) Computational Auditory Scene Analysis: Principles, Algorithms and Applications. IEEE Press, A John Wiley & Sons, Inc. (2006)
Goto, M.: A real-time music-scence-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals. Speech Communication 43, 311–329 (2004)
Kashino, K., Nakadai, K., Kinoshita, T., Tanaka, H.: Application of Bayesian probability network to music scence analysis. In: Proceedings of IJCAI Workshop on CASA, Montreal, pp. 52–59 (1995)
Klapuri, A.: Automatic music transcription as we know it today. Journal of New Music Research 33(3), 269–282 (2004)
Klapuri, A.: Signal processing methods for the automatic transcription of music. Ph.D thesis, Tampere University of Technology (2004)
Martin, K.D.: A blackboard system for automatic transcription of simple polyphonic music. M.I.T. Media Lab, Perceptual Computing, Technical Report. 385 (1996)
Niedermayer, B.: Non-negative matrix division for the automatic transcription of polyphonic music. In: Proceedings of International Conference on Music Information Retrieval (ISMIR 2008), Austria, pp. 545–549 (2008)
Phon-Amnuaisuk, S.: Transcribing Bach chorales using non-negative matrix factorisation. In: Proceedings of the 2010 International Conference on Information Technology Convergence on Audio, Language and Image Processing (ICALIP 2010), Shanghai China, pp. 688–693 (2010)
Smaragdis, P., Brown, J.C.: Non-negative matric factorization for polyphonic music transcription. In: Proceedings of IEEE Workshop Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, pp. 177–180 (2003)
Vincent, E., Rodet, X.: Music transcription with ISA and HMM In. In: Proceedings of the Fifth International Conference on Independent Component Analysis and Blind Signal Separation (ICA 2004), Gradana, Spain, pp. 1197–1204 (2004)
Walmsley, P.J., Godsill, S.J., Rayner, P.J.W.: Bayesian graphical models for polyphonic pitch tracking. In: Proceedings of Diderot Forum on Mathematics and Music, Vienna, Austria, December 2-4 (1999)
Wang, D., Brown, G.J.: Computational Auditory Scene Analysis: Principles, Algorithms and Applications. IEEE Press, A John Wiley & Sons, Inc. (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Phon-Amnuaisuk, S. (2012). Polyphonic Transcription: Exploring a Hybrid of Tone Models and Particle Swarm Optimisation. In: Machado, P., Romero, J., Carballal, A. (eds) Evolutionary and Biologically Inspired Music, Sound, Art and Design. EvoMUSART 2012. Lecture Notes in Computer Science, vol 7247. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29142-5_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-29142-5_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29141-8
Online ISBN: 978-3-642-29142-5
eBook Packages: Computer ScienceComputer Science (R0)