Skip to main content

Polyphonic Transcription: Exploring a Hybrid of Tone Models and Particle Swarm Optimisation

  • Conference paper
Book cover Evolutionary and Biologically Inspired Music, Sound, Art and Design (EvoMUSART 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7247))

Abstract

Polyphonic transcription could be formulated as a supervised classification task if the classifiers of all possible polyphonic combinations could be learned beforehand. However, it is impractical to learn all possible classification models in real life due to the exponential explosion of all possible polyphonic combinations. Here, we describe a novel polyphonic transcription approach that applies a hybrid of the Particle Swarm Optimisation (PSO) and the Tone-model techniques. This hybrid approach exploits the strengths from both the heuristic-search and the model based approaches. In our work, only the monophonic Tone-models of all pitches are learned and employed to calculate the first pass output of polyphonic transcription, which is then refined in the second pass by PSO. The experimental results show that the proposed hybrid approach outperform the competing Non-negative Matrix Factorisation (NMF) approach. This paper presents and discusses the design and the experimental results of this novel approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Plumbley, M.D., Abdallah, S.A., Blumensath, T., Davies, M.E.: Sparse representations of polyphonic music. Signal Processing 86(3), 417–431 (2005)

    Article  Google Scholar 

  2. Bello, J.P.: Toward the automated analysis of simple polyphonic music: a knowledge-based approach. Ph.D. dissertation, Department of Electrical Engineering, Queen Mary, University of London, London, U.K. (2003)

    Google Scholar 

  3. Bregman, A.: Auditory Scence Analysis. MIT Press, Cambridge (1990)

    Google Scholar 

  4. Brown, G.J., Cooke, M.: Computational auditory scene analysis. Computer Speech and Language 8, 297–336 (1994)

    Article  Google Scholar 

  5. Brown, J.C., Puckette, M.S.: An efficient algorithm for the calculation of a constant Q transform. Journal of the Acoustical Society of America 92(5), 2698–2701 (1992)

    Article  Google Scholar 

  6. Davy, M., Godsill, S.J.: Bayesian Harmonic Models for Musical Signal Analysis. In: Bernardo, J.M., Bayarri, M.J., Berger, J.O., Dawid, A.P., Heckerman, D., Smith, A.F.M., West, M. (eds.) Bayesian Statistics, vol. 7, pp. 105–124. Oxford University Press (2003)

    Google Scholar 

  7. Eberhart, R.C., Kennedy, J.: A new optimizer using particle swarm theory. In: Proceedings of the Sixth International Symposium on Micromachine and Human Science, Nagoya, Japan, pp. 39–43 (1995)

    Google Scholar 

  8. Ellis, D.P.W.: Model-based scene analysis. In: Wang, D., Brown, G.J. (eds.) Computational Auditory Scene Analysis: Principles, Algorithms and Applications. IEEE Press, A John Wiley & Sons, Inc. (2006)

    Google Scholar 

  9. Goto, M.: A real-time music-scence-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals. Speech Communication 43, 311–329 (2004)

    Article  Google Scholar 

  10. Kashino, K., Nakadai, K., Kinoshita, T., Tanaka, H.: Application of Bayesian probability network to music scence analysis. In: Proceedings of IJCAI Workshop on CASA, Montreal, pp. 52–59 (1995)

    Google Scholar 

  11. Klapuri, A.: Automatic music transcription as we know it today. Journal of New Music Research 33(3), 269–282 (2004)

    Article  Google Scholar 

  12. Klapuri, A.: Signal processing methods for the automatic transcription of music. Ph.D thesis, Tampere University of Technology (2004)

    Google Scholar 

  13. Martin, K.D.: A blackboard system for automatic transcription of simple polyphonic music. M.I.T. Media Lab, Perceptual Computing, Technical Report. 385 (1996)

    Google Scholar 

  14. Niedermayer, B.: Non-negative matrix division for the automatic transcription of polyphonic music. In: Proceedings of International Conference on Music Information Retrieval (ISMIR 2008), Austria, pp. 545–549 (2008)

    Google Scholar 

  15. Phon-Amnuaisuk, S.: Transcribing Bach chorales using non-negative matrix factorisation. In: Proceedings of the 2010 International Conference on Information Technology Convergence on Audio, Language and Image Processing (ICALIP 2010), Shanghai China, pp. 688–693 (2010)

    Google Scholar 

  16. Smaragdis, P., Brown, J.C.: Non-negative matric factorization for polyphonic music transcription. In: Proceedings of IEEE Workshop Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, pp. 177–180 (2003)

    Google Scholar 

  17. Vincent, E., Rodet, X.: Music transcription with ISA and HMM In. In: Proceedings of the Fifth International Conference on Independent Component Analysis and Blind Signal Separation (ICA 2004), Gradana, Spain, pp. 1197–1204 (2004)

    Google Scholar 

  18. Walmsley, P.J., Godsill, S.J., Rayner, P.J.W.: Bayesian graphical models for polyphonic pitch tracking. In: Proceedings of Diderot Forum on Mathematics and Music, Vienna, Austria, December 2-4 (1999)

    Google Scholar 

  19. Wang, D., Brown, G.J.: Computational Auditory Scene Analysis: Principles, Algorithms and Applications. IEEE Press, A John Wiley & Sons, Inc. (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Phon-Amnuaisuk, S. (2012). Polyphonic Transcription: Exploring a Hybrid of Tone Models and Particle Swarm Optimisation. In: Machado, P., Romero, J., Carballal, A. (eds) Evolutionary and Biologically Inspired Music, Sound, Art and Design. EvoMUSART 2012. Lecture Notes in Computer Science, vol 7247. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29142-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29142-5_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29141-8

  • Online ISBN: 978-3-642-29142-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics