Temporally-Constrained Convolutive Probabilistic Latent Component Analysis for Multi-pitch Detection

  • Emmanouil Benetos
  • Simon Dixon
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7191)


In this paper, a method for multi-pitch detection which exploits the temporal evolution of musical sounds is presented. The proposed method extends the shift-invariant probabilistic latent component analysis algorithm by introducing temporal constraints using multiple Hidden Markov Models, while supporting multiple-instrument spectral templates. Thus, this model can support the representation of sound states such as attack, sustain, and decay, while the shift-invariance across log-frequency can be utilized for multi-pitch detection in music signals that contain frequency modulations or tuning changes. For note tracking, pitch-specific Hidden Markov Models are also employed in a postprocessing step. The proposed system was tested on recordings from the RWC database, the MIREX multi-F0 dataset, and on recordings from a Disklavier piano. Experimental results using a variety of error metrics, show that the proposed system outperforms a non-temporally constrained model. The proposed system also outperforms state-of-the art transcription algorithms for the RWC and Disklavier datasets.


Music signal analysis probabilistic latent component analysis hidden Markov models 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bello, J.P., Daudet, L., Abdallah, S., Duxbury, C., Davies, M., Sandler, M.: A tutorial on onset detection of music signals. IEEE Trans. Audio, Speech, and Language Processing 13(5), 1035–1047 (2005)CrossRefGoogle Scholar
  2. 2.
    Benetos, E., Dixon, S.: A temporally-constrained convolutive probabilistic model for pitch detection. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, USA, pp. 133–136 (October 2011)Google Scholar
  3. 3.
    Benetos, E., Dixon, S.: Multiple-instrument polyphonic music transcription using a convolutive probabilistic model. In: 8th Sound and Music Computing Conf., Padova, Italy, pp. 19–24 (July 2011)Google Scholar
  4. 4.
    de Cheveigné, A.: Multiple F0 estimation. In: Wang, D.L., Brown, G.J. (eds.) Computational Auditory Scene Analysis, Algorithms and Applications, pp. 45–79. IEEE Press/Wiley (2006)Google Scholar
  5. 5.
    Emiya, V., Badeau, R., David, B.: Multipitch estimation of piano sounds using a new probabilistic spectral smoothness principle. IEEE Trans. Audio, Speech, and Language Processing 18(6), 1643–1654 (2010)CrossRefGoogle Scholar
  6. 6.
    Goto, M., Hashiguchi, H., Nishimura, T., Oka, R.: RWC music database: music genre database and musical instrument sound database. In: Int. Conf. Music Information Retrieval, Baltimore, USA (October 2003)Google Scholar
  7. 7.
    Mysore, G.: A non-negative framework for joint modeling of spectral structure and temporal dynamics in sound mixtures. Ph.D. thesis, Stanford University, USA (June 2010)Google Scholar
  8. 8.
    Nakano, M., Le Roux, J., Kameoka, H., Kitano, Y., Ono, N., Sagayama, S.: Nonnegative Matrix Factorization with Markov-Chained Bases for Modeling Time-Varying Patterns in Music Spectrograms. In: Vigneron, V., Zarzoso, V., Moreau, E., Gribonval, R., Vincent, E. (eds.) LVA/ICA 2010. LNCS, vol. 6365, pp. 149–156. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  9. 9.
    Poliner, G., Ellis, D.: A discriminative model for polyphonic piano transcription. EURASIP J. Advances in Signal Processing (8), 154–162 (January 2007)Google Scholar
  10. 10.
    Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. of the IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  11. 11.
    Smaragdis, P.: Relative-pitch tracking of multiple arbitary sounds. J. Acoustical Society of America 125(5), 3406–3413 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Emmanouil Benetos
    • 1
  • Simon Dixon
    • 1
  1. 1.Centre for Digital MusicQueen Mary University of LondonLondonUK

Personalised recommendations