Temporally-Constrained Convolutive Probabilistic Latent Component Analysis for Multi-pitch Detection
In this paper, a method for multi-pitch detection which exploits the temporal evolution of musical sounds is presented. The proposed method extends the shift-invariant probabilistic latent component analysis algorithm by introducing temporal constraints using multiple Hidden Markov Models, while supporting multiple-instrument spectral templates. Thus, this model can support the representation of sound states such as attack, sustain, and decay, while the shift-invariance across log-frequency can be utilized for multi-pitch detection in music signals that contain frequency modulations or tuning changes. For note tracking, pitch-specific Hidden Markov Models are also employed in a postprocessing step. The proposed system was tested on recordings from the RWC database, the MIREX multi-F0 dataset, and on recordings from a Disklavier piano. Experimental results using a variety of error metrics, show that the proposed system outperforms a non-temporally constrained model. The proposed system also outperforms state-of-the art transcription algorithms for the RWC and Disklavier datasets.
KeywordsMusic signal analysis probabilistic latent component analysis hidden Markov models
Unable to display preview. Download preview PDF.
- 2.Benetos, E., Dixon, S.: A temporally-constrained convolutive probabilistic model for pitch detection. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, USA, pp. 133–136 (October 2011)Google Scholar
- 3.Benetos, E., Dixon, S.: Multiple-instrument polyphonic music transcription using a convolutive probabilistic model. In: 8th Sound and Music Computing Conf., Padova, Italy, pp. 19–24 (July 2011)Google Scholar
- 4.de Cheveigné, A.: Multiple F0 estimation. In: Wang, D.L., Brown, G.J. (eds.) Computational Auditory Scene Analysis, Algorithms and Applications, pp. 45–79. IEEE Press/Wiley (2006)Google Scholar
- 6.Goto, M., Hashiguchi, H., Nishimura, T., Oka, R.: RWC music database: music genre database and musical instrument sound database. In: Int. Conf. Music Information Retrieval, Baltimore, USA (October 2003)Google Scholar
- 7.Mysore, G.: A non-negative framework for joint modeling of spectral structure and temporal dynamics in sound mixtures. Ph.D. thesis, Stanford University, USA (June 2010)Google Scholar
- 8.Nakano, M., Le Roux, J., Kameoka, H., Kitano, Y., Ono, N., Sagayama, S.: Nonnegative Matrix Factorization with Markov-Chained Bases for Modeling Time-Varying Patterns in Music Spectrograms. In: Vigneron, V., Zarzoso, V., Moreau, E., Gribonval, R., Vincent, E. (eds.) LVA/ICA 2010. LNCS, vol. 6365, pp. 149–156. Springer, Heidelberg (2010)CrossRefGoogle Scholar
- 9.Poliner, G., Ellis, D.: A discriminative model for polyphonic piano transcription. EURASIP J. Advances in Signal Processing (8), 154–162 (January 2007)Google Scholar