Abstract
In this chapter, I briefly introduce a multivariate analysis technique called non-negative matrix factorization (NMF), which has attracted a lot of attention in the field of audio signal processing in recent years. I will mention some basic properties of NMF, effects induced by the non-negative constraints, how to derive an iterative algorithm for NMF, and some attempts that have been made to apply NMF to audio processing problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Lee, D. D., & Seung, H. S. (2000). Algorithms for nonnegative matrix factorization. In Advances in NIPS (pp. 556–562).
Paatero, P., & Tapper, U. (1994). Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics, 5, 111–126.
Csiszár, I. (1975). I-divergence geometry of probability distributions and minimization problems. The Annals of Probability, 3(1), 146–158.
Parry, R. M., & Essa, I. (2007). Phase-aware non-negative spectrogram factorization. In Proceedings of ICA (pp. 536–543).
Févotte, C., Bertin, N., & Durrieu, J.-L. (2009). Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis. Neural Computation, 21(3), 793–830.
Ortega, J. M., & Rheinboldt, W. C. (1970). Iterative solutions of nonlinear equations in several variables. New York: Academic Press.
Hunter, D. R., & Lange, K. (2000). Quantile regression via an MM algorithm. Journal of Computational and Graphical Statistics, 9, 60–77.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 39(1), 1–38.
Kameoka, H., Goto, M., & Sagayama, S. (2006, August). Selective amplifier of periodic and non-periodic components in concurrent audio signals with spectral control envelopes. IPSJ Technical Report (vol. 2006-MUS-66, pp. 77–84) (in Japanese).
Eguchi, S., & Kano, Y. (2001). "Robustifying maximum likelihood estimation. Technical Report, Institute of Statistical Mathematics. Research Memo. 802.
Nakano, M., Kameoka, H., Le Roux, J., Kitano, Y., Ono, N., & Sagayama, S. (2010). Convergence-guaranteed multiplicative algorithms for non-negative matrix factorization with beta-divergence. In Proceedings of MLSP (pp. 283–288).
Bregman, L. M. (1967). The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. USSR Computational Mathematics and Mathematical Physics, 7(3), 210–217.
Hennequin, R., David, B., & Badeau, R. (2011). Beta-divergence as a subclass of Bregman divergence. IEEE Signal Processing Letters, 18(2), 83–86.
Dhillon, I. S., & Sra, S. (2005). Generalized nonnegative matrix approximations with Bregman divergences. In Advances in NIPS (pp. 283–290).
Hofmann, T. (1999). Probabilistic latent semantic analysis. In Proceedings of UAI (pp. 289–296).
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022. (J. Lafferty (Ed.)).
Cemgil, A. T. (2008). Bayesian inference for nonnegative matrix factorization models, Technical Report CUED/F-INFENG/TR.609, University of Cambridge.
Smaragdis, P., & Brown, J. C. (2003). Non-negative matrix factorization for music transcription. In Proceedings of WASPAA (pp. 177–180).
Kameoka, H., Ono, N., Kashino, K., & Sagayama, S. (2009) Complex NMF: A new sparse representation for acoustic signals. In Proceedings of ICASSP (pp. 3437–3440).
Smaragdis, P. (2004). Non-negative matrix factor deconvolution; extraction of multiple sound sources from monophonic inputs. In Proceedings of ICA (pp. 494–499).
Ozerov, A. Févotte, C., & Charbit, M. (2009). Factorial scaled hidden Markov model for polyphonic audio representation and source separation. In Proceedings of WASPAA (pp. 121–124).
Nakano, M., Le Roux, J., Kameoka, H., Nakamura, T., Ono, N., & Sagayama, S. (2011). Bayesian nonparametric spectrogram modeling based on infinite factorial infinite hidden Markov model. In Proceedings of WASPAA (pp. 325–328).
Virtanen, T. (2007). Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Transactions on Audio, Speech, and Language Processing, 15(3), 1066–1074.
Raczynski, S. A., Ono, N., & Sagayama, S. (2007). Multipitch analisys with harmonic nonnegative matrix approximation. In Proceedings of ISMIR (pp. 381–386).
Virtanen, T., & Klapuri, A. (2006). Analysis of polyphonic audio using source-filter model and non-negative matrix factorization. In Advances of NIPS.
Vincent, E., Bertin, N., & Badeau, R. (2008) Harmonic and inharmonic nonnegative matrix factorization for polyphonic pitch transcription. In Proceedings of ICASSP (pp. 109–112).
Kameoka, H., & Kashino, K. (2009). Composite autoregressive system for sparse source-filter representation of speech. In Proceedings of ISCAS (pp. 2477–2480).
Yoshii, K., & Goto, M. (2012, October). Infinite composite autoregressive models for music signal analysis. In Proceedings of The 13th International Society for Music Information Retrieval Conference (ISMIR) (pp. 79–84).
Kameoka, H., Nakano, M., Ochiai, K., Imoto, Y., Kashino, K., & Sagayama, S. (2012). Constrained and regularized variants of non-negative matrix factorization incorporating music-specific constraints. In Proceedings of ICASSP (pp. 5365–5368).
Smaragdis, P., Raj, B., & Shashanka, M. V. (2007). Supervised and semi-supervised separation of sounds from single-channel mixtures. In Proceedings of ICA (pp. 414–421).
Smaragdis, P., & Raj, B. (2007). Example-driven bandwidth expansion. In Proceedings of WASPAA (pp. 135–138).
Durrieu, J.-L., Richard, G., David, B., & Févotte, C. (2010). Source/filter model for unsupervised main melody extraction from polyphonic audio signals. IEEE Transactions on Audio, Speech, and Language Processing, 18(3), 564–575.
Helén, M., & Virtanen, T. (2005). Separation of drums from polyphonic music using non-negative matrix factorization and support vector machine. In Proceedings of EUSIPCO.
Hurmalainen, A., Gemmeke, J., & Virtanen, T. (2011). Non-negative matrix deconvolution in noise robust speech recognition. In Proceddings of ICASSP (pp. 4588–4591).
Durrieu, J. -L., Thiran, J. -P. (2011). Sparse non-negative decomposition of speech power spectra for formant tracking. In Proceedings of ICASSP (pp. 5260–5263).
Togami, M., Kawaguchi, Y., Kokubo, H., & Obuchi, Y. (2010). Acoustic echo suppressor with multichannel semi-blind non-negative matrix factorization. In Proceedings of APSIPA (pp. 522–525).
Hiroya, S. (2013). Non-negative temporal decomposition of speech parameters by multiplicative update rules. IEEE Transactions on Audio, Speech, and Language Processing, 21(10), 2108–2117.
Kameoka, H., Nakatani, T., & Yoshioka, T. (2009). Robust speech dereverberation based on non-negativity and sparse nature of speech spectrograms. In Proceedings of ICASSP (pp. 45–48).
Ozerov, A., & Févotte, C. (2010). Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Transactions on Audio, Speech, and Language Processing, 18(3), 550–563.
Kitano, Y., Kameoka, H., Izumi, Y., Ono, N., & Sagayama, S. (2010). A sparse component model of source sinals and its application to blind source separation. In Proceedings of ICASSP (pp. 4122–4125).
Sawada, H., Kameoka, H., Araki, S., & Ueda, N. (2011). New formulations and efficient algorithms for multichannel NMF. In Proceedings of WASPAA (pp. 153–156).
Sawada, H., Kameoka, H., Araki, S., & Ueda, N. (2012). Efficient algorithms for multichannel extensions of Itakura-Saito nonnegative matrix factorization. In Proceedings of ICASSP (pp. 261–264).
Higuchi, T., Takeda, H., Nakamura, T., Kameoka, H. (2014). A unified approach for underdetermined blind signal separation and source activity detection by multichannel factorial hidden Markov models. In Proceedings of The 15th Annual Conference of the International Speech Communication Association (Interspeech 2014) (pp. 850–854).
Schmidt, M. N., Winther, O., & Hansen, L. K. (2009). Bayesian non-negative matrix factorization. In Proceedings of ICA (pp. 540–547).
Schwarz, G. E. (1978). Estimating the dimension of a model. Annals of Statistics, 6(2), 461–464.
Corduneanu, A., & Bishop, C. M. (2001). Variational Bayesian model selection for mixture distributions. In Proceedings of AISTATS (pp. 27–34).
Smaragdis, P., Raj, B., & Shashanka, M. (2006). A probabilistic latent variable model for acoustic modeling. In Advances in NIPS.
Yoshii, K., & Goto, M. (2012). A nonparametric Bayesian multipitch analyzer based on infinite latent harmonic allocation. IEEE Transactions on Audio, Speech, and Language Processing, 20(3), 717–730.
Knowles, D., & Ghahramani, Z. (2007). Infinite sparse factor analysis and infinite independent components analysis.
Liang, D., Hoffman, M. D., & Ellis, D. P. W. (2013). Beta process sparse nonnegative matrix factorization for music.
Hoffman, M., Blei, D. & Cook, P. (2010). Bayesian nonparametric matrix factorization for recorded music. In Proceedings of ICML (pp. 439–446).
Cichocki, A., Zdunek, R., Phan, A. H., & Amari, S. (2009). Nonnegative matrix and tensor factorizations: Applications to exploratory multi-way data analysis and blind source separation. London: Wiley.
Kameoka, H. (2012). Non-negative matrix factorization with application to audio signal processing. Acoustical Science and Technology, 68(11), 559–565. (in Japanese).
Sawada, H. (2012). Nonnegative matrix factorization and its applications to data/signal analysis. IEICE Journal, 95, 829–833.
Smaragdis, P., Fevotte, C., Mysore, G., Mohammadiha, N., & Hoffman, M. (2014). Static and dynamic source separation using nonnegative factorizations: A unified view. In IEEE Signal Processing Magazine (pp. 66–75).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 The Author(s)
About this chapter
Cite this chapter
Kameoka, H. (2016). Non-negative Matrix Factorization and Its Variants for Audio Signal Processing. In: Sakata, T. (eds) Applied Matrix and Tensor Variate Data Analysis. SpringerBriefs in Statistics(). Springer, Tokyo. https://doi.org/10.1007/978-4-431-55387-8_2
Download citation
DOI: https://doi.org/10.1007/978-4-431-55387-8_2
Published:
Publisher Name: Springer, Tokyo
Print ISBN: 978-4-431-55386-1
Online ISBN: 978-4-431-55387-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)