Non-negative Matrix Factorization and Its Variants for Audio Signal Processing

Kameoka, Hirokazu

doi:10.1007/978-4-431-55387-8_2

Hirokazu Kameoka^2,3

Part of the book series: SpringerBriefs in Statistics ((JSSRES))

2426 Accesses
7 Citations

Abstract

In this chapter, I briefly introduce a multivariate analysis technique called non-negative matrix factorization (NMF), which has attracted a lot of attention in the field of audio signal processing in recent years. I will mention some basic properties of NMF, effects induced by the non-negative constraints, how to derive an iterative algorithm for NMF, and some attempts that have been made to apply NMF to audio processing problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Introduction to Multichannel NMF for Audio Source Separation

Single-Channel Audio Source Separation with NMF: Divergences, Constraints and Algorithms

Component-Adaptive Priors for NMF

References

Lee, D. D., & Seung, H. S. (2000). Algorithms for nonnegative matrix factorization. In Advances in NIPS (pp. 556–562).
Google Scholar
Paatero, P., & Tapper, U. (1994). Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics, 5, 111–126.
Article Google Scholar
Csiszár, I. (1975). I-divergence geometry of probability distributions and minimization problems. The Annals of Probability, 3(1), 146–158.
Article MATH Google Scholar
Parry, R. M., & Essa, I. (2007). Phase-aware non-negative spectrogram factorization. In Proceedings of ICA (pp. 536–543).
Google Scholar
Févotte, C., Bertin, N., & Durrieu, J.-L. (2009). Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis. Neural Computation, 21(3), 793–830.
Article MATH Google Scholar
Ortega, J. M., & Rheinboldt, W. C. (1970). Iterative solutions of nonlinear equations in several variables. New York: Academic Press.
Google Scholar
Hunter, D. R., & Lange, K. (2000). Quantile regression via an MM algorithm. Journal of Computational and Graphical Statistics, 9, 60–77.
MathSciNet Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 39(1), 1–38.
MathSciNet MATH Google Scholar
Kameoka, H., Goto, M., & Sagayama, S. (2006, August). Selective amplifier of periodic and non-periodic components in concurrent audio signals with spectral control envelopes. IPSJ Technical Report (vol. 2006-MUS-66, pp. 77–84) (in Japanese).
Google Scholar
Eguchi, S., & Kano, Y. (2001). "Robustifying maximum likelihood estimation. Technical Report, Institute of Statistical Mathematics. Research Memo. 802.
Google Scholar
Nakano, M., Kameoka, H., Le Roux, J., Kitano, Y., Ono, N., & Sagayama, S. (2010). Convergence-guaranteed multiplicative algorithms for non-negative matrix factorization with beta-divergence. In Proceedings of MLSP (pp. 283–288).
Google Scholar
Bregman, L. M. (1967). The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. USSR Computational Mathematics and Mathematical Physics, 7(3), 210–217.
Article Google Scholar
Hennequin, R., David, B., & Badeau, R. (2011). Beta-divergence as a subclass of Bregman divergence. IEEE Signal Processing Letters, 18(2), 83–86.
Article Google Scholar
Dhillon, I. S., & Sra, S. (2005). Generalized nonnegative matrix approximations with Bregman divergences. In Advances in NIPS (pp. 283–290).
Google Scholar
Hofmann, T. (1999). Probabilistic latent semantic analysis. In Proceedings of UAI (pp. 289–296).
Google Scholar
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022. (J. Lafferty (Ed.)).
MATH Google Scholar
Cemgil, A. T. (2008). Bayesian inference for nonnegative matrix factorization models, Technical Report CUED/F-INFENG/TR.609, University of Cambridge.
Google Scholar
Smaragdis, P., & Brown, J. C. (2003). Non-negative matrix factorization for music transcription. In Proceedings of WASPAA (pp. 177–180).
Google Scholar
Kameoka, H., Ono, N., Kashino, K., & Sagayama, S. (2009) Complex NMF: A new sparse representation for acoustic signals. In Proceedings of ICASSP (pp. 3437–3440).
Google Scholar
Smaragdis, P. (2004). Non-negative matrix factor deconvolution; extraction of multiple sound sources from monophonic inputs. In Proceedings of ICA (pp. 494–499).
Google Scholar
Ozerov, A. Févotte, C., & Charbit, M. (2009). Factorial scaled hidden Markov model for polyphonic audio representation and source separation. In Proceedings of WASPAA (pp. 121–124).
Google Scholar
Nakano, M., Le Roux, J., Kameoka, H., Nakamura, T., Ono, N., & Sagayama, S. (2011). Bayesian nonparametric spectrogram modeling based on infinite factorial infinite hidden Markov model. In Proceedings of WASPAA (pp. 325–328).
Google Scholar
Virtanen, T. (2007). Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Transactions on Audio, Speech, and Language Processing, 15(3), 1066–1074.
Article Google Scholar
Raczynski, S. A., Ono, N., & Sagayama, S. (2007). Multipitch analisys with harmonic nonnegative matrix approximation. In Proceedings of ISMIR (pp. 381–386).
Google Scholar
Virtanen, T., & Klapuri, A. (2006). Analysis of polyphonic audio using source-filter model and non-negative matrix factorization. In Advances of NIPS.
Google Scholar
Vincent, E., Bertin, N., & Badeau, R. (2008) Harmonic and inharmonic nonnegative matrix factorization for polyphonic pitch transcription. In Proceedings of ICASSP (pp. 109–112).
Google Scholar
Kameoka, H., & Kashino, K. (2009). Composite autoregressive system for sparse source-filter representation of speech. In Proceedings of ISCAS (pp. 2477–2480).
Google Scholar
Yoshii, K., & Goto, M. (2012, October). Infinite composite autoregressive models for music signal analysis. In Proceedings of The 13th International Society for Music Information Retrieval Conference (ISMIR) (pp. 79–84).
Google Scholar
Kameoka, H., Nakano, M., Ochiai, K., Imoto, Y., Kashino, K., & Sagayama, S. (2012). Constrained and regularized variants of non-negative matrix factorization incorporating music-specific constraints. In Proceedings of ICASSP (pp. 5365–5368).
Google Scholar
Smaragdis, P., Raj, B., & Shashanka, M. V. (2007). Supervised and semi-supervised separation of sounds from single-channel mixtures. In Proceedings of ICA (pp. 414–421).
Google Scholar
Smaragdis, P., & Raj, B. (2007). Example-driven bandwidth expansion. In Proceedings of WASPAA (pp. 135–138).
Google Scholar
Durrieu, J.-L., Richard, G., David, B., & Févotte, C. (2010). Source/filter model for unsupervised main melody extraction from polyphonic audio signals. IEEE Transactions on Audio, Speech, and Language Processing, 18(3), 564–575.
Article Google Scholar
Helén, M., & Virtanen, T. (2005). Separation of drums from polyphonic music using non-negative matrix factorization and support vector machine. In Proceedings of EUSIPCO.
Google Scholar
Hurmalainen, A., Gemmeke, J., & Virtanen, T. (2011). Non-negative matrix deconvolution in noise robust speech recognition. In Proceddings of ICASSP (pp. 4588–4591).
Google Scholar
Durrieu, J. -L., Thiran, J. -P. (2011). Sparse non-negative decomposition of speech power spectra for formant tracking. In Proceedings of ICASSP (pp. 5260–5263).
Google Scholar
Togami, M., Kawaguchi, Y., Kokubo, H., & Obuchi, Y. (2010). Acoustic echo suppressor with multichannel semi-blind non-negative matrix factorization. In Proceedings of APSIPA (pp. 522–525).
Google Scholar
Hiroya, S. (2013). Non-negative temporal decomposition of speech parameters by multiplicative update rules. IEEE Transactions on Audio, Speech, and Language Processing, 21(10), 2108–2117.
Article Google Scholar
Kameoka, H., Nakatani, T., & Yoshioka, T. (2009). Robust speech dereverberation based on non-negativity and sparse nature of speech spectrograms. In Proceedings of ICASSP (pp. 45–48).
Google Scholar
Ozerov, A., & Févotte, C. (2010). Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Transactions on Audio, Speech, and Language Processing, 18(3), 550–563.
Article Google Scholar
Kitano, Y., Kameoka, H., Izumi, Y., Ono, N., & Sagayama, S. (2010). A sparse component model of source sinals and its application to blind source separation. In Proceedings of ICASSP (pp. 4122–4125).
Google Scholar
Sawada, H., Kameoka, H., Araki, S., & Ueda, N. (2011). New formulations and efficient algorithms for multichannel NMF. In Proceedings of WASPAA (pp. 153–156).
Google Scholar
Sawada, H., Kameoka, H., Araki, S., & Ueda, N. (2012). Efficient algorithms for multichannel extensions of Itakura-Saito nonnegative matrix factorization. In Proceedings of ICASSP (pp. 261–264).
Google Scholar
Higuchi, T., Takeda, H., Nakamura, T., Kameoka, H. (2014). A unified approach for underdetermined blind signal separation and source activity detection by multichannel factorial hidden Markov models. In Proceedings of The 15th Annual Conference of the International Speech Communication Association (Interspeech 2014) (pp. 850–854).
Google Scholar
Schmidt, M. N., Winther, O., & Hansen, L. K. (2009). Bayesian non-negative matrix factorization. In Proceedings of ICA (pp. 540–547).
Google Scholar
Schwarz, G. E. (1978). Estimating the dimension of a model. Annals of Statistics, 6(2), 461–464.
Article MathSciNet MATH Google Scholar
Corduneanu, A., & Bishop, C. M. (2001). Variational Bayesian model selection for mixture distributions. In Proceedings of AISTATS (pp. 27–34).
Google Scholar
Smaragdis, P., Raj, B., & Shashanka, M. (2006). A probabilistic latent variable model for acoustic modeling. In Advances in NIPS.
Google Scholar
Yoshii, K., & Goto, M. (2012). A nonparametric Bayesian multipitch analyzer based on infinite latent harmonic allocation. IEEE Transactions on Audio, Speech, and Language Processing, 20(3), 717–730.
Article Google Scholar
Knowles, D., & Ghahramani, Z. (2007). Infinite sparse factor analysis and infinite independent components analysis.
Google Scholar
Liang, D., Hoffman, M. D., & Ellis, D. P. W. (2013). Beta process sparse nonnegative matrix factorization for music.
Google Scholar
Hoffman, M., Blei, D. & Cook, P. (2010). Bayesian nonparametric matrix factorization for recorded music. In Proceedings of ICML (pp. 439–446).
Google Scholar
Cichocki, A., Zdunek, R., Phan, A. H., & Amari, S. (2009). Nonnegative matrix and tensor factorizations: Applications to exploratory multi-way data analysis and blind source separation. London: Wiley.
Google Scholar
Kameoka, H. (2012). Non-negative matrix factorization with application to audio signal processing. Acoustical Science and Technology, 68(11), 559–565. (in Japanese).
Google Scholar
Sawada, H. (2012). Nonnegative matrix factorization and its applications to data/signal analysis. IEICE Journal, 95, 829–833.
Google Scholar
Smaragdis, P., Fevotte, C., Mysore, G., Mohammadiha, N., & Hoffman, M. (2014). Static and dynamic source separation using nonnegative factorizations: A unified view. In IEEE Signal Processing Magazine (pp. 66–75).
Google Scholar

Download references

Author information

Authors and Affiliations

The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
Hirokazu Kameoka
Nippon Telegraph and Telephone Corporation, 3-1 Morinosato Wakamiya, Atsugi, Kanagawa, 243-0198, Japan
Hirokazu Kameoka

Authors

Hirokazu Kameoka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hirokazu Kameoka .

Editor information

Editors and Affiliations

Faculty of Design, Kyushu University, Fukuoka, Fukuoka, Japan
Toshio Sakata

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kameoka, H. (2016). Non-negative Matrix Factorization and Its Variants for Audio Signal Processing. In: Sakata, T. (eds) Applied Matrix and Tensor Variate Data Analysis. SpringerBriefs in Statistics(). Springer, Tokyo. https://doi.org/10.1007/978-4-431-55387-8_2

Download citation

DOI: https://doi.org/10.1007/978-4-431-55387-8_2
Published: 03 February 2016
Publisher Name: Springer, Tokyo
Print ISBN: 978-4-431-55386-1
Online ISBN: 978-4-431-55387-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Non-negative Matrix Factorization and Its Variants for Audio Signal Processing

Abstract

Access this chapter

Similar content being viewed by others

An Introduction to Multichannel NMF for Audio Source Separation

Single-Channel Audio Source Separation with NMF: Divergences, Constraints and Algorithms

Component-Adaptive Priors for NMF

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Non-negative Matrix Factorization and Its Variants for Audio Signal Processing

Abstract

Access this chapter

Similar content being viewed by others

An Introduction to Multichannel NMF for Audio Source Separation

Single-Channel Audio Source Separation with NMF: Divergences, Constraints and Algorithms

Component-Adaptive Priors for NMF

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation