Skip to main content
Log in

A new discriminant NMF algorithm and its application to the extraction of subtle emotional differences in speech

  • Research Article
  • Published:
Cognitive Neurodynamics Aims and scope Submit manuscript

Abstract

In this study we propose a new feature extraction algorithm, dNMF (discriminant non-negative matrix factorization), to learn subtle class-related differences while maintaining an accurate generative capability. In addition to the minimum representation error for the standard NMF (non-negative matrix factorization) algorithm, the dNMF algorithm also results in higher between-class variance for discriminant power. The multiplicative NMF learning algorithm has been modified to cope with this additional constraint. The cost function was carefully designed so that the extraction of feature coefficients from a single testing pattern with pre-trained feature vectors resulted in a quadratic convex optimization problem in non-negative space for uniqueness. It also resolves issues related to the previous discriminant NMF algorithms. The developed dNMF algorithm has been applied to the emotion recognition task for speech, where it needs to emphasize the emotional differences while de-emphasizing the dominant phonetic components. The dNMF algorithm successfully extracted subtle emotional differences, demonstrated much better recognition performance and showed a smaller representation error from an emotional speech database.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Bell A, Sejnowski TJ (1997) The “independent components” of natural scenes are edge filters. Vis Res 37(23):3327–3338

    Article  PubMed  CAS  Google Scholar 

  • Burkhardt F, Paeschke A, Rolfes M, Sendlmeier W, Weiss B (2005) A database of German emotional speech. Proc Interspeech 2005:1517–1520

    Google Scholar 

  • Dhir CS, Lee SY (2011) Discriminant independent component analysis. IEEE Trans Neural Netw 22(6):827–845

    Article  Google Scholar 

  • Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469

    Google Scholar 

  • Kim T, Lee SY (2005) Learning self-organized topology-preserving complex speech features at primary auditory cortex. Neurocomputing 65–66:793–800

    Article  Google Scholar 

  • Kim D, Lee SY, Amari S (2009) Representative and discriminant feature extraction based on NMF for emotion recognition in speech. Neural Inf Process LNCS 5863:649–656

    Article  Google Scholar 

  • Kotsia I, Zafeiriou S, Pitas I (2007) A novel discriminant non-negative matrix factorization algorithm with applications to facial image characterization problems. IEEE Trans Inf Forensics Secur 2(3):588–595

    Article  Google Scholar 

  • Laurberg H, Christensen MG, Plumbley MD, Hansen LK, Jensen SH (2008) Theorem of positive data: on the uniqueness of NMF. Comput Intell Neurosci 2008:704206

    Article  Google Scholar 

  • Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791

    Article  PubMed  CAS  Google Scholar 

  • Lee JH, Lee TW, Jung HY, Lee SY (2002) On the efficient speech feature extraction based on independent component analysis. Neural Process Lett 15(3):235–245

    Article  Google Scholar 

  • Lewicki MS (2002) Efficient coding of natural sounds. Nat Neurosci 5(4):356–363

    Article  PubMed  CAS  Google Scholar 

  • Lin Y, Wei G (2005) Speech emotion recognition based on HMM and SVM. Proc Fourth Int Conf Mach Learn Cybern 8:4898–4901

    Article  Google Scholar 

  • Long J, Gu Z, Li Y, Yu T, Li F, Fu M (2011) Semi-supervised joint spatio-temporal feature selection for P300-based BCI speller. Cogn Neurodyn 5:387–398

    Article  PubMed  Google Scholar 

  • Martinez AM, Kak AC (2001) PCA versus LDA. IEEE Trans Pattern Anal Mach Intell 23(2):228–233

    Article  Google Scholar 

  • Oudeyer PY (2003) The production and recognition of emotions in speech: features and algorithms. Int J Hum Comput Stud 59(1):157–183

    Article  Google Scholar 

  • Slaney M, McRoberts G (2003) Baby ears: a recognition system for affective vocalizations. Speech Commun 39:367–384

    Article  Google Scholar 

  • Ververidisa D, Kotropoulos C (2006) Emotional speech recognition: resources, features, and methods. Communications 48(9):1162–1181

    Google Scholar 

  • Wang Y, Jia Y, Hu C, Turk M (2005) Non-negative matrix factorization framework for face recognition. Int J Pattern Recogn Artif Intell 19(4):495–511

    Article  Google Scholar 

  • Wang C, Zou J, Zhang J, Wang M, Wang R (2010) Feature extraction and recognition of epileptiform activity in EEG by combining PCA with ApEn. Cogn Neurodyn 4:233–240

    Article  PubMed  Google Scholar 

  • Wu S, Falk T, Chan W (2011) Automatic speech emotion recognition using modulation spectral features. Speech Commun 53(5):768–785

    Article  Google Scholar 

  • Yang Z, Oja E (2010) Linear and nonlinear projective nonnegative matrix factorization. IEEE Trans Neural Netw 21(5):734–749

    Article  PubMed  Google Scholar 

  • You M, Chen C, Bu J, Liu J, Tao J (2006) Emotional speech analysis on nonlinear manifold. In: Proceedings of the 18th international conference on pattern recognition, vol 3, pp 91–94

  • Zafeiriou S, Petrou M (2010) Nonlinear non-negative component analysis algorithms. IEEE Trans Image Process 19(4):1050–1066

    Article  PubMed  Google Scholar 

  • Zafeiriou S, Tefas A, Buciu I, Pitas I (2006) Exploiting discriminant information in nonnegative matrix factorization with application to frontal face verification. IEEE Trans Neural Netw 17(3):683–695

    Article  PubMed  Google Scholar 

  • Zhao Q, Rutkowski M, Zhang L, Cichocki A (2010) Generalized optimal spatial filtering using a kernel approach with application to EEG classification. Cogn Neurodyn 4:355–358

    Article  PubMed  Google Scholar 

  • Zhou G, Hansen JHL, Kaiser JF (2001) Nonlinear feature based classification of speech under stress. IEEE Trans Speech Audio Process 9:201–216

    Article  Google Scholar 

Download references

Acknowledgments

The main portion of this research was conducted while S. Y. Lee had visited RIKEN Brain Science Institute, Japan. This research was supported by the Korea Research Foundation Grant funded by the Korean Government (MOEHRD) (KRF-2008-013-D00091), and latter by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2009-0092812 and 2010-0028722).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soo-Young Lee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, SY., Song, HA. & Amari, Si. A new discriminant NMF algorithm and its application to the extraction of subtle emotional differences in speech. Cogn Neurodyn 6, 525–535 (2012). https://doi.org/10.1007/s11571-012-9213-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11571-012-9213-1

Keywords

Navigation