Skip to main content
Log in

Extraction of novel features for emotion recognition

  • Information Technology
  • Published:
Journal of Shanghai University (English Edition)

Abstract

Hilbert-Huang transform method has been widely utilized from its inception because of the superiority in varieties of areas. The Hilbert spectrum thus obtained is able to reflect the distribution of the signal energy in a number of scales accurately. In this paper, a novel feature called ECC is proposed via feature extraction of the Hilbert energy spectrum which describes the distribution of the instantaneous energy. The experimental results conspicuously demonstrate that ECC outperforms the traditional short-term average energy. Combination of the ECC with mel frequency cepstral coefficients (MFCC) delineates the distribution of energy in the time domain and frequency domain, and the features of this group achieve a better recognition effect compared with the feature combination of the short-term average energy, pitch and MFCC. Afterwards, further improvements of ECC are developed. TECC is gained by combining ECC with the teager energy operator, and EFCC is obtained by introducing the instantaneous frequency to the energy. In the experiments, seven status of emotion are selected to be recognized and the highest recognition rate 83.57% is achieved within the classification accuracy of boredom reaching 100%. The numerical results indicate that the proposed features ECC, TECC and EFCC can improve the performance of speech emotion recognition substantially.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Chang J S, Kim E Y, Kim H J. Mobile robot control using hand-shape recognition [J]. Transactions of the Institute of Measurement and Control, 2008, 30(2): 143–152.

    Article  Google Scholar 

  2. Picard R. Affective computing [M]. Boston: MIT Press, 1997.

    Google Scholar 

  3. Anselmo F N, Wanderley C C, Vinicius R M, Teodiano F B F. Human-machine interface based on electro-biological signals for mobile vehicles [C]// IEEE International Symposium on Industrial Electronics, Montreal, Canada. 2006, 2954–2959.

  4. Rohit M, Mirrasoul J M. Trend analysis techniques for incipient fault prediction [C]// Power & Energy Society General Meeting, Calgary, Canada. 2009: 1–8.

  5. Norden E H, Zheng S, Steven R L, Manli C W, Hsing H S, Quanan Z, Nai C Y, Chi C T, Henry H L. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis [J]. Proceedings of the Royal Society A, 1998, 454(1971): 903–995.

    Google Scholar 

  6. Jones J D, Pei J S, Wright J P, Tull M P. Embedded EMD algorithm within an FPGA-based design to classify nonlinear SDOF systems[C]// Proceedings of SPIE — The International Society for Optical Engineering, 2010, DOI:10.1117/12.847889.

  7. Shi Q W, Zhou W, Cao J T, Tanaka T, Wang R B. Brain-computer interface system using approximate entropy and EMD techniques [J]. Lecture Notes in Computer Science, 2010, 6146(2): 204–212.

    Article  Google Scholar 

  8. Pan J Y, Yan X H, Zheng Q N. Interpretation of scatter meter ocean surface wind vector EOFs over the Northwestern Pacific [J]. Remote Sensing of Environment, 2003, 84(1): 53–68.

    Article  Google Scholar 

  9. Dong Y F, Li Y M, Xiao M K. Analysis of earthquake ground motions using an improved Hilbert-Huang transform [J]. Soil Dynamics and Earthquake Engineering, 2008, 28(1): 7–19.

    Article  Google Scholar 

  10. Huang B, Yan G Z. Analysis of the characteristics of gastrointestinal motility based on Hilbert-Huang transform method [J]. High Technology Letters, 2008, 14(1): 30–34.

    Google Scholar 

  11. Zhang L, Huang M. Fault diagnosis approach for bearing based on EMD and slice bi-spectrum [J]. Journal of Beijing University of Aeronautics and Astronautics, 2010, 36(3): 287–290.

    Google Scholar 

  12. Cao C F, Yang S X, Yang J X. Vibration mode extraction method based on the characteristics of white noise [J]. Journal of Mechanical Engineering, 2010, 46(3): 65–70.

    Article  Google Scholar 

  13. Xie S, Zeng Y C, Jiang Y B. Application of Hilbert marginal spectrum in speech emotion recognition [J]. Technical Acoustics, 2009, 28(2): 148–152.

    Google Scholar 

  14. Wang D L, Leung H, Kwak K C, Yoon H. Enhanced speech recognition with blind equalization for robot ‘WEVER-R2 [C]// Proceedings of IEEE International Workshop on Robot and Human Interactive Communication, Jeju, Korean. 2007: 684–688.

  15. Roy A, Doherty J F. Empirical mode decomposition frequency resolution improvement using the preemphasis and de-emphasis method [C]// The 42nd Annual Conference on Information Sciences and Systems, Princeton, USA. 2008: 453–457.

  16. Tsau E, Cho N, Kuo C J. Fundamental frequency estimation for music signals with modified Hilbert-Huang Transform (HHT) [C]// Proceedings of IEEE International Conference on Multimedia and Expo. 2009: 338–341.

  17. Teager H M, Teager S M. A phenomenological model for vowel production in the vocal tract [J]. Speech Science: Recent Advances, 1983: 73–109.

  18. Chorin A J, Marsden J E. A mathematical introduction to fluid mechanics [M]// 2nd ed. Berlin: Springer-Verlag, 1990.

    Google Scholar 

  19. Thomas T J. A finite element model of fluid flow in the vocal tract [J]. Computer Speech and Language, 1986, 1(1): 131–151.

    Article  Google Scholar 

  20. Zhou G, Hansen J H L, Kaiser J F. Nonlinear feature based classification of speech under stress [J]. IEEE Transactions on Speech and Audio Processing, 2010, 9(3): 201–216.

    Article  Google Scholar 

  21. Kaiser J F. On a simple algorithm to calculate the ‘energy’ of a signal [C]// Proceedings of the International Conference on Acoustics, Speech and Signal Processing, Albuquerque, USA. 1990: 381–384.

  22. Boudraa A O, Cexus J C, Salzenstein F, Guillon L. If estimation using empirical mode decomposition and nonlinear Teager energy operator [C]// Proceedings of the First International Symposium on Control, Communications and Signal Processing. 2004: 45–48.

  23. Gao H, Chen S G, Su G C. Emotion classification of mandarin speech based on TEO nonlinear features [C]// The Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, Qingdao, China. 2007: 394–398.

  24. Gao Hui, Su Guang-chuan, Chen Shan-guang. Acoustic Feature analysis of mandarin speech under various emotional status [J]. Space Medical and Medical Engineering, 2005, 18(5): 350–354 (in Chinese).

    Google Scholar 

  25. Burkhardt F, Paeschke A, Rolfes M, Weiss B. A database of German emotional speech [C]// Proceedings of the Ninth European Conference on Speech Communication and Technology, Lisbon, Portugal. 2005: 3–6.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Li  (李 昕).

Additional information

Project supported by the State Key Laboratory of Robotics and System (Grant No.SKLS-2009-MS-10), and the Shanghai Leading Academic Discipline Project (Grant No.J50103)

About this article

Cite this article

Li, X., Zheng, Y. & Li, X. Extraction of novel features for emotion recognition. J. Shanghai Univ.(Engl. Ed.) 15, 479–486 (2011). https://doi.org/10.1007/s11741-011-0772-3

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11741-011-0772-3

Keywords

Navigation