Skip to main content

Ensemble Models for Enhancement of an Arabic Speech Emotion Recognition System

  • Conference paper
  • First Online:
Book cover Advances in Information and Communication (FICC 2019)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 70))

Included in the following conference series:

Abstract

Ensemble classification model has been widely used in the area of machine learning to enhance the performance of single classifiers. In this paper, we study the effect of employing five ensemble models, namely Bagging, Adaboost, Logitboost, Random Subspace and Random Committee, on a vocal emotion recognition system. The system recognizes happy, angry, and surprise emotion from Arabic natural speech where the highest accuracy among single classifiers is obtained by SMO 95.52%. After applying the ensemble models on 19 single classifiers, the best enhanced accuracy is 95.95% achieved by SMO as well. The highest improvement in accuracy was 19.09%. It was achieved by the Boosting technique having the Naïve Bayes Multinomial as base classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Batliner, A., Schuller, B., Seppi, D., Steidl, S., Devillers, L., Vidrascu, L., Vogt, T., Aharonson, V., Amir, N.: The automatic recognition of emotions in speech. In: Petta, P., Pelachaud, C., Cowie, R. (eds.) Emotion-Oriented Systems, pp. 71–99. Springer, Berlin (2011)

    Chapter  Google Scholar 

  2. Valentini, G., Masulli, F.: Ensembles of learning machines. In: Marinaro, M., Tagliaferri, R. (eds.) WIRN 2002. LNCS, vol. 2486, pp. 3–20. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45808-5_1

    Chapter  MATH  Google Scholar 

  3. Klaylat, S., Osman, Z., Zantout, R., Hamandi, L.: Emotion Recognition in Arabic Speech. Analog. Integr. Circuits Signal Process., Springer, 96(2), 337–351 (2018)

    Article  Google Scholar 

  4. Klaylat, S., Hamandi, L., Zantout, R., Osman, Z.: Arabic natural audio dataset. MendeleyData, v1,http://dx.doi.org/10.17632/xm232yxf7t.1, Mendeley Data Website

  5. Melville, P., Shah, N., Mihalkova, L., Mooney, R.J.: Experiments on ensembles with missing and noisy data. In: International Workshop on Multiple Classifier Systems, pp. 293–302. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  6. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)

    MATH  Google Scholar 

  7. Kam Ho, T.: The Random Subspace Method for constructing Decision Forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)

    Article  Google Scholar 

  8. Frank, E., Hall, M.A., Witten, I.H.: The WEKA workbench. Online appendix for “data mining: practical machine learning tools and techniques”. Morgan Kaufmann, 4th edn. (2016)

    Google Scholar 

  9. Batliner, A., Steidl, S., Schuller, B., Seppi, D., Laskowski, K., Vogt, T., Devillers, L., Vidrascu, L., Amir, N., Kessous, L., Aharonson, V.: Combining efforts for improving automatic classification of emotional user states. In: Proceedings of 5th Slovenian and 1st International Language Technologies Conference, IS LTC, pp. 240–245. Ljubljana, Slovenia (2006)

    Google Scholar 

  10. Fiscus, J.: A post-processing system to yield reduced word error rates: recognizer output voting error reduction (ROVER). In: proceedings of Automatic Speech Recognition and Understanding, ASRU, pp. 347–354. Santa Barbara, USA (1997)

    Google Scholar 

  11. Batliner, A., Hacker, C., Steidl, S., Nöth, E., D’Arcy, S., Russell, M.J., Wong, M.: You stupid tin box-children interacting with the AIBO robot: a cross-linguistic emotional speech corpus. In: Proceedings of 4th Language Resources and Evaluation Conference, LREC, pp. 171–174. Lisbon, Portugal (2004)

    Google Scholar 

  12. Iriondo, I., Planet, S., Socoró, J.C., Alías, F.: Objective and subjective evaluation of an expressive speech corpus. In: Proceedings of International Conference on Nonlinear Speech Processing, pp. 86–94. Springer, Berlin, Heidelberg (2007)

    Google Scholar 

  13. Morrison, D., De Silva, L.C.: Voting ensembles for spoken affect classification. J. Netw. Comput. Appl. 30(4), 1356–1365 (2007)

    Article  Google Scholar 

  14. Morrison, D., Wang, R., De Silva, L.C.: Ensemble methods for spoken emotion recognition in call-centres. Speech Commun. 49(2), 98–112 (2007)

    Article  Google Scholar 

  15. Schuller, B., Lang, M., Rigoll, G.: Robust acoustic speech emotion recognition by ensembles of classifiers. In: Tagungsband Fortschritte der Akustik-DAGA# 05. München (2005)

    Google Scholar 

  16. Schuller, B., Rigoll, G.: Timing levels in segment-based speech emotion recognition. In: Proceedings of International Conference on Spoken Language Processing ICSLP, Pittsburgh, USA (2006)

    Google Scholar 

  17. Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W.F., Weiss, B.: A database of German emotional speech. In: Proceedings of 9th European Conference on Speech Communication and Technology, ISCA, pp. 1517–1520. Lisbon, Portugal (2005)

    Google Scholar 

  18. Dastgheib, A., Ranjbar Pouya, O., Lithgow, B., Moussavi, Z.: Comparison of a new ad-hoc classification method with Support Vector Machine and Ensemble classifiers for the diagnosis of Meniere’s disease using EVestG signals. In: Proceedings of Electrical and Computer Engineering (CCECE), IEEE, pp. 1–4. Canada (2016)

    Google Scholar 

  19. Dacheng, T., Xiaoou, T., Xuelong, L., Xindong, W.: Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 28(7), 1088–1099 (2006)

    Article  Google Scholar 

  20. Nanni, L., Lumini, A.: Random subspace for an improved BioHashing for face authentication. Pattern Recogn. Lett. 29(3), 295–300 (2008)

    Article  Google Scholar 

  21. Wang, X., Tang, X.: Random sampling LDA for face recognition. In: proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2004)

    Google Scholar 

  22. Ali, R., Siddiqi, M.H., Idris, M., Kang, B.H., Lee, S.: Prediction of diabetes mellitus based on boosting ensemble modeling. In: proceedings of International conference on Ubiquitous Computing and Ambient Intelligence, pp. 25–28. Springer, Cham (2014)

    Google Scholar 

  23. Thongkam, J., Xu, G., Zhang, Y., Huang, F.: Support Vector Machine for Outlier Detection in Breast Cancer Survivability Prediction. In: Ishikawa, Yoshiharu, He, J., Xu, G., Shi, Y., Huang, G., Pang, C., Zhang, Q., Wang, G. (eds.) APWeb 2008. LNCS, vol. 4977, pp. 99–109. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89376-9_10

    Chapter  Google Scholar 

  24. Leard Statistics website: Kruskal-Wallis H Test using SPSS Statistics. https://statistics.laerd.com/spss-tutorials/kruskal-wallis-h-test-using-spss-statistics.php

  25. Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. Wiley (2004)

    Google Scholar 

  26. Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: The proceedings of 13th International Conference on Machine Learning, pp. 148–156. San Francisco (1996)

    Google Scholar 

  27. Skurichina, M.: Stabilizing weak classifiers. Ph.D. thesis, Delft University of Technology, Delft, The Netherlands (2001)

    Google Scholar 

  28. Skurichina, M., Duin, R.P.W.: Bagging, boosting and the random subspace method for linear classifiers. Pattern Anal. Appl. 5, 121–135 (2002)

    Article  MathSciNet  Google Scholar 

  29. Friedman, J., Hastie, T., Tibshirani, R.: Additive Logistic Regression: A Statistical View of Boosting. Stanford University (1998)

    Google Scholar 

  30. Niculescu-Mizil, A., Caruana, R.: An empirical comparison of supervised learning algorithms using different performance metrics. Technical report, Cornell University (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samira Klaylat .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zantout, R., Klaylat, S., Hamandi, L., Osman, Z. (2020). Ensemble Models for Enhancement of an Arabic Speech Emotion Recognition System. In: Arai, K., Bhatia, R. (eds) Advances in Information and Communication. FICC 2019. Lecture Notes in Networks and Systems, vol 70. Springer, Cham. https://doi.org/10.1007/978-3-030-12385-7_15

Download citation

Publish with us

Policies and ethics