Advertisement

Emotion recognition in Arabic speech

  • Samira Klaylat
  • Ziad Osman
  • Lama Hamandi
  • Rached Zantout
Article

Abstract

Automatic emotion recognition from speech signals without linguistic cues has been an important emerging research area. Integrating emotions in human–computer interaction is of great importance to effectively simulate real life scenarios. Research has been focusing on recognizing emotions from acted speech while little work was done on natural real life utterances. English, French, German and Chinese corpora were used for that purpose while no natural Arabic corpus was found to date. In this paper, emotion recognition in Arabic spoken data is studied for the first time. A realistic speech corpus from Arabic TV shows is collected. The videos are labeled by their perceived emotions; namely happy, angry or surprised. Prosodic features are extracted and thirty-five classification methods are applied. Results are analyzed in this paper and conclusions and future recommendations are identified.

Keywords

Emotional recognition Arabic speech Natural corpus Prosodic features 

References

  1. 1.
    Petrushin, V. (2000). Emotion recognition in speech signal: Experimental study, development, and application. In Proceedings of the 6th International Conference on Spoken Language Processing, Beijing, China.Google Scholar
  2. 2.
    Liscombe, J., Riccardi, G., & Hakkani-Tnr, D. (2005) Using context to improve emotion detection in spoken dialog systems. In Interspeech, pp. 1845–1848.Google Scholar
  3. 3.
    Roisman, G. I., Tsai, J. L., & Chiang, K. S. (2004). The emotional integration of childhood experience: Physiological, facial expressive, and self-reported emotional response during the adult attachment interview. Developmental Psychology, 40(5), 776–789.CrossRefGoogle Scholar
  4. 4.
    Pantic, M., Pentland, A., Nijholt, A., & Huang, T. S. (2006). Human computing and machine understanding of human behavior: A survey. In Proceedings Eighth ACM Int’l Conf. Multimodal Interfaces, 2006, pp. 239–248.Google Scholar
  5. 5.
    Chevrie-Muller, C., Seguier, N., Spira, A., & Dordain, M. (1978). Recognition of psychiatric disorders from voice quality. Language and Speech, 21, 87–111.CrossRefGoogle Scholar
  6. 6.
    France, D. J., Shiavi, R. G., Silverman, S., Silverman, M., & Wilkes, M. (2000). Acoustical properties of speech as indicators of depression and suicidal risk. IEEE Transactions on Biomedical Eng, 47(7), 829–837.CrossRefGoogle Scholar
  7. 7.
    Ji, Q., Lan, P., & Looney, C. (2006). A probabilistic framework for modeling and real-time monitoring human fatigue. IEEE Systems, Man, and Cybernetics Part A, 36(5), 862–875.CrossRefGoogle Scholar
  8. 8.
    Ai, H., Litman, D. J., Forbes-Riley, K., Rotaru, M., Tetreault, J., & Purandare, A. (2006). Using system and user performance features to improve emotion detection in spoken tutoring systems. In Proceedings of Interspeech, 2006, pp. 797–800.Google Scholar
  9. 9.
    Klein, J., Moon, Y., & Picard, R. W. (2002). This computer responds to user frustration: Theory, design and results. Interacting with Computers, 14(2), 119–140.CrossRefGoogle Scholar
  10. 10.
    Kuncheva, L., Bezdek, J., & Duin, R. (2006). Decision templates for multiple classifier fusion: An experimental comparison. Pattern Recognition, 34(2), 299–314.CrossRefMATHGoogle Scholar
  11. 11.
    Scheirer, J., Fernandez, R., Klein, J., & Picard, R. W. (2002). Frustrating the user on purpose: A step toward building an affective computer. Interactive Comput., 34(2), 93–118.CrossRefGoogle Scholar
  12. 12.
  13. 13.
    http://appcrawlr.com/android/sprint-mobile-ip, Sprint Mobile IP, App Crawlr website.
  14. 14.
    Ekman, P. (1971). Universals and cultural differences in facial expressions of emotion. In Proceedings of Nebraska Symp. Motivation, pp. 207–283.Google Scholar
  15. 15.
    Ekman, P. (1982). Emotion in the human face (2nd ed.). Cambridge: Cambridge University Press.Google Scholar
  16. 16.
    Ekman, P., & Oster, H. (1979). Facial expressions of emotion. Annual Review of Psychology, 30, 527–554.CrossRefGoogle Scholar
  17. 17.
    Clavel, C., Vasilescu, I., Devillers, L., Richard, G., & Ehrette, T. (2008). Fear-type emotion recognition for future audio-based surveillance systems. Speech Communication, 50(6), 487–503.CrossRefGoogle Scholar
  18. 18.
    Cowie, R., & Cornelius, R. R. (2003). Describing the emotional states that are expressed in speech. Speech Communication, 40(1–2), 5–32.CrossRefMATHGoogle Scholar
  19. 19.
    Kehrein, R. (2002). The prosody of authentic emotions. In Proceedings of Speech Prosody, Aix-en-Provence, 2002, pp. 423–426.Google Scholar
  20. 20.
    Cowie, R., Douglas-Cowie, E., Savvidou, S., McMahon, E., Sawey, M., & Schröder, M. (2000). ‘Feeltrace’: An instrument for recording perceived emotion in real time. In Proceedings ISCA Workshop Speech and Emotion, 2000, pp. 19–24.Google Scholar
  21. 21.
    Ayadi, M. E., Kamel, M. S., & Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition, 44, 572–587.CrossRefMATHGoogle Scholar
  22. 22.
    Marsella, S. C., & Gratch, J. (2009). EMA: A process model of appraisal dynamics. Journal of Cognitive Systems Research, 10(1), 70–90.CrossRefGoogle Scholar
  23. 23.
    Gratch, J., Marsella, S., & Petta, P. (2009). Modeling the cognitive antecedents and consequences of emotion. Journal of Cognitive Systems Research, 10(1), 1–5.CrossRefGoogle Scholar
  24. 24.
    Batliner, A., Fischer, K., Huber, R., Spilker, J., & E. Nöth. (2000). Desperately seeking emotions: Actors, wizards, and human beings. In Proceedings of the ISCA workshop on speech and emotion, Newcastle, Northern Ireland, 2000, pp. 195–200.Google Scholar
  25. 25.
    Wilting, J., Krahmer, E., & Swerts, M. (2006). Real vs. acted emotional speech. In Proceedings of Interspeech, Pittsburgh, PA, 2006, pp. 805–808.Google Scholar
  26. 26.
    Douglas, E., Devillers, L., Martin, J. C., Cowie, R., Savvidou, S., & Abrilian, S. (2005). Multimodal databases of everyday emotion: Facing up to complexity. In 9th European Conference on Speech Communication and Technology Lisbon, Portugal, 2005, pp. 813–816.Google Scholar
  27. 27.
    Devillers, L., Vidrascu, L., & Lamel, L. (2005). Challenges in real-life emotion annotation and machine learning based detection. Neural Networks, 18(4), 407–422.CrossRefGoogle Scholar
  28. 28.
    Engberg, I., & Hansen, A. (1996). Documentation of the Danish emotional speech database DES. Center for Person Communication, Institute of Electronic Systems, Alborg University, Aalborg, Denmark.Google Scholar
  29. 29.
    Burkhardt, F., Eckert, M., Johannsen, W., & Stegmann, J. (2010). A database of age and gender annotated telephone speech. In Proceedings of LREC, Valletta, Malta, 2010, pp. 1562–1565.Google Scholar
  30. 30.
    Schuller, B., Eyben, F., Can, S., & Feussner, H. (2010). Speech in minimal invasive surgery—Towards an affective language resource of real-life medical operations. In Proceedings of the 3rd International Workshop on emotion: Corpora for Research on Emotion and Affect, satellite of LREC, Valletta, Malta, 2010, pp. 5–9.Google Scholar
  31. 31.
    Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., & Weiss, B. (2005) A database of German emotional speech. In Proceedings of Interspeech, Lisbon, 2005, pp. 1517–1520.Google Scholar
  32. 32.
    http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2002S28S, University of Pennsylvania Linguistic Data Consortium, Emotional prosody speech and transcripts, July, 2002.
  33. 33.
    Jovicic, S. T., Kacic, Z., Dordevic, M., & Rajkovic, M. (2004). Serbian emotional speech database: Design, processing and evaluation. In Proceedings of 9th Conference on Speech and Computer, St. Petersburg, Russia, 2004, pp. 77–81.Google Scholar
  34. 34.
    Nwe, T. L. (2003). Analysis and detection of human emotion and stress from speech signals. Ph.D. thesis, Department of Electrical and Computer Engineering, National University of Singapore, 2003.Google Scholar
  35. 35.
    Breazeal, C., & Aryananda, L. (2002). Recognition of affective communicative intent in robot-directed speech. Autonomous Robots, 12(2002), 83–104.CrossRefMATHGoogle Scholar
  36. 36.
    Meddeb, M., & Alimi, A. (2017). Building and analyzing emotion corpus of the arabic speech. International Workshop on Arabic Script Analysis and Recognition (ASAR), IEEE, 2017.Google Scholar
  37. 37.
    Zhou, J., Wang, G., Yang, Y., & Chen, P. (2006). Speech emotion recognition based on rough set and SVM. In 5th IEEE International Conference on Cognitive Informatics, 2006, Vol. 1, pp. 53–61.Google Scholar
  38. 38.
    Rao, K. S., & Koolagudi, S. G. (2011). Identification of Hindi dialects and emotions using spectral and prosodic features of speech. Systemics, Cybernetics, and Informatics, 9(4), 24–33.Google Scholar
  39. 39.
    Caballero-Morales, S. O. (2013). Recognition of emotions in Mexican Spanish speech: An approach based on acoustic modeling of emotion-specific vowels. The Scientific World Journal, 1–13.Google Scholar
  40. 40.
    Song, P., Ou, S., Zheng, W., Jin, Y., & Zhao, L. (2016). Speech emotion recognition using transfer non-negative matrix factorization. In Proceedings of IEEE international conference ICASSP, 2016, pp. 5180–5184.Google Scholar
  41. 41.
    Pravena, D., & Govind, D. (2017). Development of simulated emotion speech database for excitation source analysis. International Journal of Speech Technology, 20, 327–338.CrossRefGoogle Scholar
  42. 42.
    Busso, C., Bulut, M., Lee, C. C., Kazemzadeh, A., Mower, E., & Kim, S. (2008). IEMOCAP: Interactive emotional dyadic motion capture database. Journal of Language Resources and Evaluation, 42(4), 335–359.CrossRefGoogle Scholar
  43. 43.
    Schiel, F., Steininger, S., & Turk, U. (2002). The SmartKom multimodal corpus at BAS. In Proceedings of the 3rd Language Resources and Evaluation Conference, 2002, Canary Islands, Spain, pp. 200–206.Google Scholar
  44. 44.
    Batliner, A., Hacker, C., Steidl, S., Noth, E., D’Arcy, S., & Russell, M. (2004). ‘You stupid tin box’—Children interacting with the AIBO robot: A cross-linguistic emotional speech corpus. In Proceedings of 4th Language Resources and Evaluation Conference, 2004, Lisbon, Portugal, pp. 171–174.Google Scholar
  45. 45.
    Douglas-Cowie, E., Campbell, N., Cowie, R., & Roach, P. (2003). Emotional speech: Towards a new generation of databases. Speech Communication, 40(1–2), 33–60.CrossRefMATHGoogle Scholar
  46. 46.
    Mohanty, S., & Swain, B. K. (2010). Emotion recognition using fuzzy K-means from Oriya speech. In International Conference [ACCTA-2010] on Special Issue of IJCCT, 2010, Vol. 1, Issue 2–4.Google Scholar
  47. 47.
    Grimm, M., Kroschel, K., & Narayanan, S. (2008). The Vera am Mittag German audio–visual emotional speech database. In Proceedings IEEE International Conference on Multimedia and Expo, 2008, Hannover, Germany, pp. 865–868.Google Scholar
  48. 48.
    Morrison, D., Wang, R., & De Silva, L. (2007). Ensemble methods for spoken emotion recognition in call-centres. Speech Communication, 49(2), 98–112.CrossRefGoogle Scholar
  49. 49.
    Mohammadi, G., Vinciarelli, A., & Mortillaro, M. (2010). The voice of personality: Mapping nonverbal vocal behavior into trait attributions. In Proceedings of second international workshop on social signal processing, 2010, Florence, pp. 17–20.Google Scholar
  50. 50.
    Schuller, B., Muller, R., Eyben, F., Gast, J., Hornler, B., Wöllmer, M., et al. (2009). Being bored? recognizing natural interest by extensive audiovisual integration for real-life application. Image and Vision Computing, 27, 1760–1774.CrossRefGoogle Scholar
  51. 51.
    Lee, C. M., & Narayanan, S. S. (2005). Toward detecting emotions in spoken dialogs. IEEE Transactions on Speech and Audio Processing, 13(2), 293–303.CrossRefGoogle Scholar
  52. 52.
    Vidrascu, L., & Devillers, L. (2006). Real-life emotions in naturalistic data recorded in a medical call center. In 1st International Workshop on Emotion: Corpora for Research on Emotion and Affect (International conference on Language Resources and Evaluation), Genoa, Italy, 2006, pp. 20–24.Google Scholar
  53. 53.
    Quiros-Ramirez, M. A., Polikovsky, S., Kameda, Y., & Onisawa, T. (2014). A spontaneous cross-cultural emotion database: Latin-America vs. Japan. In International conference on Kansei Engineering and emotion research, 2014, pp. 1127–1134.Google Scholar
  54. 54.
    Koolagudi, S., & Sreenivasa Rao, K. (2012). Emotion recognition from speech: A review. International Journal of Speech Technology, 15(2), 99–117.CrossRefGoogle Scholar
  55. 55.
    Mubarak, O. M., Ambikairajah, E., & Epps, J. (2005). Analysis of an MFCC-based audio indexing system for efficient coding of multimedia sources. In The 8th International Symposium on Signal Processing and its Applications, Sydney, Australia, 2005, pp. 28–31.Google Scholar
  56. 56.
    Pao, T. L., Chen, Y. T., Yeh, J. H., & Liao, W. Y. (2005). Combining acoustic features for improved emotion recognition in mandarin speech. In Lecture Notes in Computer Science 3784, ACII 2005 (pp. 279–285). Berlin, Heidelberg: Springer.Google Scholar
  57. 57.
    Pao, T. L., Chen, Y. T., Yeh, J. H., Cheng, Y. M., & Chien, C. S. (2007). Feature combination for better differentiating anger from neutral in mandarin emotional speech. In LNCS 4738, ACII 2007. Berlin, Heidelberg: Springer.Google Scholar
  58. 58.
    Davis, S., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech and Signal Processing, 28, 357–366.CrossRefGoogle Scholar
  59. 59.
    Makhou, J. (1975). Linear prediction: A tutorial review. Proceedings of the IEEE, 63(4), 561–580.CrossRefGoogle Scholar
  60. 60.
    Cummings, K. E., & Clements, M. A. (1995). Analysis of the glottal excitation of emotionally styled and stressed speech. Journal of Acoustic Society of America, 98, 88–98.CrossRefGoogle Scholar
  61. 61.
    Rabiner, L. R., & Juang, B. H. (1993). Fundamentals of speech recognition. Englewood Cliffs, NJ: Prentice-Hall.Google Scholar
  62. 62.
    Chauhan, A., Koolagudi, S. G., Kafley, S., & Rao, K. S. (2010). Emotion recognition using LP residual. In IEEE TechSym West Bengal, India.Google Scholar
  63. 63.
    Koolagudi, S. G., Reddy, R., & Rao, K. S. (2010). Emotion recognition from speech signal using epoch parameters. In International conference on signal processing and communications, IISc, Bangalore, India (pp. 1–5). New York: IEEE Press.Google Scholar
  64. 64.
    Iliev, A., & Scordilis, M. S. (2001). Spoken emotion recognition using glottal symmetry. EURASIP Journal on Advances in Signal Processing, 1, 11.Google Scholar
  65. 65.
    Li, Y., & Zhao, Y. (1999). Recognizing emotions in speech using short-term and long term features. Budapest: Eurospeech.Google Scholar
  66. 66.
    Wang, Y., Li, B., Meng, Q., & Li, P. (2009). Emotional feature analysis and recognition in multilingual speech signal. Beijing: Electronic Measurement and Instruments (ICEMI).CrossRefGoogle Scholar
  67. 67.
    Vidrascu, L., & Devillers, L. (2007). Five emotion classes detection in real-world call center data: The use of various types of paralinguistic features. Orsay: LIMSI-CNRS.Google Scholar
  68. 68.
    Ayadi, M. E., Kamel, M. S., & Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition, 44(3), 572–587.CrossRefMATHGoogle Scholar
  69. 69.
    Xie, B., Chen, L., Chen, G. C., & Chen, C. (2007). Feature selection for emotion recognition of mandarin speech. Journal of Zhejiang University (Engineering Science), 41(11), 1816–1822.Google Scholar
  70. 70.
    Murray, I. R., & Arnott, J. L. (2008). Applying an analysis of acted vocal emotions to improve the simulation of synthetic speech. Computer Speech & Language, 22(2), 107–129.CrossRefGoogle Scholar
  71. 71.
    McGilloway, S., Cowie, R., Douglas-Cowie, E., Gielen, S., Westerdijk, M., & Strceve, S. (2000). Approaching automatic recognition of emotion from voice: A rough benchmark. In ISCA Workshop on Speech and Emotion, Belfast.Google Scholar
  72. 72.
    Schuller, B., Wimmer, M., Mosenlechner, L., Kern, C., Arsić, D., & Rigoll, G. (2008). Brute-forcing hierarchical functionals for paralinguistics: A waste of feature space. In Proceedings of international conference in acoustic, speech, signal processing, Las Vegas, NV, 2008, pp. 4501–4504.Google Scholar
  73. 73.
    Schuller, B., Rigoll, G., & Lang, M. (2004). Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture. In Proceedings of IEEE international conference in acoustic, speech, signal processing, New York, 2004, pp. 577–580.Google Scholar
  74. 74.
    Wu, S., Falk, T. H., & Chan, W. Y. (2009). Automatic recognition of speech emotion using long-term spectro-temporal features. In 16th international conference on digital signal processing, Santorini-Hellas, 2009, pp. 1–6.Google Scholar
  75. 75.
    Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., Busso, C., & Deng, Z. (2004). An acoustic study of emotions expressed in speech. In International conference on spoken language processing, Jeju Island, Korea.Google Scholar
  76. 76.
    Vidrascu, L., & Devillers, L. (2005). Real-life emotion representation and detection in call centers data. In J. Tao, T. Tan, & R. Picard (Eds.), LNCS: ACII (Vol. 3784, pp. 739–746). Berlin: Springer.Google Scholar
  77. 77.
    Nakatsu, R., Nicholson, J., & Tosa, N. (2000). Emotion recognition and its application to computer agents with spontaneous interactive capabilities. Knowledge-Based Systems, 13, 497–504.CrossRefGoogle Scholar
  78. 78.
    Clavel, C., Vasilescu, I., Devillers, L., Richard, G., & Ehrette, T. (2008). Fear-type emotion recognition for future audio-based surveillance systems. Speech Communication, 50(6), 487–503.CrossRefGoogle Scholar
  79. 79.
    Ververidis, D., Kotropoulos, C., & Pitas, I. (2004). Automatic emotional speech classification. In International conference on digital signal processing (pp. I593–I596). New York: IEEE Press.Google Scholar
  80. 80.
    Hoque, M. E., Yeasin, M., & Louwerse, M. M. (2006). Robust recognition of emotion from speech. In Intelligent virtual agents. Lecture Notes in Computer Science (pp. 42–53). Berlin: Springer.Google Scholar
  81. 81.
    Chuang, Z. J., & Wu, C. H. (2004). Emotion recognition using acoustic features and textual content. In Proceedings of IEEE international conference on multimedia and expo, 2004, Vol. 1, pp. 53–56.Google Scholar
  82. 82.
    Hoch, S., Althoff, F., McGlaun, G., & Rigoll, G. (2005). Bimodal fusion of emotional data in an automotive environment. In Proceedings of IEEE international conference on acoustics, speech, and signal processing, 2005, Vol. 2, pp. 1085–1088.Google Scholar
  83. 83.
    Yu, F., Chang, E., Xu, Y. Q., & Shum, H. Y. (2001). Emotion detection from speech to enrich multimedia content. In 2nd IEEE Pacific-Rim conference on multimedia, Beijing, China.Google Scholar
  84. 84.
    Wang, Y., Du, S., & Zhan, Y. (2008). Adaptive and optimal classification of speech emotion recognition. In 4th international conference on natural computation, 2008, pp. 407–411.Google Scholar
  85. 85.
    Zhang, S. (2008). Emotion recognition in Chinese natural speech by combining prosody and voice quality features. In F. Sun et al. (Eds.), Advances in neural networks. Lecture Notes in Computer Science (pp. 457–464). Berlin: Springer.Google Scholar
  86. 86.
    Luengo, I., Navas, E., Hernez, I., & Snchez, I. (2005). Automatic emotion recognition using prosodic parameters. In INTERSPEECH, Lisbon, Portugal, 2005, pp. 493–496.Google Scholar
  87. 87.
    Lugger, M., & Yang, B. (2007). The relevance of voice quality features in speaker independent emotion recognition. In International conference in acoustic, speech, signal processing, Honolulu, Hawaii, USA, 2007, Vol. 4, pp. 17–20.Google Scholar
  88. 88.
    Iliou, T., Anagnostopoulos, C. N. (2009). Statistical evaluation of speech features for emotion recognition. In 4th international conference on digital telecommunications, Colmar, France, 2009, pp. 121–126.Google Scholar
  89. 89.
    Lee, C. M., Narayanan, S. S., & Pieraccini, R. (2001). Recognition of negative emotions from the speech signal. In Proceedings of Automatic Speech Recognition and Understanding workshop, 2001, pp. 240–243.Google Scholar
  90. 90.
    Litman, D. J., & Forbes-Riley, K. (2006). Recognizing student emotions and attitudes on the basis of utterances in spoken tutoring dialogues with both human and computer tutors. Speech Communication, 48, 559–590.CrossRefGoogle Scholar
  91. 91.
    Dumouchel, P., Dehak, N., Attabi, Y., Dehak, R., & Boufaden, N. (2009). Cepstral and long-term features for emotion recognition. In Proceedings Interspeech, Brighton, 2009, pp. 344–347.Google Scholar
  92. 92.
    Vlasenko, B., & Wendemuth, A. (2009). Processing affected speech within human machine interaction. In Proceedings Annual Conference of the International Speech Communication Association, 2009.Google Scholar
  93. 93.
    Shami, M. T., & Kamel, M. S. (2005). Segment-based approach to the recognition of emotions in speech. In Proceedings of IEEE international conference on multimedia and expo, 2005, pp. 4–7.Google Scholar
  94. 94.
    http://www.youtube.com/watch?v=uvhNyAXFTMQ, “Egypt today show”, Alfaraiin channel, May 25, 2012.
  95. 95.
    http://www.youtube.com/watch?v=S1T_EKDpIR8,” New cairo show”, Al-hayat channel, May 28, 2012.
  96. 96.
    http://www.youtube.com/watch?v=2v6X2VEjb4k, “Laka sumt, AlUraify show”, September 12, 2012.
  97. 97.
    http://www.youtube.com/watch?v=MQv3tKTwm7k, Zain telecommunication, January 22, 2009.
  98. 98.
    http://www.youtube.com/watch?v=16qNcn03G3s, Prince Sultan bin Fahed call, Alriyadiya sport channel, October 6, 2011.
  99. 99.
    http://www.youtube.com/watch?v=E4TqhBo1SCk, Althaqafiya channel, January 7, 2012.
  100. 100.
    http://www.youtube.com/watch?v=Wpf3OxEdJak, “Dairat al dawe”, Noon channel, Haifa wehbe call, November 19, 2009.
  101. 101.
    http://www.youtube.com/watch?v=eBznv9QNU7M, “Musalsalati show”, Mona Zaki call, June 12, 2011.
  102. 102.
    https://www.kaggle.com/suso172/arabic-natural-audio-dataset, Kaggle website for public datasets, 2017.
  103. 103.
    Schiel, F., & Heinrich, C. (2009). Laying the foundation for in-car alcohol detection by speech. In Proceedings of Interspeech, Brighton, UK, 2009, pp. 983–986.Google Scholar
  104. 104.
    Schuller, B., & Burkhardt, F. (2010). Learning with synthesized speech for automatic emotion recognition. In Proceedings of international conference on digital signal processing, Dallas, TX, 2010, pp. 5150–5155.Google Scholar
  105. 105.
    Burkhardt, F., Schuller, B., Weiss, B., & Weninger, F. (2011). ‘Would you buy a car from me?’—On the likability of telephone voices. In Proceedings of Interspeech, Florence, 2011, pp. 1557–1560.Google Scholar
  106. 106.
    Fisher, W., Doddington, G., & Goudie-Marshall, K. (1986). The DARPA speech recognition research database: Specifications and status. In Proceedings of the DARPA Workshop on Speech Recognition, 1986, pp. 93–99.Google Scholar
  107. 107.
    Schuller, B., Steidl, S., & Batliner, A. (2009). The interspeech 2009 emotion challenge. In Proceedings of Interspeech, Brighton, UK, 2009, pp. 312–315.Google Scholar
  108. 108.
    Schuller, B., Steidl, S., Batliner, A., Burkhardt, F., Devillers, L., & Muller, C. (2010). The interspeech 2010 paralinguistic challenge. In Proceedings of Interspeech, Makuhari, Japan, 2010, pp. 2794–2797.Google Scholar
  109. 109.
    Schuller, B., Steidl, S., Batliner, A., & Krajewski, J. (2011). The interspeech 2011 speaker state challenge. In Proceedings of Interspeech, Florence, 2011, pp. 3201–3204.Google Scholar
  110. 110.
    Schuller, B., et al. (2012). The interspeech 2012 speaker trait challenge. In Proceedings of Interspeech, Portland, OR, 2012, pp. 254–257.Google Scholar
  111. 111.
    Schuller, B., Valstar, M., Cowie, R., & Pantic, M. (2011). AVEC 2011—The first international audio/visual emotion challenge (Vol. 2, pp. 415–424). Berlin: Springer.Google Scholar
  112. 112.
    Schuller, B., Valstar, M., Cowie, R., & Pantic, M. (2012). AVEC 2012—The continuous audio/visual emotion challenge. In Proceedings of the 2nd international audio/visual emotion challenge and workshop, AVEC, grand challenge and satellite of ACM ICMI, CA, 2012.Google Scholar
  113. 113.
    Eyben, F., Wöllmer, M., & Schuller, B. (2010). openSMILE—The munich versatile and fast open-source audio feature extractor. ACM.Google Scholar
  114. 114.
    https://statistics.laerd.com/spss-tutorials/kruskal-wallis-h-test-using-spss-statistics.php, “Kruskal-Wallis H Test using SPSS Statistics”, Leard Statistics website.
  115. 115.
    Witten, I. A., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques (2nd ed.). San Francisco, CA: Morgan Kaufmann.MATHGoogle Scholar
  116. 116.
    Platt, J. C. (1998). Sequential minimal optimization: A fast algorithm for training support vector machines. Technical Report MSR-TR-98-14, April 1998.Google Scholar
  117. 117.
    Vapnik, V. (1995). The nature of statistical learning theory. New York: Springer.CrossRefMATHGoogle Scholar
  118. 118.
    Meddeb, M., Karray, H., & Alimi, A. (2016). Automated extraction of features from Arabic emotional speech corpus. International Journal of Computer Information Systems and Industrial Management Applications, 8, 184–194.Google Scholar
  119. 119.
    Meddeb, M., Hichem, K., & Alimi, A. (2015). Speech emotion recognition based on Arabic features. In 15th International conference on Intelligent Systems design and Applications (ISDA15), Marrakesh, Morocco, IEEE conference, 2015, pp. 14–16.Google Scholar
  120. 120.
    Elliott, C. (1992). The affective reasoner: A process model of emotions in a multi-agent system. Ph.D. thesis, Inst. Learning Sciences, Northwestern University, Tech. Rep. 32, 1992.Google Scholar
  121. 121.
    Landweh, N., Hall, M., & Frank, E. (2005). Logistic model trees. Machine Learning, 59(1–2), 161–205.CrossRefGoogle Scholar
  122. 122.
    Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), 832–844.CrossRefGoogle Scholar
  123. 123.
    http://weka.sourceforge.net/doc.dev/allclasses-noframe.html, Machine Learning Group at the University of Waikato, 2016.
  124. 124.
    Frank, E., Wang, Y., Inglis, S., Holmes, G., & Witten, I. H. (1998). Using model trees for classification. Machine Learning, 32(1), 63–76.CrossRefMATHGoogle Scholar
  125. 125.
    Aha, D., & Kibler, D. (1991). Instance-based learning algorithms. Machine Learning, 6, 37–66.MATHGoogle Scholar
  126. 126.
    Meddeb, M., Karray, H., & Alimi, (2016). Content-based Arabic speech similarity search and emotion detection. In Proceedings of the International Conference on Advanced Intelligent Systems and Informatics, 2016, pp. 530–539.Google Scholar
  127. 127.
    Wu, S., Falk, T. H., & Chan, W. Y. (2011). Automatic speech emotion recognition using modulation spectral features. Speech Communication, 53, 768–785.CrossRefGoogle Scholar
  128. 128.
    Dellaert, F., Polzin, T., & Waibel, A. (1996). Recognizing emotion in speech. In Proceedings of ICSLP, Philadelphia, 1996, pp. 1970–1973.Google Scholar
  129. 129.
    Batliner, A., Fischer, K., Huber, R., Spilker, J., Noth, E. (2000). Desperately seeking emotions: Actors, wizards, and human beings. In Proceedings of the ISCA Workshop on Speech and Emotion, Newcastle, 2000, pp. 195–200.Google Scholar
  130. 130.
    Schuller, B., Muller, R., Lang, M., & Rigoll, G. (2005). Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles. In Proceedings of Interspeech, Lisbon, 2005, pp. 805–809.Google Scholar
  131. 131.
    McGilloway, S., Cowie, R., Doulas-Cowie, E., Gielen, S., Westerdijk, M., & Stroeve, S. (2000). Approaching automatic recognition of emotion from voice: A rough benchmark. In Proceedings of the ISCA Workshop on Speech and Emotion, Newcastle, 2000, pp. 207–212.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer ScienceBeirut Arab UniversityBeirutLebanon
  2. 2.Electrical and Computer Engineering DepartmentBeirut Arab UniversityBeirutLebanon
  3. 3.Electrical and Computer Engineering DepartmentAmerican University of BeirutBeirutLebanon
  4. 4.Electrical and Computer Engineering DepartmentRafik Hariri UniversityMechrefLebanon

Personalised recommendations