Emotion recognition in Arabic speech

Klaylat, Samira; Osman, Ziad; Hamandi, Lama; Zantout, Rached

doi:10.1007/s10470-018-1142-4

Samira Klaylat ORCID: orcid.org/0000-0003-4877-8236¹,
Ziad Osman²,
Lama Hamandi³ &
…
Rached Zantout⁴

1230 Accesses
34 Citations
Explore all metrics

Abstract

Automatic emotion recognition from speech signals without linguistic cues has been an important emerging research area. Integrating emotions in human–computer interaction is of great importance to effectively simulate real life scenarios. Research has been focusing on recognizing emotions from acted speech while little work was done on natural real life utterances. English, French, German and Chinese corpora were used for that purpose while no natural Arabic corpus was found to date. In this paper, emotion recognition in Arabic spoken data is studied for the first time. A realistic speech corpus from Arabic TV shows is collected. The videos are labeled by their perceived emotions; namely happy, angry or surprised. Prosodic features are extracted and thirty-five classification methods are applied. Results are analyzed in this paper and conclusions and future recommendations are identified.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Petrushin, V. (2000). Emotion recognition in speech signal: Experimental study, development, and application. In Proceedings of the 6th International Conference on Spoken Language Processing, Beijing, China.
Liscombe, J., Riccardi, G., & Hakkani-Tnr, D. (2005) Using context to improve emotion detection in spoken dialog systems. In Interspeech, pp. 1845–1848.
Roisman, G. I., Tsai, J. L., & Chiang, K. S. (2004). The emotional integration of childhood experience: Physiological, facial expressive, and self-reported emotional response during the adult attachment interview. Developmental Psychology, 40(5), 776–789.
Article Google Scholar
Pantic, M., Pentland, A., Nijholt, A., & Huang, T. S. (2006). Human computing and machine understanding of human behavior: A survey. In Proceedings Eighth ACM Int’l Conf. Multimodal Interfaces, 2006, pp. 239–248.
Chevrie-Muller, C., Seguier, N., Spira, A., & Dordain, M. (1978). Recognition of psychiatric disorders from voice quality. Language and Speech, 21, 87–111.
Article Google Scholar
France, D. J., Shiavi, R. G., Silverman, S., Silverman, M., & Wilkes, M. (2000). Acoustical properties of speech as indicators of depression and suicidal risk. IEEE Transactions on Biomedical Eng, 47(7), 829–837.
Article Google Scholar
Ji, Q., Lan, P., & Looney, C. (2006). A probabilistic framework for modeling and real-time monitoring human fatigue. IEEE Systems, Man, and Cybernetics Part A, 36(5), 862–875.
Article Google Scholar
Ai, H., Litman, D. J., Forbes-Riley, K., Rotaru, M., Tetreault, J., & Purandare, A. (2006). Using system and user performance features to improve emotion detection in spoken tutoring systems. In Proceedings of Interspeech, 2006, pp. 797–800.
Klein, J., Moon, Y., & Picard, R. W. (2002). This computer responds to user frustration: Theory, design and results. Interacting with Computers, 14(2), 119–140.
Article Google Scholar
Kuncheva, L., Bezdek, J., & Duin, R. (2006). Decision templates for multiple classifier fusion: An experimental comparison. Pattern Recognition, 34(2), 299–314.
Article MATH Google Scholar
Scheirer, J., Fernandez, R., Klein, J., & Picard, R. W. (2002). Frustrating the user on purpose: A step toward building an affective computer. Interactive Comput., 34(2), 93–118.
Article Google Scholar
http://android-apps.com/apps/skc-interpret/, Android Apps website.
http://appcrawlr.com/android/sprint-mobile-ip, Sprint Mobile IP, App Crawlr website.
Ekman, P. (1971). Universals and cultural differences in facial expressions of emotion. In Proceedings of Nebraska Symp. Motivation, pp. 207–283.
Ekman, P. (1982). Emotion in the human face (2nd ed.). Cambridge: Cambridge University Press.
Google Scholar
Ekman, P., & Oster, H. (1979). Facial expressions of emotion. Annual Review of Psychology, 30, 527–554.
Article Google Scholar
Clavel, C., Vasilescu, I., Devillers, L., Richard, G., & Ehrette, T. (2008). Fear-type emotion recognition for future audio-based surveillance systems. Speech Communication, 50(6), 487–503.
Article Google Scholar
Cowie, R., & Cornelius, R. R. (2003). Describing the emotional states that are expressed in speech. Speech Communication, 40(1–2), 5–32.
Article MATH Google Scholar
Kehrein, R. (2002). The prosody of authentic emotions. In Proceedings of Speech Prosody, Aix-en-Provence, 2002, pp. 423–426.
Cowie, R., Douglas-Cowie, E., Savvidou, S., McMahon, E., Sawey, M., & Schröder, M. (2000). ‘Feeltrace’: An instrument for recording perceived emotion in real time. In Proceedings ISCA Workshop Speech and Emotion, 2000, pp. 19–24.
Ayadi, M. E., Kamel, M. S., & Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition, 44, 572–587.
Article MATH Google Scholar
Marsella, S. C., & Gratch, J. (2009). EMA: A process model of appraisal dynamics. Journal of Cognitive Systems Research, 10(1), 70–90.
Article Google Scholar
Gratch, J., Marsella, S., & Petta, P. (2009). Modeling the cognitive antecedents and consequences of emotion. Journal of Cognitive Systems Research, 10(1), 1–5.
Article Google Scholar
Batliner, A., Fischer, K., Huber, R., Spilker, J., & E. Nöth. (2000). Desperately seeking emotions: Actors, wizards, and human beings. In Proceedings of the ISCA workshop on speech and emotion, Newcastle, Northern Ireland, 2000, pp. 195–200.
Wilting, J., Krahmer, E., & Swerts, M. (2006). Real vs. acted emotional speech. In Proceedings of Interspeech, Pittsburgh, PA, 2006, pp. 805–808.
Douglas, E., Devillers, L., Martin, J. C., Cowie, R., Savvidou, S., & Abrilian, S. (2005). Multimodal databases of everyday emotion: Facing up to complexity. In 9th European Conference on Speech Communication and Technology Lisbon, Portugal, 2005, pp. 813–816.
Devillers, L., Vidrascu, L., & Lamel, L. (2005). Challenges in real-life emotion annotation and machine learning based detection. Neural Networks, 18(4), 407–422.
Article Google Scholar
Engberg, I., & Hansen, A. (1996). Documentation of the Danish emotional speech database DES. Center for Person Communication, Institute of Electronic Systems, Alborg University, Aalborg, Denmark.
Burkhardt, F., Eckert, M., Johannsen, W., & Stegmann, J. (2010). A database of age and gender annotated telephone speech. In Proceedings of LREC, Valletta, Malta, 2010, pp. 1562–1565.
Schuller, B., Eyben, F., Can, S., & Feussner, H. (2010). Speech in minimal invasive surgery—Towards an affective language resource of real-life medical operations. In Proceedings of the 3rd International Workshop on emotion: Corpora for Research on Emotion and Affect, satellite of LREC, Valletta, Malta, 2010, pp. 5–9.
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., & Weiss, B. (2005) A database of German emotional speech. In Proceedings of Interspeech, Lisbon, 2005, pp. 1517–1520.
http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2002S28S, University of Pennsylvania Linguistic Data Consortium, Emotional prosody speech and transcripts, July, 2002.
Jovicic, S. T., Kacic, Z., Dordevic, M., & Rajkovic, M. (2004). Serbian emotional speech database: Design, processing and evaluation. In Proceedings of 9th Conference on Speech and Computer, St. Petersburg, Russia, 2004, pp. 77–81.
Nwe, T. L. (2003). Analysis and detection of human emotion and stress from speech signals. Ph.D. thesis, Department of Electrical and Computer Engineering, National University of Singapore, 2003.
Breazeal, C., & Aryananda, L. (2002). Recognition of affective communicative intent in robot-directed speech. Autonomous Robots, 12(2002), 83–104.
Article MATH Google Scholar
Meddeb, M., & Alimi, A. (2017). Building and analyzing emotion corpus of the arabic speech. International Workshop on Arabic Script Analysis and Recognition (ASAR), IEEE, 2017.
Zhou, J., Wang, G., Yang, Y., & Chen, P. (2006). Speech emotion recognition based on rough set and SVM. In 5th IEEE International Conference on Cognitive Informatics, 2006, Vol. 1, pp. 53–61.
Rao, K. S., & Koolagudi, S. G. (2011). Identification of Hindi dialects and emotions using spectral and prosodic features of speech. Systemics, Cybernetics, and Informatics, 9(4), 24–33.
Google Scholar
Caballero-Morales, S. O. (2013). Recognition of emotions in Mexican Spanish speech: An approach based on acoustic modeling of emotion-specific vowels. The Scientific World Journal, 1–13.
Song, P., Ou, S., Zheng, W., Jin, Y., & Zhao, L. (2016). Speech emotion recognition using transfer non-negative matrix factorization. In Proceedings of IEEE international conference ICASSP, 2016, pp. 5180–5184.
Pravena, D., & Govind, D. (2017). Development of simulated emotion speech database for excitation source analysis. International Journal of Speech Technology, 20, 327–338.
Article Google Scholar
Busso, C., Bulut, M., Lee, C. C., Kazemzadeh, A., Mower, E., & Kim, S. (2008). IEMOCAP: Interactive emotional dyadic motion capture database. Journal of Language Resources and Evaluation, 42(4), 335–359.
Article Google Scholar
Schiel, F., Steininger, S., & Turk, U. (2002). The SmartKom multimodal corpus at BAS. In Proceedings of the 3rd Language Resources and Evaluation Conference, 2002, Canary Islands, Spain, pp. 200–206.
Batliner, A., Hacker, C., Steidl, S., Noth, E., D’Arcy, S., & Russell, M. (2004). ‘You stupid tin box’—Children interacting with the AIBO robot: A cross-linguistic emotional speech corpus. In Proceedings of 4th Language Resources and Evaluation Conference, 2004, Lisbon, Portugal, pp. 171–174.
Douglas-Cowie, E., Campbell, N., Cowie, R., & Roach, P. (2003). Emotional speech: Towards a new generation of databases. Speech Communication, 40(1–2), 33–60.
Article MATH Google Scholar
Mohanty, S., & Swain, B. K. (2010). Emotion recognition using fuzzy K-means from Oriya speech. In International Conference [ACCTA-2010] on Special Issue of IJCCT, 2010, Vol. 1, Issue 2–4.
Grimm, M., Kroschel, K., & Narayanan, S. (2008). The Vera am Mittag German audio–visual emotional speech database. In Proceedings IEEE International Conference on Multimedia and Expo, 2008, Hannover, Germany, pp. 865–868.
Morrison, D., Wang, R., & De Silva, L. (2007). Ensemble methods for spoken emotion recognition in call-centres. Speech Communication, 49(2), 98–112.
Article Google Scholar
Mohammadi, G., Vinciarelli, A., & Mortillaro, M. (2010). The voice of personality: Mapping nonverbal vocal behavior into trait attributions. In Proceedings of second international workshop on social signal processing, 2010, Florence, pp. 17–20.
Schuller, B., Muller, R., Eyben, F., Gast, J., Hornler, B., Wöllmer, M., et al. (2009). Being bored? recognizing natural interest by extensive audiovisual integration for real-life application. Image and Vision Computing, 27, 1760–1774.
Article Google Scholar
Lee, C. M., & Narayanan, S. S. (2005). Toward detecting emotions in spoken dialogs. IEEE Transactions on Speech and Audio Processing, 13(2), 293–303.
Article Google Scholar
Vidrascu, L., & Devillers, L. (2006). Real-life emotions in naturalistic data recorded in a medical call center. In 1st International Workshop on Emotion: Corpora for Research on Emotion and Affect (International conference on Language Resources and Evaluation), Genoa, Italy, 2006, pp. 20–24.
Quiros-Ramirez, M. A., Polikovsky, S., Kameda, Y., & Onisawa, T. (2014). A spontaneous cross-cultural emotion database: Latin-America vs. Japan. In International conference on Kansei Engineering and emotion research, 2014, pp. 1127–1134.
Koolagudi, S., & Sreenivasa Rao, K. (2012). Emotion recognition from speech: A review. International Journal of Speech Technology, 15(2), 99–117.
Article Google Scholar
Mubarak, O. M., Ambikairajah, E., & Epps, J. (2005). Analysis of an MFCC-based audio indexing system for efficient coding of multimedia sources. In The 8th International Symposium on Signal Processing and its Applications, Sydney, Australia, 2005, pp. 28–31.
Pao, T. L., Chen, Y. T., Yeh, J. H., & Liao, W. Y. (2005). Combining acoustic features for improved emotion recognition in mandarin speech. In Lecture Notes in Computer Science 3784, ACII 2005 (pp. 279–285). Berlin, Heidelberg: Springer.
Pao, T. L., Chen, Y. T., Yeh, J. H., Cheng, Y. M., & Chien, C. S. (2007). Feature combination for better differentiating anger from neutral in mandarin emotional speech. In LNCS 4738, ACII 2007. Berlin, Heidelberg: Springer.
Davis, S., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech and Signal Processing, 28, 357–366.
Article Google Scholar
Makhou, J. (1975). Linear prediction: A tutorial review. Proceedings of the IEEE, 63(4), 561–580.
Article Google Scholar
Cummings, K. E., & Clements, M. A. (1995). Analysis of the glottal excitation of emotionally styled and stressed speech. Journal of Acoustic Society of America, 98, 88–98.
Article Google Scholar
Rabiner, L. R., & Juang, B. H. (1993). Fundamentals of speech recognition. Englewood Cliffs, NJ: Prentice-Hall.
Google Scholar
Chauhan, A., Koolagudi, S. G., Kafley, S., & Rao, K. S. (2010). Emotion recognition using LP residual. In IEEE TechSym West Bengal, India.
Koolagudi, S. G., Reddy, R., & Rao, K. S. (2010). Emotion recognition from speech signal using epoch parameters. In International conference on signal processing and communications, IISc, Bangalore, India (pp. 1–5). New York: IEEE Press.
Iliev, A., & Scordilis, M. S. (2001). Spoken emotion recognition using glottal symmetry. EURASIP Journal on Advances in Signal Processing, 1, 11.
Google Scholar
Li, Y., & Zhao, Y. (1999). Recognizing emotions in speech using short-term and long term features. Budapest: Eurospeech.
Google Scholar
Wang, Y., Li, B., Meng, Q., & Li, P. (2009). Emotional feature analysis and recognition in multilingual speech signal. Beijing: Electronic Measurement and Instruments (ICEMI).
Book Google Scholar
Vidrascu, L., & Devillers, L. (2007). Five emotion classes detection in real-world call center data: The use of various types of paralinguistic features. Orsay: LIMSI-CNRS.
Google Scholar
Ayadi, M. E., Kamel, M. S., & Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition, 44(3), 572–587.
Article MATH Google Scholar
Xie, B., Chen, L., Chen, G. C., & Chen, C. (2007). Feature selection for emotion recognition of mandarin speech. Journal of Zhejiang University (Engineering Science), 41(11), 1816–1822.
Google Scholar
Murray, I. R., & Arnott, J. L. (2008). Applying an analysis of acted vocal emotions to improve the simulation of synthetic speech. Computer Speech & Language, 22(2), 107–129.
Article Google Scholar
McGilloway, S., Cowie, R., Douglas-Cowie, E., Gielen, S., Westerdijk, M., & Strceve, S. (2000). Approaching automatic recognition of emotion from voice: A rough benchmark. In ISCA Workshop on Speech and Emotion, Belfast.
Schuller, B., Wimmer, M., Mosenlechner, L., Kern, C., Arsić, D., & Rigoll, G. (2008). Brute-forcing hierarchical functionals for paralinguistics: A waste of feature space. In Proceedings of international conference in acoustic, speech, signal processing, Las Vegas, NV, 2008, pp. 4501–4504.
Schuller, B., Rigoll, G., & Lang, M. (2004). Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture. In Proceedings of IEEE international conference in acoustic, speech, signal processing, New York, 2004, pp. 577–580.
Wu, S., Falk, T. H., & Chan, W. Y. (2009). Automatic recognition of speech emotion using long-term spectro-temporal features. In 16th international conference on digital signal processing, Santorini-Hellas, 2009, pp. 1–6.
Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., Busso, C., & Deng, Z. (2004). An acoustic study of emotions expressed in speech. In International conference on spoken language processing, Jeju Island, Korea.
Vidrascu, L., & Devillers, L. (2005). Real-life emotion representation and detection in call centers data. In J. Tao, T. Tan, & R. Picard (Eds.), LNCS: ACII (Vol. 3784, pp. 739–746). Berlin: Springer.
Google Scholar
Nakatsu, R., Nicholson, J., & Tosa, N. (2000). Emotion recognition and its application to computer agents with spontaneous interactive capabilities. Knowledge-Based Systems, 13, 497–504.
Article Google Scholar
Clavel, C., Vasilescu, I., Devillers, L., Richard, G., & Ehrette, T. (2008). Fear-type emotion recognition for future audio-based surveillance systems. Speech Communication, 50(6), 487–503.
Article Google Scholar
Ververidis, D., Kotropoulos, C., & Pitas, I. (2004). Automatic emotional speech classification. In International conference on digital signal processing (pp. I593–I596). New York: IEEE Press.
Hoque, M. E., Yeasin, M., & Louwerse, M. M. (2006). Robust recognition of emotion from speech. In Intelligent virtual agents. Lecture Notes in Computer Science (pp. 42–53). Berlin: Springer.
Chuang, Z. J., & Wu, C. H. (2004). Emotion recognition using acoustic features and textual content. In Proceedings of IEEE international conference on multimedia and expo, 2004, Vol. 1, pp. 53–56.
Hoch, S., Althoff, F., McGlaun, G., & Rigoll, G. (2005). Bimodal fusion of emotional data in an automotive environment. In Proceedings of IEEE international conference on acoustics, speech, and signal processing, 2005, Vol. 2, pp. 1085–1088.
Yu, F., Chang, E., Xu, Y. Q., & Shum, H. Y. (2001). Emotion detection from speech to enrich multimedia content. In 2nd IEEE Pacific-Rim conference on multimedia, Beijing, China.
Wang, Y., Du, S., & Zhan, Y. (2008). Adaptive and optimal classification of speech emotion recognition. In 4th international conference on natural computation, 2008, pp. 407–411.
Zhang, S. (2008). Emotion recognition in Chinese natural speech by combining prosody and voice quality features. In F. Sun et al. (Eds.), Advances in neural networks. Lecture Notes in Computer Science (pp. 457–464). Berlin: Springer.
Google Scholar
Luengo, I., Navas, E., Hernez, I., & Snchez, I. (2005). Automatic emotion recognition using prosodic parameters. In INTERSPEECH, Lisbon, Portugal, 2005, pp. 493–496.
Lugger, M., & Yang, B. (2007). The relevance of voice quality features in speaker independent emotion recognition. In International conference in acoustic, speech, signal processing, Honolulu, Hawaii, USA, 2007, Vol. 4, pp. 17–20.
Iliou, T., Anagnostopoulos, C. N. (2009). Statistical evaluation of speech features for emotion recognition. In 4th international conference on digital telecommunications, Colmar, France, 2009, pp. 121–126.
Lee, C. M., Narayanan, S. S., & Pieraccini, R. (2001). Recognition of negative emotions from the speech signal. In Proceedings of Automatic Speech Recognition and Understanding workshop, 2001, pp. 240–243.
Litman, D. J., & Forbes-Riley, K. (2006). Recognizing student emotions and attitudes on the basis of utterances in spoken tutoring dialogues with both human and computer tutors. Speech Communication, 48, 559–590.
Article Google Scholar
Dumouchel, P., Dehak, N., Attabi, Y., Dehak, R., & Boufaden, N. (2009). Cepstral and long-term features for emotion recognition. In Proceedings Interspeech, Brighton, 2009, pp. 344–347.
Vlasenko, B., & Wendemuth, A. (2009). Processing affected speech within human machine interaction. In Proceedings Annual Conference of the International Speech Communication Association, 2009.
Shami, M. T., & Kamel, M. S. (2005). Segment-based approach to the recognition of emotions in speech. In Proceedings of IEEE international conference on multimedia and expo, 2005, pp. 4–7.
http://www.youtube.com/watch?v=uvhNyAXFTMQ, “Egypt today show”, Alfaraiin channel, May 25, 2012.
http://www.youtube.com/watch?v=S1T_EKDpIR8,” New cairo show”, Al-hayat channel, May 28, 2012.
http://www.youtube.com/watch?v=2v6X2VEjb4k, “Laka sumt, AlUraify show”, September 12, 2012.
http://www.youtube.com/watch?v=MQv3tKTwm7k, Zain telecommunication, January 22, 2009.
http://www.youtube.com/watch?v=16qNcn03G3s, Prince Sultan bin Fahed call, Alriyadiya sport channel, October 6, 2011.
http://www.youtube.com/watch?v=E4TqhBo1SCk, Althaqafiya channel, January 7, 2012.
http://www.youtube.com/watch?v=Wpf3OxEdJak, “Dairat al dawe”, Noon channel, Haifa wehbe call, November 19, 2009.
http://www.youtube.com/watch?v=eBznv9QNU7M, “Musalsalati show”, Mona Zaki call, June 12, 2011.
https://www.kaggle.com/suso172/arabic-natural-audio-dataset, Kaggle website for public datasets, 2017.
Schiel, F., & Heinrich, C. (2009). Laying the foundation for in-car alcohol detection by speech. In Proceedings of Interspeech, Brighton, UK, 2009, pp. 983–986.
Schuller, B., & Burkhardt, F. (2010). Learning with synthesized speech for automatic emotion recognition. In Proceedings of international conference on digital signal processing, Dallas, TX, 2010, pp. 5150–5155.
Burkhardt, F., Schuller, B., Weiss, B., & Weninger, F. (2011). ‘Would you buy a car from me?’—On the likability of telephone voices. In Proceedings of Interspeech, Florence, 2011, pp. 1557–1560.
Fisher, W., Doddington, G., & Goudie-Marshall, K. (1986). The DARPA speech recognition research database: Specifications and status. In Proceedings of the DARPA Workshop on Speech Recognition, 1986, pp. 93–99.
Schuller, B., Steidl, S., & Batliner, A. (2009). The interspeech 2009 emotion challenge. In Proceedings of Interspeech, Brighton, UK, 2009, pp. 312–315.
Schuller, B., Steidl, S., Batliner, A., Burkhardt, F., Devillers, L., & Muller, C. (2010). The interspeech 2010 paralinguistic challenge. In Proceedings of Interspeech, Makuhari, Japan, 2010, pp. 2794–2797.
Schuller, B., Steidl, S., Batliner, A., & Krajewski, J. (2011). The interspeech 2011 speaker state challenge. In Proceedings of Interspeech, Florence, 2011, pp. 3201–3204.
Schuller, B., et al. (2012). The interspeech 2012 speaker trait challenge. In Proceedings of Interspeech, Portland, OR, 2012, pp. 254–257.
Schuller, B., Valstar, M., Cowie, R., & Pantic, M. (2011). AVEC 2011—The first international audio/visual emotion challenge (Vol. 2, pp. 415–424). Berlin: Springer.
Google Scholar
Schuller, B., Valstar, M., Cowie, R., & Pantic, M. (2012). AVEC 2012—The continuous audio/visual emotion challenge. In Proceedings of the 2nd international audio/visual emotion challenge and workshop, AVEC, grand challenge and satellite of ACM ICMI, CA, 2012.
Eyben, F., Wöllmer, M., & Schuller, B. (2010). openSMILE—The munich versatile and fast open-source audio feature extractor. ACM.
https://statistics.laerd.com/spss-tutorials/kruskal-wallis-h-test-using-spss-statistics.php, “Kruskal-Wallis H Test using SPSS Statistics”, Leard Statistics website.
Witten, I. A., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques (2nd ed.). San Francisco, CA: Morgan Kaufmann.
MATH Google Scholar
Platt, J. C. (1998). Sequential minimal optimization: A fast algorithm for training support vector machines. Technical Report MSR-TR-98-14, April 1998.
Vapnik, V. (1995). The nature of statistical learning theory. New York: Springer.
Book MATH Google Scholar
Meddeb, M., Karray, H., & Alimi, A. (2016). Automated extraction of features from Arabic emotional speech corpus. International Journal of Computer Information Systems and Industrial Management Applications, 8, 184–194.
Google Scholar
Meddeb, M., Hichem, K., & Alimi, A. (2015). Speech emotion recognition based on Arabic features. In 15th International conference on Intelligent Systems design and Applications (ISDA15), Marrakesh, Morocco, IEEE conference, 2015, pp. 14–16.
Elliott, C. (1992). The affective reasoner: A process model of emotions in a multi-agent system. Ph.D. thesis, Inst. Learning Sciences, Northwestern University, Tech. Rep. 32, 1992.
Landweh, N., Hall, M., & Frank, E. (2005). Logistic model trees. Machine Learning, 59(1–2), 161–205.
Article Google Scholar
Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), 832–844.
Article Google Scholar
http://weka.sourceforge.net/doc.dev/allclasses-noframe.html, Machine Learning Group at the University of Waikato, 2016.
Frank, E., Wang, Y., Inglis, S., Holmes, G., & Witten, I. H. (1998). Using model trees for classification. Machine Learning, 32(1), 63–76.
Article MATH Google Scholar
Aha, D., & Kibler, D. (1991). Instance-based learning algorithms. Machine Learning, 6, 37–66.
MATH Google Scholar
Meddeb, M., Karray, H., & Alimi, (2016). Content-based Arabic speech similarity search and emotion detection. In Proceedings of the International Conference on Advanced Intelligent Systems and Informatics, 2016, pp. 530–539.
Wu, S., Falk, T. H., & Chan, W. Y. (2011). Automatic speech emotion recognition using modulation spectral features. Speech Communication, 53, 768–785.
Article Google Scholar
Dellaert, F., Polzin, T., & Waibel, A. (1996). Recognizing emotion in speech. In Proceedings of ICSLP, Philadelphia, 1996, pp. 1970–1973.
Batliner, A., Fischer, K., Huber, R., Spilker, J., Noth, E. (2000). Desperately seeking emotions: Actors, wizards, and human beings. In Proceedings of the ISCA Workshop on Speech and Emotion, Newcastle, 2000, pp. 195–200.
Schuller, B., Muller, R., Lang, M., & Rigoll, G. (2005). Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles. In Proceedings of Interspeech, Lisbon, 2005, pp. 805–809.
McGilloway, S., Cowie, R., Doulas-Cowie, E., Gielen, S., Westerdijk, M., & Stroeve, S. (2000). Approaching automatic recognition of emotion from voice: A rough benchmark. In Proceedings of the ISCA Workshop on Speech and Emotion, Newcastle, 2000, pp. 207–212.

Download references

Author information

Authors and Affiliations

Department of Computer Science, Beirut Arab University, Beirut, Lebanon
Samira Klaylat
Electrical and Computer Engineering Department, Beirut Arab University, Beirut, Lebanon
Ziad Osman
Electrical and Computer Engineering Department, American University of Beirut, Beirut, Lebanon
Lama Hamandi
Electrical and Computer Engineering Department, Rafik Hariri University, Mechref, Lebanon
Rached Zantout

Authors

Samira Klaylat
View author publications
You can also search for this author in PubMed Google Scholar
Ziad Osman
View author publications
You can also search for this author in PubMed Google Scholar
Lama Hamandi
View author publications
You can also search for this author in PubMed Google Scholar
Rached Zantout
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Samira Klaylat.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Klaylat, S., Osman, Z., Hamandi, L. et al. Emotion recognition in Arabic speech. Analog Integr Circ Sig Process 96, 337–351 (2018). https://doi.org/10.1007/s10470-018-1142-4

Download citation

Received: 25 October 2017
Revised: 14 February 2018
Accepted: 23 February 2018
Published: 14 March 2018
Issue Date: August 2018
DOI: https://doi.org/10.1007/s10470-018-1142-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Emotion recognition in Arabic speech

Abstract

Access this article

Similar content being viewed by others

Emotional Expression: Advances in Basic Emotion Theory

Transformer models for text-based emotion detection: a review of BERT-based approaches

A comprehensive survey on automatic speech recognition using neural networks

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Emotion recognition in Arabic speech

Abstract

Access this article

Similar content being viewed by others

Emotional Expression: Advances in Basic Emotion Theory

Transformer models for text-based emotion detection: a review of BERT-based approaches

A comprehensive survey on automatic speech recognition using neural networks

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation