Tracking the Expression of Annoyance in Call Centers

  • Jon IrastorzaEmail author
  • María Inés TorresEmail author
Part of the Topics in Intelligent Engineering and Informatics book series (TIEI, volume 13)


Machine learning researchers have dealt with the identification of emotional cues from speech since it is research domain showing a large number of potential applications. Many acoustic parameters have been analyzed when searching for cues to identify emotional categories. Then classical classifiers and also outstanding computational approaches have been developed. Experiments have been carried out mainly over induced emotions, even if recently research is shifting to work over spontaneous emotions. In such a framework, it is worth mentioning that the expression of spontaneous emotions depends on cultural factors, on the particular individual and also on the specific situation. In this work, we were interested in the emotional shifts during conversation. In particular we were aimed to track the annoyance shifts appearing in phone conversations to complaint services. To this end we analyzed a set of audio files showing different ways to express annoyance. The call center operators found disappointment, impotence or anger as expression of annoyance. However, our experiments showed that variations of parameters derived from intensity combined with some spectral information and suprasegmental features are very robust for each speaker and annoyance rate. The work also discussed the annotation problem arising when dealing with human labelling of subjective events. In this work we proposed an extended rating scale in order to include annotators disagreements. Our frame classification results validated the chosen annotation procedure. Experimental results also showed that shifts in customer annoyance rates could be potentially tracked during phone calls.


Annoyance Ratings Frame Classification Customer Annoyance Annotation Procedure Call Centre Operators 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work has been partially funded by the Spanish Science Ministry under grant TIN2014-54288-C4-4-R and by the EU H2020 project EMPATHIC grant N 769872.


  1. 1.
    Anagnostopoulos CN, Iliou T, Giannoukos I (2015) Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011. Artif Intell Rev 43(2):155–177CrossRefGoogle Scholar
  2. 2.
    Ashwin C, Chapman E, Colle L, Baron-Cohen S (2006) Impaired recognition of negative basic emotions in autism: a test of the amygdala theory. Social neuroscience 1(3–4):349–363CrossRefGoogle Scholar
  3. 3.
    Ayadi ME, Kamel MS, Karray F (2011) Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recognit 44(3):572–587CrossRefGoogle Scholar
  4. 4.
    Baranyi P, Csapó A (2012) Definition and synergies of cognitive infocommunications. Acta Polytech Hung 9(1):67–83Google Scholar
  5. 5.
    Baranyi P, Csapó A, Sallai G (2015) Cognitive Infocommunications (CogInfoCom). Springer InternationalGoogle Scholar
  6. 6.
    Ben-David BM, Multani N, Shakuf V, Rudzicz F, van Lieshout PHHM (2016) Prosody and semantics are separate but not separable channels in the perception of emotional speech: test for rating of emotions in speech. J Speech Lang Hear Res 59(1):72–89CrossRefGoogle Scholar
  7. 7.
    Boersma P, Weenink D (2016) Praat: doing phonetics by computer. Software tool, University of Amsterdam, version 6. 0.15.
  8. 8.
    Clavel C, Callejas Z (2016) Sentiment analysis: from opinion mining to human-agent interaction. IEEE Trans Affect Comput 7(1):74–93CrossRefGoogle Scholar
  9. 9.
    Devillers L, Vidrascu L, Lamel L (2005) Challenges in real-life emotion annotation and machine learning based detection. Neural Netw 18(4):407–422Google Scholar
  10. 10.
    Eskimez SE, Imade K, Yang N, Sturge-Apple M, Duan Z, Heinzelman W (2016) Emotion classification: how does an automated system compare to naive human coders? In: Proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP 2016), pp 2274–2278.
  11. 11.
    Esposito A, Esposito AM, Likforman-Sulem L, Maldonato MN, Vinciarelli A (2016) Recent advances in nonlinear speech processing, chap on the significance of speech pauses in depressive disorders. In: Results on read and spontaneous narratives. Springer International Publishing, Cham, pp 73–82Google Scholar
  12. 12.
    Girard JM, Cohn JF (2016) Automated audiovisual depression analysis. Curr Opin Psychol 4:75–79. Scholar
  13. 13.
    Irastorza J, Torres MI (2016) Analyzing the expression of annoyance during phone calls to complaint services. In: 2016 7th IEEE international conference on cognitive info communications (CogInfoCom). IEEE, pp 103–106Google Scholar
  14. 14.
    Iturriza M (2015) Identificacin de activacin emocional adaptada a cada locutor. Graduation thesis Universidad del País VascoGoogle Scholar
  15. 15.
    Justo R, Horno O, Serras M, Torres MI (2014) Tracking emotional hints in spoken interaction. In: Proceedings of VIII Jornadas en Tecnología del Habla and IV Iberian SLTech Workshop (IberSpeech 2014), pp 216–226Google Scholar
  16. 16.
    Kim JC, Clements MA (2015) Multimodal affect classification at various temporal lengths. IEEE Trans Affect Comput 6(4):371–384CrossRefGoogle Scholar
  17. 17.
    Koeda M, Belin P, Hama T, Masuda T, Matsuura M, Okubo Y (2013) Cross-cultural differences in the processing of non-verbal affective vocalizations by Japanese and canadian listenersGoogle Scholar
  18. 18.
    Meilán JJG, Martínez-Sácnhez F, Carro J, López DE, Millian-Morell L, Arana JM (2014) Speech in alzheimer’s disease: can temporal and acoustic parameters discriminate dementia? Dement Geriatr Cognit Disord 37(5–6):327–334CrossRefGoogle Scholar
  19. 19.
    Mencattini A, Martinelli E, Ringeval F, Schuller B, Natlae CD (2016) Continuous estimation of emotions in speech by dynamic cooperative speaker models. IEEE Trans Affect Comput PP(99):1–1.
  20. 20.
    Paltoglou G, Thelwall M (2013) Seeing stars of valence and arousal in blog posts. IEEE Trans Affect Comput 4(1):116–123CrossRefGoogle Scholar
  21. 21.
    Ringeval F, Eyben F, Kroupi E, Yuce A, Thiran JP, Ebrahimi T, Lalanne D, Schuller B (2015) Prediction of asynchronous dimensional emotion ratings from audiovisual and physiological data. Pattern Recognit Lett 66:22–30CrossRefGoogle Scholar
  22. 22.
    Rump KM, Giovannelli JL, Minshew NJ, Strauss MS (2009) The development of emotion recognition in individuals with autism. Child Dev 80(5):1434–1447CrossRefGoogle Scholar
  23. 23.
    Schuller B, Batliner A, Steidl S, Seppi D (2011) Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge. Speech Commun 53(9–10):1062–1087CrossRefGoogle Scholar
  24. 24.
    Valstar M, Schuller B, Smith K, Almaev T, Eyben F, Krajewski J, Cowie R, Pantic M (2014) Avec 2014: 3D dimensional affect and depression recognition challenge. In: Proceedings of the 4th international workshop on audio/visual emotion challenge, ACM, New York, NY, USA, AVEC ’14, pp 3–10Google Scholar
  25. 25.
    Ververidis D, Kotropoulos C (2006) Emotional speech recognition: resources, features, and methods. Speech Commun 48(9):1162–1181CrossRefGoogle Scholar
  26. 26.
    Vidrascu L, Devillers L (2005) detection of real-life emotions in call centers. In: Proceedings of interspeech’05: the 6th annual conference of the international speech communication association, ISCA. Lisbon, Portugal, pp 1841–1844Google Scholar
  27. 27.
    Wang K, An N, Li BN, Zhang Y, Li L (2015) Speech emotion recognition using fourier parameters. IEEE Trans Affect Comput 6(1):69–75CrossRefGoogle Scholar
  28. 28.
    Wollmer M, Eyben F, Reiter S, Schuller B, Cox C, Douglas-Cowie E, Cowie R (2008) Abandoning emotion classes—towards continuous emotion recognition with modelling of long-range dependencies, pp 597–600Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Speech Interactive Research GroupUniversidad el País Vasco UPV/EHULeioaSpain

Personalised recommendations