Analysis of Speech Under Stress and Cognitive Load in USAR Operations

Conference paper


This paper presents ongoing work on analysis of speech under stress and cognitive load in speech recordings of Urban Search and Rescue (USAR) training operations. During the training operations several team members communicate with other members on the field and members on the control command using only one radio channel. The type of stress encountered in the USAR domain, more specifically on the human team communication, includes both physical or psychological stress and cognitive task load. Physical stress due to the real situation and cognitive task load due to tele-operation of robots and equipment. We were able to annotate and identify the acoustic correlates of these two types of stress on the recordings. Traditional prosody features and acoustic features extracted at sub-band level probed to be robust to discriminate among the different types of stress and neutral data.


Cognitive Load Unmanned Aerial Vehicle Acoustic Feature Spectral Entropy Unmanned Ground Vehicle 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The work reported in this paper has received funding from the EU-FP7 ICT 247870 NIFTi project. We would like to thank Holmer Hemsen for assistance with data annotation.


  1. 1.
    Burkhardt, F., Polzehl, T., Stegmann, J., Metze, F., Metze, F., Huber, R.: Detecting real life anger. In: IEEE International Conference on ICASSP, Taipei (2009)Google Scholar
  2. 2.
    Charfuelan, M., Schröder, M.: The vocal effort of dominance in scenario meetings. In: Interspeech. Florence (2011)Google Scholar
  3. 3.
    Hansen, J., Patil, S.: Speech under stress: Analysis, modeling and recognition. In: Speaker Classification I, Lecture Notes in Computer Science, vol. 4343, pp. 108–137. Springer, Berlin (2007)Google Scholar
  4. 4.
    Hansen, J.H.L., Bou-Ghazale, S.E.: Getting started with susas: a speech under simulated and actual stress database. In: Eurospeech. Rhodes (1997)Google Scholar
  5. 5.
    Jameson, A., Kiefer, J., Müller, C., Gromann-Hutter, B., Wittig, F., Rummer, R.: Assessment of a user’s time pressure and cognitive load on the basis of features of speech. In: Resource-Adaptive Cognitive Processes, Cognitive Technologies. Springer, Berlin (2010)Google Scholar
  6. 6.
    Kruijff, G.: Proceedings of NJEx 2011, NID 2011 (2012). DFKI internal reportGoogle Scholar
  7. 7.
    Looije, R., te Brake, G., Neerincx, M.: Geo-collaboration under stress. In: Workshop on Mobile HCI for Emergencies. Singapore (2007)Google Scholar
  8. 8.
    Misra, H., Ikbal, S., Sivadas, S., Bourlard, H.: Multi-resolution spectral entropy feature for robust ASR. In: IEEE International Conference ICASSP. Philadelphia (2005)Google Scholar
  9. 9.
    Patil, S.A., Hansen, J.H.L.: Detection of speech under physical stress: Model development, sensor selection, and feature fusion. In: Interspeech. Brisbane (2008)Google Scholar
  10. 10.
    Scherer, K.R., Grandjean, D., Johnstone, T., Klasmeyer, G., Bänziger, T.: Acoustic correlates of task load and stress. In: ICSLP2002–Interspeech 2002. Denver (2002)Google Scholar
  11. 11.
    Sjölander, K.: The Snack Sound Toolkit. (2012)
  12. 12.
    Zhou, G., Hansen, J.H.L., Kaiser, J.F.: Nonlinear feature based classification of speech under stress. IEEE Trans. Speech Audio Process. 9(3), 201–216 (2001)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.DFKI GmbH, Language Technology LabBerlinGermany
  2. 2.DFKI GmbH, Language Technology LabSaarbrückenGermany

Personalised recommendations