The Acoustic Front-End in Scenarios of Interaction Research

  • Rüdiger Hoffmann
  • Lutz-Michael Alisch
  • Uwe Altmann
  • Thomas Fehér
  • Rico Petrick
  • Sören Wittenberg
  • Rico Hermkes
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5042)


The paper is concerning some problems which are posed by the growing interest in social interaction research as far as they can be solved by engineers in acoustics and speech technology. Firstly the importance of nonverbal and paraverbal modalities in two prototypical scenarios are discussed: face-to-face interactions in psychotherapeutic consulting and side-by-side interactions of children cooperating in a computer game. Some challenges in processing signals are stated with respect to both scenarios. The following technologies of acoustic signal processing are discussed: (a) analysis of the influence of the room impulse response to the recognition rate, (b) adaptive two-channel microphone, (c) localization and separation of sound sources in rooms, and (d) single-channel noise suppression.


Human-human interaction multimodality acoustic front-end acoustic signal processing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alisch, L.-M.: Sprache im Kontext sozial- und humanwissenschaftlicher Forschung (Speech in the context of social and human sciences.). In: Hoffmann, R. (ed.) Elektronische Sprachsignalverarbeitung. Tagungsband der 17. Konferenz Elektronische Sprachsignalverarbeitung, Freiberg/Sachsen, August 28-30, 2006, pp. 9–10. TUDpress, Dresden (2006)Google Scholar
  2. 2.
    Altmann, U., Hermkes, R., Alisch, L.-M.: Analysis of nonverbal involvement. In: Esposito, A., Faundez-Zanuy, M., Keller, E., Marinaro, M. (eds.) COST Action 2102. LNCS (LNAI), vol. 4775, pp. 37–50. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  3. 3.
    American Psychiatric Association: Diagnostic and Statistical Manual of Mental Disorder (DSM VI). Washington D.C (1994)Google Scholar
  4. 4.
    Biemer, S., Müller, Ch.: Entwicklung eines Videospiels auf Basis von Open-Source-Bibliotheken für die Anwendung im Rahmen eines Experiments zur Untersuchung von dyadischen Interaktionsprozessen. Technische Universität Dresden, unpublished Diploma thesis (2008)Google Scholar
  5. 5.
    Blauert, J.: Analysis and synthesis of auditory scenes. In: Blauert, J. (ed.) Communication acoustics, pp. 1–25. Springer, Berlin (2005)CrossRefGoogle Scholar
  6. 6.
    Brähler, E.: Die Erfassung des Interaktionsstiles von Therapeuten durch die automatische Sprachanalyse. Zeitschrift für Psychosomatische Medizin und Psychoanalyse 24, 156–168 (1978)Google Scholar
  7. 7.
    Brähler, E., Beckmann, D., Zenz, H.: Giessen-Test (GT) und Sprechverhalten bei neurotischen Patienten im Erstinterview. Medizinische Psychologie 1, 49–57 (1976)Google Scholar
  8. 8.
    Bos, E.H., Geerts, E., Bouhuys, A.L.: Nonverbal involvement as an indicator of prognosis in remitted depressed subjects. Psychiatry Research 113, 269–277 (2002)CrossRefGoogle Scholar
  9. 9.
    Brandstein, M., Ward, D. (eds.): Microphone arrays. Signal processing techniques and applications. Springer, Berlin (2001)Google Scholar
  10. 10.
    Buchheim, A., Mergenthaler, E.: The relationship among attachment representation, emotion-abstraction patterns, and narrative style: A computer-based text analysis of the adult attachment interview. Psychotherapy Research 10(4), 390–407 (2000)CrossRefGoogle Scholar
  11. 11.
    Cappé, O.: Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor. IEEE Trans. on Speech and Audio Processing 2(2), 345–349 (1994)CrossRefGoogle Scholar
  12. 12.
    Cassotta, L., Feldstein, S., Jaffe, J.: AVTA: A device for automatic vocal transaction analysis. Journal of Experimental Analysis of Behavior 7, 99–104 (1964)CrossRefGoogle Scholar
  13. 13.
    Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votis, G., Kollias, S., Fellenz, W., Taylor, J.G.: Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine 18(1), 32–80 (2001)CrossRefGoogle Scholar
  14. 14.
    Ellgring, H.: Nonverbal communication in depression. Cambridge University Press, Cambridge (1989)Google Scholar
  15. 15.
    Gobl, C., Chasaide, A.N.: The role of voice quality in communicating emotion, mood and attitude. Speech Communication 40, 189–212 (2003)CrossRefzbMATHGoogle Scholar
  16. 16.
    Herbordt, W.: Sound capture for human/machine interfaces. Practical aspects of microphone array processing. LNCIS, vol. 315. Springer, Berlin (2005)CrossRefzbMATHGoogle Scholar
  17. 17.
    Hoffmann, R., Eichner, M., Wolff, M.: Analysis of verbal and nonverbal acoustic signals with the Dresden UASR system. In: Esposito, A., Faundez-Zanuy, M., Keller, E., Marinaro, M. (eds.) COST Action 2102. LNCS (LNAI), vol. 4775, pp. 200–218. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  18. 18.
    Krause, R.: Nonverbales interaktives Verhalten von Stotterern und ihren Gesprächspartnern. Schweizerische Zeitschrift für Psychologie 37, 177–201 (1978)Google Scholar
  19. 19.
    Krause, R.: Stuttering and nonverbal communication: Investigations about affect inhibition and stuttering. In: Giles, H., Robinson, W.P., Smith, P.M. (eds.) Language. Social psychological perspectives, pp. 261–266. Pergamon, Oxford (1982)Google Scholar
  20. 20.
    Lippmann, M.: Quellenortung in Räumen. Diploma thesis, TU Dresden (2007)Google Scholar
  21. 21.
    Maase, J., Hirschfeld, D., Koloska, U., Westfeld, T., Helbig, J.: Towards an evaluation standard for speech control concepts in real-worls scenarios. In: Proc. 8th European Conference on Speech Communication and Technology (EUROSPEECH), Geneva, September 1-4, 2003, pp. 1553–1556 (2003)Google Scholar
  22. 22.
    Maser, J.D. (ed.): Depression and expressive behaviour. Lawrence Erlbaum Associates, Hillsdale (NJ) (1987)Google Scholar
  23. 23.
    Näth, T.: Realisierung eines Algorithmus zur Quellentrennung auf Basis der Independent Component Analysis. Diploma thesis, TU Dresden (2007)Google Scholar
  24. 24.
    Petrick, R., Lohde, K., Wolff, M., Hoffmann, R.: The harming part of room acoustics in automatic speech recognition. In: Proc. Interspeech, Antwerpen, August 27-31, 2007, pp. 1094–1097 (2007)Google Scholar
  25. 25.
    Philippot, P., Feldman, R.S., Coats, E.J.: The Role of Nonverbal Behavior in Clinical Settings: Introduction and Overview. In: ibid. (ed.) Nonverbal Behavior in Clinical Settings, pp. 3–13. Oxford University Press, Oxford (2003)CrossRefGoogle Scholar
  26. 26.
    Richter, D.: Richtmikrofon mit digitaler Signalverarbeitung. In: Fellbaum, K. (ed.) Elektronische Sprachsignalverarbeitung. Tagungsband der 18. Konferenz Elektronische Sprachsignalverarbeitung, Cottbus, September 10-12, 2007, pp. 143–148. TUDpress, Dresden (2007)Google Scholar
  27. 27.
    Scherer, K.R.: Foreword. In: Philippot, P., Feldman, R.S., Coats, E.J. (eds.) Nonverbal Behavior in Clinical Settings, pp. v–vi. Oxford University Press, Oxford (2003)Google Scholar
  28. 28.
    Wittenberg, S., Petrick, R., Wolff, M., Hoffmann, R.: Einkanalige Störgeräuschunterdrückung zur Steigerung der Worterkennungsrate eines Spracherkenners. In: Fellbaum, K. (ed.) Elektronische Sprachsignalverarbeitung. Tagungsband der 18. Konferenz Elektronische Sprachsignalverarbeitung, Cottbus, September 10-12, 2007, pp. 52–59. TUDpress, Dresden (2007)Google Scholar
  29. 29.
    Zue, V.: On Organic Interfaces. In: Proc. Interspeech, Antwerpen, August 27-31, 2007, pp. 1–8 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Rüdiger Hoffmann
    • 1
  • Lutz-Michael Alisch
    • 2
  • Uwe Altmann
    • 2
  • Thomas Fehér
    • 1
  • Rico Petrick
    • 1
  • Sören Wittenberg
    • 1
  • Rico Hermkes
    • 2
  1. 1.Department of Electrical Engineering and Information TechnologyTechnische Universität DresdenDresdenGermany
  2. 2.Faculty of EducationTechnische Universität DresdenDresdenGermany

Personalised recommendations