Intelligent Systems: Models and Applications

Volume 3 of the series Topics in Intelligent Engineering and Informatics pp 215-235

Attention and Emotion Based Adaption of Dialog Systems

  • Sebastian HommelAffiliated withComputer Science Institute, University of Applied Sciences Ruhr West Email author 
  • , Ahmad RabieAffiliated withComputer Science Institute, University of Applied Sciences Ruhr West
  • , Uwe HandmannAffiliated withComputer Science Institute, University of Applied Sciences Ruhr West

* Final gross prices may vary according to local VAT.

Get Access


In this work methods are described, which are used for an individual adaption of a dialog system. Anyway, an automatic real-time capable visual user attention estimation for a face to face human machine interaction is described. Furthermore, an emotion estimation is presented, which combines a visual and an acoustic method. Both, the attention estimation and the visual emotion estimation based on Active Appearance Models (AAMs). Certainly, for the attention estimation Multilayer Perceptrons (MLPs) are used to map the Active Appearance Parameters (AAM-Parameters) onto the current head pose. Afterwards, the chronology of the head poses is classified as attention or inattention. In the visual emotion estimation the AAM-Parameter will be classified by a Support-Vector-Machine (SVM). The acoustic emotion estimation also use a SVM to classifies emotion related audio signal features into the 5 basis emotions (neutral, happy, sad, anger, surprise). Afterward, a Bayes network is used to combine the results of the visual and the acoustic estimation in the decision level. The visual attention estimation as well as the emotion estimation will be used in service robotic to allow a more natural and human like dialog. Furthermore, the human head pose is very efficient interpreted as head nodding or shaking by the use of adaptive statistical moments. Especially, the head movement of many demented people are restricted, so they often only use their eyes to look around. For that reason, this work examine a simple gaze estimation with the help of an ordinary webcam. Moreover, a full body user re-identification method is described, which allows an individual state estimation of several people for hight dynamic situations. In this work an appearance based method is described, which allows a fast people re-identification over a short time span to allow the usage of individual parameter.


Multilayer Perceptron Bayes network active appearance model support-vector-machine