Skip to main content

Multimodal HCI: exploratory studies on effects of first impression and single modality ratings in retrospective evaluation


This paper deals with the relevance of the first impression of interactive systems, based on short passive visual/auditory stimuli of system output. Individual consistency between such impressions and retrospective user ratings, obtained directly after real interaction, is studied in four exploratory experiments. All systems allow for voice user input. Two systems are considered to be multimodal as they support additional input other than speech (e.g., gesture); whereas the other two systems, offering speech as the sole input modality, are multimedia systems. The first impression of the four systems is based on screen-shots of typical display views and selected prompts of the systems’ speech output. Measures used here were pragmatic quality (i.e., the functional aspects of a system such as efficiency and effectiveness that are closely related to the concept of usability) and hedonic qualities (i.e., the systems non-instrumental aspects such as its ability to provide stimulation and identification—to evoke the psychological well-being of the user. It was tested, whether consistency found for web-sites can also be found for speech-based systems. In our case, this consistency was assessed not between systems, but within systems. Results indicate that users’ first impression of system output does correlate with ratings collected after the interaction for each of the four systems. For the two truly multimodal systems, ratings after single input (e.g., only voice, only touch screen) also correlates with ratings of a multimodal interaction with the same system. This result confirms data from literature. However, our assumption of lower correlations for the first impression of pragmatic quality, expected due to its experience-based character, is not supported. Instead, pragmatic quality seems to represent a construct with low consistency in general. Reasons for this might be found in the benefit of pragmatic quality experienced during multimodal interaction that is neither covered by unimodal interaction, nor predictable from a first impression. Additional multiple regression analysis for the two systems with multiple input modalities show that the first impression of the visual system output can complement predictors from the single modality interactions to model post-usage multimodal ratings. However, which of the output channels has a relevant impact was found to be highly system dependent.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10


  1. 1.

    Open source toolkit for speech recognition. Version 4. Carnegie Mellon University.

  2. 2.

    Optimsys s.r.o., Thinking Head version. Optimtalk voice browser.

  3. 3.

    Nuance Communications.


  1. 1.

    Albert W, Gribbons W, Almadas J (2009) Pre-conscious assessment of trust: A case study of financial and health care web sites. Proc. Human Factors and Ergonomic Society Annual Meeting, San Antonio, pp 449–453

    Google Scholar 

  2. 2.

    Ambady N, Skowronski JJ (eds) (2008) First impressions. Guilford Press, New York

    Google Scholar 

  3. 3.

    Bar M, Neta M, Linz H (2006) Very first impressions. Emotion 6(2):269–278

    Article  Google Scholar 

  4. 4.

    Ben-Bassat T, Meyer J, Tractinsky N (2006) Economic and subjective measures of the perceived value of aesthetics and usability. ACM Trans Comput Hum Interact 13:210–234

    Article  Google Scholar 

  5. 5.

    Bergmann K, Eyssel F, Kopp S (2012) A second chance to make a first impression? how appearance and nonverbal behavior affect perceived warmth and competence of virtual agents over time. In: Proc. Conference on Intelligent Virtual Agents (IVA), pp 126–138

  6. 6.

    Bertelson P, de Gelder B (2004) The psychology of multimodal perception. In: Spence C, Driver J (eds) Crossmodal space and crossmodal attention. Oxford University Press, Oxford, pp 141–177

    Chapter  Google Scholar 

  7. 7.

    Burnham D, Abrahamyan A, Cavedon L, Davis C, Hodgins A, Kim J, Kroos C, Kuratate T, Lewis T, Luerssen M, Paine G, Powers D, Riley M, Stelarc Stevens K (2008) From talking to thinking heads: 2008. In: Proc. International Conference on Auditory-Visual Speech Processing (AVSP)

  8. 8.

    Cafaro A, Vilhjálmsson HH, Bickmore T (2016) First impressions in human-agent virtual encounters. ACM Trans Comput Hum Interact 23(4):24–40

    Article  Google Scholar 

  9. 9.

    Cerekovic A, Aran O, Gatica-Perez D (2014) How do you like your virtual agent?: Human-agent interaction experience through nonverbal features and personality traits. In: Proc. Int. Workshop on Human Behavior Understanding, pp 1–15

  10. 10.

    Colavita FB, Weisberg D (1979) A further investigation of visual dominance. Percept Psychophys 25:345–347

    Article  Google Scholar 

  11. 11.

    Fagel S, Kühnel C, Weiss B, Wechsung I, Möller S (2008) A comparison of german talking heads in a smart home environment. In: Proc. AVSP, Togalooma

  12. 12.

    Hartmann J, Sutcliffe A, Angeli AD (2008) Towards a theory of user judgment of aesthetics and user interface quality. ACM Trans Comput Hum Interact 15:15–30

    Article  Google Scholar 

  13. 13.

    Hassenzahl M (2004) The interplay of beauty, goodness, and usability in interactive products. Hum Comput Interact 19:319–349

    Article  Google Scholar 

  14. 14.

    Hassenzahl M (2008) Aesthetics in interactive products: correlates and consequences of beauty. Product Experience. Elsevier, San Diego, pp 287–299

    Chapter  Google Scholar 

  15. 15.

    Hassenzahl M, Monk A (2010) The inference of perceived usability from beauty. Hum Comput Interact 25(3):235–260

    Article  Google Scholar 

  16. 16.

    Hillmann S (2015) Usability-Untersuchung der natürlichsprachlichen Bedienung eines Smart TV. Proc. Elektronische Sprachsignalverarbeitung (ESSV), Eichstädt, pp 96–103

    Google Scholar 

  17. 17.

    Jokinen K, Hurtig T (2006) User expectations and real experience on a multimodal interactive. In: Proc. Interspeech, pp 1049–1052

  18. 18.

    Jones EE (1990) Interpers Percept. Freeman and Company, New York

    Google Scholar 

  19. 19.

    Kühnel C (2012) Quantifying quality aspects of multimodal interactive systems. Springer, New York

    Book  Google Scholar 

  20. 20.

    Kühnel C, Weiss B, Möller S (2010) Evaluating multimodal systems – A comparison of established questionnaires and interaction parameters. In: Proc. NordiCHI, pp 286–293

  21. 21.

    Kühnel C, Weiss B, Möller S (2010) Parameters describing multimodal interaction—definitions and three usage scenarios. In: Proc. Interspeech, pp 2014–2017

  22. 22.

    Lavie T, Tractinsky N (2004) Assessing dimensions of perceived visual aesthetics of web sites. Int J Hum Comput Stud 60:269–298

    Article  Google Scholar 

  23. 23.

    Lee S, Koubek RJ (2010) Understanding user preferences based on usability and aesthetics before and after actual use. Interact Comput 19:311–318

    Google Scholar 

  24. 24.

    Lindgaard G, Fernandes G, Dudek C, Brown J (2006) Attention web designers: you have 50 milliseconds to make a good first impression!. Behav Inf Technol 25(2):115–126

    Article  Google Scholar 

  25. 25.

    Lindgaard G, Dudek C, Sen D, Sumegi L, Noonan P (2011) An exploration of relations between visual appeal, trustworthiness and perceived usability of homepages. ACM Trans Comput Hum Interact (TOCHI) 18(1):1–30

    Article  Google Scholar 

  26. 26.

    Metze F, Wechsung I, Schaffer S, Seebode J, Möller S (2009) Reliable evaluation of multimodal dialogue systems. In: Proc. HCI International, pp 75–83

  27. 27.

    Möller S, Skowronek J (2003) Quantifying the impact of system characteristics on perceived quality dimensions of a spoken dialogue service. In: Proc. European Conference on Speech Communication and Technology, Geneva, vol. 3, pp 1953–1956

  28. 28.

    Posner MI, Nissen MJ, Klein RM (1976) Visual dominance: An information-processing account of its origins and significance. Psychol Rev 83:157–171

    Article  Google Scholar 

  29. 29.

    Reponen E, Keränen J, Korhonen H (2010) World-wide access to geospatial data by pointing through the earth. In: Proc. CHI 2010 Extended Abstracts, pp 3895–3900

  30. 30.

    Riegelsberger J, Sasse M, McCarthy J (2003) Trust at first sight? a test of users’ ability to identify trustworthy e-commerce sites. Proc. Human Computer Interaction International, Bath, pp 243–260

    Google Scholar 

  31. 31.

    Scapin D, Senach B, Trousse B, Pallot M (2012) User Experience: Buzzword or new paradigm? In: Proc. 5th International Conference on Advances in Computer-Human Interactions (ACHI), Valencia, pp 336–341

  32. 32.

    Schrepp M, Held T, Laugwitz B (2006) The influence of hedonic quality on the attractiveness of user interfaces of business management software. Interact Comput 18:1055–1069

    Article  Google Scholar 

  33. 33.

    Schröder M, Trouvain J (2003) The German text-to-speech synthesis system MARY: a tool for research, development and teaching. Int J Speech Technol 6:365–377

    Article  Google Scholar 

  34. 34.

    Sonderegger A, Sauer J (2020) The influence of design aesthetics in usability testing: effects on user performance and perceived usability. Appl Ergon 41:403–410

  35. 35.

    Tractinsky N, Katz AS, Ikar D (2000) What is beautiful is usable. Interact Comput 13:127–145

    Article  Google Scholar 

  36. 36.

    Tractinsky N, Cokhavi A, Kirschenbaum M, Sharfi T (2006) Evaluating the consistency of immediate aesthetic perceptions of web pages. Int J Hum Comput Stud 64:1071–1083

    Article  Google Scholar 

  37. 37.

    Tuch A, Roth S, Hornbæk K, Opwis K, Bargas-Avila J (2012) Is beautiful really usable? toward understanding the relation between usability, aesthetics, and affect in HCI. Comput Hum Behav 28:1596–1607

    Article  Google Scholar 

  38. 38.

    Tuch AN, Presslaber EE, Stöcklin M, Opwis K, Bargas-Avila JA (2012) The role of visual complexity and prototypicality regarding first impression of websites: Working towards understanding aesthetic judgments. Int J Hum Comput Stud 70(11):794–811

  39. 39.

    Turunen M, Hakulinen J, Melto A, Heimonen T, Laivo T, Hella J (2009) SUXES – user experience evaluation method for spoken and multimodal interaction. In: Proc. Interspeech, pp 2567–2570

  40. 40.

    Turunen M, Melto A, Hella J, Heimonen T, Hakulinen J, Mäkinen E, Laivo T, Soronen H (2009) User expectations and user experience with different modalities in a mobile phone controlled home entertainment system. In: Proc. MobileHCI

  41. 41.

    van Schaik P, Ling J (2008) Modelling user experience with web sites: Usability, hedonic value, beauty and goodness. Interact Comput 20(3):419–432

    Article  Google Scholar 

  42. 42.

    Wechsung I, Schleicher R (2012) Modelling modality choice using task parameters and perceived quality. Proc. ITG Fachtagung Sprachkommunikation, Braunschweig, pp 1–4

    Google Scholar 

  43. 43.

    Wechsung I, Engelbrecht KP, Möller S (2012) Using quality ratings to predict modality choice in multimodal systems. In: Proc. 13th Interspeech, Portland, pp 731–734

  44. 44.

    Wechsung I, Engelbrecht KP, Nauman A, Schaffer S, Seebode J, Metze F, Möller S (2009) Predicting the quality of multimodal systems based on judgements of single modalities. In: Proc. Interspeech, pp 1827–1830

  45. 45.

    Wechsung I, Engelbrecht KP, Schaffer S, Seebode J, Metze F, Möller S (2009) Usability evaluation of multimodal interfaces: Is the whole the sum of its parts? In: Proc. 13th International Conference on Human-Computer Interaction, pp 113–119

  46. 46.

    Weiss B, Kühnel C, Wechsung I, Fagel S, Möller S (2010) Quality of talking heads in different interaction and media contexts. Speech Commun 52(6):481–492

    Article  Google Scholar 

  47. 47.

    Weiss B, Willkomm S, Möller S (2013) Evaluating an adaptive dialog system for the public. In: Proc. Interspeech, pp 1–4

Download references


We would like to thank the T-Labs project team, especially Uwe Hillmann and Thomas Buchholz, for making the smartphone app available and supporting the evaluation, and Elyce for additional proof-reading. This work was partially financially supported by the DFG (German Research Community, Grant MO 1038/6-1), BMWi (German Federal Ministry for Economic Affairs and Energy, Grant 01MG13001G) and the EIT (ICT RIHA Multimodal Interfaces).

Author information



Corresponding author

Correspondence to Benjamin Weiss.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Weiss, B., Wechsung, I., Hillmann, S. et al. Multimodal HCI: exploratory studies on effects of first impression and single modality ratings in retrospective evaluation. J Multimodal User Interfaces 11, 115–131 (2017).

Download citation


  • Multimodal dialog system
  • User factors
  • User experience
  • Evaluation
  • Quality aspects