Skip to main content
Log in

Run-time model based framework for automatic evaluation of multimodal interfaces

  • Original Paper
  • Published:
Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Abstract

Multimodal interfaces are expected to improve input and output capabilities of increasingly sophisticated applications. Several approaches are aimed at formally describing multimodal interaction. However, they rarely treat it as a continuous flow of actions, preserving its dynamic nature and considering modalities at the same level. This work proposes a model-based approach called Practice-oriented Analysis and Description of Multimodal Interaction (PALADIN) aimed at describing sequential multimodal interaction beyond such problems. It arranges a set of parameters to quantify multimodal interaction as a whole, in order to minimise the existing differences between modalities. Furthermore, interaction is described stepwise to preserve the dynamic nature of the dialogue process. PALADIN defines a common notation to describe interaction in different multimodal contexts, providing a framework to assess and compare the usability of systems. Our approach was integrated into four real applications to conduct two experiments with users. The experiments show the validity and prove the effectiveness of the proposed model for analysing and evaluating multimodal interaction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Table 10 in Appendix B gives an overview of the 9 guidelines.

  2. Tables 13, 14, 15 and 16 describe different interaction parameters including the modalities for which they can be applied (Mod.), the interaction level at which they are collected (Int. Lev.) and the measurement method (Meas. meth.) Table 12 further describes these abbreviations and their values.

  3. A barge-in attempt occurs when the user intentionally addresses the system while the system is still speaking, displaying the information of a GUI, performing a gesture or sending information using another modality.

  4. A facade is an object that provides a simplified interface to a larger body of code, such as a class library or a software framework.

  5. An open-source implementation of the Android HCI Extractor [36] can be downloaded from http://code.google.com/p/android-hci-extractor. More information related to this tool and its integration with the model and the framework described above can be found in [35].

References

  1. Araki M, Kouzawa A, Tachibana K (2005) Proposal of a multimodal interaction description language for various interactive agents. Trans Inf Syst E88-D(11):2469–2476

  2. Balbo S, Coutaz J, Salber D (1993) Towards automatic evaluation of multimodal user interfaces. In: Proceedings of the 1st international conference on intelligent user interfaces, IUI ’93ACM, New York, NY, USA, pp 201–208

  3. Balme L, Demeure A, Barralon N, Coutaz J, Calvary G (2004) Cameleon-rt: a software architecture reference model for distributed, migratable, and plastic user interfaces. In: Markopoulos P, Eggen B, Aarts EHL, Crowley JL (eds) EUSAI. Lecture Notes in Computer Science, vol 3295. Springer, Berlin, pp 291–302. http://dblp.uni-trier.de/db/conf/eusai/eusai2004.html#BalmeDBCC04

  4. Bayer S, Damianos LE, Kozierok R, Mokwa J (1999) The MITRE multi-modal logger: its use in evaluation of collaborative systems. ACM Comput Surv 31(2es):17

  5. Beringer N, Kartal U, Louka K, Schiel F, Türk U (2002) PROMISE—a procedure for multimodal interactive system evaluation. In: Proceedings of multimodal resources and multimodal systems evaluation workshop (LREC 2002), pp 77–80

  6. Bernsen NO, Dybkjær L (2009) Multimodal usability. Springer, Berlin

  7. Bourguet ML (2003) Designing and prototyping multimodal commands. In: Rauterberg M, Menozzi M, Wesson J (eds) INTERACT. IOS Press, Amsterdam. http://dblp.uni-trier.de/db/conf/interact/interact2003.html#Bourguet03

  8. Carey R, Bell G (1997) The annotated VRML 2.0 reference manual. Addison-Wesley, Boston

  9. Cohen PR, McGee DR (2004) Tangible multimodal interfaces for safety–critical applications. Commun ACM 47(1):41–46

    Article  Google Scholar 

  10. Coutaz J, Nigay L, Salber D, Blandford A, May J, Young RM (1995) Four easy pieces for assessing the usability of multimodal interaction: the CARE properties. In: Arnesen SA, Gilmore D (eds) Proceedings of INTERACT’95 conference. Chapman & Hall, London, pp 115–120

  11. Damianos LE, Drury J, Fanderclai T, Hirschman L, Kurtz J, Oshika B (2000) Evaluating multi-party multimodal systems. In: Proceedings of the second international conference on language resources and evaluation, vol 3. MIT Media Laboratory, pp 1361–1368

  12. Diefenbach S, Hassenzahl M (2011) Handbuch zur Fun-ni Toolbox. manual, Folkwang Universität der Künste (2011). Retrieved at 16.10.2013. http://fun-ni.org/wp-content/uploads/Diefenbach+Hassenzahl_2010_HandbuchFun-niToolbox.pdf

  13. Ergonomics of human-system interaction (2006) Part 110: Dialogue principles (ISO 9241-110:2006)

  14. Dumas B, Lalanne D, Ingold R (2010) Description languages for multimodal interaction: a set of guidelines and its illustration with SMUIML. J Multimodal User Interfaces 3:237–247

    Article  Google Scholar 

  15. Dybkjær L, Bernsen NO, Minker W (2004) Evaluation and usability of multimodal spoken language dialogue systems. Speech Commun 43:33–54

    Article  Google Scholar 

  16. Engelbrecht KP, Kruppa M, Möller S, Quade M (2008) MeMo Workbench for semi-automated usability testing. In: Proceedings of Interspeech 2008 incorporation SST 2008. International Symposium on Computer Architecture, Brisbane, Australia, pp 1662–1665

  17. Fraser NM (1997) Spoken dialogue system evaluation: a first framework for reporting results. In: EUROSPEECH-1997, pp 1907–1910

  18. Fraser NM, Gilbert G (1991) Simulating speech systems. Comput Speech Lang 5(1):81–99

    Article  Google Scholar 

  19. Göbel S, Hartmann F, Kadner K, Pohl C (2006) A device-independent multimodal mark-up language. In: Hochberger C, Liskowsky R (eds) INFORMATIK 2006. Informatik für Menschen, LNI, vol 94. Gesellschaft für Informatik, pp 170–177

  20. Gong XG, Engelbrecht KP (2013) The influence of user characteristics on the quality of judgment prediction models for tablet applications. In: 10. Berliner Werkstatt, pp 198–204

  21. GNU general public license. http://www.gnu.org/licenses/gpl.html

  22. Grice HP (1975) Logic and conversation. Syntax Semant 3:41–58

    Google Scholar 

  23. Johnston M (2009) EMMA: extensible multimodal annotation markup language. W3C recommendation, W3C (2009) http://www.w3.org/TR/2009/REC-emma-20090210/

  24. Jöst M, Häußler J, Merdes M, Malaka R (2005) Multimodal interaction for pedestrians: an evaluation study. In: Amant RS, Riedl J, Jameson A (eds) Proceedings of the 10th international conference on intelligent user interfaces. ACM, New York, pp 59–66

  25. Jouault F, Allilaire F, Bézivin J, Kurtev I (2008) ATL: a model transformation tool. Sci Comput Program 72(1–2):31–39

    Article  MATH  Google Scholar 

  26. Kranstedt A, Kopp S, Wachsmuth I (2002) MURML: a multimodal utterance representation markup language for conversational agents. In: Proceedings of AAMAS02 workshop on embodied conversational agents—let’s specify and evaluate them

  27. Kühnel C, Weiss B, Möller S (2010) Parameters describing multimodal interaction—definitions and three usage scenarios. In: Kobayashi T, Hirose K, Nakamura S (eds) Proceedings of 11th annual conference on ISCA (Interspeech 2010). ISCA, Makuhari, pp 2014–2017

  28. Larson JA, Raggett D, Raman TV (2003) W3C multimodal interaction framework. W3C note, W3C (2003). http://www.w3.org/TR/2003/NOTE-mmi-framework-20030506/

  29. Leech G, Wilson A (1996) EAGLES. recommendations for the morphosyntactic annotation of corpora. http://www.ilc.cnr.it/EAGLES96/annotate/annotate.html

  30. Lemmelä S, Vetek A, Mäkelä K, Trendafilov D (2008) Designing and evaluating multimodal interaction for mobile contexts. In: Digalakis V, Potamianos A, Turk M, Pieraccini R, Ivanov Y (eds) Proceedings of the 10th international conference on multimodal interfaces. ACM, New York, pp 265–272

  31. Limbourg Q, Vanderdonckt J, Michotte B, Bouillon L, López-Jaquero V (2005) USIXML: a language supporting multi-path development of user interfaces. In: Bastide R, Palanque P, Roth J (eds) Engineering human computer interaction and interactive systems. Lecture Notes in Computer Science, vol 3425, chap. 12. Springer, Berlin, pp 134–135. doi:10.1007/11431879_12

  32. Malhotra A, Biron PV (2004) XML schema part 2: datatypes second edition. W3C recommendation, W3C. http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/

  33. Manca M, Paternó F (2010) Supporting multimodality in service-oriented model-based development environments. In: Bernhaupt R, Forbrig P, Gulliksen J, Lárusdóttir M (eds) HCSE. Lecture Notes in Computer Science, vol 6409. Springer, Berlin, pp 135–148. http://dblp.uni-trier.de/db/conf/hcse/hcse2010.html#MancaP10

  34. Martin JC, Kipp M (2002) Annotating and measuring multimodal behaviour—tycoon metrics in the anvil tool. In: LREC. European Language Resources Association. http://dblp.uni-trier.de/db/conf/lrec/lrec2002.html#MartinK02

  35. Mateo P (2012) Android HCI extractor and the MIM project: integration and usage tutorial. http://www.catedrasaes.org/wiki/MIM. Accessed 04 Nov 2013

  36. Mateo P, Hillmann S (2012) Android HCI Extractor. http://code.google.com/p/android-hci-extractor. Accessed 04 Nov 2013

  37. Mateo P, Hillmann S (2013) Instantiation framework for the PALADIN interaction model. https://github.com/pedromateo/paladin_instantiation. Accessed 04 Nov 2013

  38. Mateo P, Hillmann S (2013) PALADIN: a run-time model for automatic evaluation of multimodal interfaces. https://github.com/pedromateo/paladin. Accessed 04 Nov 2013

  39. Mateo Navarro PL, Martínez Pérez G, Sevilla Ruiz D (2014) A context-aware interaction model for the analysis of users QoE in mobile environments. Int J Hum Comput Interact, Taylor & Francis (in press)

  40. Möller S (2005) Parameters describing the interaction with spoken dialogue systems. ITU-T Recommendation Supplement 24 to P-Series, International Telecommunication Union, Geneva, Switzerland. Based on ITU-T Contr. COM 12–17 (2009)

  41. Möller S (2005) Quality of telephone-based spoken dialogue systems. Springer, New York

    Google Scholar 

  42. Möller S (2011) Parameters describing the interaction with multimodal dialogue systems. ITU-T Recommendation Supplement 25 to P-Series Rec., International Telecommunication Union, Geneva, Switzerland

  43. Nigay L, Coutaz J (1993) A design space for multimodal systems: concurrent processing and data fusion. In: Ashlund S, Mullet K, Henderson A, Hollnagel E, White TN (eds) Proceedings of INTERACT ’93 and CHI ’93 conference on human factors in computing system. ACM, New York, pp 172–178

  44. Olmedo-Rodríguez H, Escudero-Mancebo D, Cardeñoso Payo V (2009) Evaluation proposal of a framework for the integration of multimodal interaction in 3D worlds. In: Proceedings of the 13th international conference on human–computer interactions. Part II: Novel Interaction Methods and Techniques. Springer, Berlin, pp 84–92

  45. Oshry M, Baggia P, Rehor K, Young M, Akolkar R, Yang X, Barnett J, Hosn R, Auburn R, Carter J, McGlashan S, Bodell M, Burnett DC (2009) Voice extensible markup language (VoiceXML) 3.0. W3C working draft, W3C. http://www.w3.org/TR/2009/WD-voicexml30-20091203/

  46. Oviatt S (1999) Ten myths of multimodal interaction. Commun ACM 42:74–81

    Article  Google Scholar 

  47. Oviatt S (2003) Advances in robust multimodal interface design. IEEE Comput Graph Appl 23:62–68

    Article  Google Scholar 

  48. Palanque PA, Schyn A (2003) A model-based approach for engineering multimodal interactive systems. In: Rauterberg M, Menozzi M, Wesson J (eds) INTERACT’03. IOS Press, Amsterdam, pp 543–550

  49. Paternò F, Santoro C, Spano LD (2009) MARIA: a universal, declarative, multiple abstraction-level language for service-oriented applications in ubiquitous environments. ACM Trans Comput Hum Interact 16(4):1–30. doi:10.1145/1614390.1614394

  50. Pelachaud C (2005) Multimodal expressive embodied conversational agents. In: Zhang H, Chua TS, Steinmetz R, Kankanhalli MS, Wilcox L (eds) ACM multimedia. ACM, New York, pp 683–689

  51. Perakakis M, Potamianos A (2007) The effect of input mode on inactivity and interaction times of multimodal systems. In: Massaro DW, Takeda K, Roy D, Potamianos A (eds) Proceedings of the 9th international conference on multimodal interfaces (ICMI 2007). ACM, New York, pp 102–109

  52. Perakakis M, Potamianos A (2008) Multimodal system evaluation using modality eficiency and synergy metrics. In: Proceedings of the 10th international conference on multimodal interfaces (ICMI’08). ACM, New York, pp 9–16

  53. Schatzmann J, Georgila K, Young S (2005) Quantitative evaluation of user simulation techniques for spoken dialogue systems. In: Dybkjær L, Minker W (eds) Proceedings of the 6th SIGdial workshop discourse dialogue. Special Interest Group on Discourse and Dialogue (SIGdial), Associtation for Computational Linguistics (ACL), pp 45–54

  54. Schatzmann J, Young S (2009) The hidden agenda user simulation model. IEEE Trans Audio Speech Lang Process 17(4):733–747

    Article  Google Scholar 

  55. Schmidt S, Engelbrecht KP, Schulz M, Meister M, Stubbe J, Töppel M, Möller S (2010) Identification of interactivity sequences in interactions with spoken dialog systems. In: Proceedings of the 3rd international workshop on perception quality system. Chair of Communication Acoustics TU Dresden, pp 109–114

  56. Serrano M, Nigay L (2010) A wizard of Oz component-based approach for rapidly prototyping and testing input multimodal interfaces. J Multimodal User Interfaces 3(3):215–225. doi:10.1007/s12193-010-0042-4

    Article  Google Scholar 

  57. Serrano M, Nigay L, Demumieux R, Descos J, Losquin P (2006) Multimodal interaction on mobile phones: development and evaluation using ACICARE. In: Nieminen M, Röykkee M (eds) MobileHCI ’06: Proceedings of the 8th conference on human–computer interactions mob. devices serv.,. ACM, New York, pp 129–136

  58. Sonntag D (2012) Collaborative multimodality. KI 26(2):161–168 http://dblp.uni-trier.de/db/journals/ki/ki26.html#Sonntag12

  59. Steinberg D, Budinsky F, Paternostro M, Ed M (2009) EMF. Eclipse Modeling Framwork, 2 edn. Addison-Wesley, Upper Saddle River

  60. Sturm J, Bakx I, Cranen B, Terken J, Wang F (2002) Usability evaluation of a dutch multimodal system for train timetable information. In: Rodriguez MG, Araujo CS (eds) Proceedings of LREC 2002. 3rd Intenational conference on language resources and evaluation, pp 255–261

  61. Sutcliffe A (2008) Multimedia user interface design, chap. 20. Lawrence Erlbaum Associates, New Jersey, pp 393–410

  62. Thompson HS, Maloney M, Beech D, Mendelsohn N (2004) XML schema part 1: Structures second edition. W3C recommendation, W3C. http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/

  63. Vanacken D, Boeck JD, Raymaekers C, Coninx K (2006) NIMMIT: a notation for modeling multimodal interaction techniques. In: Braz J, Jorge JA, Dias M, Marcos A (eds) GRAPP. INSTICC—Institute for Systems and Technologies of Information, Control and Communication, pp 224–231. http://dblp.uni-trier.de/db/conf/grapp/grapp2006.html#VanackenBRC06

  64. Walker M, Litman D, Kamm C, Abella A (1997) PARADISE: a framework for evaluating spoken dialogue agents. In: Proceedings of the 35th annual meeting of the association for computational linguistics, ACL 97, pp 262–270

  65. Wechsung I, Engelbrecht KP, Kühnel C, Möller S, Weiss B (2012) Measuring the quality of service and quality of experience of multimodal humanmachine interaction. J Multimodal User Interfaces 73(6):73–85

    Article  Google Scholar 

Download references

Acknowledgments

This work has been supported by the Cátedra SAES (http://www.catedrasaes.org), a private initiative of the University of Murcia (http://www.um.es) and SAES (Sociedad Anónima de Electrónica Submarina) (http://www.electronica-submarina.com), as well as by the Telekom Innovation Laboratories (http://www.laboratories.telekom.com) within Technische Universität Berlin (http://www.tu-berlin.de).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pedro Luis Mateo Navarro.

Appendices

Appendix A: Translations

See Tables 6, 7 , 8 and 9.

Table 6 Translations and meanings of German phrases in the Figs. 4a and 6a, b
Table 7 Translations and meanings of German phrases in Fig. 7a
Table 8 Translations and meanings of German phrases in Fig. 7b
Table 9 Translations and meanings of German phrases in Fig. 7c

Appendix B: Guidelines on features of multimodal description languages

See Table 10.

Table 10 Guidelines on potential features of multimodal interaction description languages as described in [14]

Appendix C: Parameters used in the model

The tables in this section give an overview about all parameters which are modified or newly introduce in PALADIN compared to ITU-T Suppl. 25 to P-Series Rec. [42]. Table 12 explains the abbreviations which are used in the subsequent tables. Furthermore, Table 11 provides an index containing each parameters (by its abbreviation) and the table or reference describing it.

See Tables 11, 12, 13, 14, 15 and 16.

Table 11 Index of parameters and the tables containing those
Table 12 Glossary of abbreviations used in Tables 13, 14, 15 and 16
Table 13 Dialogue and communication-related interaction parameters
Table 14 Modality-related interaction parameters
Table 15 Meta-communication-related interaction parameters
Table 16 Keyboard- and mouse-input-related interaction parameters

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mateo Navarro, P.L., Hillmann, S., Möller, S. et al. Run-time model based framework for automatic evaluation of multimodal interfaces. J Multimodal User Interfaces 8, 399–427 (2014). https://doi.org/10.1007/s12193-014-0170-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12193-014-0170-3

Keywords

Navigation