Skip to main content

Part of the book series: Human–Computer Interaction Series ((HCIS))

Abstract

Predicting human performance in interaction tasks allows designers or developers to understand the expected performance of a target interface without actually testing it with real users. In this chapter, we are going to discuss how deep learning methods can be used to aid human performance prediction in the context of HCI. Particularly, we are going to look at three case studies. In the first case study, we discuss deep models for goal-driven human visual search on arbitrary web pages. In the second study, we show that deep learning models could successfully capture human learning effects from repetitive interaction with vertical menus. In the third case study, we describe how deep models can be combined with analytical understanding to capture high-level interaction strategies and low-level behaviors in touchscreen grid interfaces on mobile devices. In all these studies, we show that deep learning provides great capacity for modeling complex interaction behaviors, which would be extremely difficult for traditional heuristic-based models. Furthermore, we showcase different ways to analyze a learned deep model to obtain better model interpretability, and understanding of human behaviors to advance the science.

Arianna Yuan and Ken Pfeuffer conducted the work during an internship at Google Research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Anderson P, He X, Buehler C, Teney D, Johnson M, Gould S, Zhang L (2018) Bottom-up and top-down attention for image captioning and visual question answering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6077–6086

    Google Scholar 

  2. Bailly G, Oulasvirta A, Brumby DP, Howes A (2014) Model of visual search and selection time in linear menus. In: Proceedings of the SIGCHI conference on human factors in computing systems (CHI ’14). ACM, New York, NY, USA, pp 3865–3874. http://dx.doi.org/10.1145/2556288.2557093

  3. Bailly G, Oulasvirta A, Brumby DP, Howes A (2014) Model of visual search and selection time in linear menus. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 3865–3874

    Google Scholar 

  4. Bi X, Li Y, Zhai S (2013) FFitts law: modeling finger touch with fitts’ law. In: Proceedings of the SIGCHI conference on human factors in computing systems (CHI ’13). ACM, New York, NY, USA, pp 1363–1372. http://dx.doi.org/10.1145/2470654.2466180

  5. Borji A (2019) Saliency prediction in the deep learning era: successes and limitations. IEEE Transa Pattern Anal Mach Intell (2019)

    Google Scholar 

  6. Byrne MD (2001) ACT-R/PM and menu selection. Int J Hum-Comput Stud 55(1):41–84. https://doi.org/10.1006/ijhc.2001.0469

  7. Card SK (1982) User perceptual mechanisms in the search of computer command menus. In: Proceedings of the 1982 conference on human factors in computing systems (CHI ’82). ACM, New York, NY, USA, pp 190–196. http://dx.doi.org/10.1145/800049.801779

  8. Card SK, Moran TP, Newell A (1980) The keystroke-level model for user performance time with interactive systems. Commun ACM 23(7):396–410

    Google Scholar 

  9. Chen K, Wang J, Chen L-C, Gao H, Xu W, Nevatia R (2015) ABC-CNN: an attention based convolutional neural network for visual question answering. arXiv:1511.05960

  10. Chen X, Bailly G , Brumby DP, Oulasvirta A, Howes A (2015). The emergence of interactive behaviour: a model of rational menu search. In: CHI’15 Proceedings of the 33rd annual ACM conference on human factors in computing systems, vol 33. Association for Computing Machinery (ACM), pp 4217–4226

    Google Scholar 

  11. Cockburn A, Gutwin C, Greenberg S (2007) A predictive model of menu performance. In: Proceedings of the SIGCHI conference on human factors in computing systems (CHI ’07). ACM, New York, NY, USA, pp 627–636. http://dx.doi.org/10.1145/1240624.1240723

  12. Cockburn A, Gutwin C, Greenberg S (2007) A predictive model of menu performance. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 627–636

    Google Scholar 

  13. Corbetta M, Shulman GL (2002) Control of goal-directed and stimulus-driven attention in the brain. Nat Rev Neurosci 3(3):201

    Google Scholar 

  14. Devlin J, Chang M-W, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805

  15. Fitts PM (1954) The information capacity of the human motor system in controlling the amplitude of movement. J Exper Psychol 47(6):381

    Google Scholar 

  16. Fu W-T, Pirolli P (2007) SNIF-ACT: A cognitive model of user navigation on the World Wide Web. Human-Comput. Int. 22(4):355–412

    Google Scholar 

  17. Graves A (2012) Supervised sequence labelling with recurrent neural networks. Springer, Studies in computational intelligence

    Google Scholar 

  18. Hick WE (1952) On the rate of gain of information. Q J Exp Psychol 4(1):11–26

    Google Scholar 

  19. Hochreiter S, Schmidhuber JU (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Google Scholar 

  20. Johnson M, Schuster M, Le QV, Krikun M, Yonghui W, Chen Z, Thorat N, Viégas F, Wattenberg M, Corrado G et al (2017) Google’s multilingual neural machine translation system: enabling zero-shot translation. Trans Ass Comput Ling 5(2017):339–351

    Google Scholar 

  21. Jokinen Jussi PP, Zhenxin W, Sayan S, Antti O, Xiangshi R (2020) Adaptive feature guidance: modelling visual search with graphical layouts. Int J Human-Comput Stud 136:102376

    Google Scholar 

  22. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

    Google Scholar 

  23. Lane DM, Napier HA, Batsell RR, Naman JL (1993) Predicting the skilled use of hierarchical menus with the keystroke-level model. Hum-Comput Interact 8(2):185–192. http://dx.doi.org/10.1207/s15327051hci0802_4

  24. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436

    Google Scholar 

  25. Li Y (2014) Reflection: enabling event prediction as an on-device service for mobile interaction. In: Proceedings of the 27th annual ACM symposium on user interface software and technology (UIST ’14). Association for Computing Machinery, New York, NY, USA, pp 689–698. http://dx.doi.org/10.1145/2642918.2647355

  26. Li Y, Bengio S, Bailly G (2018) Predicting human performance in vertical menu selection using deep learning. In: Proceedings of the 2018 CHI conference on human factors in computing systems, pp 1–7

    Google Scholar 

  27. Liu T, Larsson J, Carrasco M (2007) Feature-based attention modulates orientation-selective responses in human visual cortex. Neuron 55(2):313–323

    Google Scholar 

  28. MacKenzie IS, Buxton W (1992) Extending Fitts’ law to two-dimensional tasks. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 219–226

    Google Scholar 

  29. Martinez-Trujillo JC, Treue S (2004) Feature-based attention increases the selectivity of population responses in primate visual cortex. Curr Biol 14(9):744–751

    Google Scholar 

  30. McElree B, Carrasco M (1999) The temporal dynamics of visual search: evidence for parallel processing in feature and conjunction searches. J Exp Psychol: Human Percept Perf 25(6):1517

    Google Scholar 

  31. Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: ICML: Proceedings of the 27th international conference on machine learning

    Google Scholar 

  32. Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

    Google Scholar 

  33. Pfeuffer K, Li Y (2018) Analysis and modeling of grid performance on touchscreen mobile devices. In: Proceedings of the 2018 CHI conference on human factors in computing systems, pp 1–12

    Google Scholar 

  34. Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

    Google Scholar 

  35. Shen J, Reingold EM, Pomplun M (2003) Guidance of eye movements during conjunctive visual search: the distractor-ratio effect. Can J Exp Psychol 57(2):76

    Google Scholar 

  36. Shih KJ, Singh S, Hoiem D (2016) Where to look: focus regions for visual question answering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4613–4621

    Google Scholar 

  37. Tehranchi F, Ritter FE (2018) Modeling visual search in interactive graphic interfaces: adding visual pattern matching algorithms to ACT-R. In: Proceedings of 16th international conference on cognitive modeling. University of Wisconsin Madison, WI, pp 162–167

    Google Scholar 

  38. Teo L-H, John B, Blackmon M (2012) CogTool-Explorer: a model of goal-directed user exploration that considers information layout. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 2479–2488

    Google Scholar 

  39. Todi K, Jokinen J, Luyten K, Oulasvirta A (2019) Individualising graphical layouts with predictive visual search models. ACM Trans Int Intell Syst (TiiS) 10(1):1–24

    Google Scholar 

  40. Treisman AM, Gelade G (1980) A feature-integration theory of attention. Cogn Psychol 12(1):97–136

    Google Scholar 

  41. van der Meulen H, Varsanyi P, Westendorf L, Kun AL, Shaer O (2016) Towards understanding collaboration around interactive surfaces: exploring joint visual attention. In: Proceedings of the 29th annual symposium on user interface software and technology. ACM, pp 219–220

    Google Scholar 

  42. Walter R, Bulling A, Lindlbauer D, Schuessler M, Müller J (2015) Analyzing visual attention during whole body interaction with public displays. In: Proceedings of the 2015 ACM international joint conference on pervasive and ubiquitous computing. ACM, New York, NY, USA, pp 1263–1267

    Google Scholar 

  43. Wu X, Gedeon T, Wang L (2018) The analysis method of visual information searching in the human-computer interactive process of intelligent control system. In: Congress of the international ergonomics association. Springer, pp 73–84

    Google Scholar 

  44. Xu H, Saenko K (2016) Ask, attend and answer: exploring question-guided spatial attention for visual question answering. In: European conference on computer vision. Springer, pp 451–466

    Google Scholar 

  45. Yuan A, Li Y (2020) Modeling human visual search performance on realistic webpages using analytical and deep learning methods. In: Proceedings of the 2020 CHI conference on human factors in computing systems, pp 1–12

    Google Scholar 

  46. Zhaoping L, Frith U (2011) A clash of bottom-up and top-down processes in visual search: the reversed letter effect revisited. J Exp Psychol: Human Perc Perf 37(4):997

    Google Scholar 

  47. Zheng Q, Jiao J, Cao Y, Lau RWH (2018) Task-driven webpage saliency. In: Proceedings of the European conference on computer vision (ECCV), pp 287–302

    Google Scholar 

  48. Zipf GK (1949) Human behavior and the principle of least effort: an introduction to human ecology. Addison-Wesley Press, Boston. http://psycnet.apa.org/record/2005-10806-009

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Yuan, A., Pfeuffer, K., Li, Y. (2021). Human Performance Modeling with Deep Learning. In: Li, Y., Hilliges, O. (eds) Artificial Intelligence for Human Computer Interaction: A Modern Approach. Human–Computer Interaction Series. Springer, Cham. https://doi.org/10.1007/978-3-030-82681-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-82681-9_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-82680-2

  • Online ISBN: 978-3-030-82681-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics