Skip to main content

Optimal Control to Support High-Level User Goals in Human-Computer Interaction

  • Chapter
  • First Online:
Artificial Intelligence for Human Computer Interaction: A Modern Approach

Part of the book series: Human–Computer Interaction Series ((HCIS))

  • 2083 Accesses

Abstract

With emerging technologies like robots, mixed-reality systems or mobile devices, machine-provided capabilities are increasing, so is the complexity of their control and display mechanisms. To address this dichotomy, we propose optimal control as a framework to support users in achieving their high-level goals in human-computer tasks. We reason that it will improve user support over usual approaches for adaptive interfaces as its formalism implicitly captures the iterative nature of human-computer interaction. We conduct two case studies to test this hypothesis. First, we propose a model-predictive-control-based optimization scheme that supports end-users to plan and execute robotic aerial videos. Second, we introduce a reinforcement-learning-based method to adapt mixed-reality augmentations based on users’ preferences or tasks learned from their gaze interactions with a UI. Our results show that optimal control can better support users’ high-level goals in human-computer tasks than common approaches. Optimal control models human-computer interaction as a sequential decision problem which represents its nature and, hence, results in better predictability of user behavior than for other methods. In addition, our work highlights that optimization- and learning-based optimal control have complementary strengths with respect to interface adaptation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We refer the reader to [30] for details on the results and experimental design of both studies.

  2. 2.

    This also prevents solutions of infinitely long trajectories in time where adding steps with \(\mathbf {u}_i\approx 0\) is free w.r.t. to Eq. (10)).

  3. 3.

    These points can be seen Fig. 3 and are the intersections of the blue dotted lines.

  4. 4.

    For more details on experimental design and results, see [33].

  5. 5.

    For a more detailed version of this section, we refer the interested reader to [28].

  6. 6.

    Myopic policies only consider the attainable reward in the next state and neglect other future states when selecting an action.

References

  1. Abbeel P, Dolgov D, Ng AY, Thrun S (2008) Apprenticeship learning for motion planning with application to parking lot navigation. In: IEEE international conference on intelligent robots and systems 2008. IROS ’08. IEEE, pp 1083–1090

    Google Scholar 

  2. Pieter A, Ng Andrew Y (2004) Apprenticeship learning via inverse reinforcement learning. p 1

    Google Scholar 

  3. Kumaripaba A, Alan M, Antti O, Giulio J, Dorota G (2016) Beyond relevance: adapting exploration/exploitation in information retrieval. Association for Computing Machinery, New York, NY, USA

    Google Scholar 

  4. Audronis T (2014) How to get cinematic drone shots

    Google Scholar 

  5. Aytar Y, Pfaff T, Budden D, Le Paine T, Wang Z, de Freitas N (2018) Playing hard exploration games by watching youtube. In: Advances in neural information processing systems

    Google Scholar 

  6. Gilles B, Antti O, Timo K, Sabrina H (2013) Menuoptimizer: interactive optimization of menu systems. pp 331–342

    Google Scholar 

  7. Banovic N, Buzali T, Chevalier F, Mankoff J, Dey AK (2016) Modeling and understanding human routine behavior. In: Proceedings of the 2016 CHI conference on human factors in computing systems, CHI ’16. ACM, pp 248–260

    Google Scholar 

  8. Bemporad A, Morari M, Dua V, Pistikopoulos EN (2002) The explicit linear quadratic regulator for constrained systems. Automatica 38(1):3–20

    Google Scholar 

  9. Bertsekas Dimitri P, Tsitsiklis John N (1995). Neuro-dynamic programming: an overview, vol 1. IEEE, pp 560–564

    Google Scholar 

  10. Bronner S, Shippen J (2015) Biomechanical metrics of aesthetic perception in dance. Exp Brain Res 233(12), 3565–3581:12

    Google Scholar 

  11. Chapanis A (1976) Engineering psychology. Rand McNally, Chicago

    Google Scholar 

  12. Chen M, Beutel A, Covington P, Jain S, Belletti F, Chi H (eds) (2019) Top-k off-policy correction for a reinforce recommender system. In: Proceedings of the twelfth ACM international conference on web search and data mining, WSDM ’19. ACM, pp 456–464

    Google Scholar 

  13. Chen X, Bailly G, Brumby DP, Oulasvirta A, Howes A (2015) The emergence of interactive behavior: A model of rational menu search. In: Proceedings of the 33rd annual ACM conference on human factors in computing systems, CHI ’15, pp 4217-4226, New York, NY, USA. Association for Computing Machinery

    Google Scholar 

  14. Xiuli C, Sandra Dorothee S, Chris B, Andrew H (2017). A cognitive model of how people make decisions through interaction with visual displays. Association for Computing Machinery, New York, NY, USA

    Google Scholar 

  15. Cheng E (2016) Aerial photography and videography using drones, vol 1. Peachpit Press

    Google Scholar 

  16. Chipalkatty R, Droge G, Egerstedt MB (2013) Less is more: mixed-initiative model-predictive control with human inputs. IEEE Trans Rob 29(3):695–703

    Article  Google Scholar 

  17. Chipalkatty R, Egerstedt M (2010) Human-in-the-loop: Terminal constraint receding horizon control with human inputs. pp 2712–2717

    Google Scholar 

  18. Christiano PF, Leike J, Brown T, Martic M Legg S, Amodei D (2017) Deep reinforcement learning from human preferences. In: Advances in neural information processing systems, pp 4299–4307

    Google Scholar 

  19. Clarke DW, Mohtadi C, Tuffs PS (1987) Generalized predictive control-part i. the basic algorithm. Automatica 23(2):137–148

    Article  Google Scholar 

  20. Coates A, Abbeel P, Ng AY (2009) Apprenticeship learning for helicopter control. Commun ACM 52(7):97–105

    Article  Google Scholar 

  21. Cutler CR, Ramaker BL (1980) Dynamic matrix control - a computer control algorithm. In: Joint automatic control conference, vol 17, p 72

    Google Scholar 

  22. Dulac-Arnold G, Evans R, van Hasselt H, Sunehag P, Lillicrap T, Hunt J, Mann T, Weber T, Degris T, Coppin B (2015). Deep reinforcement learning in large discrete action spaces. arXiv:1512.07679

  23. Engbert R, Kliegl R (2003) Microsaccades uncover the orientation of covert attention. Vis Res 43(9):1035–1045

    Article  Google Scholar 

  24. Findlater L, Gajos KZ (2009) Design space and evaluation challenges of adaptive graphical user interfaces. AI Mag 30(4):68–68

    Google Scholar 

  25. Frans K, Ho J, Chen X, Abbeel X, Schulman J (2017) Meta learning shared hierarchies. arXiv:1710.09767

  26. Fritsch FN, Carlson RE (1980) Monotone piecewise cubic interpolation. SIAM J Numer Anal 17(2):238–246

    Article  MathSciNet  Google Scholar 

  27. Gašić M, Young S (2014) Gaussian processes for POMDP-based dialogue manager optimization. IEEE Trans Audio Speech Lang Process 22(1):28–40

    Article  Google Scholar 

  28. Gebhardt C, Hecox B, van Opheusden B, Wigdor D, Hillis J, Hilliges O, Benko H (2019) Learning cooperative personalized policies from gaze data. In: Proceedings of the 32nd annual ACM symposium on user interface software and technology, UIST ’19, New York, NY, US. ACM

    Google Scholar 

  29. gebhardt c, hepp b, naegeli t, stevsic s, hilliges o (2061) airways: optimization-based Planning of Quadrotor Trajectories according to High-Level User Goals. In: ACM SIGCHI conference on human factors in computing systems, CHI ’16, New York, NY, USA. ACM

    Google Scholar 

  30. Gebhardt C, Hilliges O (2018) WYFIWYG: investigating effective user support in aerial videography. arXiv:1801.05972

  31. Christoph G, Otmar H (2020) Optimizing for cinematographic quadrotor camera target framing. In: Submission to ACM SIGCHI

    Google Scholar 

  32. Gebhardt C, Oulasvirta A, Hilliges O (2020) Hierarchical Reinforcement Learning as a Model of Human Task Interleaving. arXiv:2001.02122

  33. Gebhardt C, Stevsic S, Hilliges O (2018) Optimizing for aesthetically pleasing quadrotor camera motion. ACM Trans Graph (Proc ACM SIGGRAPH) 37(4):90:1–90:11:8

    Google Scholar 

  34. Ali G, Judith B, Atsuto M, Danica K, Mårten B (2016) A sensorimotor reinforcement learning framework for physical human-robot interaction. pp 2682–2688

    Google Scholar 

  35. Dorota G, Tuukka R, Ksenia K, Kumaripaba A, Samuel K, Giulio J (2013) Directing exploratory search: Reinforcement learning from user interactions with keywords. pp 117–128

    Google Scholar 

  36. Görges D (2017) Relations between model predictive control and reinforcement learning. IFAC-PapersOnLine 50(1):4920–4928

    Google Scholar 

  37. Grieder P, Borrelli F, Torrisi F, Morari M (2004) Computation of the constrained infinite time linear quadratic regulator. Automatica 40(4):701–708

    Google Scholar 

  38. Hadfield-Menell D, Russell SJ, Abbeel P, Dragan A (2016) Cooperative inverse reinforcement learning. In: Advances in neural information processing systems, pp 3909–3917

    Google Scholar 

  39. Hennessy J (2015) 13 powerful tips to improve your aerial cinematography

    Google Scholar 

  40. Ho B-J, Balaji B, Koseoglu M, Sandha S, Pei S, Srivastava M (2020) Quick question: Interrupting users for microtasks with reinforcement learning. arXiv:2007.09515

  41. Hogan N (1984) Adaptive control of mechanical impedance by coactivation of antagonist muscles. IEEE Trans Autom Control 29(8):681–690

    Google Scholar 

  42. Horvitz EJ, Breese JS, Heckerman D, Hovel D, Rommelse K (2013) The lumiere project: Bayesian user modeling for inferring the goals and needs of software users. arXiv:1301.7385

  43. Howes A, Chen X, Acharya A, Lewis RL (2018) Interaction as an emergent property of a partially observable markov decision process. Computational interaction design. pp 287–310

    Google Scholar 

  44. Zehong H, Liang Y, Zhang J, Li Z, Liu Y (2018) Inference aided reinforcement learning for incentive mechanism design in crowdsourcing. In: Advances in Neural Information Processing Systems. NIPS ’18:5508–5518

    Google Scholar 

  45. Hwangbo J, Lee J, Dosovitskiy A, Bellicoso D, Tsounis V, Koltun V, Hutter M (2019) Learning agile and dynamic motor skills for legged robots. Sci Robot 4(26)

    Google Scholar 

  46. Anthony J, Krzysztof GZ (2012) Systems that adapt to their users. The Human-Computer interaction handbook: fundamentals, evolving technologies and emerging applications. CRC Press, Boca Raton, FL

    Google Scholar 

  47. Johansen TA (2004) Approximate explicit receding horizon control of constrained nonlinear systems. Automatica 40(2):293–300

    Article  MathSciNet  Google Scholar 

  48. Jorgensen SJ, Campbell O, Llado T, Kim D, Ahn J, Sentis L (2017) Exploring model predictive control to generate optimal control policies for hri dynamical systems. arXiv:1701.03839

  49. Joubert N, Roberts M, Truong A, Berthouzoz F, Hanrahan P (2015) An interactive tool for designing quadrotor camera shots. vol 34. ACM, New York, NY, USA, pp 238:1–238:11

    Google Scholar 

  50. Julier S, Lanzagorta M, Baillot Y, Rosenblum L, Feiner S, Hollerer T, Sestito S (2000) Information filtering for mobile augmented reality. In: Proceedings IEEE and ACM international symposium on augmented reality (ISAR 2000). IEEE, pp 3–11

    Google Scholar 

  51. Kartoun U, Stern H, Edan Y (2010) A human-robot collaborative reinforcement learning algorithm. J Intell Robot Syst 60(2):217–239

    Article  Google Scholar 

  52. Kirches C (2011) Fast numerical methods for mixed-integer nonlinear model-predictive control. Springer

    Google Scholar 

  53. Krishnan S, Garg A, Liaw R, Miller L, Pokorny FT, Goldberg K (2016) Hirl: hierarchical inverse reinforcement learning for long-horizon tasks with delayed rewards. arXiv:1604.06508

  54. Kostadin K, Jason P, Elizabeth WD (2016) “Silence your phones” smartphone notifications increase inattention and hyperactivity symptoms. pp 1011–1020

    Google Scholar 

  55. Lam D, Manzie C, Good MC (2013) Multi-axis model predictive contouring control. Int J Control 86(8):1410–1424

    Article  MathSciNet  Google Scholar 

  56. (2020) Optimal control for electromagnetic haptic guidance systems. In: Langerak Thomas, Zarate Juan, Vechev Velko, Lindlbauer David, Panozzo Daniele, Hilliges Otmar (eds)

    Google Scholar 

  57. Lee SJ, Popović Z (2010) Learning behavior styles with inverse reinforcement learning. In: ACM transactions on graphics (TOG), vol 29. ACM, p 122

    Google Scholar 

  58. Lee Y, Wampler K, Bernstein G, Popović J, Popović Z (2010) Motion fields for interactive character locomotion. In: ACM transactions on graphics (TOG), vol 29. ACM, p 138

    Google Scholar 

  59. Liebman E, Saar-Tsechansky M, Stone P (2015) Dj-mc: a reinforcement-learning agent for music playlist recommendation. In: Proceedings of the 2015 international conference on autonomous agents and multiagent systems, AAMAS ’15, pp 591–599

    Google Scholar 

  60. Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (eds) (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971

  61. Liniger A, Domahidi A, Morari M (2015) Optimization-based autonomous racing of 1: 43 scale rc cars. Opt Control Appl Methods 36(5):628–647

    Article  MathSciNet  Google Scholar 

  62. Liu F, Tang R, Li X, Zhang W, Ye Y, Chen H, Guo H, Zhang Y (2018) Deep reinforcement learning based recommendation with explicit user-item interactions modeling. arXiv:1810.12027

  63. Lo W-Y, Zwicker M (2008) Real-time planning for parameterized human motion. In: Proceedings of the 2008 ACM SIGGRAPH/eurographics symposium on computer animation, SCA ’08, pp 29–38

    Google Scholar 

  64. Justin M, Wei L, Tovi G, George F (2009) Communitycommands: command recommendations for software applications. pp 193–202

    Google Scholar 

  65. McCann J, Pollard N (2007) Responsive characters from motion fragments. In: ACM transactions on graphics (TOG), vol 26. ACM, p 6

    Google Scholar 

  66. McRuer Duane T, Jex Henry R (1967) A review of quasi-linear pilot models

    Google Scholar 

  67. Michalska H, Mayne DQ (1993) Robust receding horizon control of constrained nonlinear systems. IEEE Trans Autom Control 38(11):1623–1633, 11

    Google Scholar 

  68. Bastian M, Andreas K (2010) User model for predictive calibration control on interactive screens. pp 32–37

    Google Scholar 

  69. Mitsunaga N, Smith C, Kanda T, Ishiguro H, Hagita N (2006) Robot behavior adaptation for human-robot interaction based on policy gradient reinforcement learning. J Robot Soc Jpn 24(7):820–829

    Article  Google Scholar 

  70. Modares H, Ranatunga I, Lewis FL, Popa DO (2015) Optimized assistive human-robot interaction using reinforcement learning. IEEE Trans Cybernet 46(3):655–667

    Article  Google Scholar 

  71. Müller J, Oulasvirta A, Murray-Smith R (2017) Control theoretic models of pointing. ACM Trans Comput-Hum Interact (TOCHI) 24(4):1–36

    Article  Google Scholar 

  72. Murray-Smith R (2018) Control theory, dynamics and continuous interaction

    Google Scholar 

  73. Nägeli T, Alonso-Mora J, Domahidi A, Rus D, Hilliges O (2017) Real-time motion planning for aerial videography with dynamic obstacle avoidance and viewpoint optimization. IEEE Robot Autom Lett PP(99):1–1

    Google Scholar 

  74. Nägeli T, Meier L, Domahidi A, Alonso-Mora J, Hilliges O (2017) Real-time planning for automated multi-view drone cinematography. ACM Trans Graph 36(4):132:1–132:10

    Google Scholar 

  75. Thomas N, Ying-Yin H, Andreas K (2014) Planning redirection techniques for optimal free walking experience using model predictive control. pp 111–118

    Google Scholar 

  76. Ng AY, Russell SJ (2000) Algorithms for inverse reinforcement learning. In: Proceedings of the seventeenth international conference on machine learning, ICML ’00, pp 663–670

    Google Scholar 

  77. Oliff H, Liu Y, Kumar M, Williams M, Ryan M (2020) Reinforcement learning for facilitating human-robot-interaction in manufacturing. J Manuf Syst 56:326–340

    Article  Google Scholar 

  78. Park S, Gebhardt C, Rädle R, Feit A, Vrzakova H, Dayama N, Yeo H-S, Klokmose C, Quigley A, Oulasvirta A, Hilliges O (2018) AdaM: adapting multi-user interfaces for collaborative environments in real-time. In: ACM SIGCHI conference on human factors in computing systems, cHI ’18, New York, NY, USA. ACM

    Google Scholar 

  79. Bin Peng X, Abbeel P, Levine S, van de Panne M (2018) Deepmimic: example-guided deep reinforcement learning of physics-based character skills. ACM Trans Graph 37(4):8

    Google Scholar 

  80. Bin Peng X, Kanazawa A, Malik J, Abbeel P, Levine S (2018) Sfv: Reinforcement learning of physical skills from videos. ACM Trans Graph, 37

    Google Scholar 

  81. Purves D, Fitzpatrick D, Katz LC, Lamantia AS, McNamara JO, Williams SM, Augustine GJ (2000) Neuroscience. Sinauer Associates

    Google Scholar 

  82. Rachael JA, Rault A, Testud JL, Papon J (1978) Model predictive heuristic control: application to an industrial process. Automatica 14(5):413–428

    Article  Google Scholar 

  83. Mizanoor Rahman SM, Behzad S, Yue W (2015)Trust-based optimal subtask allocation and model predictive control for human-robot collaborative assembly in manufacturing, vol 57250. American Society of Mechanical Engineers, p page V002T32A004

    Google Scholar 

  84. Rajeswaran A, Lowrey K, Todorov EV, Kakade SM (2017) Towards generalization and simplicity in continuous control. In Advances in Neural Information Processing Systems. NIPS ’17:6550–6561

    Google Scholar 

  85. Roberts M, Hanrahan P (2016) Generating dynamically feasible trajectories for quadrotor cameras. ACM Trans Graph 354:61:1-61:11

    Google Scholar 

  86. Safavi A, Zadeh MH (2017) Teaching the user by learning from the user: personalizing movement control in physical human-robot interaction. IEEE/CAA J Autom Sinica 4(4):704–713

    Article  Google Scholar 

  87. Sheridan TB, Ferrell WR (1974) Man-machine systems; Information, control, and decision models of human performance. The MIT press

    Google Scholar 

  88. Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A et al (2017) Mastering the game of go without human knowledge. Nature 550(7676):354–359

    Article  Google Scholar 

  89. Su P-H, Budzianowski P, Ultes S, Gasic M, Young S (2017) Sample-efficient actor-critic reinforcement learning with supervised data for dialogue management. arXiv:1707.00130

  90. Sutton RS, Barto AG, Williams RJ (1992) Reinforcement learning is direct adaptive optimal control. IEEE Control Syst Mag 12(2):19–22

    Article  Google Scholar 

  91. Rowan S, Kieran F, Owen C (2019) A reinforcement learning and synthetic data approach to mobile notification management. pp 155–164

    Google Scholar 

  92. Teramae T, Noda T, Morimoto J (2018) Emg-based model predictive control for physical human-robot interaction: application for assist-as-needed control. IEEE Robot Autom Lett 3(1):210–217

    Article  Google Scholar 

  93. Tjomsland J, Shafti A, Aldo Faisal A (2019) Human-robot collaboration via deep reinforcement learning of real-world interactions. arXiv:1912.01715

  94. Treuille A, Lee Y, Popović Z (2007) Near-optimal character animation with continuous control. ACM Trans Graph 26(3):7

    Article  Google Scholar 

  95. (1989) Christopher John Cornish Hellaby Watkins. Learning from delayed rewards

    Google Scholar 

  96. Wiener N (2019) Cybernetics or Control and Communication in the Animal and the Machine. MIT press

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christoph Gebhardt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Gebhardt, C., Hilliges, O. (2021). Optimal Control to Support High-Level User Goals in Human-Computer Interaction. In: Li, Y., Hilliges, O. (eds) Artificial Intelligence for Human Computer Interaction: A Modern Approach. Human–Computer Interaction Series. Springer, Cham. https://doi.org/10.1007/978-3-030-82681-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-82681-9_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-82680-2

  • Online ISBN: 978-3-030-82681-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics