Skip to main content

Deep Learning-Based Hand Posture Recognition for Pen Interaction Enhancement

Part of the Human–Computer Interaction Series book series (HCIS)

Abstract

This chapter examines how digital pen interaction can be expanded by detecting different hand postures formed primarily by the hand while it grips the pen. Three systems using different types of sensors are considered: an EMG armband, the raw capacitive image of the touchscreen, and a pen-top fisheye camera. In each case, deep neural networks are used to perform classification or regression to detect hand postures and gestures. Additional analyses are provided to demonstrate the benefit of deep learning over conventional machine-learning methods, as well as explore the impact on model accuracy resulting from the number of postures to be recognised, user-dependent versus user-independent models, and the amount of training data. Examples of posture-based pen interaction in applications are discussed and a number of usability aspects resulting from user evaluations are identified. The chapter concludes with perspectives on the recognition and design of posture-based pen interaction for future systems.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-82681-9_7
  • Chapter length: 33 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   139.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-82681-9
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Hardcover Book
USD   179.99
Price excludes VAT (USA)
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

References

  1. Appert C, Zhai S (2009) Using strokes as command shortcuts: cognitive benefits and toolkit support. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 2289–2298

    Google Scholar 

  2. Aslan I, Buchwald I, Koytek P, André E (2016) Pen + Mid-Air: an exploration of mid-air gestures to complement pen input on tablets. In: Proceedings of the 9th Nordic conference on human-computer interaction, NordiCHI ’16, pp 1:1-1:10, New York, NY, USA. ACM

    Google Scholar 

  3. Bandini A, Zariffa J (2020) Analysis of the hands in egocentric vision: a survey. IEEE Trans Pattern Anal Mach Intell

    Google Scholar 

  4. Batmaz AU, Mutasim AK, Stuerzlinger W (2020) Precision vs. power grip: a comparison of pen grip styles for selection in virtual reality. In: 2020 IEEE conference on virtual reality and 3D user interfaces abstracts and workshops (VRW), pp 23–28. IEEE

    Google Scholar 

  5. Hongliang B, Jian Z, Yanjiao C (2020) Smartge: identifying pen-holding gesture with smartwatch. IEEE Access 8:28820–28830

    Google Scholar 

  6. Bi X, Moscovich T, Ramos G, Balakrishnan R, Hinckley K (2008) An exploration of pen rolling for pen-based interaction. In: Proceedings of the 21st annual ACM symposium on User interface software and technology, pp 191–200

    Google Scholar 

  7. Brandl P, Forlines C, Wigdor D, Haller M, Shen C (2008) Combining and measuring the benefits of bimanual pen and direct-touch interaction on horizontal interfaces. In: Proceedings of the working conference on advanced visual interfaces, pp 154–161, Napoli, Italy. ACM

    Google Scholar 

  8. Cami D, Matulic F, Calland RG, Vogel B, Vogel D (2018) Unimanual Pen+Touch input using variations of precision grip postures. In: Proceedings of the 31st annual ACM symposium on user interface software and technology, UIST ’18, pp 825–837, New York, NY, USA. ACM

    Google Scholar 

  9. Theocharis C, Andreas S, Dimitrios K, Kosmas D, Petros D (2020) A comprehensive study on deep learning-based 3d hand pose estimation methods. Appl Sci 10(19):6850

    Google Scholar 

  10. Weiya C, Yu C, Tu C, Zehua L, Jing T, Ou S, Fu Y, Zhidong X (2020) A survey on hand pose estimation with wearable sensors and computer-vision-based methods. Sensors 20(4):1074

    Google Scholar 

  11. Côté-Allard U, Fall CL, Drouin A, Campeau-Lecours A, Gosselin C, Glette K, Laviolette F, Gosselin B (2019) Deep learning for electromyographic hand gesture signal classification using transfer learning. IEEE Trans Neural Syst Rehab Eng 27(4):760–771

    Google Scholar 

  12. Dementyev A, Paradiso JA (2014) Wristflex: low-power gesture input with wrist-worn pressure sensors. In: Proceedings of the 27th annual ACM symposium on user interface software and technology, UIST ’14, pp 161–166, New York, NY, USA. Association for Computing Machinery

    Google Scholar 

  13. Drey T, Gugenheimer J, Karlbauer J, Milo M, Rukzio E (2020) Vrsketchin: exploring the design space of pen and tablet interaction for 3d sketching in virtual reality. In: Proceedings of the 2020 CHI conference on human factors in computing systems, pp 1–14

    Google Scholar 

  14. Du H, Li P, Zhou H, Gong W, Luo G, Yang P (2018) Wordrecorder: accurate acoustic-based handwriting recognition using deep learning. In: IEEE INFOCOM 2018-IEEE conference on computer communications, pp 1448–1456. IEEE

    Google Scholar 

  15. Elkin LA, Beau J-B, Casiez G, Vogel D (2020) Manipulation, learning, and recall with tangible pen-like input. In: Proceedings of the 2020 CHI conference on human factors in computing systems, CHI ’20, pp 1–12, New York, NY, USA. Association for Computing Machinery

    Google Scholar 

  16. Fellion N, Pietrzak T, Girouard A (2017) Flexstylus: leveraging bend input for pen interaction. In: Proceedings of the 30th annual ACM symposium on user interface software and technology, UIST ’17, pages 375–385, New York, NY, USA. ACM

    Google Scholar 

  17. Frisch M, Heydekorn J, Dachselt R (2009) Investigating multi-touch and pen gestures for diagram editing on interactive surfaces. Proc ITS 2009:149–156

    Google Scholar 

  18. Ge L, Ren Z, Li Y, Xue Z, Wang Y, Cai J, Yuan J (2019) 3d hand shape and pose estimation from a single rgb image. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10833–10842

    Google Scholar 

  19. Gesslein T, Biener V, Gagel P, Schneider D, Kristensson PO, Ofek E, Pahud M, Grubert J (2020) Pen-based interaction with spreadsheets in mobile virtual reality. arXiv:2008.04543

  20. Oliver G, Wu S, Daniele P, Otmar H, Olga S-H (2019) Interactive hand pose estimation using a stretch-sensing soft glove. ACM Trans Graph (TOG) 38(4):1–15

    Google Scholar 

  21. Grossman T, Hinckley K, Baudisch P, Agrawala M, Balakrishnan R (2006) Hover widgets: using the tracking state to extend the capabilities of pen-operated devices. In Proceedings of the SIGCHI conference on Human Factors in computing systems, pp 861–870, Montréal, Québec, Canada. ACM

    Google Scholar 

  22. Hamilton W, Kerne A, Robbins T (2012) High-performance pen+ touch modality interactions: a real-time strategy game esports context. In: Proceedings of the 25th annual ACM symposium on user interface software and technology, pp 309–318

    Google Scholar 

  23. Haque F, Nancel M, Vogel D (2015) Myopoint: pointing and clicking using forearm mounted electromyography and inertial motion sensors. In: Proceedings of the 33rd annual ACM conference on human factors in computing systems, CHI ’15, pp 3653–3656, New York, NY, USA. Association for Computing Machinery

    Google Scholar 

  24. Hasan K, Yang X- D, Bunt A, Irani P (2012) A-coord input: coordinating auxiliary input streams for augmenting contextual pen-based interactions. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’12, pp 805–814, New York, NY, USA. ACM

    Google Scholar 

  25. Hasson Y, Varol G, Tzionas D, Kalevatykh I, Black MJ, Laptev I, Schmid C (2019) Learning joint reconstruction of hands and manipulated objects. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 11807–11816

    Google Scholar 

  26. Hinckley K, ’Anthony’ Chen X, Benko H (2013) Motion and context sensing techniques for pen computing. In: Proceedings of graphics interface 2013, GI ’13, pp 71–78, Toronto, Ont., Canada, Canada. Canadian Information Processing Society

    Google Scholar 

  27. Hinckley K, Pahud M, Benko H, Irani P, Guimbretière F, Gavriliu M, ’Anthony’ Chen X, Matulic F, Buxton W, Wilson A (2014) Sensing techniques for tablet+stylus interaction. In: Proceedings of the 27th annual ACM symposium on user interface software and technology, UIST ’14, pp 605–614, New York, NY, USA. ACM

    Google Scholar 

  28. Hinckley K, Yatani K, Pahud M, Coddington N, Rodenhouse J, Wilson A, Benko H, Buxton B (2010) Pen + touch = new tools. In: Proceedings of the 23nd annual ACM symposium on User interface software and technology, pp 27–36, New York, New York, USA. ACM

    Google Scholar 

  29. Howard J, Gugger S (2020) Fastai: a layered api for deep learning. Information 11(2):108

    CrossRef  Google Scholar 

  30. Hu F, He P, Xu S, Li Y, Zhang C (2020) Fingertrak: continuous 3d hand pose tracking by deep learning hand silhouettes captured by miniature thermal cameras on wrist. Proc ACM Interact Mob Wearable Ubiquitous Technol 4(2)

    Google Scholar 

  31. Hwang S, Bianchi A, Ahn M, Wohn K (2013) MagPen: magnetically driven pen interactions on and around conventional smartphones. In: Proceedings of the 15th international conference on human-computer interaction with mobile devices and services, MobileHCI ’13, pp 412–415, New York, NY, USA. ACM

    Google Scholar 

  32. Iravantchi Y, Zhang Y, Bernitsas E, Goel M, Harrison C (2019) Interferi: gesture sensing using on-body acoustic interferometry. In: Proceedings of the 2019 CHI conference on human factors in computing systems, CHI ’19, pp 1–13, New York, NY, USA. Association for Computing Machinery

    Google Scholar 

  33. Jiang S, Lv B, Guo W, Zhang C, Wang H, Sheng X, Shull PB (2017) Feasibility of wrist-worn, real-time hand, and surface gesture recognition via semg and imu sensing. IEEE Trans Ind Inf 14(8):3376–3385

    Google Scholar 

  34. Kefer K, Holzmann C, Findling RD (2017) Evaluating the placement of arm-worn devices for recognizing variations of dynamic hand gestures. J Mobile Multimedia 12(3&4):225–242

    Google Scholar 

  35. Kim C, Chiu P, Oda H (2017) Capturing handwritten ink strokes with a fast video camera. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 1, pp 1269–1274. IEEE

    Google Scholar 

  36. Kim D, Hilliges O, Izadi S, Butler AD, Chen J, Oikonomidis I, Olivier P (2012) Digits: freehand 3d interactions anywhere using a wrist-worn gloveless sensor. In: Proceedings of the 25th annual ACM symposium on user interface software and technology, UIST ’12, pp 167–176, New York, NY, USA. Association for Computing Machinery

    Google Scholar 

  37. Kim J-H, Thang ND, Kim T-S (2009) 3-d hand motion tracking and gesture recognition using a data glove. In: 2009 IEEE international symposium on industrial electronics, pp 1013–1018. IEEE

    Google Scholar 

  38. Li Y, Hinckley K, Guan Z, Landay J (2005) Experimental analysis of mode switching techniques in pen-based user interfaces. CHI ’05: proceedings of the sigchi conference on Human factors in computing systems, pp 461–470

    Google Scholar 

  39. Lin J-W, Wang C, Huang Y, Chou K-T, Chen H-Y, Tseng W-L, Chen MY (2015) Backhand: sensing hand gestures via back of the hand. In: Proceedings of the 28th annual ACM symposium on user interface software and technology, UIST ’15, pp 557–564, New York, NY, USA. Association for Computing Machinery

    Google Scholar 

  40. Logitech vr ink pilot edition. https://www.logitech.com/en-roeu/promo/vr-ink.html. Accessed 17 Dec 2020

  41. Matsubara T, Morimoto J (2013) Bilinear modeling of emg signals to extract user-independent features for multiuser myoelectric interface. IEEE Trans Biomed Eng 60(8):2205–2213

    CrossRef  Google Scholar 

  42. Matulic F (2018) Colouraize: Ai-driven colourisation of paper drawings with interactive projection system. In: Proceedings of the 2018 ACM international conference on interactive surfaces and spaces, pp 273–278

    Google Scholar 

  43. Matulic F, Arakawa R, Vogel B, Vogel D (2020) Pensight: enhanced interaction with a pen-top camera. In: Proceedings of the 2020 CHI conference on human factors in computing systems, pp 1–14

    Google Scholar 

  44. Matulic F, Norrie M (2012) Empirical evaluation of uni- and bimodal pen and touch interaction properties on digital tabletops. In: Proceedings of the 2012 ACM international conference on interactive tabletops and surfaces, ITS ’12, pp 143–152, New York, NY, USA. ACM

    Google Scholar 

  45. Matulic F, Norrie MC (2013) Pen and touch gestural environment for document editing on interactive tabletops. In: Proceedings of the 2013 ACM international conference on interactive tabletops and surfaces, ITS ’13, pp 41–50, New York, NY, USA. ACM

    Google Scholar 

  46. Matulic F, Vogel B, Kimura N, Vogel D (2019) Eliciting pen-holding postures for general input with suitability for emg armband detection. In: Proceedings of the 2019 ACM international conference on interactive surfaces and spaces, pp 89–100

    Google Scholar 

  47. Matulic F, Vogel D, Dachselt R (2017) Hand contact shape recognition for posture-based tabletop widgets and interaction. In: Proceedings of the 2017 ACM international conference on interactive surfaces and spaces, ISS ’17, pp 3–11, New York, NY, USA. ACM

    Google Scholar 

  48. McIntosh J, Marzo A, Fraser M (2017) Sensir: detecting hand gestures with a wearable bracelet using infrared transmission and reflection. In: Proceedings of the 30th annual ACM symposium on user interface software and technology, UIST ’17, pp 593–597, New York, NY, USA. Association for Computing Machinery

    Google Scholar 

  49. McIntosh J, Marzo A, Fraser M, Phillips C (2017) Echoflex: hand gesture recognition using ultrasound imaging. In: Proceedings of the 2017 CHI conference on human factors in computing systems, CHI ’17, pp 1923–1934, New York, NY, USA. Association for Computing Machinery

    Google Scholar 

  50. McIntosh J, McNeill C, Fraser M, Kerber F, Löchtefeld M, Krüger A (2016) Empress: practical hand gesture classification with wrist-mounted emg and pressure sensing. In: Proceedings of the 2016 CHI conference on human factors in computing systems, CHI ’16, pp 2332–2342, New York, NY, USA. Association for Computing Machinery

    Google Scholar 

  51. Panteleris P, Oikonomidis I, Argyros A (2018) Using a single rgb frame for real time 3d hand pose estimation in the wild. In: 2018 IEEE winter conference on applications of computer vision (WACV), pp 436–445. IEEE

    Google Scholar 

  52. Pham D-M, Stuerzlinger W (2019) Is the pen mightier than the controller? A comparison of input devices for selection in virtual and augmented reality. In: 25th ACM symposium on virtual reality software and technology, VRST ’19, New York, NY, USA. Association for Computing Machinery

    Google Scholar 

  53. Protalinski E (2019) Ctrl-labs ceo: we’ll have neural interfaces in less than 5 years. VentureBeat

    Google Scholar 

  54. Ramos G, Boulos M, Balakrishnan R (2004) Pressure widgets. In: Proceedings of the SIGCHI conference on Human factors in computing systems, pp 487–494, Vienna, Austria. ACM

    Google Scholar 

  55. Rekimoto J (1997) Pick-and-drop: a direct manipulation technique for multiple computer environments. In: Proceedings of the 10th annual ACM symposium on user interface software and technology, UIST ’97, pp 31–39, New York, NY, USA. ACM

    Google Scholar 

  56. Roland T, Wimberger K, Amsuess S, Russold MF, Baumgartner W (2019) An insulated flexible sensor for stable electromyography detection: application to prosthesis control. Sensors 19(4):961

    Google Scholar 

  57. Saponas TS, Tan DS, Morris D, Balakrishnan R (2008) Demonstrating the feasibility of using forearm electromyography for muscle-computer interfaces. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’08, pp 515–524, New York, NY, USA. Association for Computing Machinery

    Google Scholar 

  58. Saponas TS, Tan DS, Morris D, Turner J, Landay JA (2010) Making muscle-computer interfaces more practical. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’10, pp 851–854, New York, NY, USA. Association for Computing Machinery

    Google Scholar 

  59. Schrapel M, Stadler M-L, Rohs M (2018) Pentelligence: combining pen tip motion and writing sounds for handwritten digit recognition. In: Proceedings of the 2018 CHI conference on human factors in computing systems, pp 1–11

    Google Scholar 

  60. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pp 618–626

    Google Scholar 

  61. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  62. Smith B, Wu C, Wen H, Peluse P, Sheikh Y, Hodgins JK, Shiratori T (2020) Constraining dense hand surface tracking with elasticity. ACM Trans Graph (TOG), 39(6):1–14

    Google Scholar 

  63. Song H, Benko H, Guimbretiere F, Izadi S, Cao X, Hinckley K (2011) Grips and gestures on a multi-touch pen. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’11, pp 1323–1332, New York, NY, USA. ACM

    Google Scholar 

  64. Sridhar S, Mueller F, Zollhöfer M, Casas D, Oulasvirta A, Theobalt C (2016) Real-time joint tracking of a hand manipulating an object from rgb-d input. In: European conference on computer vision, pp 294–310. Springer

    Google Scholar 

  65. Suzuki Y, Misue K, Tanaka J (2009) Interaction technique for a pen-based interface using finger motions. In: Jacko JA (ed) Human-computer interaction. Novel interaction methods and techniques, pp 503–512. Springer, Berlin Heidelberg

    Google Scholar 

  66. Tekin B, Bogo F, Pollefeys M (2019) H+ o: unified egocentric recognition of 3d hand-object poses and interactions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4511–4520

    Google Scholar 

  67. Tian F, Xu L, Wang H, Zhang X, Liu Y, Setlur V, Dai G (2008) Tilt menu: using the 3d orientation information of pen devices to extend the selection capability of pen-based user interfaces. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’08, pp 1371–1380, New York, NY, USA. ACM

    Google Scholar 

  68. van Drempt N, McCluskey A, Lannin NA (2011) A review of factors that influence adult handwriting performance. Aust Occup Therapy J 58(5):321–328

    Google Scholar 

  69. Vogel D, Balakrishnan R (2010) Direct pen interaction with a conventional graphical user interface. Human-Comput Inter 25(4):324–388

    CrossRef  Google Scholar 

  70. Vogel D, Casiez G (2011) Conté: multimodal input inspired by an artist’s crayon. In: Proceedings of the 24th annual ACM symposium on User interface software and technology, pp 357–366

    Google Scholar 

  71. Wacker P, Nowak O, Voelker S, Borchers J (2019) Arpen: mid-air object manipulation techniques for a bimanual ar system with pen & smartphone. In: Proceedings of the 2019 CHI conference on human factors in computing systems, pp 1–12

    Google Scholar 

  72. Wacom vr pen. https://developer.wacom.com/en-us/wacomvrpen. Accessed 17 Dec 2020

  73. Wen H, Rojas JR, Dey AK (2016) Serendipity: finger gesture recognition using an off-the-shelf smartwatch. In: Proceedings of the 2016 CHI conference on human factors in computing systems, pp 3847–3851

    Google Scholar 

  74. Westerman W (1999) Hand tracking, finger identification, and chordic manipulation on a multi-touch surface. PhD thesis, University of Delaware

    Google Scholar 

  75. Wu E, Yuan Y, Yeo H-S, Quigley A, Koike H, Kitani KM (2020) Back-hand-pose: 3d hand pose estimation for a wrist-worn camera via dorsum deformation network. In: Proceedings of the 33rd annual ACM symposium on user interface software and technology, UIST ’20, pp 1147–1160, New York, NY, USA. Association for Computing Machinery

    Google Scholar 

  76. Xin Y, Bi X, Ren X (2011) Acquiring and pointing: an empirical study of pen-tilt-based interaction. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’11, pp 849–858, New York, NY, USA. ACM

    Google Scholar 

  77. Xu C, Pathak PH, Mohapatra P (2015) Finger-writing with smartwatch: a case for finger and hand gesture recognition using smartwatch. In: Proceedings of the 16th international workshop on mobile computing systems and applications, pp 9–14

    Google Scholar 

  78. Zhang X, Chen X, Li Y, Lantz V, Wang K, Yang J (2011) A framework for hand gesture recognition based on accelerometer and emg sensors. IEEE Trans Syst Man Cybernet-Part A: Syst Hum 41(6):1064–1076

    CrossRef  Google Scholar 

  79. Zhang Y, Harrison C (2015) Tomo: wearable, low-cost electrical impedance tomography for hand gesture recognition. In: Proceedings of the 28th annual ACM symposium on user interface software and technology, UIST ’15, pp 167–173, New York, NY, USA. Association for Computing Machinery

    Google Scholar 

  80. Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) East: an efficient and accurate scene text detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5551–5560

    Google Scholar 

Download references

Acknowledgements

We would like to acknowledge our co-authors on the three publications on which this article is based: Drini Cami, Brian Vogel, Richard G. Calland, Naoki Kimura and Riku Arakawa. While all evaluations of the neural networks presented in this article are new, we wish to recognise their contributions in the original publications.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fabrice Matulic .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Verify currency and authenticity via CrossMark

Cite this chapter

Matulic, F., Vogel, D. (2021). Deep Learning-Based Hand Posture Recognition for Pen Interaction Enhancement. In: Li, Y., Hilliges, O. (eds) Artificial Intelligence for Human Computer Interaction: A Modern Approach. Human–Computer Interaction Series. Springer, Cham. https://doi.org/10.1007/978-3-030-82681-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-82681-9_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-82680-2

  • Online ISBN: 978-3-030-82681-9

  • eBook Packages: Computer ScienceComputer Science (R0)